 Okay, hello everybody and so first of all, I want to thank you for giving me the opportunity to present my work, which I did it together together with Polestar Lars Schaefer from Ruhi University in Bochum, Germany and with Polestar Franz Möder from the University of Aarhus in Denmark and about how we used NMR relaxation data to improve the dynamics of messy groups in N-Bind charm force fields So this is the collaborative project from between NMR relaxation experiments and dissimulations and our original idea was not to improve the force fields so our original idea was to describe the stability of proteins and solutions and the reason that we privatized the force field was just that we on the way We found out that the force fields for messy group rotations are actually incorrect So that's why I also want to start to describe a little bit the semi-stability of T-perlysocyme and Connect this semi-stability with the role of the conflation entropy for the stability of this protein And then I like to do the connection between the conflation entropy and NMR order parameters and relaxation rates and from these relaxation rates I will then show that the Current force fields at least current ember and charm force fields I don't describe the dynamics of the rotation of messy groups and putting side chains correctly And then I want to show you how we will come twice these force fields with quantum chemical calculations After this I like to comment about the applicability of the famously parisabre model which is used to extract NMR order parameters from MD simulations for the description of the dynamics of messy groups and I want to finish a talk with an evaluation about current and by and charm force fields for the description of the Of the dynamics of proteins based on NMR extension data for putting back on inside chain So let's start with the semi-stability of T4 life as design So one of the main changes in biochemistry is the discovery of new drugs and one of the approaches is To find new drugs is that you mutate already existing rocks in order to improve see its functionality or its behavior Depending on some external stimulus Here we focused on a very stable protein solution T4 Lysosine, which is also used for many force field studies And we especially were looking at different losing to other than the mutations of this protein Was all these leucine's Residuses which mutated to alanine's are directed to the inner part of the specific of the specific domain in the protein So when you come mutate such a bigger side chain to a smaller side chain like alanine then you increase some cavity volume in the vicinity of the side chain and You would assume assume that the increase in cavity volume Which you get by if you always perform the same kind of mutation in a stable protein would be similar However, this is not the case So these increase in cavity volume range from about 30 angst in cubic to more than 100 angst in cubic Depending on where you do the kind of mutation An immediate explanation for this would be that you have somehow a structural change Compare the crystal structures of these mutants and compare them with the crystal structure of the white type Then you see that CCI pharmacy is lower than two angstrom So there's no real structure to change which could be explained the different increase in cavity volume up in mutation so our assumption was that the reason for these different increase in cavity volume are probably is probably an entire and topic effect and not an entire pick effect so What we then did is There are some free energy calculations with pimex whose main developer we telescope this was giving some I think some of the last talks in in the webinars of the open-force heat initiative and We also did some entropy calculations up to mutation And we found that there was just a modest correlation between the increase in cavity volume and see free energy difference between mutant and white type But there was a clear linear relationship between the change in entropy and see change in the cavity volume This talk is not intended to be a free energy talk, but But this result motivated us to look closer to the reason why we have these correlation between the entropy and the cavity volume and therefore we wanted to know to calculate the configuration entropy on per residue basis In general if you calculate the entropy difference between two systems So in this case a mutant and white type you can have different kinds of contributions You can have a change of entropy coming from the configuration entropy of the protein From the rotation and translational motion of the full protein from the solvent or from other influences, which are also for the specified here In case of T for Lysosium, which is very stable in which does not change its structure in shape of a mutation You can easily one easily see that the change in configuration entropy dominates the full entropy change The change of the configuration entropy Can be then described as the change of the entropy of the protein backbone and the change of the entropy of protein sidechains And then the one also has a coupling term, but usually this coupling term is more than comparison to the individual contributions of from that one inside chain Changes in configuration entropy in general are connected to changes in the dynamics of the phase space And this dynamics can be expressed by the orientation emotions of a representative point Of this phase space. So in case of the protein backbone one could look at amide bonds because they are representative of the Dynamics of the protein backbone Specific residue and in case of sidechains one would look at one which is at the end of the protein sidechain because One set of at the end of protein sidechains or the dynamics of bonds at the end of protein sidechains Are dominated by the dynamics of all bonds of the sidechain And it shows messy boobs because of three reasons first We can easily measure them in a more organization experiments by using in a module to spin relaxation measurements second Six of 20 amino acids contain messy boobs So the configuration entropy which you would get from all sidechains which contain messy boobs is representative for the Confession entropy of the full sidechain of the protein and the third reason is that messy boobs are always at the end of protein sidechains So their dynamics is representative for the dynamics of the full sidechain if you want to describe the dynamics of these bonds then you can do animal Measurements and you can describe the dynamics with animal order parameters, which they use between zero and one There's zero means that all orientations of the points are equally distributed. So you have a lot of flexibility One means that the orientation of the bond is always in the same direction. So eliminated flexibility Which can then be directly connected to the configuration entropy and my order parameters have reported frequently and the literature for For memory dissimulations and from NMR experiments and they call it work quite well Because the describing the internal motions of amide bonds of the protein backbone is relatively easy There are all so several studies for describing order parameters for messy boobs and protein sidechains But there are just few and these studies which calculate order parameters for messy boobs and protein sidechains and these studies either look just at very Simple test cases or their correlation with and this and our order parameters are relatively poor and I'd like to describe you What is the reason for this? So in general is the order parameters defined as the long-term limit of the internal time correlation function That means the only focus on the internal motion, which you can easily extract from an MD simulation by Overlapping every snapshot of the trajectory to a reference structure In case of messy boobs and protein sidechains, there are several emotions which contribute to the these internal motion It was these internal time correlation function The first kind of motions are very fast librational motions on the 10 to second time range Then we have the rotation of these carbon hydrogen bonds around the symmetry axis of the messy boobs Which usually happens for a freely rotating messy group on a time scale of about five picoseconds Additionally because the messy boobs at the at the end of the protein sidechains The dynamics is also influenced by the dynamics of the other bonds of the protein sidechains Which by themselves have fast librational motions That can also have one or more jumps on a picosecond to a nanosecond time scale and if you want to completely describe the dynamics of these bonds you also have to include the overall motion of the protein which happens On a typical time scale of 10 nanoseconds so the global tumbling of the protein of the size of T for ls sign Motions of step one two and three contribute to the decrease of the internal time correlation function to the fast mass rotation decreases the internal time correlation function to a value of one over nine and Then the time correlation function further decreases to its constant plateau value Which would correspond then to one over nine times the messy excess order parameter Which is the value we usually reported in literature This order permit us easily extract expectable from MD simulations However, this is not the order parameter vis-à-vis method in an MR experiments and therefore there are many This is one reason why there are differences between order from messy excess order commuters from MD simulations in MR doctors in NMR in NMR You usually put your protein in solution you label it in our case we label it with the uterium Then you you put in your spectrometer you apply a strong magnetic field So that the spins orient with respect to the energy levels with respect to the strong magnetic field And you apply a second magnetic field field with the radio frequency pulse these spins are out of equilibrium and after releasing these perturbations they relax back and what you measure are relaxation curves whose time constants are or whose inverse of the time constants are then the relaxation rates and Usually you measure a set of relaxation rates which can then be converted to a set of spectra density and points of suspected entity Which describe different contributions of the different frequencies of the motion of this point of interest However, there are two problems from the NMR side first the first problem is that you only can measure specific points of suspected entities You cannot measure the full spectra density you can couldn't get more points of suspected entity if you measure more There's different magnetic field strings or lose something like field shuttling, but you will never get the full spectra density And the second point is the spectra density is the time the free transform of the time correlation function Even if you could get the full spectra density and you could get the full time correlation function You cannot extract the time correlation function of the internal motion because the time correlation function will be always an overlap between the internal motion of the bond and the overall motion of the full protein Therefore in NMR you usually use emotional model, which is the so-called Li-Powell-Saber model This Li-Powell-Saber model assumes that you have a Set the overall motion of the protein and the internal motion of the bond of interest are statistically independent And if this is the case then you can write the time correlation as a product of your time correlation of function of the overall motion of the protein and the internal motion of your point of interest The overall motion can be described as with an exponential decay on the rotational tumbling time of the protein for an isotopic protein And the internal motion also decreases exponentially on a characteristic timescale tau f to its final value of one-ninth of the times the excess order parameter If you assume such a model then you can easily fully transform it and you can use this model To fit your points of suspected entity and that's how the NMR Experimentalists get their order parameters. I want to mention three aspects here The first aspect is that the basic assumption that the overall motion and the internal motion are statistically independent This is usually validated if these two kinds of motions happen on different timescales and therefore this model is perfectly suited for the description of dynamics of emmite order parameters because the emmite order parameters in Structure tweets like alpha helices for example Have only very fast internal motions or these Libration and motions on a femtosecond to picosecond time range, which are much faster than the overall motion of the protein This doesn't have to be the case for messy groups at the end of protein side chains because their dynamics is also influenced by Rotomere jumps which can happen on the same time scale or even slower than the overall motion of the full protein So one aspect I I want to show you or I want to comment on is if these models also valid for Describing the dynamics of messy groups in protein side chain The second aspect I want to mention is that in this model Described the overall motion with an exponential decay on the rotation and tumbling time of the protein So strictly only valid if you have an isotropic protein when others to be protein like T for lysosome X for example one has to use One specific rotation at tumbling times because one has to consider the principle axis frame of the protein and the third aspect I'd like to mention is that What you see from this approach is that the order from it is extracted from NMR Experiments are different from the order parameters from MD simulations as described on the previous slide and Therefore our motivation was to this to create a model where you can extract these order per meter in a similar way like they extracted from NMR so and Here's how our model works. So we usually start MD simulations and Then we remove the overall tumbling by overlapping every snapshot of the trajectory to a reference structure then we calculate the internal time correlation function of the point of interest and These internal time correlation functions decreased to different kinds of motions vibration emotions rotation of the messy group around the symmetry axis and And also what am I talking about and all these kinds of motions can be described as Stinger exponential decays. So what we did is we fitted these internal time collation functions within the multi exponential decay to get a smooth time Collation function in an elliptical form In order to get the full time collation function, we also have to add the overall motion But here we need the rotation at tumbling time of this specific messy group, which we cannot get because he The the orientation of the messy group changes in the principle access framework of the protein during the MD simulations But we know that for a structure protein like T4 lysosine the n-right points of the protein backbone Keeps their position in the principle access frame of the protein and we also know that for these n-right points the parasuburn model definitely works So that's why we Introduced the tumbling time from the protein backbone We use the sleeper suburn model to fix the full time collation function with tumbling for the amide ones Then we use structure information to convert these rotational tumbling times to rotational tumbling times in the Principle access frame of the protein and then we use the orientation of the mess of group with respect to the orientation of the Amide point of the same residue in the protein backbone to convert the rotational tumbling times to the mess of groups And that are then the rotational tumbling times which we used in the formula on the left side Of course, there are some assumptions in this approach But we made sure that the time collation functions which we expected with this model are very close to the time collation function Which we don't go tightly from the MD simulations Once we have these time collation functions We can easily fully transform them to get suspected entity and then we used only the points which we are measured in MR to Fit them the city power soboload it in the same way like the NMR values are extracted to get all the premises and time scales But because we now have these points of suspected entities, we can also back convert them to calculate relaxation rates and These relaxation rights are died the measure to NMR. So they don't Underlay on The assumption that the over emotional and internal motion are separated Which we don't know if this is valid for messy good at the end of protein sidechains therefore become prepared We did not compare order parameters between MD and NMR, but we compared directly the relaxation rates And here I want to show you the Relaxation base which we got from MD simulations with the M by 99 SP star IDN force field compared to the relaxation rates from Measurements of the group of people need to go you know It's the top you see the relaxation rate of the Y which reports on the slow dynamics of These of these messy bonds and put in sidechains then the bottom you see of the Z which reports on the fast dynamics And the results which we got with the original 99 SP star IDN force field are the red dots But you can see that for both relaxation rates most of the relaxation rates in MD Are higher than the NMR values. This is especially remarkable for the relaxation rate of the Z at the bottom Yeah, this is especially remarkable for this relaxation rate This means that the especially the fast dynamics of messy bonds and protein sidechains is too slow in MD simulations So if you go back to the dynamics which influence the rotation of the message or which influence the Dynamics of these carbon-hydrogen bonds and message groups Then the fast dynamics are either fast vibration emotions or the rotation of the message group around the symmetry axis The fast vibration emotions don't Contribute to these strong overestimation of the relaxation rates But the rotation of the carbon-hydrogen bonds around the symmetry axis of the message group do so these rotation of the message group around the symmetry axis is too slow in in current MD force fields and Looked in the force field these rotation is described by data angles, which end up in these carbon-hydrogen carbon-hydrogen points and these data angles have been introduced in the NMR force fields in M194 based on DFT calculations in 1986 where SN was used as a test case And To our surprise these data angles haven't been changed for most of the time before except for the 15 APQ force fields which Re-perma-trized all the data angles So probably the same data angle terms that should be still in the newest M4 force fields There is one paper from Chadfield in 2002 who found that also these Energy barriers of methyl rotation are too high for alanine side chains in the charm 22C map for our force field and he suggested new force for parameters for alanine, which is then Transferred also to the message group of the other side chains We thought that this transferability from the message groups of alanine to message groups of other side chains probably is not the best idea and One may has to repermin twice these data angles side specific side specificity or side chain specifically And that's actually what we did. So we created block dipeptides for every side chain of proteins which contain message groups and For these block dipe peptides We created the backbone configuration to the fight side data angles in either an extended configuration or an in helical Conflation and we did not really see an influence of these backbone configuration on the Hate of the energy barriers of message rotation For the side chains we use the most populated watermilk state Based on the water make a library, but he also tested it with this other watermilk states and the influences is on the energy barriers of message rotation is Not negligible, so it's about but it's relatively small It's about zero point five to one kilo two per mole depending on which watermilk state you are and for these watermilk states and be rotated to message group in around the symmetry Excess in five degree steps in an interval over 120 degree to get the full Profile and for all these structures we did and DF potential energy scan with the M zero six two X Functional and the double theta basis set to get the the profile of message rotation in case of alanie for all structures and for in case of the longer Side chains for the structures close to the maximum minimum. He's then performed couple cluster CC STT calculation with a typical theta basis set in order to get an accurate estimation of the energy barriers and The energy barriers and from the capitalistic calculations are already actually lower than from the DFT calculations We then did the same with the force field for the 99 SP study and force field and Found that these energy barriers are higher for all side chains except for three or mean in the original force field Subsequently, we then reduce the pre-factor of the study angles which contribute to these message rotation accordingly and specifically for every message group in every side chain and Then we did this report these parameterization again. Well, actually we did these calculations again with the three-point rise force field and could Get a nice correspondence to the CC STT calculations With these new parameters then we did all the MD simulations of a week team and what I can Calculated again the relaxation rates and the results are no the blue triangle that you see What you see is that these blue triangles have no lower values in comparison to the original the values from the original force field and They are at least no incorrect frequency range one may think that or one sees that the correlation for these fast Dynamics so the RFT is that relaxation rate is still very poor and this is the reason that has to reason that we have a very complicated dynamics and Of these put in side chains and these complicated elements on the picosecond time range Result in these poor correlation Resented the same parameterization for these 15 IPQ force field and also for the charm 36 force field and Also found that the rotation and barriers also in these force fields are too high and comparison to the force field And compared to the CC STT calculations So we reprimandized also then these these parameters and could reproduce much better relaxation rates for both force fields We then looked at the spectred entities and the time correlation functions because they describe the full dynamics of these carbon hydrogemons and messy groups and On the left side you see the results for the original force field for a specific messy group So in this case the C gamma messy group as isolated in 61 of ubiquitin and on the right side, so you see the spectred entity of Which we extracted from the new force field, especially the high frequencies which we reprimandized I know much better captured in the new first from if you're telling the old force field the reason that deal That we get a discrepancy at the zero frequency is that our MD simulations are performed in normal water and the NMR Experiments are performed done in heavy water and heavy water of course Tumbits a little bit slower than normal water and if you include this effect, then we also can and we produce J of C or correctly and importantly for the Water from it does is the reproduction of the entered internal time correlation function and Here you see the same messy group as an example and the original force field Decreases too slow and comparison to the new force field in green which gives inter attack and time correlation functions Very close to the NMR values, which are represented here by the blue line This is new force which we can now investigate if the power sub model is Is valid for the description of the dynamics of messy groups and protein in side chain And what we did is we compared now the internal time correlation function which we got directly from the MD simulation This is the internal time correlation function which we get Which we got after the calculated expected entity fitted is the viscillipal sub model and use the fitting power parameters of these li-pal sub model to calculate the internal time correlation function We calculated the root mean square relative ever between these two internal time correlation functions Over the first ten now seconds of these internal time correlation function And what you see here are the results for all messy groups and for T for lies or sign But you can see is that for most of these messy groups the root mean square relative ever between these two internal time correlation function is Lower than 0.5 percent But for some messy groups the difference between these two internal time correlation function is relatively high and I'd like to give you an example for both cases both cases that shows messy group which is connected to the C data atom of an isolated side chain in the first case I look at the I present you the results for the I solution 150 Sidechain and this sidechain does not have water make jumps during the MD simulation around the sidechain data angles k1 k2 This means for the internal time correlation functions that this time correlation function decreases very fast to a value of 1 over 9 due to the fast method rotation around the symmetry axis but then stays constant for the rest of the time and See the power sour mode is able to fit this inter-time correlation function correctly In case of either the sidechain is only seen 27 and T for lies to sign We have we find Several root of my jumps between two would I make states around the sidechain? I think the k2 on a time scale of one to two nanosecond This means for the inter-time correlation function that this function also decreases to a value of one over nine in On a time scale of about 5 pico second But then further decreases on lower time scales due to the slow dynamics and the power sample model can only fit an intermediate time scale That means that the power sample model is not valid for the description of all messy groups and protein sidechains And one should take caution if one compares and de-extracted relaxation rates with animal effects and with animal values So but how close we are now with this with the new force field and comparison to the NMR values here I show you the relaxation rates for Extracted from on dissimulation for T for lies to sign compared to the our measured relaxation rates And what you see is that we get a quite good correlation. Of course, we have some noise to inside which is Which is more than when you look at the relaxation rates from the protein back one Which I show you on the next slide, but we are much better than the previous force fields and quantitatively Quantitatively you can reach no correlation coefficients about 0.7 to 0.8 between MD D-right relaxation rates and NMR D-right relaxation rates for protein of the size of T for lies to sign finally, I'd like to Show you some results of wiki teen which is Easier protein like T for lies to sign and show you how far we are now this common force fields So the results which I show you are no for all the force which I shown previously the 1990s be starting the end force field C FF 15 IPQ force field and the charm 36 force field On the left side, so you see the relaxation rates for the protein backbone So the transverse and the lonely to know don't know relaxation rates on At the top and the NOE relaxation rates at the bottom all with all force fields are able to Reproduce the structure features of wiki teen, which is no surprise because a bikini is a very easy protein and also used in many force field development studies But we all force fields are not able to not just able to reproduce the correct structure features But also the correct quantitative value, which means that's also the dynamics of the protein backbone Once is correctly captured two points, I'd like to point out here are the For the first C red peak in the transverse relaxation rate R2 This red peak comes from chemical exchange on a time scale of about 10 to 100 microseconds Which is not captured in our emcee simulation which had an aggregate simulation type of about one microsecond and the second point I'd like to point out is the green line in the Transverse relaxation rate R2 and then the NOE The screen line is the result of the charm 36 simulations with the tips of your water model The tips we put your water model has in safe diffusion Which is 2.7 times faster than the normal self diffusion and therefore these relaxation rates Much lower than the relaxation rates from the other force fit and also from the animal values We did not want to we redo the charm 36 force Simulation is another water model because we don't know if these protein protein Solvent interactions changed when you when we go to another force to another water model, but what we did is we just Skate the diffusion time by this factor of 2.7 and then we could also get the correct relaxation times R2 and the NOE values quantitatively we are now able to get correlation coefficients of about 0.9 for For back excitation rates and for the NOEs we are now able to get correlation coefficients of 0.99 which means that the dynamics of the protein backbone is not correctly described in current force fields The situation is a little bit different for the protein side chains. So here are the results for these relaxation rates of meso groups The correlation for the slow dynamics So the relaxation rate of the y is still quite okay, not as good as for the protein backbone, but we are able to be on the way to Reproduce the nanoscale a second time scale correctly for the protein side chains However for the fast dynamics the correlation is still bad even with the new reprimand force fields and this is due to the complicated dynamics that you have to put in side chains Quantitatively We can get now all correlation coefficients between 0.8 and 0.9 for the slow relaxation rate of the y and for the order per meter But the correlation for the fast dynamics of the z is much worse We can now ask the question why is this correlation is much worse in comparison to the correlation on the previous slide for T for lysosome if T for lysosome is much more complicated protein than uricitin But the case for this is that T for lysosome contains 16 alanine residues and uricitin only contains two and alanine residues have a much faster dynamics than the other side chains and can also So separated easily from each other and therefore the correlation coefficient is much higher if you have a lot of alanines in your protein Based on these results, which I've shown you I'd like to Give some consequence sequences for future force field developments. First of all My recommendations were done for these massive group rotations and put it in side chains and chemically that are always SP3 hybridized carbons, which are connected to SP2 hybridized carbons Except for mesionine where these carbons are bonded to the sulfon atom One might expect that the similar chemistry should give similar force field parameter parameters for the energy barrier of Messe rotation, but this is not the case What we see is that we get similar Energy barriers based on the distance to the protein backbone for example We get clearly different energy barriers for the C gamma and C C data Energy barriers in either leucine But we get similar values between the C gamma messi groups in isoleucine and leucine and you get also similar values for the various methodation between the C data messi group in isoleucine and the C data messi groups in leucine another aspect which we which One may have to consider in the future is that we got slightly different values For the barrier energy barriers of methodation for the different photomic states and one may have to include this for future force field developments In general Our research shows that the dynamics of protein backbone bonds are very captured in modern protein force fields But the dynamics of sidechain still has to be improved, especially for the fast dynamics on the picosecond time range And this is I would like to summarize my talk. So I repel twice the dynamics of Messe group I repel the rotation of carbon hydrogen bonds around the symmetry axis of Messe groups and these new force field development force field parameters better capture in a multi-tune relaxation rates spectra densities and time correlation functions First of all, I created or we created a new spectra density mapping approach to calculate order per meter And I did not show this in this in this presentation But these new approach we produce better messy order per meter than just taking the long-term limit of the internal time correlation function which is described in the PCCP paper from 2018 and first of all we investigated if the power sub model describes the dynamics of carbon hydrogen bonds correctly and this is the case for most of the Messe groups but for some Messe groups in the power sub model It's not valid to describe the dynamics correctly and one may has to think to create a better model to describe this dynamics to fit the NMR data and Finally the MD force fields are able to capture the amplitude of motion spatter than their precise time skates. I Like to thank my PI professor Scheper from Wunderstil in Bochum, Germany and Also, Poster Franz Möller from the OZF arms in Denmark For their support and also Dr. Menjui from the OZF arms was now in Seattle in Washington state And who did the NMR experiments on DUTU on T for LISO sign They also want to say my full shefa group and my funding organization is solved for funding And I also want to thank you for your attention And before I finish I'd like to mention that the code for these new force field parameters is available on our web page simulation.org And it will be soon also available on my github account, which in the moment is still empty and I also will upload the code how we calculated the order parameter of And relaxation bits of messy sidechains soon to this github account Again, I want to thank you for paying attention and now I am open for questions