 Okay, let me share my screen. And the floor is yours. Okay, can you see my screen? Yeah. Okay, perfect. So, hi everyone, my name is Mattia Bernetti, I'm a postdoc in the group of Giovanni Busi, Yaret Sissa. And today, I will tell you about RNA structure and its relevance for function. I will also speak about experiments that we can use to study RNA structure and how we can combine the experiments in particulars, and molecular dynamics simulation to characterize the RNA structure. I will show you an application on a real life case on the GTPAs associated center, which is an RNA found in the ribosome, the cellular machinery where the biosynthesis of proteins takes place. And then I will close with a quick summary. Okay, so as Giovanni also introduced yesterday already. When we think about RNA, the biological relevance has historically been based on its role as an intermediate of the flow of genetic information from the DNA to functional proteins. But, however, over the years after this central dogma was formulated in the late 50s, we've seen that RNA molecules actually can have multiple functions. They can have enzymatic activities like in ribosomes or they can self recognize and replicate like formulated in the RNA world theory. Or they can also have a regulation and signaling functions like in the interference and riboswitches. Okay, so we've seen over the years that actually a large portion, large fraction of the transcribed RNA is actually not coding for functional protein, but it is functional per se. So to be functional RNA molecules in particular for instance ribosomes, riboswitches or transfer RNAs, typically need to interact with molecular partners such as small molecules or other macromolecules. And in turn this interaction typically requires the RNA to adopt a specific structure. So to to recall you very quickly, the structure of RNA is hierarchical and we can distinguish a primary structure which is the sequence of nuclear bases. We have the secondary structure which is the base pain pattern. But then we focus on the tertiary structure which is the three dimensional organization of the of the RNA molecules which is typically the peculiar one associated to a specific function. Okay, from an experimental standpoint, there are several experimental methods available to to get insights into into RNA structure in general into the structure of biomolecules. We have x-ray crystallography, NMR or cryoAM, and these are very excellent techniques that allow achieving a very high resolution. However, they require let's say non-trivial technicalities to be used that that make actually their utilization on non-straight forward. On the contrary, this technique called the small-angle x-ray scattering or SUCKS is considered a simple and versatile technique, in particular for instance, because it's so vast against changing in the in the sample conditions like pH or ionic conditions for instance. It has the major drawback to have a very low resolution. So let's look at this technique into a bit more detail. So in a SUCKS experiment, we have an x-ray radiation that illuminates a sample and the scattered radiation is recorded by a detector as a as a function of the of the scattering gamble. So this is a typical plot of the intensity that you that you that you get SUCKS spectrum. And in particular, I would like to focus on on one aspect. So the experiment, the SUCKS experiment is actually based on two measurements, two measures, one on the sample with the of the biomolecules, the RNA here, in this specific case, and a second measure on the buffer only and the final spectrum that you see is actually taken from the difference of these two. So a convenient representation of the spectrum is called the Kratki plot, which were basically it is considered to be more informative about the destruction and we'll see this in a moment. So in this in this case, for instance, the SUCKS experiment was repeated several times changing the ionic conditions. And as you can see the shape of the of the SUCKS spectrum in the Kratki form changes as a result. So this corresponds to changing the confirmation of the biomolecule of of of an RNA molecule in this in this case, and we know that typically higher peaks and lower shoulder correspond to more compact confirmation of the of the biomolecule, while lower peaks and higher shoulder correspond to more extended to more extended ones essentially. So this specific experiment was actually conducted on the GPS associated center that I already mentioned mentioned or GAC in brief. It is a 58 nucleotide long RNA found in the ribosome. And in particular we know that to achieve this tertiary for the structure that the presence of magnesium ions is is essential. Okay, so given these experimental profile these experimental spectra. We don't know how the structures behind actually look like that's why we are discussing about the resolution of the technique. So if we want to assign structures to these experimental spectra we first need to generate a pool of possibly different conformation of the of the biomolecule of the of the RNA molecule. We know that we can achieve this using molecular dynamics simulations, and I will not discuss and be because it's been already widely discussed by the other speakers, but let me just recall you that in the simulations in our simulations in this example, we conduct a simulation of our system at the atomistic scale as you can see here with the solvent model explicitly or in other words water and ions are explicitly simulated. So here you can see a shorter trajectory of GAC RNA simulated with MD in directly in solution with water and ions, but of course in this case, what we're mostly interested about then is the structure of the of the RNA. And then, again, if you want to assign the structures to the experimental spectra we have devised that we have conceived a procedure, which basically starts with molecular dynamics simulation to generate an ethrogenous ensemble of possible of GAC RNA, then we compute the success spectra for from these and the structures, and then we try to find the best match between the computer spectrum the experimental one to understand what structure is behind the experimental spectra. So let's go through all these steps here. And let me start from the simulations. So we started by trying to use plain molecular dynamics at a long time scale at the macro second time scale which for a system of this size is considered to be to be rather long here in the sketch of the two of two systems that we have simulated. So one in absence of magnesium ions, these are potassium ions in a bed and one in presence of magnesium ions as well. So in the micro second time scale, basically, the structure didn't change remarkably that's what that's what happened and you can see it also from these plots of the radius of duration which measures the degree of the structure of the of the RNA, which is this sketching right here, but also from the root mean square deviation, which is a measure basically of the of the discrepancy of the distance actually of the structure from the from the initial state you can see that in both cases basically we didn't get to get much much farther than the than the initial state. So what we thought to do to sample this a period genus and sample of possible conformations for Gacarene is to use an onset sampling that also was widely discussed but let me recall you that these are basically methods that allow to accelerate the the sampling in particular to use the metadynamics and when using metadynamics typically you have to to to specify some some variables describing the system which are called the collective variables or CPS in short. In the case with we chose to to enhance the sampling to collective variables, from which we obtained that is free energy surface and basically the first collective variable that it is used to enhance the, let's say the the the distribution of tertiary contacts, which are those that keep the tertiary structure in place, while the second collective variable was used to to sample conformations that corresponded to to to suspect of different shape. Let me try to explain this better. The suspect of different shape where the different shape actually the suspect was assessed between this ratio of the peak cover the shoulder and you can see it represented here on a sample sex spectrum. So these are spectra computed for different regions of the obtained free energy surface. And here you can see that the shape of the spectrum changes and also the ratio between pick and shoulder changes, and we have corresponding to that conformations of of GAC RNA, you're on the left, you have more compact ones why going to the right here we have for instance a more extended one where you can see that the the tertiary contacts are mostly broken. Okay, so we got to the, to the delicate, delicate argument or point of this of this old story which is the computation of the success spectra from from the structure is. And as you can see, basically to compute a success spectrum, we have to compute all the per wise distances between the, the atoms in the, in the systems. And this problem grows practically with the with the number of items so very quickly it becomes very computationally intensive so imagine if you have to do it during a simulation many many times to use it as a collective variable this becomes really, really tough. To tackle to mitigate this issue, let's say the the group of Pythonian and co workers devised a very practical solution, very handy strategy, where basically we normally conduct our simulation at the atomic scale. We conduct all the other ions and and everything, but then when we have to compute the collective variable in particular the sucks spectrum, we basically make a transition we convert our structure into its course grain representation. So in this way, this pair wise submission runs over the number of beads, which are these spheres here. They are less than the number of atoms and so it becomes way more tractable. So basically here is, as you can imagine we lose a bit in terms of accuracy of the spectrum that we compute, but we gain a lot in terms of efficiency. So at this point I can probably stress that at this stage, like unit and for the NNZ sampling simulation when I remind you the goal is to sample an heterogeneous conformational ensemble with the structures at different compactness and structure. So here, the, given given this goal, basically, we, we don't care much about computing exactly very accurate sucks Petra. Okay, so for us it's a it's a very good, very good option. So having discussed this, we can move basically to the second component of our procedure, which is the computation of suck spectra from from the dmd sample structures. Okay, we have already discussed the part of the, the efficiency problem, but actually another major critical point when computing suck spectra. It's actually the, the inclusion of the, of the solvent contribution. Let me try to explain this. So, when we compute as a spectrum from an MD structure we can either use only the the atoms of the of the solute of the major solid of the biomolecule. We can also include a contribution from the solvent in an implicit manner. And this is for instance implemented in a software very popular from the 90s I think which is called Christon, or we can include explicitly the contribution from the from the solvent like it's done in a waxes and Capricorn software. And I would say that for MD people, I would say this is, this is like in this way we really take advantage of all the information that we have in our simulations because our, the solvent and our simulation is modeled explicitly so we use all the information that we have essentially. And also in, in this, in this procedure here which includes explicitly the solvent a second simulation with the solvent only is also conducted, and exactly as in the experiment experiments the final spectrum is computed as the difference of this, of these two. And look at this mythos essentially as for what models that we have available to compute the experimental observable the success spectrum from, from an empty sample, an empty sample structures of course, the more detail you had today to the computation so the about about the solvent sorry, the more details you have about the solvent the more the more intensive also this computation becomes. And also that we consider tackling here. So, if you remember about the experiments that I should have been in where the different ionic conditions, the, the, the, the shape of the spectrum the crack it up changed. We thought about checking if also in our simulation changing the ionic conditions could produce a noticeable effect on the, on the final spectrum. So, to build the four different systems at increasing the concentration of magnesium, we freeze the RNA coordinates we fix them to exclude RNA dynamics from from the from the simulation and then we computed the the spectra with the different methods that I briefly discussed a few moments ago. So, as you can see, very quickly, without going into the details you can see that basically all the methods produce compatible results for the different systems. So basically from these quick analysis, we concluded that, first of all, the different methods are rather robust to in computing the spectra so the computation is robust among different methods. And also in particular that the differences in the suspect are mostly depending on our main structure then on the on different ionic conditions. So, we finally got to the to the last to the last step, let's say of our procedure. Here, basically, once we have done the MD simulations we have computed the spectra now we have to match the computer spectra with the experimental one to to assign the structures to the to the experimental profiles. At this point, we exploited a rewriting procedure which is based on the maximum entropy principle this was introduced a bit already by by in the talk by Giovanni yesterday but very very quickly to recontextualize here. So we use the maximum entropy principle we we seek for our for a posterior distribution, which is as close as possible to a prior knowledge, and in this case our prior knowledge is the MD sample the structure that an original ensemble that we that we have harvested, and is as close as possible to the to the to the experimental data and that agrees actually with the experimental data. In this, in this eroding procedure here that that we have conducted that we basically aimed at awaiting the experimental spectra for GAC RNA you obtained in presence of magnesium or or potassium you can see the the spectra here so the gray one is for GAC RNA in presence of magnesium and the pink one in presence of potassium. What we have done is to now use accurate suspect suspect in what way we took all our MD structures that we sampled with an answer sampling we computed the spectra with the accurate accurate way that includes explicitly a solvent contribution that I discussed not not not long ago. And then we enforced the matching with the experiments in particular in four specific points the the famous peak and shoulder than how you should be familiar with. And also two points from this low Q region where you basically, this is important because you can perform a fit in this region and directly from the from the spectrum you can get the value of the radius of of of generation. So you can see here how the the related spectra for GAC RNA in presence of magnesium looks like this is the this is the green profile that you see here and you can see also how nicely it agrees with the with the experimental one and also the value of the radius of generation the predicted one agrees with the with the experimental one within within statistical error. And here also you can see the the related spectrum in for GAC RNA in presence of potassium, which agrees also with the with the experimental one within, which is the pink one within within error. So if you, if you look at the statistical error here you can see that it's like remarkably higher than the one obtained for the waiting of GAC RNA in magnesium. And the reason here is that. So basically, in our simulations the the the sampling of more extended of more extended conformations were rather limited, and these long conformation are actually essential to build a representative ensemble for potassium so of course like this limitation affected somehow the statistical error that we that we obtained for the related spectrum. Okay, so here you can see the the related the corresponding related ensemble corresponding to these are with the spectra. And we basically we with we tried to identify the the fraction of compact and extended conformation for the two related ensembles, and you can see that in the case of magnesium and basically the related ensemble is mostly represented by compact conformation of this of this GAC RNA molecule, while this is complemented by a small population of about 1%, let's say, of of extended GAC RNA conformations. If then we look at the related ensemble for GAC RNA in presence of potassium we can see that this fraction increases significantly significantly and up to about the 40%. So basically, both have in both cases we get a mixture of compact and extended conformations, and what changes is the relative population of the of the two. Okay, to, basically to get to the conclusion and to, and to summarize, let me let me give you like a whole picture of what is our, our procedure that we that we employed basically to reconstruct this conformational ensembles of GAC RNA. So first of all, we use the enhanced sampling simulation to sample an ensemble of GAC RNA structures which was as heterogeneous as as as possible. So to this end, we use the metadenomics with a collective variable that relied on an approximate forward model to compute the success spectra. And this again made the computation of the of the spectrum on the fly more more accessible in terms of efficiency. So in this case, the approximate forward model doesn't take into account the solvent and uses a coarse grain representation of the RNA molecule and this is, again, we said not not really critical at this stage because what we want is to is to sample a lot to accurately compute the success spectra. And so we from the net sampling simulation, we obtained this conformational ensemble of GAC RNA. So these structures with the different weights from these structures, we took these structures and we computed the success spectra using an accurate model including explicitly the solvent and also anatomist having anatomistic representation of the of the RNA molecule, which is the larger biomolecule that we have in the system. We have conducted the derivative procedure so enforcing the experimental the experimental sex spectra. And as an outcome, you can see that achieve basically a reweighted ensemble that started from a prior distribution and after the weight into the maximum entropy principle and using as an input to the spectra the accurate spectra computed with the solvent. So we obtained this reweighted ensemble in particular we did it for GAC RNA in the presence of magnesium and potassium as an experimental references. So again, differently that the previous stage at this stage we employed an accurate forward model to compute the success spectra, where the solvent was included at the RNA was described at the, at the atomistic level. Interestingly, just to conclude, we tried also to do the reweighting using the success spectra computed using the approximate forward model and interestingly, we've seen that no successful reweighting basically was, was achievable in this, in this case. You can read it out and find more details you can you can look at this work has been recently published in June so you can take a look and and also give some some other feedbacks for future works, you never know like you can always improve and we have a summary at the, at the end of all this presentation. So basically we have discussed how any structure is important for for function, we have discussed how we can use small angle x risk scattering or sucks experiments, even if they have a rather low resolution, and basically combine them with the molecular dynamic simulation using the, the maximum maximum mental principle and reconstruct the conformational and samples of relevant biomolecules in this case of the GAC RNA. So in particular in our specific case we have observed how it is critical to include the solvent contribution to achieve a successful reweighting. And finally also that both and samples for GAC RNA in presence of magnesium or potassium were composed by a mixture of compact and extended conformation but with different relative populations. So with this said, thanks a lot, and thanks also to the organizer for for inviting me and to Michael, the people in the group, the PI Giovanni and the other, and the other students, and also the visiting students and also the group with which we did this working collaboration that basically performed the experimental work. Thank you. Thank you very much, material for very interesting talk. All right, so the floor is open for questions. I have one which I will start off with. So, so this computing the sex spectra and the subtraction of the solvent. It's interesting and so I understood from what I understood the, and the experimental side. They do it a sex experiment of the salt in solvent alone. Yes, they get that and then you, you know that the RNA and whatever, but that subtraction is, it's not an equal sign right. There's some assumptions that one makes when when you do it the subtraction even experimentally which is that you're assuming that the structural correlations in the liquid and induced by the ions is relatively similar in the buffer solution and with the RNA. And I'm having trouble accepting that. Let me let me try to to put it in another perspective. So, as far as I know from from the experimental side because I mean I've never done this experiment. I had a bit but if I understood well basically. So first off, to be honest, I know that the computation is done, like the measure is done so that everything is like as a result is a positive intensity of course, left this way, basically the difference that we expect to see is the one given by of course the structure of the biomolecule and then what we can consider to be the structure of water and ions around or in the close proximity of the biomolecule. So let's say that for like all that we have a far apart way can sense out with the with the signal that to get in the solution with the only the only ions and basically that's the reason why you want to do the subtraction and keep the rest. Of course then it's not exact and there are also there's some other approaches, approximations, which I cannot tell you a lot about. So, okay, so I buy that argument now. And so I think it makes so for high q values, you're probably okay. But as you get to low q, that stuff probably gets much more noisier. Yes, more complicated. Yes, I mean, I know that the regime in which you start losing the like the correct information about the water or let's say the point where the information about the water starts, starts metering is rather close to do to the end of what we consider now so we like some effect here, but like not on the very low. So it really depends on the regions that you don't understand. Okay, thanks. Thank you. Welcome. Okay, questions from the audience. You can unmute yourself and ask. Maybe I can ask a question if I may go ahead, please better. Yeah. Yeah. Yeah, I am. I had a question about the reweighting procedure. I mean, how much do you have to reweight in the first place so in other words, how much do your final posterior distributions change from the initial prior distributions is this a dramatic change or is this only a marginal change. Just to understand also the. Yeah, how, how well the force fields are actually capable. Yeah, that's, that's, that's, yeah, that's, that's a very good point actually we discussed a lot with Giovanni and like the rerun of the simulation and all the setup of the NSE sampling procedure basically dependent on this on this aspect. The point here is in the case of magnesium, where the ensemble is closer to the, to the, to the prior because so the prior is close to the, to the idea one let's put it this this way because it's more populated by more complex structures. We can talk about this in terms of kids sides, we have a, I remember now I don't remember the exact value but we had a good percentage of, of maintained structure or maintained weight from the prior compared to the, to the posterior. So this point was so basically the ensemble was not changed very much in the case of the potassium and this was more dramatic because as I discussed very, very briefly, when I showed the table basically the, the, we found out that the related ensemble for potassium was divided by a higher fraction, a remarkable fraction of more extended conformations, and the sampling of those was rather complicated and difficult to, to achieve in that, in that case, the position distribution was really was rather different than the original one and focused mostly on this, on this part of the original ensemble let's say we have the, the, the, like the percentage and the size in the paper now I don't remember exactly the values. Okay. Thanks. Thank you. More questions from the audience can write it in the chat if you would like. Okay. So I think we can close. Thank you very much to all the speakers for the session. I just want to remind all of you that even though we don't have this in the official program. After the afternoon session the gather the town virtual space will be open, and I encourage all the speakers and participants to, to go there and have more discussions in a more informal way, virtually. All right, so thank you very much, and we will see you again at 2pm. Thank you very much.