 ... ... ... ... ... ... ... So... So it's question in since that, it's a... I mean you have to write... let's say, a small paragraph. Like a few lines to answer that question. Ok. It's... So we do it in the computer room because it's in the, so this is a larger space. And it will be singular. in zelo je vsega, tako da je nekaj grub, ok. Zelo je konceptu. Super. Zelo je vsega. Zelo je bilo na vsega vsega, kako je vsega materijala, vsega lektu. Zelo je bilo vsega vsega, kako je vsega vsega. Zelo je bilo vsega. Zelo je bilo vsega. Zelo je bilo vsega, kako je vsega? žel ti zelo traženja? Zelo, ki si plati materija, da se bo, kako jo. So, da je v 1h vrti. Zelo jasno, ne bo za vrti. Zelo, ki si plati... Zelo, ki so plati. Zelo, ki si plati materija... Zelo, ki si plati materija... Zelo, ki sem, ne ti plati, zelo, ne. So let's save the trees. OK, let's continue about this. More or less we stopped here to the models for real change. I want to just make a faster copy of this flow theory, because this is what we need next. OK, so, we have that the flow theory is basically an effective theory to to compute, let's say, in an effective manner, the typical size of a real chain, I mean, where you have a exclusive volume interaction. OK, so this consists of two parts, so I try to, let's say, an enthalpy part again. So I'm just dividing by kB t to have a dimensional unit. OK, a dimensional quantity. So this is the free energy and this is, let's say, my energy is k. So I have these three parts and then the entropy. OK, let me see. Yes, so it's the right unit. OK, kB is the Boltzmann constant temperature. So now I need to estimate these two things. OK, so this, as I said, so this is estimated, I mean, in the following way. OK, so you have, basically, is the square of the, let's say, monomer density inside the volume occupied by the polymer, OK, times the volume of the polymer itself. I'm doing that in the dimension. I mean, then I will specialize to, there is in 3D, OK, in any dimension, OK. And because this is a density, so it's a number of monomer divided by volume, this is volume, OK, I need, basically, there is a prefactor here, which has units of volume as well, OK. And actually, this prefactor is the, basically the second virial coefficient that was, I think that was explained here, OK. And this is my, this quantity here, basically. OK, and this is the same second virial coefficient that you have like, let's say, between two molecules in a gas, roughly. OK, so now this quantity here, so here I'm completely, let's say, this is not really equal, it's something like that, OK. I'm neglecting prefactor, let's say numerical prefactor, because I mean, of course I'm not, I don't know them, basically. This does not enter in the estimate of this quantity. So this quantity is basically delta number of, so rho, if you want, OK. This is N divided by R to the d, is a density, so it's number of monomer divided by the volume occupied by the chain, OK. And basically, it's rho squared just because you have pairwise interaction, OK. Taken together, this becomes equal to v2 N squared R to the d, OK. I mean, if you just make the math. While this one, OK. So this one, OK. While this one is, so it requires a bit more attention, because this is basically KB times the logarithm of the total number of conformation with these sides R, OK. So I write in this way. And now to estimate these, as I said, Flory used basically the assumption that, I mean, he could use sort of the Gaussian approximation. OK, as I said, this is an approximation. In fact, we know that it is not true. So basically, this quantity here is proportional to the total number of polymer conformation, let's say in, with 10 monomer, OK. If you allow me, indeed, if I do this thing here. So indeed I mention, OK. And this is just a pre-factor, I mean. I mean, let's say it's a quantity that is an exponentially growing quantity, because we know that the total number of possible conformation for a polymer of N with 10 monomers is basically exponentially high. But it's not a quantity that depends on R, let's say. It's just a pre-factor. So this does not end in the final computation, but we use it just for, let's say, for logical reasons. And then I have to add to here the fact, let's say a pre-factor, OK. Which gives the total, let's say, which is proportional to the number of conformation with these sites R, OK. And this is exponential of minus R squared over basically NB squared times pre-factor, OK. Something like that. Where B is the typical monomer sites, OK. So this gives a sort of probability. Then, if I, so if I do all this math and I took the logarithm, in the end I have the following expression, which is the one written there, OK. Which is this quantity here, so it's V2 N squared R to the D. And this minus, kebi, kebi log of this quantity, which makes plus R squared NB squared. Plus constant, OK. But constant, which do not depend on R, OK. And this is the quantity written there. And I have to do this minimization and basically I have to take the derivative of this quantity with respect to R and just making it zero, OK. So then I do the F over R, no. And then I impose it equal to zero. And from this I get that R has to be like BN to the nu, OK. Where this nu turns out to be 3 divided by D plus 2, OK. OK. So, one thing that I've not told you yesterday is that from here you can see that for D equal 4, OK. So, because now this quantity depends on the dimension. So it's already, as I said, it's already in retrieval result because actually it has to be so, no. Because excluded volume interactions, of course, are more important in lower dimension because simply the polymer has less space if you want, right. It feels more, the effect of excluded volume. And they decrease, OK, as D becomes larger and larger. However, this formula stops, I mean, the validity of some dimension. And the dimension is D equal 4. Because if you insert here D equal 4 you get one-half. And one-half is the exponent of, let's say, for an ideal polymer, OK. This means that this, let's say, for basically the correct result is the form that nu is equal to 3D plus 2 for D smaller or equal 4 and nu is equal one-half for D larger than 4. So, and this 4, OK, is called the critical dimension of associated to the excluded volume interaction. So that means that for D larger than 4 you still have excluded volume interaction but they are not important. OK, the polymer remains, is described by this exponent, by, let's say, the Gaussian exponent. OK, so this is something quite important, OK. OK, so, why, so now let's go to, so I tell you that, because let's go now to a, so this is not important. So let's go now to, yes, yes, yes. So this is estimating this, well, the argument can be made slightly more, no, actually more rigorous, but this is just, let's say, the more unwavering way of deriving it, OK. Because of that, I mean, the more rigorous thing would be required to much time and I don't have such time. OK, so now, yes. Sorry? Why? OK, so, good question. So you can, well, there is again and also an unwavering explanation. Look at this formula here, OK. So you have that, this quantity here is always large, I mean, you can check it easily. It's always larger than one alpha, provided d is smaller than, is smaller, let's say, or equal. I mean, for d equal 4 is exactly one alpha, no, it's 3 over 6, so one alpha. For d equal smaller than 4, it's always larger than one alpha. This means that correctly. Excudi volume interaction makes the polymer more swollen, with respect to ideal conditions, OK. For d larger than 4, this becomes smaller than one alpha, which would make, let's say, excudi volume interaction making the polymer less swollen than the ideal condition, which is a bit absurdity, OK. So this means that for d larger than 4, this is the right exponent. Just a basic excudi volume interaction just stop to be important, OK, in this respect. If you want to see it slightly more rigorously, you can look at this quantity here, OK. Because if you look at, let's say, to the estimation of the excudi volume interaction, because if I look at, at this quantity here, OK, maybe green is not the best choice, maybe another color. So if you look at this quantity here, the way to compute this, I mean, to estimate this so-called critical dimension, so basically replace r, OK, replace r by its, let's say by its ideal size, OK. So make the substitution here, OK. So you would get something like v2 n squared times, OK, b to the d, which is a pre facto, n d over 2, OK. So then this thing becomes v2 v to the d, OK, fine. And then n 2 minus d over 2, OK. So then you see that this can be also, I mean, this is of course the same as me, n, so minus d minus 4 divided by 2, OK. So now, for d large, smaller than 4, this term here, and for n large, which is, let's say, in the limit of very large volume, this quantity here goes to infinity, OK, which means that this term cannot be neglected when, I mean, compared to ideal conditions, let's say, when you have an exclusive volume. Byseversa, for d than 4, this term goes to 0. So, which means that it's not important, OK. And OK, so it is another way of seeing the same thing, if you want. Actually, it's typically the more solid argument how you see such a thing, OK. And in fact, d equal 4 is called the critical dimension. I mean, critical dimension with respect to the two-body exclusive volume interaction, OK. I mean, this is, now, I mean, it's well understood, I mean, this has to be, has to be so, OK. So now, so I'm just, I mean, I repeated this argument a bit of the flow-reterior, because it's a bit necessary for the following. So, can I move further? You have had the question? OK. So now, what I'm going to say now is a bit of an application of, let's say, polymer theory, let's say, but now for branched polymer, applied to RNA secondary structure, OK. This is a bit of a work we did with, actually, with Tanje. But, I mean, it's more broad in sense that it was already, this theory of, let's say, of branched polymer was already applied to these, to RNA molecules, especially to RNA secondary structure. I mean, not basically, you have seen with Tanje that, let's say, RNA molecules, of course, they live in 3D, so it's important not only the secondary structure, but it's especially important in tertiary structure, when basically these molecules really fold in 3D space. I'm not talking about that, I'm talking about this part here. And now, one can sort of model or understand this secondary structure by using the physics of branched polymer, OK. I mean, there are some other application here, but they are not important in this level. So, what are branched polymer? OK, so far we have talked about, so these are maybe needed here, so just branched polymer are, let's say, a more complicated class of polymers, because they are not linear, but they can be branched. And this happens, so, let's say, the typical view of polymers is that you have monomer units followed by other monomer units, and they keep growing, OK, and they form linear chains, OK. However, it can happen that you may have monomers, not with, let's say, doubly functional monomers, but each monomer can be three functional, OK. That means that two monomers can be attached here, OK. Something like that, and so on and so forth. So, you can have here and it keeps growing, but now this monomer can be two functional and they increase, OK. And you form, let's say, a branched structure, OK. So, now, why this is relevant for RNA, because you see, no, if you have this doubly, let's say, the RNA falls into, I mean, like this, then, let's say, on large scale you can, I mean, imagine that you forget this, let's say, this double folding here, if you want, and you adjust focusing, let's say, on the backbone of this structure. So, the backbone can be thought as a branched structure, OK. So, this is the idea. And the idea will be like, OK, let's for the time being forget about these, let's say, the base pairing and just focus on the backbone and try to see if we can understand, if we can predict or say something about the structure of the backbone in terms of what we know about the physics of branched polymer, OK. That's a bit of the idea, I mean, if there is anything interesting that we can do. So, now, so, now, the physics of branched polymer is, of course, I mean, you may imagine, it's a bit more complicated, I mean, it's quite more complicated, actually, than linear polymers. And in fact, you can introduce I mean, a larger set of, let's say, scaling exponent, OK. So, now, for linear polymer, we introduced this exponent nu, OK, that describes how, let's say, the total polymer size increases with the total number of monomers. So, this exponent is still, let's say, this definition, of course, it's still perpetually valid also for branched polymer, OK, because we have a number of monomers and we can always ask what is the typical generation radius as a function of the number of monomers, OK. Now, you see why it's important to define the generation radius, because the end-to-end distance, now, it's defined, because you have many ends, actually, OK. Then, you have another quantity, OK, another quantity is, and this is actually something that is interesting for what I'm going to say, is suppose that, let's say, suppose that you have let's say, you take two monomers along the chain, OK, and you ask what is the average path, I mean, on the backbone between these two monomers, OK. So, now, this quantity, and you average over all possible realization of your polymer chain. So, now, this quantity is also scales, let's say, with total number of monomers in some specific exponent, which is called rho, OK, it's defined in this way. And you have two more exponents. So, another exponent, so, this is clear. So, what I'm doing, I'm, let's say, I'm picking to, let's say, for instance, a monomer here and another monomer here, I'm computing, let's say, this path here in terms of the number of monomers between these two monomers. Ah, I let you notice one thing, this path is unique, OK, because I'm not considering, so I don't have cycles in my molecules, OK. So, then, if I pick two monomers, the path that connects them is unique. The minimum path, I'm going from here to here to here, OK. I have no other choices. So, I'm doing this for any average overall possible conformation, because the bonds can be rearranged everywhere. And so, this is the quantity I'm looking at and I mean, I can define this exponent here, because it has to scale with the number of monomers of the chain. Then I have another exponent here, which is defined as the following. So, now, suppose that I have my molecules and I cut at random one bond, for instance, this one. Then I'm considering how many monomers are on the left part and how many monomers are on the right part, OK. I mean, just measure them. And I average overall possible cuts in the chain, OK. So, this is a sort of, this is called the average size as a function of, so the average the mean branch size, sorry. And this is also as a scaling property. This is also scaled as the total number of monomers or bonds inside the chain as an exponent which is called, which is defined epsilon, OK. So, now, these two guys, Fandres, Munch and Madras proved that whatever it is, let's say, the molecule you have and what is the ensemble is, epsilon is equal to rho, OK. If you are interested, I mean, this is a very, very nice work. It's a bit mathematical work. So, if you like mathematical details, they prove this by using statistical physics reasoning. So, it's not a simple argument, OK, for the time, I mean, just accept it. But it's, I mean, it's a very impressive work that you did here. And then you have another exponent, OK, this is less interesting for what I'm saying by just, I mean, for your personal culture. So, suppose that again, you take any two monomers on the chain, you compute the path between them and you measure the spatial distance, OK. So, this quantity scales with the path length as l to the two new paths. So, you see why, basically, I introduced an exponent which is analogous to nu, but it's related to the path on the chain. It's like this is an independent polymer and you are measuring, let's say, another exponent. And let's say, now, by scaling it's easy to see that nu path is not independent by nu and rho, but it's just the ratio between the two, as to be so, OK, just by scaling argument. But this is not important from goth angle to say, so, I mean, just keep it, but, I mean, yes, this one, yes. No, actually, you take two points on the chain to two monomers, for instance these two, and you do it for any possible path, and you measure any steps, I have to walk on the molecule to go from here to here. OK, that's easy, that's it. The next, you mean this one? The third is done, suppose that I cut this one bond, you count how many monomers are left here and how many are left here. OK, that separates in two independent molecules. OK, and you have so, and then you have some size and some size, and you do all this, do the same operation for all bonds, and you average overall possible conformation of the polymer. OK, you get this quantity. Subbranches, the typical size of the subbranches, OK. Yeah, to be honest, I mean, in the calculation I always took the smallest of the two to avoid let's say, finite sets effect, but I mean, you can take just one, because I repeated the procedure for all possible polymer conformation, because to imagine that the ensemble I'm talking about is where, let's say, this branch it's a randomly branching point, so they can be arranged everywhere, so you have a high number of possible, it's even higher than the ones of a linear chain. I don't have only let's say, the exponentially high contribution from the fact that the chain can assume all possible spatial conformation, but they have only another entrop, if you want, associated to the fact that the branches can be arranged everywhere. So in this creates let's say, they are like yeah, this creates a higher ensemble, if you want, OK. Is it OK? So this is the physical meaning of this spawn, OK. So now, OK, for this but let's say, so now, this can be also understood in terms of the flow theory. I have to do some let's say, some some additional ingredients, OK. And that was done, I mean, the contributors of this work were let's say, Isankson and Lubenski, and I think this is, yeah, this work here, especially, I mean, if you look at this, it's a bit simpler work, it's based on the use of the flow theory, OK. And so let's look again at this quantity here, OK. So now, the flow theory, again, for now, for branch polymer, randomly branch polymer, OK. So, I mean, it contains again the same terms, OK. So always an enthalpy contribution and entropy, OK. So now, this contribution here, OK, I'll explain this terms that are on the blackboard, on the slide. So this terms here is the same, OK. So it doesn't change, I mean, it's an approximation, of course, but it doesn't change with respect to one that I have defined and I've introduced for linear change. So this is still, let's say, V2 to the D. Really the same, because, I mean, why doesn't change, I mean, I can make the same approximation, if you want, I mean, conceptually. Because to compute this term, I have, let's say, I have not really used any information from the fact that my monomers are connected in a chain, I mean, OK. So it's just like it's just like, let's say, collection of particles with a screw volume interaction and that's it. So the fact that is, I mean, branch polymer doesn't really matter for estimating this term. So this is why it remains the same, OK. Well, the trickest part is this one, which is the entropic part, OK. Because as I said, now the entropy is more complicated and this is the main, basically, contribution of this work here. Which is, by the way, it's not a long paper, I think it's four or five pages. But it, I mean, it's pretty clear. So in my opinion, so if you want to understand a bit, you can look at this paper here. And now it is estimated. So let's, OK, sorry, I can't set the molecule, but let's draw it again. Basically, the scheme that I have there, OK. So now this term, if you want, it actually contains two contribution according to these guys there, OK. I write, let's say, S over, so I write D, like S1 divided by KB plus S2 divided by KB. OK. So these two contribution are highlighted here. This one and this one. OK. So the reason why I call it this way, I mean, it's for the following reasons. So this first part here, OK, let's say S1, OK, is a sort of reminds of the same entropy contribution, OK, that is let's say, that comes from a linear chain. For a linear chain, you remember, I had that this is R squared divided by N, by NB squared. OK. Here, in this case, because I have a branch polymer, let's say, so in this term here comes from the fact that I have, let's say, a linear strand of N monomers, OK. Here, my effective linear strands is not made by N monomer, but it's made by on average, let's say, by N over B monomers, because my linear strand is the one that is effectively playing a role, is the typical length of a single path on the chain, OK. So then it's, I divided by L over B because this is the number of monomer belonging to an effective path, OK. So this is why I have this term here, which is basically this one. This is the first one, OK. So in this is more or less easy. Of course, I mean, I have many possible paths, but let's say the idea of, let's say, the hypothesis of these people is that tends eventually only in some prefactor, because again here I'm completely neglecting the prefactor. And the second term here, this one, which is the less obvious, let's say, is related to the entropy of branching, OK. Because as I said, I mean, the branches can be, I mean, all these branches can be arranged everywhere again. So if you want, this is a bit of a term, which is similar to this, I mean, in let's say in the power law form, but it's of this kind, OK. And so what is the physical meaning of this? So suppose that so now I have to work a sort of in the space of branch of, let's say, of the branchesness. So where now the typical size is not played by R, but it's played by the length of the path, OK. And then I have to divide by NB squared, because this is similar to the flow, to the usual flow term, which is R squared over NB squared, but where R now it's L, it's not R, because this is an entropy associated how I can rewire my branches on all possible molecules, OK. I don't know if I've been clear. I mean, it's the same, it's a sort of Gaussian term, if you want, but where now let's say the typical size is L, so it's the average path, but it's not it's not R, OK. So then I have two contribution here, which is this one, which are one is the elastic contribution and the other is elastic, so it's the entropy of, let's say, the entropy associated to the all possible size in real space, plus the entropy associated to all possible sizes but in because of the rewiring of the branches in, let's say, in the space of all possible connections. And now so now I can, in fact if I, for the time being, I completely neglect these well, OK, let's do it's story. So now I have these, I have the following let's say, I basically now this this free energy which is, now it can be written like this, OK. So let me cancel this. So now I have V2 N squared over R to the D plus this one which is N squared LB this term here plus this one. So now this term here is more complicated in sense that because now I have two variable, so it's a sort of function of two variables, the free energy because I don't have only to minimize with respect to R, I have also to minimize with respect to L, because it's also, let's say there are two two variables, one is, let's say the spatial side of the polymer and the other one is is the typical path on the branched on the branched molecules, OK. So you can see it's, I mean, this is simple, I mean, it's not complicated, but let's say this I mean, it's more sophisticated like, of course, I mean, because the molecule itself is more sophisticated, OK. So the interesting, one thing that you can OK, and this, but I mean, you can, we can see another feature of this sorry, let me write in this here. OK, we can see another OK, first of all, we can see quite of a simplification in this formula that let's say the two terms here or well, sorry, the terms here the exclusive volume interaction, as I said does not depend on L, OK. So so you can let's say here you have to perform the two following minimization you have to do first the derivative of F with respect to L and imposing that equal to 0 and then derivative of F with respect to R and then imposing it equal to 0 so we have these two conditions, OK. But now the first one it's the one that you can do first because this, basically at this term does not enter, OK, because does not depend on L, so you do, you minimize this first and then you minimize the second one, yes. Yes. No, this is OK, maybe drawing is not ideal but this means that this defines what is this R of L so it's but you don't, actually you don't have to worry because I'm not going to take, I mean it's not, this one is not applicable I mean it cannot be measured in RNA, OK. So it is it cannot be measured in RNA but if you want to know about this so this quantity here it measures the how to say this so take to let's say a pair of monomers along the branch polymer you compute the typical special distance as a function of the linear path on these molecules, OK and this is this quantity, OK. It's a function of path length, it's not as a function of the total number of monomers, OK. So that's the thing. While here R is the typical monomers, the typical molecule size is a function of total number of monomers is to separate quantity, OK. I mean I'm not saying that you are not related actually are related by this scaling relation that's why this new path exponent, which is defined for the paths is the ratio between where row, OK. I mean I understand maybe it's not obvious but this let's say but this is a bit critical I mean if we want to apply this to RNA molecules, OK. Because for let's say if I'm assuming that the typical secondary structure of RNA molecules I mean we look like this then in the secondary structure let's say analysis what I can measure, I can effectively measure the length of the paths OK and in this quantity here, which is the ones that I'm going to talk about. So this one and this one, OK. But I don't have access to this one, neither to this one because that will involve a measurement of the molecule in space and if I'm measuring the secondary structure I don't have access to that, OK. I need I need some spatial information which I don't have, OK. So when I'm talking about, yes. So now basically you have to do this exercise. And so I think I have it in the next slide. Yes, so let me see. Yes, so the if you minimize this one the result is so it gives you this one. So basically you can try yourself, OK. If you do this, OK. It gives that R has to go like L over B cube times N, OK. So this Nisima for me you can it gives you just this, OK. So now of course I mean R because now I'm minimizing with respect to L, OK. And of course R from this one will contain the terms which still depend on L and the terms which depends on N, OK. This one. So in order then to minimize to this you put this relationship back here now you'll have that F will depend only on L and N. You minimize it relationship between L and N, OK. And this will give you from this will give L as a function of N which defines my exponent which I call it rho, OK. And and then once you have that you basically reinsert this one here and you have a relationship between actually rho and nu, OK. Which is the relationship which appears here, OK. So now well, this is a bit of so I mean so now you have a sort of some prediction, OK. For for the exponent nu and for the exponent sorry, we have a prediction actually relationship between the exponent rho and the exponent nu. And you can compare to let's say you can compare to to the available let's say numerical results in the literature and this is a bit of a comparison but I mean this is a bit technical so I think we can we can skip this. So what is more what is more important also that you can also move further of this of this flow theory and you can compute it's the corresponding distribution function for this for this quantities, OK. And this all obey some let's say some scaling relationship, OK. And let's see maybe this is a bit so let's OK, maybe let's go to RNA and hopefully that will become a bit clearer, OK. So this is maybe some kind of secondary structure that I guess you have already discussed a bit with R&J, OK. So RNA we know that RNA because of the base of the base pairing, OK. It takes this peculiar form, OK. So you have that because of this base pair formation you have what is called the secondary structure, OK. And this is basically it's like a graph, I mean so this is so this is a pairing, OK. But it's not, I mean it's a pairing so this is for a very short of course this is for a very short RNA fragment but you have a longer one you can form many of this base pairing and you can have a larger molecule that will resemble like let's say like a branch structure, OK. Then this structure will fold presumably in 3D space and you have the so called the tertiary structure but here I mean what I'm going to talk about here is this secondary structure not the tertiary one, OK. Because this is a bit the goal of this lecture, OK. So now this is something that is related to a larger molecules where you have you have talked with Ange about multi loop so this is a structure where you have basically of this you have this multi loop that you have this structure which is not completely base pair but you have many arms that protrude from it and this is called the multi loop of degree 5 because you have 1, 2, 3, 4, 5 let's say arms protruding from this quantity and here it's one of these bigger monster, let's say from bigger RNA molecules that produce such kind of branched branching structures. And then from a long non-coding RNA which is just a a specific RNA molecules which is 600 nucleotide long that form this kind of structure, OK. I don't know actually why it's called Braveheart but that's the name for this molecule, OK. So now this kind of I mean I guess Ange has told you that structures, OK, are the ones that are predicted based on some let's say the empirical, OK, we call it like this way force field so basically we have some programs that try to, so you have your let's say your linear RNA sequence OK, you you have your, and what you want to like to know is how many false, these long RNA may produce, OK. So there are, let's say there are empirical programs produced in the literature that you can, they can be fed by your favorite, let's say they can be fed by your favorite RNA molecules and they will produce a certain numbers of these typical false, OK. The secondary structure so they will give you a hint, I mean an idea of what is the secondary structure to that molecules. I mean, we know that these secondary structure are not unique in the sense that you have many typical, I mean a lot, I mean a very huge number just because the pairing between the different nucleotides is not unique, OK. I think that was a stress many times already. So because it's not unique you can actually produce a large number of these molecules, OK. The idea is that, OK, because you can produce a large number of these molecules, this can be mapped into a corresponding high number of branch molecules and it, I mean and maybe by comparing to this kind of theory we learn something from, let's say from the typical statistics of these molecules. So now this is not completely new and it's quite known in the literature in the sense that there were a I don't know if I have no, sorry. So this is a collection of precisely of these kind of results. So people in the literature have measured the following. So suppose you have many, many molecules of RNA molecules and what you do, you are using precisely this program, these tools to predict secondary structure, OK. And you do so once you have let's say your produced molecules from these programs what you do, you can analyze the output, of course. And in particular people have introduced a quantity which is called which was called the maximum ledder distance. Basically quantity measures the typical let's say the largest path on these, let's say on the resulting branched molecule as a function of the total size of the molecules. If you want people have measured this, OK. Actually not the typical L but the let's say the average largest L, OK. But in the end it's the same thing because by scaling argument this, I mean the maximum ledder distance or the typical ledder distance should always obey to the same scaling relationship. As a function of the total number of base pair of total number of bases in RNA molecules. Namely as a function of total basic of N if you want, OK. Where N here now is not N but is N then nucleotides. OK, so now you have to just imagine that instead of a, but it's the same thing, OK. So people have measured this quantity. OK, so for RNA molecules and they found here you have, I guess you have a picture, yes you have a picture from I mean you have an indication from this quantities. I mean, why this is so interesting? I have been so excited about that because you can do two experiments here. One experiment is, OK I measured this quantity for specific RNA in particular they measure that for RNA viruses. OK, so many RNA viruses here I think the largest one is the tobacco mosaicovirus, but you can do for many of them. And you can compare the same quantity for random RNA molecules. OK, and random I mean that you let's say, you construct artificially they don't exist in nature. Artificial RNA molecules made for instance of, like say, which homogeneous composition of this is like 25% of A, T, C and U, OK. And then you but they are not like I mean, in random distribution I mean, you say that the total number of nucleotides is some number. So 25% is A, 25% is T, 25% is T, 25% is U and you just make a random sequence out of it, and you use these the same programs to measure the faults, OK. And interesting result they got is that the let's say, the non-random RNA, namely RNA of viruses have a more compact path distance, OK. Then do one, let's say for random RNA, OK. So this is, so the results for viruses are this one, OK. So this is the, let's say the maximum letter distance is this one while the results for random RNA is here, this one. So it's systematically higher, the results for random RNA which means that, problem, of course not surprisingly that viruses, I mean, I mean let's say, living organisms have an RNA which cannot be considered as random, but has to be I mean is different, OK. And this also means that presumably evolution has favored of course, I mean this should be expected, but this is let's say, one of the first quantitative experiments let's say, that have shown that there is an amount of non-randomness that can be quantified for real for real bios, OK. And the quantity that people have measured is precisely this one, OK. Is it clear? So you can there is, I don't know why I didn't put the references. Ah yes, it's Ah yes, it's these people here, so a lot of work was done, oops, it was done in the group of Bill Gelbart in the United States plus collaborators of course and I think to my knowledge, the first work that has put forward these that have used this sort of relationship is this one, predicting the sizes of a large RNA molecules published on PNAS in as you can see in 2008, now so now it's 16 years ago, OK. So this is another plot, basically showing the same thing probably this is with less points, this is more clear. So here you have the so-called maximum-ledder distance if you want this quantity here as a function of the sequence length, OK. So what these people have found have found the following relationship have found that if you measure the maximum-ledder distance for a random RNA the curve you found is the one that is shown here, OK. So these are the points, OK. For random RNA and they found the dependence as a function of the total of nucleotides of about two-thirds, 0,67. That was, let's say, it's a fit, OK. But so this relationship seems very robust, OK. Basically how they, OK. And these are the points that you have here, as you can see it's for different families, OK. You have this bromovirid, the bromovirid RNA2, the bromovirid RNA1. So this kind of things. So the tobacco is here, OK. The tobacco virus is one of the largest one, OK. So basically how do they, I mean, just a few details on how do they get these points. So let's say, how do they get the random RNA and then they compare to real RNA. They had the same so they did two things actually. They did the way I said here. So they consider an homogeneous composition of nucleotides, namely 25%, 25%, 25%, 25%, and it fixed n, obviously, sorry, it fixed the number of nucleotides. Or they've also taken, let's say the non-homogeneous composition of each RNA virus and they have reshafled it. So basically they said, OK, the composition is not the one, I mean, let's say the distribution of the composition is not the one that you have been linear in real sequences. The amount of nucleotides is the same, but it's let's say, you randomize the position along the sequence. So this is another way of doing randomness, if you want. And if you do this, so if you compare this to operation, you basically get the same exponent here. However, for the real and non-random one, let's say taken from nature and which, let's say with the distribution that you have in real molecules, you get this data. As you can see, this maximum-letter distance is comparably lower with respect to the number one. And this has to be, I mean, according to their interpretation of course, but I think it's really, I mean, something that you can believe in, that is sort of is a sort of, let's say, in information, saying that evolution has pushed strongly for having nonreal, I mean really non-random sequence, not only non-random in, let's say, in terms of the choice of the right amount of base pair composition, of base, sorry, of basis composition, but also in terms of how, let's say, the molecules are exactly allocated along your sequence. Because then you have to form this base pair and this base pair has to be functional of course, because this virus is I mean, sort of have to live. So this is for instance, this is something that you can also see a bit qualitatively. So this is a phage. So a phage is a virus that infects bacteria, ok. Is a single strand RNA virus. It contains about 4,000 nucleotides. So this is the sort of conformation you get for its corresponding secondary structure and this is the same, let's say, secondary structure you have for a similarly composed RNA, but random. The way, I mean, randomize the one of the way that I mentioned here. The same size, more or less, is 4,000 nucleotides is 4,200. Ok, otherwise it's pretty much close. You see that the non, sorry, the random one has a much larger leather distance than this one. Ok, you can also appreciate that by inspection. And this is a bit of the message contained here. So, what these gentlemen also did and this is an hypothesis, ok. They said, ok, now I mean, again, this is a leather distance. It's a sort of measure of the size of your molecule, but it's not a measure of the size in space because, let's say, this kind of cause, I mean, of this kind of structure you get for, let's say, from these programs. They only tells you how much frequently two bases interact, they form base pairing, but they don't tell you, I mean, the extension of molecules in space because they don't know anything about that. Ok, so this is the important thing. However, what these people have done, so this is an hypothesis the following. Ok, now I have my maximum leather distance. Ok, so then assuming that the maximum leather distance is a measure of the average path on the molecules, then I can assume that, ok, then the average path falls randomly in space and then so the associated generation radius, so now this is the generation radius of the molecule in space has to be just has to just grow, like the square root of, let's say, this leather distance, because, I mean, if this is, let's say, randomly folding in space, then this has to be like, if you want, just a random polymer, so has to be like the square root of this quantity. And this seems very interesting conclusion for the following reason, because if you take the square root of this number which is very close to the third, you get that how the typical RNA molecule has to fold in space, has to be like one over, basically n to the one cube, one over three, ok. And if you have such a kind of folding it means that my polymer I mean, the RNA molecules if you want about the polymer basically space filling has to be compact. You see this? You see this? Because if you compute the, let's say, the typical monomer density within the space occupied by the volume, so this has to be like n, again, like this, divided by the typical size r, but this one, sorry, the typical size r cube, ok. And because r I'm predicting that has to be like one over three, this one then so you have r to the three, n to the one third is n, so n over n just cancels, ok. And this means that this is order of one, so basically the typical density of let's say my RNA molecules inside its volume is constant, and the polymer is space filling, so it's just occupying all sort of possible monomer inside the space. And this is so this is an hypothesis, of course, it's a prediction, more than an hypothesis because of this work, but it's an interesting one, because it tells you that some it's an interesting one in the sense that for two reasons. The first reason is that because the molecule is compact then can fit inside can be efficiently if you want, inside its own volume. And the second in my opinion more interesting conclusion is that by measuring let's say something which doesn't know anything about the let's say the spatial folding of the molecule you actually are giving a prediction about how the molecule would fold. So now the interesting conclusion of this work is that I mean why this is important why the density has to be constant is important also in other respect because you have to think how these are RNA viruses and in RNA viruses I don't know if I mentioned to you, but they are let's say the RNA is containing some in a structure which is on the capsid of the virus so now the capsule of the virus is a very tight environment so if RNA attains the maximum compaction then it means that basically it has done efficiently this work the RNA can really stay inside its own volume and this is again, this is a conclusion that you can only deduce from the properties of these letter D, which again doesn't know anything about the spatial size so it's something that somehow itself consistent with the physical expectation otherwise it's the only yes, because you are only applying let's say the physics of these paths, okay you don't know anything about the spatial size I don't know if well, in the end we have a bit repeated this analysis but maybe sorry, it was confession ah, okay so I don't know what sorry, let me just check with the time no, because I mean maybe we enter in probably too much details later, so it's probably because this it's the thing that I wanted to tell you so now maybe it's a good if you have some question at this point, yes ah, this is simple because ah so it's not how to say this so suppose so now, okay, what I can measure, I hope this is clear it's only by this the outcome of my numerical experiment I can measure this maximum letter distance which is basically the longest path on this branch 3 okay, this is something that I mean, it's not complicated to measure I mean, you just count so this is what you get these are just an example but you can have many, okay so this is for a again, this is for a real virus it's a phage it's a virus that attacks bacteria and this is for the, let's say, similar size RNA I think also with more or less the same composition but just randomized that I just shuffle the position on the sequence of the different nucleotides and I get this thing here that you can, I mean, you can see that is a larger distance sorry, is a larger maximum letter distance now, suppose that as you can see, I mean from here, if this makes more intuitive so you can consider this one and you measure this quantity here or this quantity here as a function of certain number of nucleotides okay, and you get this scaling relationship that basically the maximum letter distance as a function of the number of nucleotides scales like n to about two-third, okay in sequence is random okay, because I mean, there is a complication in when you, you cannot do a really I mean, this is the work we did with Ange but it's a bit too detailed I don't think it's really worth, we don't have time in if I had more time I could explain you but let's say if you have only this kind of measure man, it's difficult to get any scaling maybe you can have an idea but it's difficult to get any scaling for a specific molecule because basically we have only so to get a scaling you need many n for a single molecule n is fixed right, so you cannot do any real scaling out of it although you, I mean, it can be done a bit more common, so I don't have the time to talk about that but the interesting conclusion is that okay, but you can do that for certainly for random molecules in the sense that you have the same, this n I mean, it's a random molecule you can create that artificially you can do that I mean, you can do that like if you want you can play such a kind of game or you can do what these people have also done here and also I think Ange has done this kind of work for specific RNA sequence because, I mean, RNA sequences in viruses let's say, they span in enormous range so you can have small RNA let's say small in terms of nucleotide or very large RNA in terms of very large number of nucleotide so you have such a kind of data on top of that you can have the same sequence composition and you just randomize it you move nucleotides around and you can create your own ensemble of folds and you can, this is also a sort of randomness if you want then and then you can repeat the game and you can compute this maximally as a function of number of nucleotides you observe this power law after that so now we have this power law now what we can do that and this is a bit, I mean the discussion that did in this work I think it was already in this first two work you can say that ok now I have my maximum large distance as a function of N now assuming that this ladder is a sort of let's say now this ladder has to fold in space but how can fold in space I mean it can fold just randomly it's like a random walk in space then, I mean I have no other possibilities that the the generation rates of the polymer in space of the average RNA molecules in space is to be just because it's just like a random polymer is to be the square root of the maximum ladder distance no, exactly like if you have a random polymer is to be like the square root of the number of monomes if you have a linear chain and then if I took the square root of the maximum ladder distance it has to be like N to the one third you see, it's just the square root of this is it clear? well because all the random points follow on this curve ok also, if you take these points here which are non-random and you randomize that in the way I said you just reshuffle the sequences they all fall more or less here if you want to think no, I don't think there is this data here but they are all fall on this there is actually there is a there is a small difference but ok I will not complete this up they don't fall on this curve I think this curve refers to this composition of nucleotides if you don't have a non-homogeneous one but it's still random in the sense that you randomize along the sequence this curve I think can become a bit lower but the scaling is the same so this exponent is still one third so this is the important thing actually it's more than so this is a log-log plot ok, maybe it's not evident but it's a log-log plot because you see that the spacing is not the same so it's a log-log plot so if you are doing the same curves but for different folds and the composition not homogeneous what you can see that you have another set of points but the curves I think it should become a bit lower but the slope is the same which is the important thing so basically this means that the pre-factor is different and that can maybe be related to the composition here but the pre-factor remains the same which is of course the important information of course the pre-factor is also interesting but it's probably related to but if you want the pre-factor sort of renormalization of the effective bond let's say the effective bond here but if you don't touch the scaling behavior then it's fine so this is the important thing that enters, by the way in all polymer theories it's the scaling, it's not the pre-factor ok yeah ah, yeah, they are error bars yes this of course depends on on the statistic of your sample no, here I'm not sure because this is the report from this paper I'm not sure if they are error bars or standard deviation I'm not completely sure but this should be error bars I think at least I just don't remember but I'm sure it's written in the paper because why do you think they are too large? ok, no ok, but consider that well, producing these folds ah, the statistic is not really enormous anyway I think when you produce these folds ok, now I don't remember the right statistic basically it's not like you have an ensemble an exponentially large ensemble ok, even for random RNA I think the amount of folds you have is on the order of if you are lucky with this kind of program calculation is about maybe 1000 so that's why the so 1000 means that you have an error if you send an error of the mean of about what, 30 or one third of the value so it's possible that the error bars are not negligible they are computed by some calculation they are quite because what this code does Ange has mentioned to you the presence of pseudonauts these folds what they do they compute all possible base pair in energy score and they minimize this energy score but of course they produce sequences so you can give I think I have ah, yes so this is how it's very empirical there's nothing really holy in that so basically you have an energy so this is called Vienna RNA it's called Vienna I think because it was developed in Vienna and they have different you have different version of it it's one of the it's very used I mean there are others in the work with Ange we use this but if I remember correctly there are others anyway so you have an energy score that you need to minimize you just insert in that your favorite RNA sequence and so basically the minimization of this energy score so you have a term which is just an offset which has to be there for some technical reasons then you have a term which favors or penalizes I mean depend on the amount of this term branches I mean like this really branch like this or a term that favors or penalizes the number of ampere nucleotides and that produces these more or less I mean a more or less large number of these of these holes and that's it but out of it you don't get only one structure one possible structure you get many and well this is a bit rule of thumb you may decide how many of those you want to retain because the energy you get they are all very close to each other but in the end I think in the best case scenario you get you keep them 1000 I mean just another mile but around 1000 so I have to say as you can see now it was let's say the first version of this RNA was published this is the newest version but the first version of this RNA was published in 1981 so not really wow and they are now new versions but all the time they have I mean in these new releases they have changed a bit these weights here I mean let's say these energy terms it's very empirical so the way these terms were established they are based on let's say on the best agreement with experiment from I think it was from calorimetric measurements but otherwise it's very empirical the important thing of let's say a very important assumption of this kind of models is they completely neglect the present they don't take them into account the presence of pseudonauts because as I guess Ange has mentioned to you pseudonauts make the things much more complicated I mean if you introduce pseudonaut in the calculation basically the problem starts to be I mean you can do that only for very short RNA molecules if you are going to large RNA molecules like the ones analyzed here the problem becomes impossible to solve I mean the number of conformations you have to estimate becomes exponentially large and then you cannot just do that ok this is I can give the slides to morning but this is a bit too much detail now what was fun what we found is that basically so these terms I didn't put the precise values here but basically from it changed in the course of the years so in the first version this branching term was higher compared to this ampere nucleotide no sorry was higher branches were penalized more so now in the newer version branches are less penalized compared to ampere nucleotide but it becomes a bit it's a bit unempirical work but still if you want to get a first understanding about how your RNA molecules should work you use this kind of tools there are others but I mean they also produce different results but it's kind of it's basically the first thing you are using ok and there is a server for it I mean in case you want to play you can just use it it's called v Vienna again it's called v Vienna RNA and well there is a lot of stuff we have done but this is just the acknowledgement of the different people which I think deserve some acknowledgement and otherwise I don't know I think we can stop here because I think it's a bad time I don't know if you have questions or questions about this ok so I guess then I'll I'll see you tomorrow for this exam so as I said it's just a basic question so don't be too worried what time is it 11? 11, right? ok I'll come a bit earlier that's it