 What else can I predict? So, people have done these experiments to see that if I take a polymer, let us say I take this polymer, I take some random polymer and I look at two sites along this backbone let us say one site here and another site there ok. If I walked along the backbone, these two sites are separated by whatever is the distance between them let us say some S base pairs ok. But I that is the genomic separation between these two base pairs, but I could also ask what is the spatial separation between these two base pairs, what is the distance between these two base pairs right. And I could do people have done experiments to find out these quantities, what they do is that this technique is called fluorescence in C 2 hybridization, fluorescence in C 2 hybridization of fish. So, what they do is that if you want to know where a particular sequence of this DNA is, you create a complementary. So, you create a sort of probe which is exactly the complementary sequence to this region right. So, you create a complementary probe. So, if this is A T T A this would be T A A T right and you attach a fluorescent marker to it. So, this probe will go and attach to this specific region of the DNA that you are interested in right. So, you can know that inside this nucleus where exactly is this this segment of the DNA. You can do this for every segment and therefore, get. So, if you if you tag this segment for example, this using a different target of course, depending on what the sequence is over here, you would get where this sequence is on an average inside the nucleus. And then you could say therefore, what is the average separation between these two segments. If you did it for a standard polymer, so just purely from from a let us say a random walk polymer, then again this separation between these two base pairs which was separated by s distance along the contour would typically go as s to the power 2 nu where nu is equal to half for a random walk polymer ok r square would go as s. So, if nu is half this is s to the power of 1. So, that is what I would expect for a typical polymer. So, it would keep growing like this. So, if I plotted r square s versus s it would grow something. You can do this. So, if you do this experiment for actual genomes. So, here is a plot of that this is actual fish data this is the r square as a function of the average genomic separation between base pairs ok. So, this is averaged over what? This is average this is not for a particular pair of base pairs. So, if these two base pairs are separated by s I take the average of these two over many many cells, but I take such average over all base pairs which are separated by s. So, this is one pair one and two then this could be another pairs if I start somewhere over here the other pair would be somewhere over here right. These two would also be separated by s. So, average over all base pairs which are separated by this s genomic distance and I find out what is the average separation between such base pairs and it turns out that yes it initially does increase, but then slowly after a point after around 1 megabase or so, it quickly approaches the saturation ok. So, it does not keep growing indefinitely as I would expect or as I would expect naively it reaches some sort of saturation pretty quickly and what could the reasons for this saturation be? A trivial reason of course, is that it is these polymer systems are confined you are putting them inside a confined volume. So, they cannot really go beyond the confines of this nuclear radius right. So, there is some sort of a confinement effect. There could be other effects like tethering it is known that there are proteins on the surface of the nucleus which I will discuss which actually like to bind which actually like to bind to these chromosomes and that again will restrict what sort of confinations are allowed and then there are also protein protein DNA interactions. So, if you have proteins which bind regions of DNA something like this if a protein came and bound two regions of DNA like this then these two base pairs over here and here even though they are very widely apart on the backbone would be extremely close in spatial separations right. So, this sort of a leveling out of this R square could in principle be due to confinement, it could be due to tethering, it could be due to protein interactions or more likely due to all of these together ok. So, we have, but whatever model you sort of build should reproduce this sort of a feature that it should sort of flat out after around 1 mega bases or 2 mega bases yes. No. So, this is I think this paper was for a mouse genome. So, this is averaged over all sort of what do you mean a single DNA. So, let us say this and this. So, I have a fluorescent protein marker over here which tells me the location x, y, z of this segment. So, it is not a single it is not a single base pair level. So, you build the target strand has I forgotten maybe around I think it is around 30 to 50 base pairs. So, you build a complementary strand sorry ok. So, that was the conclusion. So, it is not a single base pair it has it is a target sequence basically ok. So, then that target sequence will come and seek its complementary target sequence and bind. But in the level of this genome which is like 10 to the power of 9 base pairs, the 30 to 50 base pair probe is like almost a single point effectively right. It is a very small point on this whole DNA back point yes I see ok. So, you mean that so, that is true. So, you have this DNA double helix which has let us say a t whatever t something like this and I am designing something which has anyway this complementary sequence this t a. So, in order for it to come and actually bind here. So, one thing is that. So, I am not sure of the exact answer of this, but one thing is that you therefore, have two sort of possible binding partners right. It could bind to this complementary strand or it could bind to this target strand right with equal sort of energies if I have not done anything else right. So, even with even if I were to do nothing else these would occasionally bind to this target strand and cause a fluorescence. Maybe in only half the samples, but I am not sure that is the answer actually I think they might do something. So, that the energy of attraction to this complementary strand is slightly lower. So, that it preferentially wants to bind to this. I am not sure if they do it by opening up the DNA somewhat in this region or through some other method, but I can look that up. But even if I were to do nothing I would still expect the sort of fluorescent signal because I have introduced this target probe. Yes, no right you have two copies of every chromosome. When I say humans have 23 pairs of chromosomes, you have two copies of 1, two copies of 2 and then 1 x and 1 y. So, 22 pairs and then 1 x and 1 y and then when the cells divide you know this chromosome sort of ok. So, let me just the confinement is sort of obvious that because you have a nucleus nuclear volume you have some confinement effects. Let us look a little bit about this tethering proteins. So, here are particular class of proteins called lamin proteins and these lamin proteins are preferentially found along the periphery of the nucleus periphery of the nucleus sorry forget this curve just look at these things. So, there are different categories different species of lamin proteins closely related for example, this lamin B1, this is lamin B2 whatever. But the key thing is that these lamin proteins are found sort of distributed along the periphery of the nucleus and people have done experiments to show that there are regions of the chromosome which bind very strongly to these lamin proteins of the periphery. So, you have these sort of tethering interactions which says that this polymer is not simply a free polymer inside this volume, but there are constraints which says that certain regions of the polymer will actually be bound to this nuclear to this boundary which is the nuclear nuclear wall or the nuclear lamina as it is called. So, that is one class of interactions that can also provide some sort of compaction. So, if you had a polymer which was interacting with the wall in this way the statistics of that if you were to calculate average r square s would be different from the statistics of a free polymer floating in solute floating inside this bounded volume. So, that is another thing you have these tethering interactions and thirdly you have these protein mediated interactions which sort of comes under this very interesting thing called DNA looping. So, often people have found that there are these proteins which loop distant segments of DNA. So, this is a this is some protein which takes together two segments of DNA which are very far apart and it sort of brings them in close proximity to one another. So, what I have got now over here is what I would call a DNA loop and people these loops play a lot of functional role. So, for example, you could have a gene activator region. So, for example, let us say I have a gene I have a gene sequence like this and in order for this gene to be active then there is a promoter region which sits very far away let us say somewhere over here far away along the backbone. If there was now a looping protein which sort of bound this region to that. So, that my confirmation now looks like this then this could be then a protein could come and bind to this activator region and start transcription of the DNA or stop transcription of the DNA need not be an enhancer it could be a depressor, but by bringing together distant segments of the DNA you could play a role in regulating the genome itself ok. So, the structure of so, the structure is important in order to sort of understand how the genomic into information that is there in the DNA itself is interpreted and that is interpreted through this higher order sort of regulation. So, if you bring together distant parts some genes may start expressing some genes may become silent and so on. So, they say something about actually this looping protein. So, it is actually quite interesting or rather it is something that I am interested in so, I might as well tell you. So, what is known is that you form these extremely large loops or rather loops set up on all length scales it is not just simple short loops that you would expect randomly, but you form extremely large loops of the order of kilo bases. So, hundreds of kilo bases even and in occasionally even mega base long loops and these are very well known in the literature a lot of the functional roles of these loops are also known. What was not known is that what sort of proteins form these loops. So, what people hypothesized a few years back 3, 4 years back is that you have some sort of a protein which is called which I call an extrusion complex or which people call an extrusion complex which has something like the structure it looks like two rings ok. What it does is that it comes and binds to this DNA topologically. So, that the DNA is if I have two rings you imagine a thread passing through both of these rings and then as this proteins as this sort of DNA I could pull on this loop or this thread and I could get a longer and longer loop ok. And if this loop formation would stop at certain specific sequence markers which are called CTCF motifs. So, these are certain sequences on the DNA which tells these proteins to sort of stop and that is where the loop stabilizes ok. So, it was not really known what these proteins were this was more for hypothetical model, but recently a couple of years back people have found out identified certain candidates for what are called this is loop extrusion complexes. So, these extrusion complexes are called the loop extrusion complexes. So, for example, this is one such protein it is a protein called cohesin which has a structure like this. So, this is a dimer actually. So, it could be that this DNA sort of one arm threads through here and the other arm threads through there and you form this sort of a loop. So, here is sort of a cartoon you form a loop and you sort of thread through it and you keep growing the loop until you hit the markers at the ends which tells the proteins when to stop. So, how do these proteins actually extrude loops and people have done these experiments as well these are a nice set of experiments. So, this is a DNA so, what is called as a DNA curtain. What you take is that you take DNA fibers you take DNA strands you take a DNA strand for example, like this take many, many such DNA strands and you put these proteins on them and you put these proteins this cohesin proteins on them and you tag these protein cohesin proteins with some fluorescent marker. So, as these cohesins move about on the DNA you can follow the trajectory of these markers and see what sort of a trajectory you can. So, these are these multiple DNA curtain. So, this is called the DNA curtain. So, these are these multiple DNA strands ok. So, you could think of this like an ensemble ok. This is an experiment which is many, many copies of this simultaneously happening and then you can average over these ensembles to get average statistics ok. So, that is what we will forget about the lower figure. So, if you look at any such strand of if you look at cohesin on any such strand and you look at the trajectory of cohesin. So, now you have to imagine that this axis is time ok. This is axis is where my DNA strand is ok. So, as I move along here. So, these are continuous snapshots of that fluorescent marker. So, as I see like this what I am seeing is the trajectory of that cohesin molecule on the backbone of the DNA and you can calculate. So, let us say I start from somewhere and I can calculate how far I have gone in a certain amount of time right. So, I can calculate this by averaging over all these DNA strands and all times I can calculate how much this cohesin moves on this DNA backbone as a function of time and it turns out it moves perfectly diffusively. So, for diffusion this coefficient should be one and experimentally I think that determine this coefficient to be 0.98 ok. So, which is pretty impressive right. So, it is actually moves diffusively not only that you can measure therefore, the what is the diffusion coefficient and they have done that experiment and they have measured the diffusion coefficient at different salt concentrations. Physiological salt concentrations is roughly somewhere over here between 100 to 250. So, inside the cell you have certain salt concentration. So, over there typically so, at physiological salt the diffusion coefficient of this cohesin molecule which is forming this loop would be somewhere 0.1 to 1 sort of micron square per second again disregard this lower figures. What does it mean? So, this further what you see is that this cohesins are actually topologically bound to DNA they are not chemically bound. What does how do I see that? So, for example, I can look at locations where this binding and unbinding of cohesin happens. So, you will see that binding. So, here is my DNA strand tell me if I am not being clear. So, let us say here is my wall to which I have attached my DNA strand and here is my cohesin protein which is doing a diffusive random walk on this DNA backward ok. Now, you could figure out where these cohesin proteins bind ok at what region they bind and you will see that if you look at the histograms if you look at the histograms they bind sort of so, this is my position along the the y axis is the position along the DNA and this these histograms on your left are your binding events. So, they bind sort of everywhere along the DNA. There is some sequence specificity, but on an average they bind everywhere. On the other hand if I look at where they come off they come off only at this only at this end of the DNA which is this free end of the DNA which is floating around ok. So, it does not once it is bound it does not unbind from random positions, but it sort of travels to the end and then slips out from here which is consistent with this sort of a ring idea that one has. It is not a chemical bond, but it is a sort of topological bond. So, it forms a ring like this and then that ring does a diffusive walk and then only when it reaches the end it sort of falls off. You could show that for example, if you capped this end as well if you if you capped this end of the DNA as well then this would actually not fall off. So, this flow is basically you these are in done in nanofluidic channels. So, you have some fluid sort of flowing to push the cohesin along this direction. So, when the flow is when the flow is on all the cohesins are at this end they have all reached the end, but they cannot come out because I have stopped I put a cap over here and then when the I take the flow off it goes back to doing a diffusive random walk. Again I start the flow again everything gets pushed here and then again when I turn it off it goes back to doing a random walk. So, it has extremely long lifetimes. So, it is not a chemical bond that binds and unbinds it is actually a topological bond or at least some of them form a topological bond which once its bound stays on for a very long period of time ok. So, it is what is the y axis? This is the DNA strand. So, let us say it is 0 base here and then whatever 50 kilo bases at this end. So, you have to imagine the y axis as the DNA strand and x axis is time ok or it is the position of the cohesin molecule on the DNA strand there ok. So, you could ask that well ok this is how cohesin moves on DNA it diffuses on DNA that is nice, but we just saw that DNA inside cells is not just bare DNA it comes in the form of these nucleosomes right it comes in the form of these nucleosomes. So, these are these histone complexes and the DNA is wrapped around that which is a sort of a large object right the histone protein is a protein octamer. So, it is a large object. So, what happens when this diffusing protein this cohesin faces a large object on the on its way. So, again they have done these the same set of experiments actually have a reference. So, this reference over here in 2017.