 So, again they have done experiments, but not only have they put nucleosomes you could put different barriers now, barriers of different sizes. So, what you do is that you put some sort of a barrier ok and then you ask how does it affect the motion of this cohesin along the DNA backbone right. And now you can put barriers of different sizes and ask that well for very small barriers you would expect nothing happens for very large barriers you would say that it would never cross the barrier and do we see that or not. So, for example, here is a small very small protein which I have put at this location on the DNA. So, if my whole DNA is around 50 kilo bases at the 20 kilo base mark I have attached a protein ok and then I see what the cohesin trajectory show and it because if this is very small. So, this protein is around if I remember correctly around 2 nanometers because it is a very small protein it does not this cohesin ring does not even see this protein it continues to do its random walk happily on its own right. But now what have so now what is done is that on the same protein you put a quantum dot which is at the sizes written somewhere. So, it is a much bigger obstacle now and then you ask the same you put it at this location the location of the arrow. So, everywhere on this there is this obstacles or rather at all times there is this obstacle at this location and then you ask what happens to the trajectory and as you can see it goes it hits this obstacle and then it comes back because it is large it cannot really cross over it comes back it does random walk gain again when it reaches the boundary it has to turn back ok. By doing this in a control and you can do the same thing with different obstacles by doing this in a controlled fashion you can say that what is the effective size of this ring right when an obstacle becomes much big so big that this ring cannot cross over this obstacle I say that is gives me an upper limit to the size of this ring itself. So, they do this and they find that this cohesin pore is roughly between I think 10 to 20 nanometers. What happens if you actually put nucleosomes? So, this is a nucleosome. So, remember this is this is DNA wrapped around this histone complex and what you see is that so, here is here is where I have placed this this is a single nucleosome that I have placed at this location on the DNA backbone right and I observe the trajectories. So, what I will see is that there are crossing. So, occasionally when it comes here it crosses over to the other side, but also sometimes when it comes to this boundary it does not really cross it comes back over here more often than you would expect purely on the basis of a random walk. So, what this is is roughly like a semi permeable barrier. So, I have this a one dimensional random walk of this cohesin protein on the DNA backbone and somewhere over here I have placed an obstacle right. So, when this random walker comes and meets this obstacle with some probability it will cross over, but with some probability it will go back. So, it is this crossing over probability is not 0, but it is pretty small. If there was no obstacle here of course, both sides would be half and half. So, it is definitely not half it is much much smaller than that, but it is not 0 and you can place this new. So, I do not know if you can see this figure you can place this nucleosome at different locations and you can see that the permeability takes a dip exactly at the location of the nucleosome. If you place the nucleosome here the permeability takes a dip here if you place it some other location it would take a dip there ok. So, it functions at sort of obstacles not complete obstacles in that it does not completely stop the motion, but it definitely slows down the motion ok. And you could ask that well therefore, what would happen this is experiments for a single nucleosome what would happen if I put more and more and more nucleosomes. So, they put many many nucleosomes now. So, this lower figure is for an extremely high density of nucleosome. So, everywhere along here there are many many nucleosomes and then you look at the cohesin trajectory and as expected because of this extremely high density you see that the cohesin does not really move at all it stays fixed at a roughly fixed at a particular position on the DNA right because whenever even if it moves a little bit it sees another obstacle and so on. So, this is sort of so, you can estimate how the you can calculate various things like survival probabilities and so on, but you can estimate how the diffusion constant would sort of fall as you put in more and more obstacles or as you decrease the linker length between two nucleosomes. And the answer that you get or the answer that these people get is a diffusion coefficient. So, realistic linker lengths like I said is roughly 30, 200, 150 maybe so, somewhere along this region of the curve and there you would get diffusion coefficients if I took that for completely bare DNA it is around 1 micron square per second you get roughly of the order of 10 to the power of minus 3 minus 4 micron square per second. And this is actually a puzzle in that it is not clear that if you have an object which is refusing on a 1D lattice with something like 10 to the power of minus 4 micron square per second and then you ask that well I know in high C I know from these experiments I will talk about experiments in a bit, but I know from these experiments that I formed these very large loops of the order of kilo bases and even mega bases. I could ask that given this diffusion coefficient what would be the typical time required to form this large loop right and it turns out that if you put in this diffusion coefficient number and if you put in this typical sizes of the loops the time scales that you would get are of the order of hours. Whereas, typically would need to form these within less than a second right because the cells you would have to organize the genome very quickly. So, you would need to form these loops on the order of seconds, but it seems if you put in these numbers which are measured it seems like it would take roughly hours or something like that minutes or hours at least to form these large loops. So, again this is something that is not completely understood we understand how these loops form because of these loop extrusion factors like poheasant, but we are not sure how to sort of reconcile the time scales that we get from here and the time scales that we observe in experiments ok. One more set of experiments one more sort of set of experiments. So, these are a very nice class of experiments called chromosomal confirmation captured and again these tell us something about this large scale statistical properties of the chromosome. So, what I want to ask again is a related question that. So, one thing that I asked was from these fish experiments one thing that we could measure was the average separation between two elements on the genome which are S day spares apart. I want to ask something in a similar spirit which is that if I take a chromosome or if I take any polymer purely randomly for example, two segments of this polymer could come very close to each other right. If these two segments were very close together along the backbone they would have a high probability of being found together. If they were very far apart they would have a low probability of being found together. So, that is something that is called the contact probability that is called the contact probability and it is denoted by p c of S. So, this asks that what is the probability that two base pairs which are S distance apart on the backbone have come in close proximity to each other in in 3D space ok. So, that is called the contact probability and there are these very nice experiments called chromosomal confirmation capture experiments which measures this contact probability. So, I will just give a quick run through of what they do. So, here is my so, here is my DNA right and let us say that this region and this region are very close to each other. What I do is that I cross link these regions so, that they stay close to each other then I lyse the cells so, that I can take out the chromosome. I then chop off the chromosome into small small bits ok. So, the so, this region which was sort of very close to one another stays in one segment and these remaining things are chopped off ok. Then I ligate so, I wash away all these other strands which do not have this protein marker ok and then I ligate these two strands this red and the yellow remember which were initially very far apart. So, let us say this was red and this was yellow ok. So, they are initially very far apart along the chromosome, but at the moment I did this experiment due to whatever reason they were close together whether that was due to some loop exclusion factor or purely randomly or whatever they were close together and so, I freeze I sort of freeze them in in that snapshot in time and then I do this ligate I do this restriction and this ligation so, that I join them together and then I wash away these proteins and I sequence this DNA that I have right and this DNA now contains one segment which is from this region the red part another segment which is from that region which was the yellow part. I do this across millions of cells at many many times and I get an ensemble of experiments which tells me that how often do I find this segment and this segment which was some S base pairs apart how often do I find them in contact right. So, if I have done a million such experiments and in thousand of them I find that these are these are in contact like this I would say that that gives me my contact probability averaged over all S and so on and so forth. So, you can build these curves now and you can do this on actual genomes. So, for example, here is an contact probability a normalized contact probability is a function of this genomic separation along the backbone for the east genome and for the human genome. And here is what it looks like it falls off roughly as a power law and so, it PC of S goes as S to the power of minus beta and for example, in east roughly that B appears to be minus 3 halves on an average in humans that beta appears to be sorry since I have written minus beta, beta appears to be 3 halves in east and 1 in humans. This sort of tells you a little bit also about how tightly the chromosome is packaged right. If you did it for different chromosomes 1 2 3 4 whatever up until 23 it would tell you if there is a difference between how tightly these different chromosomes are packing. For example, in the figure I showed earlier the chromosomes that were on the periphery something like this versus something like this you would see different contact probability curves for this chromosome versus this chromosome right. And this is something that you can actually calculate from simple polymer models. So, let us just see. So, this is in humans remember humans gives S to the power of minus 1 east gives S to the power of minus 3 halves ok. So, we can ask that well now I know at least one simple model for a polymer which is this random walk polymer. If I were just looking at this sort of quantity what is the contact probability as a function of the separation what does the random walk model or random walk polymer tell me. So, here is my probability of loop formation. It is the number of loop confirmations by the total number of confirmations. So, let me say if this was a 1D polymer let me just again do an idealized case first. So, if this was a 1D polymer remember my 1D random walk can take steps to the right or to the left with probability half and half let us say if it is an unbiased random walk and then a looped confirmation would correspond to when the number of steps it has taken to the right is exactly equal to the number of steps it has taken to the left right. So, it comes back to the origin. So, I can calculate the probability of that. So, that is just n c n by 2 that is the number of looped confirmations and the total number of confirmations is 2 to the power of n. I can use because these ends are very large I can use Stirling's approximation to simplify this and this will give me some answer. So, for example, if you use Stirling's approximation what you get for the probability of loop formation for this 1 dimensional polymer 1 dimensional random walk is it goes as n to the power of minus half. Now of course, I did this in the discrete sense you could also write down the full probability distribution like we wrote down the probability distribution for the n to n vector is this Gaussian e to the power of minus r square by 2 n h square. And let us say that I have some cut off that when two monomers approach each other within some distance delta I will say that a loop has formed and then I can calculate what is the probability of a loop formation. So, I take this probability distribution I integrate over all n to n vectors which lie between minus delta to plus delta right and that will give me the probability of loop formation in this 1D polymer. And again you will see if you work this if you put in this and you do this integration from minus delta to delta the simple Gaussian integration what you will get is that the contact probability will scale as n to the power of minus half. So, s to the power of minus half s being the genomic separation. So, for a 1D random walk so for a 1D random walk polymer you get PC of s goes is s to the power of minus half. But of course, these polymers real polymers are not 1D these are 3D polymers, but that is fine we know what is the probability distribution for a random walk polymer in 3D that also we did last class. So, let us say I could take a three dimensional polymer now this is my three dimensional probability distribution e to the power of minus 3 r square by 2 n square with this normalization factor. And again I say that I will I will say that a loop has formed if two monomers approach within some distance delta of one another. And therefore, I can calculate what is the probability of loop formation or this contact probability in 3D. So, this 4 pi r square times this p 3D dr from 0 to delta right this there if this delta is my cut off. Again you can do this integration this is an r square e to the power of minus r square sort of an Gaussian integration. If you do this integration what you all I am interested in I am not interested in all this delta and a and so on I simply want to see how it scales with the separation right along the base base I want to see how it scales with n. So, the contact probability for a three dimensional random walk scales as n to the power of minus 3 halves ok. So, if I put in this probability distribution for the 3D random walk. So, this is nice in that if I now go back and see what these experiments measure it seems that if I looked at the east data I could actually explain this east at least just by this measure there are other measures which may or may not agree. But if I were to simply look at the contact probability and I would look at the east genome it seems that what I would predict from a simple random walk calculation is actually what is observed for east that it goes scales as l l to the power of minus 3 halves. On the other hand that is not true for humans. So, if you you cannot use a simple random walk model to explain the contact probability of humans and people have come up with various sorts of models in order to get this minus 1 exponent I will not discuss them if you are interested I will tell you later. So, that there is a question that what sort of a polymer model could you build which would then explain this sort of a PC of s going roughly as s to the power of minus 1 and also ideally all of these other things that we have noted. So, these are all things that whatever model that you build in order to explain this higher order organization of chromosomes would need to sort of reproduce ok. So, I said that different chromosomes might fold in different ways. So, is that true? So, again let us look at these experiments. So, this is experiments on if I get which cell line this is I think this is a human fibroblast cell and this is a human embryonic stem cell. So, let us look at the fibroblast cell first the different lines are for different chromosomes. So, chromosome X 11, 12, 19 ok and you plot the contact probability for each of these chromosomes ok. You will see that each chromosome lies slightly differently on a different line right each has a different alpha I was calling that beta they are calling it alpha or whatever. So, this is the same as my beta over here. So, each of them has a different exponent they roughly lie around 1. So, 0.93 to 1.3, but they are distinctly different which means that each chromosome has folded on to a different degree of compaction. If you were to do it in a different cell line you would see different exponents for each of these chromosomes ok. In particular if you look at human embryonic stem cells it seems that this distinction has actually gone away. So, all these chromosomes now fall exactly on the same line which is roughly what around which line is this 1.65 ok. The higher the exponent the more loosely it is packed and. So, remember what I said is that when you packed something very tightly it is more likely that this region does not contain any necessary genes for that cell type right. So, in the embryonic cell type this is in the embryo. So, it has not really it has not really how should I say specialized to a particular cell type right it is not differentiated. So, it could this is the my embryonic stem cell it could become whatever heart liver whatever, but in this state it has not ok. So, all there is no distinction between what genes I need for functioning of a particular cell type and what genes I do not and therefore, all of them roughly fold to or all of them roughly compress or package to the same extent. There is no distinction between the chromosomes. As you go through differentiation as these embryonic cell cells divide and differentiate into these different different cell types you will see this sort of it is starting to diverge from this behavior and ultimately in a completely differentiated cell type like this fibroblast you will see different chromosomes folding completely to different extents. So, these are dynamic. So, the these this information is not there in the sequence because the sequence is the same in this cell versus this cell, but these are dynamic. So, this is something apart from the sequence apart from the genetic information which tells you basically how this sequence information is to be used by the cell in a functional way ok. Moreover, from these. So, these are all obtained from these chromosome confirmation captures of experiments that I said these are technically called various forms. So, depending on the sophistication of the experiments I let us say these are called high C experiments. So, these are all from these sort of experiments, but you can do even better. So, these are averaged remember these these are averaged over all chromosome these are averaged over all base pairs which are s distance apart ok. So, I took this base pair and this base pair which was s apart I found out what was the probability of them coming together I took this and this which was also s apart I found out what is the probability of them being in contact I averaged over all such pairs in order to generate this contact probability curve. But I have more granular information than this which is that how often do exactly this pair and this pair come in contact and you can show this that information as well. For example, so because the figures are so bad let me just draw it. So, both axis let us say is some particular chromosome let us say chromosome 20 chromosome 20 chromosome 20 ok. This is base pair 1 2 3 4 all the way up to whatever number of bases the chromosome has that particular chromosome has the y axis is the same 1 2 3 all the way up to n ok. So, if a chromosome has n base pairs I can I can construct an n cross n matrix and the color intensity tells me how often this pair has come in contact with this this pair has come in contact with that and so on ok. Ordinarily what would you expect? You would expect that well there would be a sort of it is drawn like this, but whatever since I have written like this ordinarily I would expect that the highest intensity would be along the diagonal right because there are any way close along the backbone itself and then as I moved away from the backbone this intensity would fall off right. What you see is that that is not strictly true there are actually domains where even though they are far apart. So, there are these domains like this where even though they are far apart they have very strong interactions with each other and these domains actually we now know they have actually an important functional role to play. So, for example, if there is if there is a protein which always brings these two into contact then these this base pair and this base pair would show a very bright red spot in these figures and that tells us something about the structure of these chromosomes and you can do this for all different chromosomes. So, for example, these are all the human chromosomes 1 to 22 and then chromosomes. So, it tells us how these chromosomes are organized each individual chromosome is organized for a particular cell type. If we did this for another cell type you would see some other sort of a feature emerging and also even in a particular cell type you can see that these features will change depending on the state of the cell. So, for example, this is a proliferous this is the same cell this is the chromosome. So, these this is called contact map this sort of a matrix is this matrix that I have constructed is called a contact map a chromosome contact map. So, if I plot these chromosome contact maps let us say for chromosome 18 for the same cell type, but at different points in the cell in the life cycle of a cell and you will see that the features that one sees are actually different depending on at what state of the life cycle is it in. So, not only does it depend so this contact maps not only do they depend on the cell type not only do they depend upon the cell type, but even in a single cell type it depends on time as to what time relative to the cell cycle time of the cell lifetime as to what cycle it is in. So, these are very how should I say these are very rich data that one has now because of these extremely beautiful experiments that people have devised. The challenge is to sort of come up with a polymer polymer sort of a model or whatever model that you want that will take all of these disparate experiments and sort of tell us exactly what is going on when this chromosome 4 goes inside the nucleus and organizes itself, which is still an open question we do not know, but at least whatever model that we come up with must answer or must agree simultaneously with all of these different experimental observations that we have. By the way this website over here actually lets you this all of this data is open source. So, you can if you go to this program called juice box you can upload whichever human chromosome that you want and you can see how these contact maps look for each different human chromosomes and you could average and you could generate those contact probabilities that you generated and so on. It is a good resource to play around if you are interested. So, I think I will stop here for the day yeah I will stop here for the day and I will continue with polymer models in different contexts proteins and so on.