 OK, so I just wanted to finish a couple of slides from yesterday and there was also something yesterday I didn't do which I need for introduction today and then I'll start today's talk. So just wanted to give you where we have this stuff from yesterday where we are headed and so our goal really is to develop eventually a theory that kind of can take all this data together and the way we approach it is slightly different because we want to try to come up with a theory that explains the phenomenology of the network from a first principles approach and not just by purely taking all the data and fitting all the means and adjusting 38 different parameters or so and so for this essentially you need to have an anchor and our anchor tries to work with realistic constraints for these biochemical signaling networks and those are based on first principles and what we basically want to achieve is that if you have an input like the transcription factors or the positions the ligands or whatnot into your network that we want to use those and then maximize the transmission of information in the sense of Shannon the information can be defined very rigorously mathematically it's based on the same kinds of probability distributions that I showed you yesterday and so and we had already alluded to yesterday their signatures in our data that let us believe that that information is optimized in our system so there's a transmission that goes from those broad gradients that contain position information after the setup of the mother that that information is passed on to the different layers of the network in an optimal manner and so that is kind of the principle that we would like to to put behind our network and then under that constraint to basically optimize sorry and optimize the parameters under the constraints that eventually the number of signaling molecules is finite and that of course leads to biochemical noise and you then need to look at how do you deal with the noise in gene expression in your system and so just very briefly the key ideas for this theory are then to because we want to characterize not just the means but also the covariance metrics because we have access to that in our data you need you need of course to understand the biophysical constraints that encompass noise in gene regulation and so you need a more noise model and we are testing several ones and you're trying to have them as realistic as possible but also as as rich as possible so that we have a lot of flexibility and that goes also for the the the gene regulation dynamics for which we also have a rich set of of of parameters that allow us to eventually then having enough parameters to fit the model but also not too many so that you get you get too overfitting but the fact is that because we eventually want to recover not just the means but also the covariance metrics we have a much richer set of data that we that we can use to constrain our parameters and so in an eventually this is then all optimized by basic by maximizing this objective function which is maximizing the transmission of position information and that is a number that has a unit it's in bits so you can actually have a real measure for how good you're you're doing in the end and then compare that to how much information is inherent in your in your model so this the first principle is really to understand the the molecular I mean well so what is different is that we don't fit these we don't have a we don't fit what the what the model spits out as profiles you don't fit that directly to our means or something and use that to adjust the connectivity metrics or so what we want is to set up the system and have it with with plausible parameters so few fewer parameters as few as possible that most of them we hopefully have have measured to give us candidates that you know spit out profiles that resemble what we have right now and eventually then that will be based on optimizing that in the information the transmission of information which is a much larger principle than you know you set you write down a set of differential equations for each of your player and fit those to your means well this is all of this is all a bit abstract I want to talk too much about this today so this is work that has been going on for the last almost ten years with our collaborators in Vienna Gazper Tatschik and in Princeton Bill Bialek and in Paris with Alexander Rauchak and one person who is right now at the at the core of the continuation is Thomas Sokolowski who sits in the audience and he has a poster tomorrow and I'm sure he's happy to tell you much more of the details of how we are proceeding there all right I want to really continue go on to talking about transcription regulation today so now let's switch gears and think back what I talked about yesterday I can I told you that the fly system seems to be extremely precise and reproducible to the level of you know the most extent that it could be to the level of individual cells and so there's no need for the system to be any more reproducible or precise and that and that is a network that is based on transcription transcription networks all players are transcription factors but transcription is inherently noisy at least we know that from single-celled organisms and so I alluded to it yesterday we wanted to check whether there's something special about transcription development that makes the system as precise as we see in our data okay and so we developed the system to look at this in in in vivo and so here I'm looking at the transition again from this broad input gradient this exponential gradient how it switches on this this step-like function where at the center of the embryo it splits the embryo in half and shows you an on and off transition if you want and so we developed a method with which you can basically measure the activity of transcription in vivo and the way this works is that essentially you have a you can make a reporter for this gene this gene is called hunchback and you can take the regulatory sequences of this hunchback and fuse to it a little reporter at the at the five prime end of the promoter and now whenever polymerase goes by this reporter it attaches a little stem loop or cassette of stem loops to the unfinished piece of mRNA okay these stem loops they're called ms2 they are there's just a small you know one kb sequence that basically gives you 24 of those stem loops they're now dangling on your elongating polymerase okay oh something didn't work when I redid this talk yeah sorry yeah so how is it how does it become a reporter exactly what is the what yeah so what exactly because I haven't I haven't finished so what is reported is that these stem loops now they can recognize they are recognized by a protein that's a recognition protein for those stem both come from the bacterial system so bacteria make these things naturally they have a an mRNA stem loop that they just generate and there's a protein that happens to bind that happens to recognize that stem loop now you can take that protein fuse it to gfp and provide it in the mother and so now what happens is that this protein will bind to those stem loops it has gfp attached to it and now each polymerase gives you then one digit of fluorescence more on your site in the genome where you're making this gene and here in this movie you see so yesterday I showed you a movie yesterday it was white now it's red so in red you see those the DNA again this was histone gfp this gets histone rfp the same movie then yesterday just with a different color and in red you see these nuclei they divide etc it's just a zoom in from what I showed yesterday it's not the entire embryo but now in each nucleus you discover there are two green spots and those green spots correspond to the two transcription sites because flies are diploid they have a piece of DNA or they have a chromosome from the mother and from the father you see two of them okay and so what you can do now is you can take the intensity of this can take the you can go in and ask what is the intensity as a function of time of these spots and so if you take that one hour window that I looked at yesterday with the gab genes and you are this is after after the last division has happened what happens to the intensity well it comes out of mitosis and increases and then it fluctuates okay and those fluctuations they are they can be they show you that you know there is noise in transcription there bursts however you want to call them and so there's nothing really special about transcription in the fly system the fly just has found a way to cope with the noise okay is there any question about this because I know you're going to use this system for quite a bit and so the the advantage of this system is if you were to measure how do you measure yesterday we measure gene expression by looking at the protein output in a dead animal so why don't I look at the protein output of a living animal I could just put gfp to the protein that is made but the problem if you do that is that the gfp takes a while to actually become bright okay it's called maturation or folding time of the protein and so if you measured the protein output you would have a time delay of like 10 15 minutes which you can't afford in a system that runs as fast as the tosophil embryo however in this system here that's not the case because the protein that's fused to this the gfp that's used to that stem loop binding protein has been provided by the mother doing ogenesis during she made the embryo and she just like the histone yesterday she just put that in the embryo and it was ready to go and now whenever you have a stem loop system that gets generated zygotically after mbt like after two hours those proteins are ready they're all going to bind to these stem loops and that increases your signal to noise ratio at the stem loop because now you concentrate those gfps that are floating around all at that stem loop place is that clear yeah sorry yeah so that's a very good question yeah so first of all what about binding the uh the binding affinity is super high and it's just stay stuck it's almost like a covalent bond you need a lot of kt to get rid of the to get rid of that stem loop protein combination and that has been we have measured it i mean not in v but collaborators yeah yeah so that's a good question um first of all how do i want to go into this yeah so each cassette each of these each polymerase has a cassette of 24 stem loops but each stem loop can bind to gfps okay so that's 48 per polymerase okay and now you can then count how many polymerase do you have on there and then you have to ask well what is your detection threshold because you have to see of gfp molecules floating around and you have to have enough polymerases that the concentration of gfp locally is high enough that you stick above the c that you can actually see it okay and so that's around three polymerase and so starting at like a hundred and hundred and fifty gfp molecules condensed here i can see it okay turns also out that the binding efficiency is not a hundred percent so if you have a cassette of 24 loops you only bind like 17 plus minus two or so yeah so when the polymerase falls off um the whole mRNA that's elongating falls off of the polymerase as well and then gets shoved out of the nucleus i'm not sure i understand the question so this piece of mRNA is the natural piece that the fly wants to make that will encode for the future protein that gives you this green pattern yeah no no no so this this stem loop dna was developed in a bacterium like gfp was developed in the in a jellyfish and i used and i just put it in the fly okay and so in the same way that couple of stem loop system plus protein was developed in the bacteria and i went into the bacteria cloned the sequences out and put it in two different flies one fly i put the stem loops in the other fly i put the the protein that will bind the stem loops that when i put in the mother that when i put in the father i mix them together and they make this why did i do that because i want to measure the transcription activity i want to know how many polymerases at any given time during development are on that particular gene and so now i can go in and ask what is the intensity level of these spots and what's on the my y axis here i can turn it says right now arbitrary units because i haven't told you yet about a different method that i need in order to turn arbitrary units into numbers of polymerizes that are actively working away what this is the gfp signal right because the the stem loop is for binding a protein that i fused to a gfp see this is the yellow the in black you have the stem loop in yellow you have the protein that binds the stem loop and i fused that to a gfp and that dimer of protein plus gfp that's floating around in the in the cytoplasm and as soon as there is a stem loop appearing one of those proteins is going to bind to the stem loop actually too still don't seem convinced by what the heck i'm doing here what's what's what's lacking yeah yes and i trick that mRNA that polymerase to not just make the normal mRNA but to have that normal mRNA mRNA with a cassette with a pocket lamp kind of hanging out there i trick it so this is that's what i'm saying yesterday my slide modification to development was des today my slide modification to development is that i put these weird stem loops in the system which are unnatural so yesterday i looked at the natural system but a dead one and so if you do physics of living system and you do it on des that's kind of a contradiction but nevertheless you get something out of it and today we are looking at life but it's not the actual life that you see in nature because the fly couldn't care less of having a stem loop system yeah so all of them glow you mean in cytoplasm they all glow yes but because i put as we computed just before hundreds of stem loops on this thing you have many polymerases that run in parallel right and so that means they are now hundreds of gfp molecules all within the vicinity of a few nanometer well you see it's a little green here right let's call that back on all right can i move on so the this is a technique we talked a lot of technique the scientific point i wanted to make is that transcription in the development is very noisy you see these birds etc what people see in yeast and in bacteria and so all is good the only thing that isn't good how do you go now from this noisy transcription to something as precise as we talked about yesterday so for that we developed a different method and that method is based on single molecule detection of mRNA molecules again in dead fly embryos so in order to see single molecules you need to kill the bugger in order to have enough you know time with your collector to collect photons such that you have resolution of individual molecules and so how did we do that well we used something called in C2 hybridization the C2 hybridization is where you take you have a little piece of DNA that will find that will bind to a piece of mRNA in your embryo okay and so we take a little piece of DNA that is 20 nucleotides long and we put a little glow on that whatever something that glows fluorescence if I shine light on it and then we take many of these 20 mass that have been pre attached to fluorophores we take many of them and all entail the DNA the RNA that we want to visualize this and so now I have many 20 mass on my mRNA and again that's the same trick then with the MS2 stem loops by that means I put a lot of fluorophores on the same mRNA and not just a lot of but actually a known amount and so the known amount is important because that means I can quantify and so sorry this thing just advances at will anyways if you do this and you do the same measurement that we just did before you can just look at individual nuclei they have to its transcription sites they're very bright because they're many polymerases each polymerase has an unfinished piece of mRNA all of these unfinished pieces of mRNA will bind will be tiled by the 20 mass that have a fluorophore on them and if you take the intensity in all of the nuclei and plot them as a function of egg lengths you see that the intensity of these nascent transcription sites that's where they're sold they're nascent they're just native RNA you see that they're all over the place okay very noisy and you see it's a very sharp transition when you go beyond the boundary okay all of them stop at like 0.05 egg lengths sorry 0.5 egg lengths all right you can quantify that noise and by just looking at the mean over the standard the standard deviation over the mean and it's roughly 50 percent again which is something people see in yeast or in bacteria nothing special transcription and development now how do you get rid of it well first of all each nucleus doesn't have just one side here show you the noise and individual transcription site each nucleus has at least two because they're diploids they have a side from the mother and a side from the father but it turns out that the flying bro is so fast that DNA segregates immediately for the next cell cycle and so you have four spots you have sister you have two alleles and each allele has already its sister chromatids ready for the next cell cycle that means they have four spots that are making the same mRNA that means if you get your statistics right the noise should drop down by a factor of two if you average over the nuclei and that's roughly what you see okay one over square root of four is one over two which is what you see but now I also told you yesterday that we are living in a world where nuclei shared cytoplasm with all other nuclei there are no cell walls or you just replace wall cell boundaries and so mRNAs can freely diffuse and so there's naturally spatial averaging going on because an mRNA that's produced by this nucleus can go towards its neighboring nuclei and we have measured that the mRNAs at that stage are also very very long lived like an hour so there's both spatial and temporal averaging going on and by the time you look at you measure now cytoplasmic numbers of single molecules of mRNA as a function of egg lengths you see that your noise goes down to something of the order of 8 percent so here I have now absolute numbers of mRNA roughly 600 and it drops down to zero when you cross their boundary and the noise here is something of the order of 8 percent and yesterday I told you at the protein level you have something of the order of of 10 percent which we had measured earlier with antibody standings and so the step from here to here I don't need to explain anything and that 10 percent in protein corresponds to 1 percent in space okay so you don't need anything fancy for transcription here all you need really is physics physical processes like spatial and temporal averaging that help you to reduce the noise from 50 all the way to your 10 percent which is what you need in development to be precise yes in the third column in this one here no in this one here I just look at columns around nuclei and count single molecules yeah I look at the column around the nucleus but not the nucleus itself just at the cytoplasm around the nucleus and I count I just wanted to show you those two techniques because I'm going to need them and show you in the same token development transcription is noisy but in the fly embryo there's a very simple mechanism that helps you to reduce that noise well it sets a limit yeah it sets a minimum limit to how much spatial averaging do you need exactly precisely yes because how many how many neighbors of nuclei do you need to diffuse in order to get to that noise level and it turns out the number is one and which is very plausible right you have an emmer you make your nucleus you make an emmer and a well you don't keep it all to yourself right it naturally will diffuse at least to your to the neighboring shell and we have shown that that is in fact the case and we have also shown that just purely temporal averaging is not enough so you need you need a little bit more and that little bit more is spatial average all right so yeah there at least what you mean there at least two transcription sites yes there yeah right so each spot each dot was a single nucleus or a single transcription site yeah in fact the two intensities are completely decorrelated there's not even correlation within the nuclear environment there's some extrinsic component of the noise that is correlated yeah because you know the fluctuations are not the same but if you look at the nominal values they are decorrelated within an individual yeah exactly yeah no let's talk about this later okay well certain noises are others aren't so it's a little more complicated but some yeah some the transcription noise is is larger and actually you can see over time how it how it comes down and we use this to actually also stage embryos because these embryos that we use the hybridization protocol they're very fussy and it's much harder to stage them in time maybe we can talk about this later so anyway so we have these two nice methods now to probe transcription the embryo and that leads you to recognize that the embryo lends itself perfectly to use it as a laboratory to study transcription because transcription is ubiquitous it's you know the same in in yeast and bacteria and flies and humans but this system here lends itself really nicely to probe things because you have optical access you have a setup that's extremely reproducible so if you work with single cells it's very hard from day to day to always get your culture the same way etc in this case the mother does it for you and you not only measure in a single nucleus a single transcription site but you have kind of a high throughput imaging pipeline there because you can measure whatever 500 or 6,000 nuclei at once okay so it's a very powerful system to study transcription and the setup is extremely reproducible like you know what you would want if you want to do quantitative study okay and so here on the left side I'll show you again the live version with you know the molecules floating around and binding to the transcription sites and on the right I show you a stack of a fixed embryo where you can see in blue the nuclei in yellow you see individual molecules of mRNA and then at some point I think there should be very bright yellow inside of nuclei which are the transcription sites here they are of course only on a few of on a few slices of your stack so what we can do with this now is to probe transcription at different levels so you can probe transcription at the level of the promoter so you can ask about you know how noisy are things right transcription factors bind the promoter and then instructing the polymer polymer is binding on the promoter and transcribing and this is how you know physicists or mathematicians have been thinking about transcription for ages you have a promoter bacterial system transcription factors bind there and we need to understand what's the output given that you have a certain input but it turns out that in in metazone so everything that is more than one cell the places where transcription factors bind to instruct whether to make another gene and the promoter where the polymer is eventually binds they are disjoint I think I alluded to that yesterday and so that means that there's a sequence of DNA that's 200 to 1000 base pairs long it's called an enhancer Ken talked about these guys we have many transcription for a cocktail of transcription factor can bind that can be away from the promoter and somehow still instructs the promoter to accept or not premises okay so it's a can be a thousand base pairs a lot so there can be a lot of information from different transcription factors can be integrated at a single enhancer that's in metazones but now to increase even further the complexity it turns out that in higher order metazones you have many enhancers that can instruct the same promoter at the same time in the same cell to transcribe or not and so they kind of have to interact with each other and we know nothing about any of this okay so we know a little about how actually do things work at the even at the promoter level but there are some models that that we that help us we know we know even less if you look at an entire in an entire enhancer so you have in these enhancers you have many many binding sites and even if you have an enhancer that has six binding sites for the same protein and you permute them around you still don't understand what's going on okay so there's a lot to learn on these guys and then if you even go further if you have multiple enhancers it becomes even more tricky to understand how do now multiple enhancers talk to the promoter and you know the complexity is just much larger okay but at each of these levels we can now go in and use reporter constructs both in the live and the and the and the fixed setup measure and then test models okay and I could give you an entire lecture about this today but I won't because we're doing something we have started to bring in yet another angle to this to this transcription which we started three years ago and we haven't really published anything on it but I'm very excited about it and I want to talk about this today so we have a few papers on these things um we don't understand much they are just observations correlative things the models aren't really tight etc so I'm it's still work in progress but the new stuff has to do with the fact that there are multiple enhancers and that we recently learned from the encode project that there are 10 to 40 enhancers in mammalian systems that can all interact with the same promoter in the same cell at the same time it's gigantic right or in another extreme you have one enhancer that can interact with hundreds of promoters in your nose like when your nose is made for the different receptors for different molecules of chemicals that you can that you can smell so in both cases you have a huge amount of complexity just by the fact that you have one guy interacting with multiple others but of course how does this happen well we have to now spread out these enhancers all over your genome right and it people have shown that up you can have an enhancer that's a million base pairs away from a promoter and it's still able to instruct that promoter to do something or or not and that brings us to linking transcriptional activity in a nucleus with understanding the underlying polymer physics of the DNA or of chromatin okay and so there's a whole new perspective that we have to take into account because all of this DNA is crumbled somewhere in the nuclei and somehow nevertheless these enhancers have to find their target yeah they could be very close and we have no idea because it's very very hard to measure because it's very small those nuclei they are whatever 10 microns at most in size in many systems they are three microns in size and your imaging resolution is almost nothing and it's like a third of that a sixth of that so very hard to see I assume that being something you know you just know that this thing yeah so this this this number here this number this number here yeah that is I just I made that up I mean I didn't make it up I used encode they tell you how many genes they tell you how many enhancers I took take one divided by the other and I get this so that's not rocket science on average that's what it is but we know that they are individual promoters that have at least 10 enhancers and that we know from functional assets that's in the mouse right well you can take them and you know and test that we don't know we have no idea we have no idea we also we don't know for instance if all of these enhancers loop or crumble the DNA such that they have you have your promoter and then there's 10 enhancers piling up around it and there's we have no idea and so the best measurements that are made so far on that question what structure of the nucleus is concerned but we we're going into the question of structure there's the question of structure how do you fold or pack the DNA in your nuclei because if you look at the bacterium and you exploded the DNA come is all over the place you have like two meters of of DNA that you have to smush into one cubic micron and it's seemingly ordered which is quite astounding right and the best information we have so far there's two ways one way is you take this fish approach that I showed you and you tag with two different colors two different spaces in the in the in the genome and then you measure the distance but of course this is in dead tissue and so you need to do this a lot of times to get a mean and you get a standard deviation and actually if you do this you see that as a function of distance you average distance between the DNA in micrometer so this here is transferring micrometers into base pairs lies on a on a line it's at least it's there's some structure there that you can work with it's not completely random you could have this stuff all over the place right and so here you have one micron and here you have your imaging resolution and now you can try to see if it's live imaging you can do anything with this because our goal is to look at this live because we want to see the dynamics of these enhancers how they interact with a single promoter okay so we want to tag a bunch of individual enhancers with different colors and then we have our promoter tag with this ms2 stem loop system and so we can see how do these enhancers interact with the stem loop system and the correlation of spatial interactions with how much activity do you see okay so that's the kind of the grand goal of what we're trying to achieve and hopefully the data will lead us to models that tell us something of the underlying transcription dynamics but at the same time of the underlying dynamics of movements of DNA and hence the polymer physics of DNA the structure of the DNA okay and so this is all from fixed tissue so each data point here is a mean but with a huge standard deviation because they come from an ensemble that's one way to get this kind of data and the other on the isa extreme end what people do they take they take nuclei and they cross link everything together okay so you take formaldehyde with a chemical I think is it formaldehyde on whatever they take some inorganic chemical that when two piece of DNA are nearby it cross links them okay and if you do this with enough nuclei there's and there's there's now tricks where you can see where are these cross links okay and so people then built a probability matrix of chromosomes or even the entire genome of what is the probability that two piece of the genome are close by because they have the statistic that's a very very rough method right it's based on statistics only it's based on smushing cells you don't have any dynamics and most importantly you don't have any functional biological relevance because you don't know if there is a cross link if that's meaningful or not for biology okay this just gives you somehow the architecture of the of your genome okay and so what we would like to bring to the table here is the dynamics so you can actually see whether interactions are meaningful or not you can see how do multiple enhancers interact with the protein promoter etc and you can see how you link that then to the activity of the promoter is the premise kind of okay oh is this what's this too much as long as I don't show any gfp you don't ask me question all right so um as I said the the optical limit with the best microscopy that you can find nowadays is roughly down here at something 220 nanometer which is already quite good the hope is to reduce this well and so if you do this we build such a thing and if you do you notice now in your movies that you see the four spots and you see the chromosomes much nicer okay so here you see the sister chromatids which we couldn't resolve previously so before when I told you well here's a transcription trace I actually lied to you because this was not one transcription spot but it was two and so already that makes it more difficult to model this stuff but now we can actually resolve individual transcription sites because we have a segregation of the separation of the of the sister chromatids so that's at least in the right direction with the microscopy um but then of course you can combine this with you know some super resolution techniques that are not going to go too much into and hopefully get to something of the order you know 40 50 nanometer or so now it turns out that endogenous enhancer promoter distances in the fly the system that I like to work with they are around here yeah in kb what is it like 30 20 30 kb or so that's the longest that you can find in the fly um and so but because I want to do this in the fly we needed to resort to something else and so to test if this entire approach has any legs to test how our microscopy works we designed a synthetic system that lets us you know make progress maybe understand already something about the polymer physics even if the distances are slightly longer okay and so I have the system that I want to introduce you to now is at roughly 142 kb okay and I use a trick in order to make this work all right so far so good so well this is this is this is now the beginning of the talk there's all this other stuff was introduction so um what we're doing is replacing us at the locus in the in our in the dna of this gene called eve which makes nicely seven stripes we have seen it yesterday use the same thing over and over and over and over just for different things so here's eve here are the enhancers of eve and it turns out and so they are very close yeah these enhancers all of the enhancers of eve are within 20 kb hopeless for my imaging by the way everything i'm showing you today is this regular microscope it's like confocal microscopes you can all do this at home you don't need to build a fancy lattice light sheet or and this is just the guy who developed this as a developmental biologist and he used confocal okay and so there's there's going to be there's going to be an upgrade when we have to better microscopy but i'm just going to show you all the stuff that you can already just do with confocal so resolving 20 kb is hopeless with a confocal and so we need to use a trick and the trick is that it turns out that at this locus just downstream of the enhancers there's a little dna element that's functions as a tether and it has the property that when you take it and you throw it somewhere else in the genome that thing that somewhere else in the genome will bind back to its counterpart okay and so that's whatever like 20 base pairs or something it's called an insulator a boundary element doesn't really matter it's a for our purpose here it's a tether okay and it has the biological function at least in this system to block any enhancers that are downstream of it from access to the promoter that's why it's called a boundary element or an insulator all right for other purpose for now it's just glue i have a piece of dna that's at the endogenous locos of eve i have a piece of dna that i throw somewhere else in the genome and they want to glue to each other they like each other and so yeah how these enhancers were discovered yeah yeah i've i made a long time levin and company all right so we put this a copy of this anand 42 kb downstream of its counterpart and fused a little promoter to it okay but no enhancers actually this promoter here i wrote right like c but it's the same promoter than the e promoter and like z is just a little gene that's encoded here because i didn't want to have this be a fly gene that interferes with anything that i want to do so this promoter here is the same promoter than this guy here but it has no enhancers okay and so if you do fish again you can see sporadic activity of this promoter but only within the pattern of eve which means that the even enhancers are able to recruit that promoter to transcribe it also means that this is sporadic because it's not in all of them so you know if you think of looping back and forth this can give you a stable and unstable state and only in the few stochastic events that are stable you actually see expression okay so that tells us that we're kind of on the right track we have a system that has some stochastic events that are over a distance of 142 kb and that can still be controlled by enhancers to be transcribed by the way if i replace this element here with let's say lambda dna which is dna from a phage that lives in bacteria it's nothing to do with the fly so some really weird dna for the fly you don't see any of those spots okay so you need this tethering element to bind here to bring this enhancer close to sorry this promoter close to these enhancers so that they can act okay and so yeah the model then is that something looping has to happen and so now of course this is fixed and we want a live version of this how do you do this well you use this ms2 stem loop system and use crisper have you heard about this so the genome editing anybody want me to go genome editing so there's a way to basically go in the natural in the natural fly in the endogenous fly and stick something in that fly's genome and that's what we did with the stem loop cassette at the endogenous locus of eve and so what you see here is a live version of how eve comes on that's as live as it as it can get okay because you need to tag it somehow into you see the seven stripes here and then the thing just relates and the eve still continues to be on it's the same method that i just talked about before so that takes one locus and now you need to tag the other locus well it turns out there's an orthogonal system that's not based on ms2 stem loops but it's based on so called pp7 stem loops and you can fuse them next to the laxie gene and now you see again there's sporadic activity of laxie within the eve stripes okay so we have recapitulated the fixed system now in life okay but that doesn't help us yet because we can't see anything all we see is okay sometimes it's right and sometimes it isn't well you could compute probabilities of how often that happens and we do that later but what you but the first thing we would like to see and that's something that's a question that has been open since the 80s when enhancers were first discovered whether physical proximity between an enhancer and a promoter are actually necessary in order for transcription events to happen okay that's the first that's actually the sole question really that i want to convince you today that we nailed okay is physical proximity between an enhancer and a promoter necessary and the maintenance of that physical proximity for that matter in order for transcription to happen or is it sporadic enough that you don't need that and so but for this i need now a third tag that allows me to see these locus here when it's not active because i only see the locus when it's active and i would like to see when it's searching looping and finally finding and going on okay and so for this we use yet we use a third system that we stole from bacteria it's based on a sequence that's called par s that binds the protein that's called par b that par b protein we can fuse to gfp and label naked dna if you want okay so that's a sequence of whatever 500 base pairs or so it has a bunch of binding sites for these par b proteins it put this right next to the pp7 loops and now you should see this of course in all of the nuclei because they have nothing it has nothing to do with eve right so you see in all nuclear in our green spot you see in the striped nuclei a blue spot and you see laxia activities sporadically in a few okay so this thing here to make that fly took us three years it's not easy you need to figure out three different colors there's very if you look there's very few live imaging with three colors and that is because you know if you want to have quantitative specificity of your three colors it's very hard to pull them enough apart to get it all genetically in the same system etc all right pardon the par s um yeah so good point so all of the proteins that goes back to what i said earlier all of the fluorescent protein so this blue fluorescent protein this red fluorescent protein and there's a green fluorescent protein all of these are ready in the mother to go i load the mother up during all genesis doing those two and a half days the mother puts all of these proteins fold stem mature stem in the embryo so when i the embryo gets out of the pipeline of the production line gets fertilized boom those are ready to go and as soon as you see loops or a piece of DNA that can attach we attach this protein you'll see foci okay good point yeah and we need this otherwise we would have these delays right all right so now if you zoom in and look at um we look at a few nuclei you see there's one here expressing and two are not and if you take the distance between the blue and the green fluorescent protein take the distance then um you see that if the one that is expressing is almost always lower than the two that are not expressing so they're definitely closer together this is now a live trace you can do some statistics so if you average over the traces you see it goes down by roughly a factor of two if you are expressing this protein or not and now this is for three nuclei but as i said in your individual and your single embryo you have many nuclei in parallel can do high throughput imaging if you want and if you average over now 66 loci you see that the blue green distance if you have no red if this is not on is roughly at 750 nanometers and if you are on you're down to like 340 nanometers and so that's kind of a live version and live proof that in order for transcription to happen you kind of need that proximity proximity why should it be what yeah a good point why yeah so that was my next point why is it not half any other question before i tell you why it's not half sorry why is it half and not zero so um this is a um a confocal microscope right and so i'm limited in what i can actually image okay and so as a control experiment what we did was to co-localize in a different fly now the green and the blue spots in a given fly sorry on a given side okay i show you on actually on the next slide how it is done and so you know physically your green and your red are glued together down on top of each other you know that and now you measure and you measure a distance of 200 nanometer so that means that that's the best i can do okay my microscopy doesn't allow me to do any better than that and so um here's the control i just want to show you how to do this so here i'm looking at just one dimension so you have x y and z by this 3d imaging you have x y and z are you guys interested in a little imaging how you control that you're not on the okay so you have x y and z and here i just show you x the x axis and i show you what is the x in blue minus the x in green that i measure the x in the blue channel minus the x in the green channel okay and you see it's fluctuating and this here is an individual pixels so it's very noisy about noisy it should be noisy because that's the search right things are 142 kb apart they should have quite a few pixels to search otherwise i don't measure anything and what you see is that there is a there's a slight decay here and if you there's a fit and um that line that wide line really should be zero because on average that distance should be zero if there's a random walk that's what we should have the reason why it's not zero is because there's chromatic aberration the blue and the green through your objective they shoot at slightly different angles and so you have to correct for that especially if you're after nanometer resolution and so the correction is as a function of x position in my image so there's somewhere here where indeed we have zero but then we overcompensate and we undercompensate and so you have to then go back in and correct your data for that chromatic aberration but this is of course for each embryo each time and so I need to do something to not always have to do that control measurement and so the first thing we did is we generated a fly which i just alluded to earlier where we have where we can see all three colors combined in one spot so here basically take an alteration of these ms2 and pp7 stem loops and give them two different colors so i have you know i have 24 stem loops of pp7 and i have 24 stem loops of ms2 and i don't put them next to each other but i interlace them okay and i put a paris next to it as well so that we have the third color all right and so now i know that all three colors are localized in one spot and now the data looks like that okay so you still see that same slope that was taken on the same day and the same conditions you still see that same slope but now it's of course much much tighter because all we're looking at now is measurement noise and so then you ask well it's 200 nanometer can't you know can't you do better can't you just sit there make your things brighter and reduce those two and why why are why 200 nanometers what are they coming from well for this then you do the third experiment and you now you take tetraspec beads beads that have three colors beads they don't move right beads are fixed they're just sitting there in some resin and so the difference between what we see here and the beads and what we see here comes from movement because my confocal is slow it takes stacks from time point to time point and while it's taking that stack or even while it's taking a single image those spots are moving around this is like random walk is kt movement right it's fast much faster than the second multiple second resolution that my microscopy has okay so this is really the imaging resolution that we have the imaging power that we have with you know the kind of setup we have it depends on the power it depends on what the the quantum yield of your your floor for the cetera this one here tells us how much of how much movement do we have but note that all three slopes are the same and so I can use the internal slope of my measurement each time to correct for chromatic aberration and call this here my measurement error and that corresponded to the offset from zero that you saw on the previous slide is that okay or too much you want more I'll show you move on so this was for one embryo now of course I can do this with many embryos and so here each box is kind of an individual embryo and you see that your distribution so what I'm and what I'm showing you on the y-axis is the root means square distance so I take my trace my two my enhancer and my promoter they're dingling around okay over time they have a spatial separation in 3d I take that spatial separation and average over time you take the degree of distance and take the root and then you get something that's measured in nanometers okay and you see it's kind of bimodal and it turns out that all the times that we see activity of the reporter they are in the lower mode of your bimodal distribution well there's one outlier but it's good to have one outlier right you're right yes very good observation well that's uh the rest of my talk is about those guys not exactly but almost all right and so um because we have so much data we have 2,500 loci here you can actually extract an histogram and it's a bimodal distribution you see a big bump that corresponds to the search space of your two tethering elements most of the time that's what's going on and you see a small bump that corresponds to when these two guys are tethered okay and you see an even smaller bump of when in the tethered case you get activity okay so it seems like there's a two-step process at first these two tethering elements find each other and once they found each other there's still a search for the promoter that is now somehow hanging out here right it's fused right next to the tethering element it's hanging out here and it has to find the enhancers because this is as I said 20 kb and so that's what you see here and so I told you before optically we have absolutely no access to those 22 kb it's hopeless but now with enough statistics I'll show you that I do have access to actually understand something even about the endogenous eave locus and that's that's in fact what the rest of the talk is about yes absolutely I got I get back to that a little bit in the in the end yes there's right now no dynamics this is the average trace in my 30 minute time window that I observe okay how many what two in each on each chromosome on each second chromosome of which you have two there are two dt elements well but to be honest with you your question really was is the competition between different dt elements and the answer is yes but we don't have access to it pardon yes there are others they are not the same dte but there's another one actually on this side here which is continuously bound to that guy so the entire eave locus those 20 kb is already a big loop but we kind of gloss over that for now because we we didn't take that into account so far okay it does not seem to interfere with the possibility of this guy talking here okay yeah yeah the peak that I gave you was the peak of this guy and you see there's actually a little difference right between this peak and that peak and it might be that there's that in this case here we just didn't use the chromatic aberration correction that is the raw data so but it's a good observation I'll ask someone who knows the answer I just gave you was just maybe that's the answer all right good observation here because the 300 but I'll show you in future slides 340 is the one but it is true so anyways and so you see there's a certain percentage of transcription events and of all the tethered guys you have two-thirds that are actually active now we did a we did a few controls if you so this is the same data again if you replace this guy with lambda dna you don't get the second bump and if you replace this guy with the dna tethering element but inverted then you get the second bump but you don't get any red because now because this guy is inverted when it lands here there's an orientation of how these two guys like to bind when it lands here the lag z promoter is going to be on this side and so it has no access to the enhances and so that's kind of why we know that the element that's here that's likes to bind to this guy doesn't really interfere with our system always what we want to do with our system okay all right so again the power of the fly embryo is that now we can here I show you all transcription sites that I start I see in my embryo okay but the power of the fly embryo is that I have an imaging system where I can look and I can identify individual stripes and I can now split this histogram up into individual stripes okay so if you look at the if you look at individual stripes you see again the same kind of bimodal histogram okay and then you can ask well what is the so here is I think in in in yellow you see stripe five in red you see stripe four and in blue you see stripe three and you see that the fraction of active lag z expressing nuclei drops down okay it's highest in five and lowest in three okay that is essentially the difference between between the what I what we what we saw we saw in your in your little bump here right there is a there is a part of the bump that is not active and that goes down and so what we saw in the previous slide was the average all of all of this and now I've kind of resolved this in individual stripes and the reason why this is the case is because the stripes I actually ordered along the the leave locus such that the stripe that such that stripe five is closest to the tethered lag z promoter then comes stripe four and six and then comes stripes three and seven and stripe two we don't measure because it's too far out here it's for another for another experiment all right and so we can be a little bit more quantitative about that and ask what is the distance between the enhancer and the promoter for the different stripes because we can measure the distances right we have a measurement power of 200 nanometer and these distances are different and so here Alessandro I have your answer if you continue to pay attention basically the 340 that I mentioned before is just stripe five stripe four and six I it's 380 and stripe three and seven I 400 and so what I showed you before was the mixture of all of these okay and maybe there's still a discrepancy and that might be chromatic aberration but this is partially certainly the answer okay if you resolve individual stripes you see a difference and so you see that these are further away than these guys all right now you can measure that same distance when this when the red is on this wasn't the case when red is off if you measure the same distance when red is on you see that now they collapse down they all come closer to each other actually these guys come all the way to the lag z to the to the same level then stripe five so there's a loop that will loop out between here these guys they don't make it all the way they make a little bit and we don't know and we don't we really don't understand why but right now the hand where the answer is because these guys are upstream and those guys are downstream of the promoter and maybe there's something else about the topology that we don't understand and that we don't have access to but nevertheless you know each case you have a tightening of the locus in cases where you are actually actively transcribing again more power to the fact that you need the damn enhancer to be closed or in connectivity with the damn promoter in order to have damn polymerases giving you transcription and output okay and here again is to control when you invert things then things are of course not on but you get more or less the same numbers you actually get even a little bit less here and this is because now this is on the other side and we even have access a little bit to that but you know the error bars are large and so I don't want to go too much into this and finally if you plot again this engagement probability that I had on the previous slide as a function now of enhancer promoter distance in the case that the red is off so these distances here you see that you have a nice decaying curve note that it's a linear decaying curve because this is in a way the contact probability right given as a function of distance it's decaying and so in the last slide if I get there I will try to link this to a polymer physics and that's really in an infancy state questions all right so the next thing you can do is you also have access to the endogenous even density right and all of these stripes that's how we know where to look we have the effectivity right and so now you can ask well what's going on with effectivity if that same promoter also wants to sorry if that same enhancer in that stripe whatever five also wants to activate like Z suddenly you have now two promoters that are hanging out and want access to that same enhancer you can ask about how does that affect or does it affect the endogenous activity of EVE and so that we see here so what we're doing here is we're looking at the activity of endogenous EVE of the blue channel yeah the activity of the blue channel when in nuclei where there's no red activity and compare that to the EVE activity in nuclei where you have red activity and you see that there is for stripe five a 25 reduction for stripe four and six a 17 reduction and for stripe three and seven whatever a 10% reduction okay so there's some competition going on that this enhancer suddenly needs to interact with two promoters and can't really cope with it he doesn't have to power the potency to be able to do it as good as if the like Z wasn't wouldn't be hanging out there so that's a really nice quantitative measurement right I measured a 25% reduction in activity at our 2.5 of the embryos development it turns out that these EVE stripes as I said yesterday we can see a trace of them these are their blueprint if you want for structures in the adult fly so the question is whether when you reduce activity of one EVE stripe let's say by 25% can you see a phenotype in the adult that would be quite astounding right you've reduced the thing by 25% at our 2.5 of the embryos development well and I wasn't making it such a big story as that we actually see if you wouldn't see it and so if you look very closely in wild type these are the segments of the fly body and you see that in this case here you're missing an abdominal segment 4 and in this case you're missing an abdominal segment 6 and they correspond to stripe 5 and stripe 7 missing or not missing but having a reduction so this is quite astounding right you have a little bit less activity very early on and you see an effect a phenotypic effect of that quantitative reduction of numbers of proteins that you have in the early yeah that's what we have done we did not do this in this this is not done in the stem loop system etc this is done in an e-feterozygote without the stem loop system okay so we basically we only all we did is we put in the in the in the so this is not there and this is not there all we have is a promoter here the e-promoter with this element that we crisped this into this side okay so coming back to your earlier point all of this was we gloss over time you can also now look of course at the time traces of dyslexia and here are 219 of those time traces the intensity should tell you how many polymerizes you have currently working away it's roughly a this year's a 60 minute window but you see that not all spots are on all the time right and so you can order these spots now by the time they're turning on or off and so you see that these spots here they all turn on and these spots here they all turn off and now what you can do is you can align those spots at the time when they turn on and call that zero all of the spots you align at the time they turn on you see that in the red channel and then you can get the transcription dynamics of this spot going on so here you see a spot going on and here you see spot a spot going off see this is much faster this corresponds to spots that are actually coming out of mitosis in hunchback or in eve we see the same kind of dynamics and this one is slightly slower but remember polymerases are also running off right so even when the activity goes out when the signal comes don't transcribe anymore they're still polymerizes that want to run off but now what you can do is you can go in because you have you have ordered your traces at time at some arbitrary time t zero which wasn't arbitrary but that was the time when the thing turns on or off you can now look in the other channels and ask about the distance and there you see that as long as the thing is off the distance is far away it gets closer and closer and by the time it goes to 340 nanometer alessandro they're turning on and the same goes for turning off and you know that in turning off it seems to start already four minutes earlier and you wonder what is those four minutes but you know how long your gene is and we measured previously what the elongation rate of the polymerase is and you divide one by the other and you get something like 3.5 minutes and so the fact that we see this a little bit earlier we understand quantitatively why does this happen a little bit okay and so what this demonstrates is that not only do you need enhancer proximity enhancer promoter proximity for transcription events to happen you need stable enhancer promoter proximity because as soon as the thing goes away transcription holds at least in this locus and so models like you know you have an enhancer put an epigenetic mark there and then go away but because the mark is there you continue to transcribe they're kind of refuted by those measurements but again this is in a special eave locus and for now we don't know anymore okay so what I have five more minutes so this distance here was completely arbitrary right I just why 142 kb well because the thing happened to land there but nothing precludes us from inserting this cassette anywhere else on the second chromosome or even in the genome for that matter and so we have a bunch of other lines and we can now probe all of these numbers that I just throw through at you as a function of distance from the eave locus and that's really what you want if you want to do polymer physics you want to know how do things change as a function of distance etc and what that will then give you some understanding of the underlying DNA so in the next slide I'm going to show you just three movies for those three insertion sites so one is the one that we already saw there's one that is much closer note that there are now more spots and there's one that is much further away note that in this movie I think you don't see a single spot and so you can now ask what is the number of spots as a function of distance or what's the probability of seeing a spot as a function of distance and so here you see in a log log plot the looping probability which is essentially just the number of red over the total number that we see as a function of genomic distance and you see a nice linear decay you can go a step further and compute again the root mean square displacement or actually the mean square distance plot the mean square distance as a function of genomic distance and so here again you see somehow the spots being on the line this is for the case when they are not red this is for the case when they are red okay so this guy these black these blue guys here are the distance between even is to blue and Paris green in the case when they are not red and this in the case when they are red but of course we know what the distance of this guy here is right we know that these guys here are still 18.5 kb apart from each other and so we get for free another point that's at 18.5 kb and it's the average of these guys and you see that it very nicely falls on a straight line okay and also note that this is the only data set where I don't put error bars because I don't believe in a single one of those data points this is just you know the first round of experiments and we need more in order to really I mean it goes in the right direction to do some physics interpretation but we need more and so my last word hasn't been falling on this so what do you do with this now well you go and learn a bit about polymer physics and I haven't learned so much about it yet so there's a very nice book by the Jean which I'm reading and you can view the DNA essentially as a elastic polymer that can be modeled by a random walk so let's say you tag two pieces of DNA there's two spots and these two spots they have a random walk it's called a worm-like chain model and within that model you can explain many many things about polymers in nature not just you know mostly it has been applied to inert polymers but it has also been applied to DNA in vivo in vitro and here we are now for the first time in a position to be able to apply that to DNA in vivo okay and so what you have to consider there is that if you want to ask about the looping probability this you need to essentially look at the free energy function and what prevents you from looping is a tug of war between your bending energy your elastic energy and your entropy and so with those considerations within the worm-like chain model you can show and I'm not going to go through this in detail right now that first of all a straight line makes sense but then you can go further and ask what is the slope of that straight line and there yeah and there now you have different variants of your model and you can you know try to understand something about the compaction of DNA given those variants of this model okay and so that's what we are after but again I'm not going to go much deeper into this because there's no error bars on my data point and so you can do something similar also for the mean square displacement sorry the the mean square distance where you can show that within the framework of those models that distance is proportional to linearly proportional to the number of of base space that separate two spots and it's also proportional to the persistence lengths of your DNA and it's proportional to the packing fraction and so from the slope of this guy here you now get then access to both the persistence lengths and the packing fraction of your of the relevant pieces of DNA that you want to look at because we know the persistence lengths of naked DNA in vitro people can measure this in fact here it is so here's a bacterium that has exploded and the DNA is all over the place and you zoom in and you see there's little errors here and that error corresponds to roughly 50 nanometer which corresponds to the persistence length of naked DNA in vitro however in our case what we are looking at is not naked DNA but we are looking at highly ordered DNA first of all they're nucleosomes so they're little pieces of histone around which the DNA is is wrapped twice and so you see here those nucleus is an EM picture where you see those little balls of nucleotides and so that certainly changes the rigidity of your DNA polymer and moreover there's models where even this might be compacted further into something that's of the order of 30 nanometers or so and so all of these models can then be can be tested by this data and taking appropriate values for this persistence length and the and the packing fraction or actually extracting the persistence length and the packing fraction all right okay i'm i'm done i don't need to so we are now here so we have done this we are now here where we are trying to now really understand quantitatively the underlying dynamics of DNA and how that affects transcription but then eventually we are of course interested in in dorsis and in the promoter interaction so we are taking really not this synthetic construct where we have a dna tethering element as a helper in order to increase our statistics but we want to see this without those tethering elements and so that's what's currently going on all right thank you very much