 I'm talking about how to do these homework problems, since we weren't able to do the tutorial yesterday. OK, so I'm going to talk about randomness and cell division and cell death, one of the clear examples of randomness in biology. And before we do that, just a plug for all you guys. So here's two smart people, Freeman Dyson and Paul Nurse. Freeman Dyson, one of the people who is responsible for QED, Paul Nurse, Nobel Prize winner for his study of how cell cycle, cell growth, and division works. And I've just got these two sort of more recent and less recent quotes from both of them. So oops, I won't touch it. So Dyson says, biology is moving very fast. The same kinds of people who became physicists in those days, he's talking about in the 50s. Now tend to become biologists. A lot of people who are really computer scientists are doing biology. The line between math and biology is not so sharp anymore. I would say that's where the young people are. They're having the same good times now that we had 50 years ago. And this is actually a call to arms. It says, people with your backgrounds and interests and skills should really jump into the study of living systems because it contains a huge number of open and very, very interesting and deep problems. And if you don't want to take the word of a theoretical physicist, then you can take the word of an experimental biologist. So Paul Nurse says, the idea is to look for ways that can transform molecular interactions, biochemical activities, and biophysical mechanisms, all that physical stuff, into logical and informational structures and processes. So he says, this will lead to the understanding of the cell as a logical and computational machine. And it will shift biology away from the common sense and familiar world it has generally occupied in the past to one that is more abstract. And I think this is a very important quote. He's really saying that the way to understand a cell is to understand it from the point of view of structures and information and a certain degree of abstraction. Now, of course, it's not an easy thing to do. You don't want to do a spherical cow type of cell. So the idea is to figure out how we can bring together the tools from chemistry, physics, mathematics, and so on to understand how cells work. And the way I like to focus my mind on this topic is I like to ask what would life look like on other planets? So you are living in an amazing time where we have now a catalog of proved planets that are orbiting other stars. In fact, solar systems that are around other stars, multiple planets. It used to be that every time a planet was discovered, it would be a newspaper article. Now it happens so often, we have thousands. So what would life look like on other planets? Of course, we don't know. But it's nice to think about it. And in particular, since we have one example of what life looks like, how much of what life looks like here should we extrapolate to what it looks like on other planets? Now, of course, it doesn't have to mean that life on other planets is made of carbon and has the same type of scale, length scales, and time scales that we do, right? But must be some common features. So I'm just putting out there a few things. So of course, as Pauline says, physics and chemistry is how stuff works. And it'll be the same on other planets. But somehow, the physics and chemistry itself is not life. Life emerges from some collection of interactions. So one of the main things that we might guess about life on other planets is the idea that there's some kind of process of evolution that resembles the features of natural selection on Earth. The reason is because we have no other candidate kind of process that leads to this increase in complexity and internal structure and organisms that are able to interact in this active manner with their environment. So evolution through something like natural selection, we might guess, is operating on some other substrate through some other process, right? But generally those features of variation, selection, modification, anywhere else. Now, this piece is interesting. So life on Earth, it didn't have to be this way. But life on Earth is built around the idea of information. So a cell is actually an object that's constructed from a recipe that's written on a molecule called DNA. So the idea of separating the object from the information is something which maybe is a nice way to organize things. And so you might want to think about what that benefits and what that doesn't benefit. So a mountain is not based on an information. The ocean, rivers, natural objects that are not living don't separate their information from their basic structure. But somehow life does that. You might want to think about it. The third thing, which is more controversial. So all life on Earth is made of cells. This wasn't obvious until a couple of 100 years ago. Until the advent of good microscopes, we didn't realize that, first of all, there were forms of life that are unicellular that we can't see. And that even multicellular life, elephants, dolphins, humans, trees, are made up of the same type of cells. Now, will life on other planet be made up of cells? Obviously not. But what is a cell? A cell is a way to separate your private DNA and the private goodies that you have, your private goods from the external environment. So what a cell is is a membrane and everything inside the membrane. And this is very important because evolution would not really work unless you were benefiting from your own innovations. So the privatization of innovation and the benefits that derive from that is the reason why evolution gets amplified. If evolution was happening a-cellularly, where one part of some prebiotic pool, some cool innovation happened, everybody else would share the benefits. And it wouldn't spread as rapidly, counterintuitively because by this kind of cheating, it gets suppressed. So these are the kinds of questions that a lot of people think about, abstractly, not so abstractly. Nowadays, there's actually experiments where people start to look at things like cooperation, growth of complexity, and so on. So a lot of cool stuff is going on. So one lesson from all this, which was a very surprising one, one of the major discoveries of the 20th century is that life is digital. And this is also surprising because in a sense, in the 20th century, two independent threads. One was from the areas of computer science and communication, driven by radio communications, the war effort, decoding, people invented, information theory, computation, and so on. Nothing to do with life. They had to do it in order to transmit and receive information and decode things. And that's where the idea of a digital bit and so on was born. But somehow, evolution and life also operates with the same principles. So let me explain. As you all know these people, Charles Darwin, Gregor Mendel. Mendel, by the way, was trained as a physicist. And he was hired because he was good at counting. So he was actually doing these experiments on peace. But those experiments were running in that monastery before he got there. So numbers are very important. So Darwin had this idea that evolution works by parents transmitting some amount of heredity, where the offspring look like the parents. But it's not exact. And somehow the offspring that are best suited for the environment get selected. That's Darwin's theory of natural selection. Darwin speculated, but he didn't hinge his theory on the mechanism of heredity. He didn't actually worry about molecularly, even that phrase is not there, but he didn't worry about how physically the parents' traits are passed to the offspring. Now Darwin and Mendel were contemporaries. And Mendel actually had a copy of Origin of Species. Mendel had a completely different point of view. He was just looking at simple traits, color, and size, and so on. And he discovered that there are statistical laws, how these traits get passed from parents to offspring. These are the famous Mendel's laws. And his inference from this, it's a bit like doing statistical physics, where his inference from this is that there must be the passage of traits from parents to offspring obey these laws as if all the information to make an organism was in small, discrete particles, certainly discrete particles and small because they're transmitted in the seed. Now what are these discrete particles? He didn't speculate, of course he didn't know. And that's when this other parallel thread comes in in the 20th century. This is Turing, this is Shannon, the idea of a computer and the idea of a bit of information. While all this was going on, in parallel it turns out that DNA is in fact a manifestation, implementation of a sort of digital information storage medium. In fact, today there are companies that will sell you their services to back up your data on DNA. This is actually true, and you have to pay their money for it. And the claim is that DNA will last longer than, let's say, magnetic. This is a very interesting times. Now all of you know about DNA, anybody here not know about the structure of DNA and how it works? So one of the most important things is that DNA is this stretch of nucleotide bases, A, T, G, and C. And the actual letters on the DNA correspond to some sort of recipe to make the organism in which that the DNA is encoding for. So just a little game here. So since we have from Shannon the unit of information, we can ask how much information it takes to encode organisms. So a bit is a yes or no, it's a one or zero. But the unit of information these days is a byte. And a byte is basically one letter. That's why we use it. It's eight bits, right? But orders of magnitude, a bit is a tenth of a byte. A byte is a unit, one letter. A word is about 10 letters. Order of magnitude, a page is about 1,000 letters. A gene takes less than a page. And it's very easy to see because you can write the sequence of a gene on a page. It's even more compressed than that because the gene contains only four letters, whereas the English alphabet contains 26 letters and change. Digital photographs take more space. Music takes much more space. Now it turns out that a bacterium is not so big. It's 1 MB. And you know that because that's the size of its genome. A CD is just big enough to hold, I don't know if people know what CDs are, but there were these things that we used to play songs on. And they were designed to be just big enough to encode one album, or in fact, one piece of music. The original person who invented the size of the CD had one symphony I think he wanted to record on the whole CD. And that's why he picked the diameter of the thing. Anyway, and of course a DVD is much bigger. A CD can store an album, a DVD can store a movie. But a human, an entire human genome can be stored in about that size. Sort of, not with all the raw data and the raw reads when you get your genome sequence, that's much bigger. But you know, they're compressed information. So in a sense, it takes more information. So the way I like to think about it is it takes more information to encode an Arnold Schwarzenegger movie than to encode Arnold Schwarzenegger. So I mean, you might want to think about why this is the case. So somehow, biological information is operating in a highly compressed and in a sort of powerful manifestation of information. So, but look, DNA is nothing. It is a passive molecule. If you put it in a buffer like this, you won't see anything. That's how you transport DNA in a sort of buffer of salt with DNA on the inside. It's nothing on its own. DNA cannot make a cell without an existing cell. So all life is made of cells. But how does one cell make another? It turns out that we know a lot about how DNA works now. We can synthesize it, we can translate it, we can mutate it, whatever. But we don't really have a good understanding even now of how one cell makes another. We have a lot of ideas, a lot of very active threads of research. But this central question, we don't have much idea about. So I was discussing with a few people over lunch yesterday. Here's a little thought experiment. And I want you guys to think as physicists, because life is physical. So here's the thought experiment. In my lab, I actually have this experiment running, or ahead. So here's a beaker. And in the beaker, there's sugar, salt, and water, sugar, glucose, or whatever. Salt, I mean lots of salt, ammonium, chloride, calcium, sulfate, and so on. And I have a pump that pumps the sugar, salt, and water into a black box here. And I have another pump that pulls out the same fluid at the same rate. So the buffer moves from here to here to here. And all this machine does is move this sugar, salt, water solution from the left side of the screen to the right side of the screen. It's a boring machine. And then I do one little thing, one tiny little thing, to the black box. And suddenly, this boring machine that just transports the material from the left to the right of the screen now becomes this machine, which takes sugar, salt, and water, and makes living cells. So this should be astonishing. This should be astonishing. I haven't changed anything about the system except one microscopic change. How many people agree this is astonishing? I have this machine in the lab. Do you believe me? You do believe me. So what did I do to this little black box to make this machine from a passive transporter to an active generator of living things? What did I do? What could it possibly be? One cell. That's right. I added one living cell. And as soon as I say I add one living cell, and I tell this to people when I give this talk, they say, oh, yeah, well, then it's a very boring question. Somehow, when you think of it as a physical question, it seems magically astonishing. But now you think of it as a biology question. You're like, it's totally obvious. You put one cell, that cell eats sugar, salt, and water, and makes more cells. So somehow, we're willing to give biology sort of suspension of disbelief. When biology does something totally ridiculous, we think, yeah, that's OK, even though we can't understand it from some sort of laws of physics. So don't do that. You need to question the biology. You need to constantly question it. Because after all, it's nothing but the same kinds of molecules here as here. Nothing but the same thing. One little point, by the way. Which species of living cells comes out of here? Obviously, the same species as the one that I put in, right? At least on short time scales. It's not that I can put in a bacterial cell that I can bring out a dinosaur, right? But can I put in a bacterial cell and then use the same machine to make a different species of bacterial cell without putting in any more cells? Now, this experiment has been done by Craig Venter. So he's actually shown that you can take bacterial cell A, put in the genome of bacterium B. The cell A is like the DVD reader. The genome is like the DVD, right? So once you have the DVD reader and the DVD, the DVD is the thing that controls which movie is playing. So you can actually take cell species A, put in genome B, and after a while, what do you get? You get cells of species B. That you can actually transmute species. This can be done. And in principle, you can then convert a bacterial cell to a dinosaur, but it might take billions of years because you'd have to go through a series of steps. How do I know that? Because it happened at least once, right? So evolution on Earth is precisely this process. Okay. Any questions about this magical experiment? Yeah. Sorry. Yes? Sugar, salt, and water. Yeah, the sugar contains the carbon, right? The salts contain various kinds of salts. You need trace metal elements. I mean, there's a few other things, right? This is called minimal medium and you can just make it with stuff you buy from off the shelf, yeah? It's honestly just that. But where do amino acids come from? In fact, good question, right? Before you had cells, how would you take sugar, salt, and water, and make amino acids? So this is a very famous experiment, the Miller-Urey experiment. And this experiment was meant to simulate some sort of prebiotic environment, right? Where you take molecules that you expect to be found in high quantities before life existed. And then, you know, you buffet it with high temperatures, with electric discharge, right? And eventually, what you do is you look in this tube which turns into this brown gunk. And it turns out that there are complex organic molecules in there, including things like amino acids or lipid-related things, you know, complex carbohydrates. So the fact is that making organic molecules of the type that life is made up of is not that hard using abiotic factors, right? And in fact, many of these sort of asteroids that orbit the solar system and the fragments of meteorites that crash on Earth contain complex organic molecules. So that's not the issue, right? That stuff is there. It's the arrangement of the molecules that makes a big difference. Okay, so Von Neumann, who is famous for many things, but one of the things he's famous for is the idea of a physical, self-reproducing automaton. So the idea is that, you know, just molecules on the road are nothing, but can you make a machine, a physical machine? I'm talking about an ordinary-sized machine that somehow scavenges for material from the environment and puts it together in just such a way to make an exact copy of itself. Now surprisingly, we're actually quite close to that goal, right, because now we have 3D printers, and it turns out 3D printers these days can print many of the parts that are needed to make themselves, not all of them, because there's this microcircuitry and so on for which you need big silicon foundries, but we're very close. So in a sense, self-physical self-reproducing automata are almost here. In another sense, by the way, sort of software self-reproducing automata are already here. So if you want to think about digital life, that's another thread of thought in these kinds of fields. But Von Neumann was not particularly concerned with the microscopic details of this. That's why we come back to Freeman Dyson again. So Freeman Dyson has a very influential monograph where he talks about molecular replicators. He says, I can imagine at, you know, big meter length scales with enough energy that I could achieve self-replication, right? But can you do it at microscopic scale when really you're dealing with thermal fluctuations, all the mess of statistical physics right down at the molecular level, yeah? So he put out the, I mean, so his idea is that life is based on molecular replicators. So how could that possibly work? And just to plug, so this is my colleague, Sandeep Krishna, when he was working with Sanjay Jain at the Indian Institute of Science. They had a very influential paper where they just thought about a prebiotic soup where you had chemical reactions of the type that we've been discussing in class. You know, A plus B makes C, C catalyzes the ability for D to make E and so on. And what they do is they make a random network of these kinds of interactions. And they find that these random networks contain, generically contain autocatalytic cycles. These are autocatalytic cycles where molecules can take small precursors and make larger, more complex molecules which then act as enzymes or catalysts to speed up the cycle. And you know, this is one example of how a molecular replicator might be envisaged. It's pre-cellular, but it's still a sort of the idea of molecular replication and positive feedback is there. So Jacques Monogh was also obsessed with the idea of randomness. If you read his Nobel Prize lecture, he also talks about chance and contingency in evolution and in cell biology. And I really like this picture because he's holding these dice, right? So the idea of randomness is somehow front and center in this image. So this entire school of bacteriologists and phage biologists, they decided to just grab the bull by the horns. They said, how do cells replicate? The only way we can know is to do very reproducible quantitative experiments about how cells grow. And this is sort of, I think the moment that captures the transition of biology from an observational subject to a sort of reproducible quantitative discipline which it is today. That is, it's captured in this era. So this story I'm going to tell you over the next half an hour is a story about bacterial growth and division. Very simple stuff. It's stuff that every biologist at least learns in school and college. So let's start. What do we learn from bacterial growth? I mean, why is it interesting? So growth, by the way, so the way you do a bacterial growth experiment is you take the kind of beaker that I showed in the previous couple of slides ago, you put the sugar salt water, you put bacterial cells, and then you count how many cells there are in the tube as a function of time. And there are many ways to count how many cells there are. You can use microscopics, turbidity, whatever. And now what happens to the number of cells? Turns out the number of cells grows exponentially. N of t is some e to the lambda times t. And that growth, the exponential growth rate, so if you plot the log of the number of cells as a function of time, you get a straight line. Very confusingly, in the bacterial literature, exponential growth is called logarithmic growth. Because you have to plot it on log paper to get a straight line, and it's a very inverse thing. And also in the bacterial literature, a funny thing is that cell multiplication is the same as cell division, right? But don't let these things confuse you. What's happening is one cell is becoming two, two is becoming four, four is becoming eight, and so on. The point about this slide is to say that exponential growth of bacterial cells is not a confusing concept. As long as you accept that one cell can make two identical copies of itself over some time frame, let's say 20 minutes for E. Coli, or several hours for slow-growing cells, as long as you accept that one cell can make two, and then two can make four, and four can make eight, as long as the food doesn't run out, you get exponential growth. It's a very simple thing. Turns out that the growth rate, the growth exponent, e to the lambda t, the lambda contains a lot of information about the underlying metabolism of the bacterial cell. If you measure lambda under various different conditions, you can measure growth rate under various different conditions of antibiotics or food deprivation, and so on and so forth. Turns out that these growth rate plots as a function of antibiotics and food and various other things obey very reproducible, stringent mathematical laws. These are phenomenological laws. They are straight lines that can be fit reproducibly, and it takes some, it requires some explanation to see why these laws emerge. So, Teri Hoa and Matt Scott have done a lot of work in this area to show how much fantastic information you can extract from looking at the growth rate of a cell, just the growth rate. These are fabulous series of papers, you can look them up. Sanjay Jain has also done a series of works in theory, theoretical on the same area, okay? So that's all about growth, and that's not what I'm gonna talk about. I'm gonna talk about bacterial death, yes? Yes? Okay, so good, so we are going to do these experiments on hours, time scales, where the number of mutants, of course there are always a small number of mutants when there are billions of cells, but the mutants do not visibly impinge on the dynamics, okay? So good question, but for the moment, all the cells in the tube are identical copies of the parent cell that you put in. Okay, so I'm gonna talk about cell death, not cell division. So these are a series of experiments, a lot of such experiments were done in the 1970s. This is a paper from 1990. So what do we have here? We have hours on the x-axis, and we have log base 10 CFU per ML, let me explain what that means. How do you count the number of cells in a tube? You can't see them, if you don't have a microscope. So what you do is you take out a sample of the liquid, you highly dilute it, yeah? And then you spread that liquid onto a very, very food-rich plate, yeah? And then put that plate in a warm incubator, come back the next day. The idea is that you diluted it so much that individual bacterial cells fall very far apart from each other. And there's so much food that every living cell grows and divides and grows and divides, and by the time you come back in the morning, it makes a colony, yeah? So that's called colony-forming unit. So colony-forming unit is actually evidence of a living cell. And when you say colony-forming units per ML, you should think in your mind that's the number of living cells per ML of the solution. And this is the log of that, and these are all so-called growth curves. So here's exponential growth, exponential growth, exponential growth, that's in the absence of antibiotic. Then these experiments add various types of antibiotic at higher and higher and higher concentrations. And what you find is that high concentrations, you get, over the course of several hours, you get these curves, which before they flatten out here, are actually a fairly good approximation to exponential decay of bacterial cells, okay? You just digest that for a second. Exponential decay, the standard sort of prototypical example of exponential decay is radioactivity. So when you plot the log of the number of radioactive nuclei as a function of time, you get this beautiful straight line, just like that. So we know why radioactive nuclei decay exponentially. It's because the decay of an individual nucleus is a probabilistic event, it's a quantum mechanical event. It happens with a constant probability per unit time. We also worked this out yesterday. If you have an event that happens at a constant probability per unit time, these exponentials naturally arise, okay? So now, exponential growth of bacteria happens because one becomes two, becomes four, becomes eight. No problem. Can somebody tell me how exponential decay of bacteria can happen? If you have 1,024 bacteria at the beginning, then after a certain amount of time, you have 512 bacteria. And after a certain amount of time, you have 256 bacteria. That's really strange, right? Why am I saying it's strange? Because, yes? Where's this? There's no surface, so this is a liquid. This is a liquid, and the culture's being shaken like that. And it's totally well mixed. So the antibiotic's everywhere in the medium, yeah? So I mean, you're thinking maybe there's some inhomogeneity in the system or something like that, yeah? So it's a good hypothesis, but that's not happening because it's a totally well mixed system, yeah? But you see the problem. The problem is that if there were 1,024 bacteria and then there's 512, it means that 512 of them died and 512 didn't, right? Now, is it precisely a subset? Which subset were dying? I mean, is it a random subset? Is it somehow they were weak bacteria to begin with and the weak ones died first, yeah? Now, if you do this experiment not with bacteria, but with a complicated organism, right? If you actually do this experiment with, say, worms, then it turns out what happens is you add antibiotics. All the worms survive for a little while, right? And then pretty much they all die over a very short time span because that's when the antibiotic just gets them up. So even if there's some variation in the weaker worms that die and the ones that don't die, usually these so-called killing curves are fairly flat and then fairly sharply die off, right? You can imagine just adding poison to any living system. This doesn't behave like that. You get this exponential decay. So the question in my mind and, you know, if it's pure exponential decay and I'm going to show you some data that shows how close to exponential it is, if it's pure exponential decay, the answer cannot depend on details like the shape of the flask and maybe some region around it has some antibiotics because then the shape and size of the flask would change the shape of the curve, right? Instead you get the single parameter exponential decay. So what is this parameter that's controlling that rate and where is the randomness coming from? These are the questions, right? And it turns out I've given you enough mathematical equipment to try and address this question, which is why I'm giving this talk. So is the question clear? Exponential growth of cells, totally easy, right? Exponential decay of cells, which has been known for decades, right? Nobody talks about it because it requires a completely different style of explanation. So the answer must start off with some sort of thinking like this, right? So let's just unpack it, right? So I've plotted, see, a few per ML as a function of time. You have these growth curves. If this curve is going up, it's called cell growth. If this curve is going down, it's called cell decay. Fine. Now of course, microscopically, growth is happening because of cell division. And microscopically, growth is happening because of cell death. So what we need to do is to come up with a way to convert the microscopics to the macroscopically observed phenomena, something that all you stat-phase people are very well used to, okay? It's in these microscopics that the randomness must somehow lie. Now the first sort of change in one's thinking is to realize that when you see a growth curve, it's actually a difference between division and death. It's the division rate minus the death rate that somehow sets this thing up. So this curve is going up. It must be that there's a high division rate and a small death rate. Death rate may be zero, but whatever it is. Division is bigger than death. If the curve is going down, there must be a larger death rate than a division rate. This is just a mathematical tautology, okay? So what we need to do is to do the experiment to take these growth curves, which are microscopically observable, and somehow split it into these different microscopic contributions. Only then we can start to make any progress with this. And I've already told you there's a very easy way to do it. Anybody can do this. So here's what you do. You take the B curve in which the cells are growing. You shine light through it at 600 nanometers. It turns out that bacterial cells will absorb this light. And then you can make a calibration curve that says how many cells per unit volume there are based on how much light has been absorbed through the tube. So that's called optical density 600. It's just a number. By the calibration curve, you can convert OD 600 into cells per ML. And to do that, you multiply it by this factor. That's an empirical factor. It just converts this measurement apparatus number to the number I'm interested in, which is the number of cells per M. It's just a calibration factor. You can do the same experiment. As I mentioned earlier, you can calculate the colony forming units per ML. What's the difference between these two experiments? The difference is, if a cell is dead, it still scatters light. If a cell is dead, it still scatters light. Until it explodes, it lices, it's still over there and it still makes the beaker turbid. So what I've counted here is actually the number of living cells C sub L plus the number of dead cells C sub D. What I'm counting here is just the number of living cells per ML. And by taking the difference of the two, you can calculate the number of dead cells. It's that easy, yeah? In fact, this experiment is very easy. You just have a machine that does it. It can monitor this stuff as a function of time. This experiment is hard because it requires a human to pipe it out, put it in the incubator, come back the next day, count very carefully, do some statistics, get some averages, okay? So it's a painful experiment, but very easy to do in principle. These are two very easy classical experimental assets. Okay, now, so all I want to say from here is, I have a way to measure C L plus C D, living plus dead cells per ML. I have a way to measure C L separately. So here's what the experiments look like. These experiments, by the way, were done by a master student. She came to the lab and said, I want something interesting to do. She had the idea to do experiments. I don't generally run a lab that does experiments, so we did this in the lab of a colleague. And I'll show you who all those people are at the end of the talk. Okay, so let's start. So you have C L plus C D and C L. Here's what the data look like. This is log of the number of cells, log base 10 always, log of the number of cells, time and minutes, okay? And here, I plotted the total number and the number of living cells in the absence of antibiotic. I'm using an antibiotic called cannamycin, which binds to something called ribosome and prevents proteins from being made efficiently. So here, these two lines fall right on top of each other. In fact, that's how I get that calibration constant. Okay? So it's not a coincidence, but okay. For a long time, the growth rate, it's not the scale factor, but the fact that the slope of this curve, which is e to the lambda t, the growth rate of the cells, is the total number of cells, is the number of living cells. There are no dead cells. Maybe there are a few, right? But roughly the picture you should have in mind is one cell becomes two, two cells becomes four, four cells becomes eight. Maybe once in a while, a cell dies, which I show as these sort of hollow pictures, right? But mostly it's exponential growth. This will actually continue until the food runs out, which is what's going on over there, because I don't add new food. This is actually a single so-called batch experiment. Okay, look down here. I've added a large, relatively large amount of antibiotic. What happens then? Then the total number of cells increases. I add the antibiotic at t equals zero. The total number of cells increases, and then flattens out and stays flat forever. This is easy to understand, because once all the cells are dead, the total number of cells can never change, right? So eventually this has to become flat. What about the number of living cells? So it turns out the number of living cells, as soon as I add the antibiotic, they do keep growing for a while, and we'll try and understand why that is. But eventually they start to, the total number of living cells starts to drop exponentially, right? Down to the amount that we can measure up to error bars. Once the number of cells becomes very small, it becomes very difficult to measure them. But nevertheless, up to our limits of measurement, they're dropping exponentially. So what's the picture here? Here, well, there is a living cell over there. Maybe even when you add the antibiotic with one last gasp, it sort of becomes two cells. But then after that, both the daughters of the cell die, one of the daughters of that cell dies, and eventually everybody dies. So even though the number of cells may go up temporarily, eventually it goes constant, right? And there are no living cells, they're all dead. Any questions? This is in the presence of a high amount of antibiotic. So this is not gonna happen over the hours time scale of the experiment, right? There are no mutations. This is the same question as I answered earlier. Good question, but over these time scales, you can verify that resistance is not the case, right? Okay, so obviously if here there's no antibiotic and everybody's growing rapidly, and here there's a lot of antibiotic and everybody's dying rapidly, then by doing a very careful titration, by just changing the antibiotic concentration bit by bit, you can balance the system, and you can find one antibiotic concentration, for example this one over here, okay, which is called the minimal inhibitory concentration. It just takes a lot of experiments and very, very carefully done to find that midpoint. How is the midpoint defined? At this midpoint, the number of living cells doesn't go up, the number of living cells doesn't go down, the number of living cells is actually flat over time, right? And it's flat to the precision of how you can tune that antibiotic. So it's difficult to actually achieve this point, right? It takes a lot of experiments to find it. But once you do it, you can reproducibly get these flat curves, okay? Now, here's the interesting thing. Once the number of living cells is constant with time, you can start to count the number of dead cells, right? The number of dead cells is the total number of cells minus the number of living cells. And here's that plot. Now this is linear, not logarithmic. The total number of cells increases linearly with time. The number of living cells is flat, therefore the number of dead cells is also increasing linearly with time. Why is the number of dead cells increasing linearly with time? Here's the picture. On average, every living cell has two daughters, but every living cell has two daughters. On average, one of those daughters survives and one of those daughters dies. And this is why the number of living cells is constant. It's because the division rate and the death rate match. If that's the case, then the source of the number of dead cells is just the constant number of living cells. And so the number of dead cells just increases linearly with time, which is what you see over there, right? So indeed, we're able to use these measurements to accurately capture the distinction between division and death. And we're actually able to test a hypothesis. The hypothesis being, at this antibiotic concentration, if the number of living cells is flat, is the number of living cells flat because nothing is going on? They're not dividing, they're not dying, they're just waiting. Or is the number of living cells flat because they're dividing and dying at the same rate? The former picture is a purely deterministic picture, right? The latter picture is a stochastic picture because individual cells randomly either go to the death fate or to the division fate. And that's where the randomness of the system is exemplified. So these are all real experiments and not so hard to do. Any questions about the data? So now you want to try and explain this. So and as I said, the idea is to try and explain the macroscopically observable quantities from some sort of microscopic model, the standard trick of statistical physics. So here's one such attempt, right? Now for all the physicists out there, this is not a claim that this is how cells work. But it is a model that captures the phenomenological data. And you all know what the sort of explanatory status of that model is. It's not a proof, but it does show the kinds of underlying processes that are needed to explain the data. So with that caveat in mind, let me ask you a question. I've already talked about self replicators and so on. What is, in your opinion, the minimal unit of biological self-replication? Is it DNA? It's not DNA, right? Because DNA on its own doesn't self-replicate. If I just have DNA, it's not going to make more DNA, yeah? So what is the minimal biological unit of self-replication? mRNA, mRNA on its own can't replicate. Yes, what is the minimal unit of self-replication? The ribosome on its own can't self-replicate, right? On its own, it needs, of course, ribosomal genes, it needs amino acids, it needs. So clearly the ribosome on its own is not enough. The ribosome is part of a unit, but on its own is not enough. Anything else? Cell, okay, that's a good hypothesis. This is a testable hypothesis. Is the current cell the minimal unit of self-replication? Yeah, we certainly know it is unit of self-replication. The only question that remains is, is it minimal? So somewhere between pure DNA and the cell, you have the minimal unit, okay? So I'm going to put out a hypothesis. It's just a hypothesis, and I'm not going to put any molecular details. I'm going to claim that a cell contains, you know, just like Mendel claimed that there were discrete particles of inheritance. I'm going to claim that the cell contains discrete units of self-replication that are smaller than a cell, okay? This is my claim. And since I don't know what they are, I'm going to call them widgets, okay? Widget is sort of American slang for a little doodad that you don't understand, yeah? So what are the properties of widgets? For this model, the properties of widgets are the following. The assumption is that since the widget is the minimal unit of self-replication, it has no internal structure, right? So its replication and its removal are fundamental chemical, stochastic processes of the type we've been discussing in class, right? So one widget will make two widgets with the probability per unit time alpha. And the number of widgets goes up by one, right? And one widget will make zero widgets, it'll get killed, with the probability per unit time gamma, right? So this is very similar to the mRNA model that we wrote down yesterday. Slightly different though, and you'll see the equation is being slightly different. So by the way, gamma, which is the rate at which widgets are removed, we are going to assume that it is somehow proportional to the antibiotic concentration, right? That's just a way to make connections with experiments. So the more antibiotic there is, the more these guys get removed. Which sort of makes sense, because our antibiotic is attacking the ribosome, and the ribosome must be a part of any minimal self-replicating. So that's the widget, right? Now this is a very, very simple model. So dW dt, what's the equation? dW dt is alpha w minus gamma w. The difference between the mRNA model yesterday, and the mRNA model dM dt was alpha minus gamma m. Because the mRNA was being made from DNA. But in our model, widgets are making widgets. So dW dt is alpha w minus gamma w. The solution to which is very simple, it's e to the alpha minus gamma t. So the growth exponent is alpha minus gamma. You know, this is totally simple stuff. By now, you guys are all experts on this. I want to know the macroscopic behavior, right? So I'm trying to understand a very different property. I'm trying to understand what is the apparent rate at which one cell makes two cells? And what is the apparent rate at which one cell dies? And I'm going to call those five plus and five minus. These are phenomenological rates. In a sense, I've measured them. They are the slopes of all these curves that I showed you, yeah? So when one cell makes two, the number of living cells goes up by one. When one cell dies, the number of living cells goes down by one, and the number of dead cells goes up by one. These are the underlying properties of this phenomenological description of how cells behave. What is the trick? The trick is to use the powers of the mathematical framework that I've taught you to go from the microscopic parameters alpha and gamma to the phenomenological macroscopic parameters five plus and five minus, yeah? So that you can take this equation. These are the kinds of equations we can use to fit the data, right? DCL, DT, number of living cells, time derivative is five plus minus five minus CL, because of these two equations. And DCD, DT, the rate of increase of dead cells is proportional to the number of living cells because living cells are the things that make dead cells, right? So you get five minus CL. These are the equations that cells obey. These are the equations that widgets obey. Now I'm going to add one more ingredient to the model that allows you to go from the microscopic to the microscopic, right? Here's the detailed picture. These gray boxes are meant to be cells. Time is moving downwards, okay? Cells are sort of growing. And I'm just going to count the size of a cell as the number of widgets it contains. So this has three widgets. This has four widgets. This has three widgets. This has one widget and so on, yeah? Widgets make more widgets, right? So one widget can become two or widgets can down. So that's how the number of widgets in a cell changes. But we have to now put into the model some rule about when the cell chooses to divide, right? And for the moment, I'm going to use the simplest possible assumption. I'm going to assume that when the number of widgets, W, hits some threshold value omega, then the cell immediately divides, right? It's the simplest possible ingredient I can add to this model to allow cell division to occur from the microscopics, okay? Are there any questions? So in this particular case, for example, omega is equal to 5. As soon as the number of widgets reaches 5, that big cell actually partitions and becomes two little cells, okay? And how do the widgets partition between the two cells? Well, if they happen to be on the left side of the cell, they go to the left cell. If they happen to be on the right side of the cell, they go to the right cell. And the chance of them being in the left and the right is just a binomial distribution, probability half, right? So there's an injection of randomness here also. These processes are stochastic and the division and partitioning of the widgets is also stochastic. So now I've explained how the widget model gives cell division. How does the widget model give cell death? Very simple. If you have only one widget and you lose it, right? You have no more self-replicating units. So by definition, that's it, you're dead. So somehow the idea is to then implement this detail model and use alpha and gamma to work out what phi plus and phi minus ought to be, right? It sounds quite hard to do, but actually it's fairly straightforward given the mathematical equipment we discussed yesterday. Are there any questions about this cartoon? Because I'm not going to show this again, but it's important for the next slide, yeah. A widget is my abstract, assumed, totally fake, minimal reproduction unit of a cell. In this model, I'm just trying to assume this and try and see how I can fit the phenomenology, okay? So here's how you go from alpha and gamma, which are microscopic parameters to phi plus and phi minus, which are microscopic phenomenological parameters. This equation should look familiar. It's not a master equation, it's not stochastic, but it looks like a master equation, okay? And I'm going to walk you through it. I have a histogram as usual. On the x-axis of a histogram is the number of widgets, right? Zero widgets, one widget, two widgets, and so on. The height of these bins tells you the total number of cells that have exactly that many widgets. So this histogram is a distribution of many, many cells over widget number. The integral of that histogram, the total area under the histogram, is the total number of cells, okay? Now what happens to individual cells? An individual cell can move from one bin to the next. It can go from having w widgets to w plus one widgets. And we discussed yesterday what the rate of that happening is, yeah? So that rate of going to the right is alpha, which is the propensity, times w. So alpha times w is the total propensity, because widgets make more widgets, times the number of cells in this bin, cw. This is just how we developed that master equation yesterday. It's the same kind of thinking. So with these rates, alpha w, cw, alpha w plus one, cw plus one, you move to the right, okay? You can also move to the left. How does that happen? A cell with w plus one widgets can lose a widget and come down to w. A cell with w widgets can lose a widget and come down to w minus one. This happens with the propensity gamma w, times cw, the number of cells over there. Very simple, okay? And so all these dynamics of cells moving to the left and the right are encoded in this part of the equation, which in fact looks exactly like the master equation we wrote down yesterday. So d dt of the number of cells that have exactly w widgets has two lost terms, because you can leave both to the right and the left. And it has two gained terms, because you can come from the right or come from the left. Okay? Any questions about this? It's just what we wrote down yesterday. Two additional ingredients. If you leave from the left side, then you go to a special bin called zero. And that special bin called zero is actually a sink. You can never get back from that, right? So there's no right arrow coming from that side. So the total height of the zero bin, c0, is the number of dead cells. So that's what I've written down here. Cd is C sub zero. Cl is the sum of all these other guys. Cl is the number of living cells from w equals one to omega minus one. So we now have a model that's allowing us to go from widgets to the number of living and dead cells. One final important ingredient. When a cell goes from omega minus one widgets to omega widgets, it immediately divides. And when it divides, what happens? It makes two cells. And what do we know about these two cells? We know that the total number of widgets they must have is omega. But that omega is partitioned between the two cells binomially. That's the binomial coefficient. And that binomial coefficient drops the cells into two separate bins. One cell may have got a lot, one cell may have got a little. And as soon as you leave from the right side, the total area of the histogram increases by one. Yeah, because only one cell left over here and two cells came here. And that's why this is not a master equation. It's not an equation for a conditional probability distribution which is normalized. It's an equation for the total number of cells. And over time, the total number of cells can increase. And over time, the total number of living cells can certainly decrease. Any questions about this? Yes. It is homogeneous, that's why it's binomial. So you're saying there may be even more sources of noise. So I'm saying I'm using the lowest amount of noise I can. This is the simplest model. I can certainly add more complexity in the division time. Maybe it doesn't divide exactly at omega, maybe it divides at omega plus 5. Maybe the partition is not in the middle. As it turns out, by the way, bacterial cells very, very precisely put their septum right in the middle, just as a matter of observation. Because when you separate in the middle, the initial widgets are random. So separating exactly halfway exactly gives you a binomial distribution. Okay, so everybody happy with this equation? So you can solve this equation. You can solve this equation. And in fact, you can solve this equation more simply than the equations I showed you earlier, because this entire equation is just a matrix. It's just a matrix, and it's a finite matrix of that. Because you only have to look for terms from 0 to omega minus 1. So you can literally write down by hand what this matrix looked like, and try and do the matrix multiplication, okay? So let me just show you what the results of that model are. So these are the experiments, right? So this is the same experiments I showed you earlier, time, and the number of cells. But now I'm showing you the experiment where all the antibiotic experiments were all on the same plot. Here's the total number of cells. Here's the number of living cells. It's the same data I showed you earlier, just written in a different way. The total number of cells increases exponentially with zero antibiotic. It flattens if there is high antibiotic. And it has the total number of cells, has this sort of intermediate asymptote if there's that precise minimal inhibitory concentration of antibiotic, yeah? The number of living cells grows exponentially, or it decays exponentially, or it's exactly flat at the minimal inhibitory concentration. These are the experimental data. How many parameters are in the widget model? So there's alpha, there's gamma, there's omega, right? But at least one of those is just a unit of time, which I can use to calibrate, right? So it's not an essential parameter. The other one, gamma, can be used to sort of approximate adding more antibiotic. So let's say I keep alpha fixed by setting units of time. I vary gamma as a way to capture the idea of adding more antibiotic. So those two parameters are not even relevant. The only parameter left is omega. This is an example for omega is equal to 10. It's not a fit, it's just a demonstration, right? This is what the model predicts, these are the data. And that's pretty good, right? We're not looking for a fit. We're looking for an explanation for various intricate parts of the dynamics, right? Among other things, at very low antibiotic, the number of cells grows exponentially. So in this logarithmic plot, it's linear. At very high antibiotic, the number of cells decays exponentially. At intermediate antibiotic concentration, the number of living cells is flat. Similarly here, when the number of living cells decays, the total number of cells flattens out, which is what happens there, right? And when the number of living cells is flat, the total number of cells increases linearly. But on this plot where you have a logarithmic access, it increases as the logarithm, right? So you agree, we're capturing a lot of the phenomenology with very, very few assumptions, which is the whole thing we want to do. Incidentally, we also capture this little bump, right? Which is kind of cool. That little bump happens to be what happens when you have lots of widgets and the antibiotic when it's added, it takes time to destroy all of them. So in the meantime, cells can still manage to divide. So you get that little bump as well. Okay, so this is actually very nice. And as it happens, the way this experiment was done, I was sitting on these plots for a couple of years before Somia came to the lab and said she'd like to do the experiment, right? So this is one of those cases, which is very, very rare. But truly, I was just sitting on this and eventually the data came out and the first day it came out. We were just both completely thrilled by this, right? So just to say that this is not a fit and the model was not designed to explain these data. It was done the other way around. It was done the way, usually it's portrayed where you make a prediction and do the experiment. Actually, science works often in the other direction. Yes, the shadow region is where the food runs out. The food runs out and therefore, even in the absence of antibiotic, the growth rate slows down. And if you wait longer, that will just flatten, yeah? Sorry, I should have mentioned that. Yes, it is important and I'll get to that in a second. In this plot, I've used a value of omega equals 10, right? But your question is what is the actual value of omega? For the real set, it is absolutely relevant, right? So it's coming, it's coming exactly the next slide, okay? So thanks for the question, other questions, okay? So this is just to show you that this model, the way I use this model is I take alpha and gamma, right? And for each value of alpha and gamma, I get the entire dynamics of how many dead cells and how many living cells there are and I plot them here. I just go through the motions, just like we did yesterday. And these are the data. So now we're going to, now that we have a model that sort of explains the data, we're going to go into the model and see why it's doing that. That's, the reason we want a model is not to make this prediction. The reason we want a model is to get some understanding. So here's what's happening under the hood of the model, okay? So I'm going to spend enough time on this slide because this is really the heart of this course. So here are two values of omega. Omega is equal to 10 and omega is equal to 50. Omega, remember, is the number of widgets that you have at the time of cell division. These histograms are bins, right? There are 10, well, there are 0 to 9 and 0 to 49 bins in each of these x-axis. The y-axis is just the frequency. I've just plotted the histograms to the height of the maximum bin is the same. So they're not normalized histograms. This is just for ease of visual representation, okay? And what I'm plotting is the distribution. So what happens is when you take any initial distribution and you fix the values of alpha and gamma and omega, what happens is eventually, because this is just matrix exponentiation, the shape of that histogram reaches the shape of the eigenvector of that matrix with the maximum eigenvalue, okay? And so these histograms are just that eigenvector. Once you get that eigenvector, unlike for a Markov process, the eigenvalue is not one. The eigenvalue is actually either greater than one or less than one because cells are actually growing or decaying, okay? So when you look at this particular distribution of widgets, there are hardly any cells with zero widgets. And so there's hardly any death rate, in fact, there's zero death rate. There's lots of cells with omega minus one widgets. So the division rate five plus is quite high, same over here. Now as I add antibiotic, I'm making gamma higher and higher, right? It goes from 0 to 0.5 to 2 to 2.5. And what that does is it shifts the number of widgets closer to 0. The other way, shifts the number of widgets closer to 0. So you see this histogram is being bunched towards the left, towards the left, towards the left. And therefore, the division rate is dropping and the death rate is increasing, okay? So having fixed omega, for every value of gamma or gamma over alpha, for every value of gamma over alpha, I can separately plot the division rate and the death rate. And that's what these curves are, it looks like a complicated figure, but it's just a lot of overlays of some very simple figures. Let me walk you through it. For example, for omega is equal to 10, which is the plot I was using earlier. This is the sort of dark brown one, which is this guy. The division rate starts off high, then it drops, it drops, it drops. And eventually it goes to zero as antibiotic is increased. At the same time, the death rate starts off at zero, it increases, increases, increases, and then keeps on increasing with more antibiotic. So what is this flat line where the number of living cells is constant? That's obviously where the division and death rate are equal. It's that point over there, right? And by the normalization of parameters, it happens when gamma over alpha is one. So very good, so we're looking at that particular case. Now what happens if I make omega very large? Something interesting happens. At very high values of omega, in the limit that omega goes to infinity, right? The system becomes nearly deterministic, becomes nearly deterministic. What happens? The division rate starts high and it goes down to zero as you add antibiotic. But there's never any death, right? And then the death rate becomes higher, but there's never any division. And at this very fine-tuned point where division and death are equal. At that point, division and death are both zero, and that's why they're equal. Now in the biological literature, you'll find a lot of people using the phrase bacteriostatic antibiotics. So they say when you add this very precise antibiotic concentration, then bacteria enter something called stasis. They don't do anything, right? Now we've shown that's not true. Even at that antibiotic concentration, they're dividing and dying. You only get stasis in the large omega limit. So what is omega? Omega is a measure of the amount of noise or fluctuations in the system. In other words, I can take the separate rates of, you know, a very noise-free system. The cells will neither be dividing nor dying. They'll just be doing nothing. A very noisy system, cells will be dividing and dying at a very high rate. Both those rates are equal, but they're both very high. So I can use that particular division and death rate, the separate values, as a measure of the amount of noise in the system. And if I plot that as a function of 1 over omega, you see it's a nearly linear relationship. As omega increases, and so you're moving towards the left in this curve, you eventually reach that point where you have zero division and death, right? So what is omega? Omega is just a measure of the noise in the system. In this particular model, because it is so, so simple, the only way I can tune the noise is by changing omega, right? The more the omega is, the less the noise. In practice, there are other sources of noise. There are sources of noise because the septum may not be in the middle. There are sources of noise because the omega detection system may not be exact. There are other sources of noise because indeed cells might have other sources of heterogeneity, right? So in practice, there'll be even more noise than this model predicts, yeah? Okay, fine. And now we can go in and say what value of omega actually explains the observed levels of division and death? You shouldn't take it too seriously, since this model is so sparse, you shouldn't fit the data, but roughly the order of magnitude is just a few. It's like from 2 to 10, okay? So omega is somewhere between 2 and 10. And I have some confidence in saying that because a completely separate way to be able to do that is a separate way to measure minimal cellular units. So Sakchun's paper, which was out just last year, where they do very, very sophisticated measurements of growth rates, not death rates, but growth rates at different food concentrations and use a sort of collapse of the curve onto a standard form. They find that for different values of food, the number of so-called unit cells, which we're calling widgets, the so-called unit cells are varying from about 2 to 8. But not much more than that. So whatever the minimal unit of replication is in a real cell, it's not much smaller than a cell. So it really takes most of a cell in order to replicate. That's roughly the idea that's coming on this question. So remember for, you know, in general, but for the Poisson distribution that we derived yesterday, the number of mRNA was described by a Poisson distribution, and the higher the mean number of mRNA, the lower the relative variance in the system. So it's a general feature of stochastic chemistry that as the number of molecules in the system increases, the relative size of fluctuations goes down. So that's what they call noise in that case, sort of relative fluctuations. Here what I'm calling noise is a slightly different thing. So I'm saying that when you fine-tune that antibiotic level to 4.21 or whatever it was, then you're measuring that the total number of cells is flat, macroscopically. But microscopically, individual cells are stochastically dividing and dying, right? So if there was no separate division and death rate, if both of those rates were zero, I take that to be a noise-free limit, right? The cells are just waiting. And in fact, the number of widgets is a very sharp histogram and it doesn't fluctuate anywhere. But as omega becomes smaller, then the histogram width increases and you get division and death happening at higher rates. And that's what I'm calling noise. And by plotting that separate division and death rate, which is the height of the intersection point there, as a function of 1 over omega, you see that as omega increases, those spontaneous rates of division and death decrease. So this is just proof that omega is containing information about the noise in the model. So very good, we actually got through. So we have half an hour left and I'm going to do a few things with that half an hour. First, I'm going to do a little shameless plug for the place I work at just to show you where all these experiments were done. Secondly, I'm going to go over a paper that has to do with a stochastic bistable switch and show you experimental data. And thirdly, in the last 10 minutes, I'm going to go over how you guys are going to do the homework problem so that I can ruin your weekend and you're going to have to do homework. So just the point I want to make here is that this experiment was a huge collaboration, both formal and informal. So Sanjay is a physicist, Matt is an ex-physicist, Sakjun is a physicist, so there's a lot of physics involved over here. Savita Somya and Ashwin are biologists. Ashwin is a colleague of mine at NCBS and Savita is a postdoc in his lab. Sumandas is a postdoc at the Simon Center and he's a physicist. So the idea of even doing this experiment would not have occurred to us, had we not had a lot of interest locally in people who were excited about testing predictions. It's very nice to be in an environment where biologists are happy to test your predictions. So this is the shameless plug, here's where I work. So NCBS, NCBS is in Bangalore, which is right in South India. If you've not been there, please visit. The weather is great and mango season is April-May. So the Simon Center at NCBS is a subset of NCBS. We have sort of partner groups around the world who have similar kinds of programs, like Singapore, a recon group in Japan, IBS in South Korea, the Max Planck Institute in residence and these two institutes in ESPCA. Among other things we run joint meetings and postdoc programs. This is where we work, it's a nice place. Here's a really fantastic cover design that we did for a recent report and I think this captures the idea of randomness in biology. So on the left, how many of you have ever programmed in the logo? You know the programming language logo? The logo was a programming language that was used to actually teach children. You had a little triangle and by saying forward 30, left 15, it would move forward 30 steps and turn left by 15 degrees and you could do four loops and if then statements. So that's a little logo turtle and here's a real turtle. Here's a picture of a real turtle and the idea is that using fundamental random processes at microscopic scales you can somehow make something microscopic over here. So this is the kind of work these are the kinds of subjects that we spend time on at NCBS and the kinds of work are actually quite fascinating. So I work on this kind of cell biology but Madan Rao is a physicist. He spends a lot of time thinking about hydrodynamic models of active systems, membrane dynamics and so on. Shashi is a biophysicist. He makes literal sort of physical simulations of biological systems. So like lipid-driven vesicles and metronomes that synchronize. Using simulations and experiments. Sandeep is interested in how populations of cells make decisions. A bit like that game we discussed yesterday and I study cell biology and evolution. The poster went away but we do have postdoc programs and so you're invited to apply and that's the website where you can go look for the information. So there's the poster. So you can go take a look at that. So having done that very quickly move into experimental data. So for those of you who weren't there after the tutorial yesterday some of us were in here so I taught everybody a card game and if you weren't there just ask somebody who was there and you can pick up the information from there. So here's a paper which was done in the lab of Alexander van Rudenarten at MIT. Long time ago now it was when I was doing my PhD and the post-bodac was the one who did the bulk of the experiments though I did try my hand at experiments at this. So here's a here's a system. This is one of those stochastic chemical kinetic gene regulation things that we've been trying to model but here's an actual one. In this particular system remember the model that you're going to solve for your homework is a feedback model. It's where gene X, the more gene X you have the more protein X you have the higher the rate of protein X synthesis in our model that happened because protein X literally bound to its promoter site and caused an increase in expression of gene X. It's a very direct positive feedback loop. Here it happens more distantly. Let me just walk you through it. What happens is you have so-called operon. It's a collection of genes in a bacterial cell with a single promoter and this particular gene is the important one. It's called Lacky. It makes the protein Lacky which is a pump and this pump pumps in sugar from outside the cell. In this case the sugar is called TMG. When TMG enters the cell it binds to another protein which is present at fixed amounts called Lacky and inhibits it. That's what this blunt arrow means. Sharp arrow means activates, blunt means inhibit and Lacky itself is an inhibitor of the promoter. So you have a double negative feedback which is an overall positive feedback for the system. This is a real system. It's a natural system that occurs in E. coli bacterial cells and the whole point of the system is the cell will not eat sugars like TMG or lactose when glucose is present because glucose cuts the positive feedback loop. Now we measured what was going on in the system by adding another copy of the same promoter in the genome and adding the so-called green fluorescent protein from jellyfish to see what was going on and that was fun. So here's the data and I think that this is really quite amazing how nicely it worked out. Oops. I think this full screen mode is not behaving itself but let me see. Okay, there we go. Good. So remember what I drew yesterday. I drew all those double well potentials and those double well potentials had a control parameter and the control parameter controlled whether there be a high state, a high and a low state or a low state. In this case we have two parameters that we can vary. One parameter is this extracellular glucose concentration and the other parameter is the extracellular concentration of the sugar called TMG. So just focus on this picture for a second. If I pick some quantity of TMG, let's say that one, right, about 15 micromolar and I just put the bacterial cells at that state I grow them in the tube for a while and then I dump them on a microscope slide. Okay, so this is like a micron scale bar so these cells are a couple of microns in size. This is a pseudo-colored image but a quantitative pseudo-colored image so the amount of green is proportional to the amount of green fluorescent protein in the cell and what you see very strikingly is there's some white cells and then there's some green cells. So if I want to plot that quantitatively on the y-axis I have the logarithmic average greenness of every cell and each dot here is a single cell in a single experiment. Okay, so at that TMG concentration there are two populations of cells, one with the green level arbitrary units of about 100 and another with the green level arbitrary units about 1. So the factor of 100 difference in the brightness of these two cell populations. Then the neat thing we can do is actually tune the extracellular sugar concentration and by doing that what you see, so why are there two populations of cells at all? It's because you have this double well potential. Double well potential means that even if cells started off with very little green they can transition into the high green state. As you increase the TMG level very high all the cells are in the high state. Now as you decrease it each of these experiments is done separately by taking the cell from this state dumping it at this TMG and waiting for a long time several hours for them to reach steady state. So this stays in the high state and then very quickly falls down. But now if you take cells in the low state and do the reverse experiment these guys stay in the low state and very quickly jump up. And putting these two together is precisely the so called hysteresis curve you guys would have been familiar with it where the external variable control parameter is the magnetic field and the internal variable is the magnetization. The mathematics is exactly the same. The mathematics or the bifurcation class is basically a first order phase transition. But in this case you shouldn't think of it as a first order phase transition you should think of it as a stochastic chemical kinetic system in a double well potential and this is the homework problem you are going to solve. The problem you are going to solve is low to the high state stochastically and how long does it take to go from the high to the low state stochastically. One interesting point to note a lot of cells that used to be in the low state have already started jumping to the high state even before the deterministic low fixed point vanishes. It doesn't happen in the reverse direction. In the reverse direction you basically have to wait till the boundary before you come down. The reason for that is here you have low molecule number when you have low molecule number you get relative noise. That's why these guys are transitioning more rapidly upwards those guys are transitioning less rapidly downwards. This is just a nice little experiment that shows that the kinds of mathematical models we are working on are relevant. How do I bring this back? Let me then go to the homework. How much time do we have? We have half an hour? Very good. I'm actually going to sit down here. What I want to do is to go over one homework problem which is a very important one and that will be a template for you to solve the big homework problem that we've given for your for the weekend. How many of you have attended the homework? At least got started. I won't deal with the first two problems. This is just to get you used to the idea of using a random number generator. I'm not going to talk about the second problem 1b. Again it's the same thing, it's just using central limit theorem. The third problem is monopoly. If you haven't seen monopoly, this is the monopoly game. The way the game works is many players they start here they roll two dice and they move around the board like so and the trick I want you to solve is the distribution of numbers you'll get and that's the number of steps you're going to move anything from 2 to 12 and the only trick in this whole game is in this whole simulation for the homework is when you land in this little square, the 30th square called go to jail then you don't spend any time there you immediately go to jail. You can write down a little Markov matrix for that entire process and then you can use that matrix to find the eigenvalue which is the steady state that you will reach if you keep running the system from the starting condition and the steady state should look like so just take a look so I'd like you guys to generate this Markov matrix so this is a matrix of 1 to 39 positions 1 to 39 positions, why 39? because that go to jail position is not even there you never spend time there and you're meant to fill in these numbers and you're meant to implement the go to jail conditions somehow and then you need to plot give me the numerics or whatever it is of what is the chance that you're going to spend time in steady state at each one of these 40 positions around the board that's not a matrix, that's actually a picture of the board fine, so that's that the thing I really want to spend time on now is question 3 because it'll give you a template for question 4 which is the one that's going to be graded for your homework so let me see, need some chalk here so in class we already we already discussed various types of equations and if you have a standard chemical equation like this then we discussed that it converts to a master equation and I'm not going to write the master equation here because we spend quite a lot of time on it but it's also approximately you move the variable you move the dynamical variable from the x itself to the probability of getting various x's that becomes a dynamical variable, you get this type of linear equation it's the Fokker-Planck equation which says dpx dt is equal to minus d dx f of x minus g of x times p plus 1 half d squared dx squared f of x plus g of x times p and also I give you another recipe to do the same thing, so this is a recipe for you to solve the conditional probability distributions by using any kind of PDE solving system you want but there's another way to write a recipe for the stochastic trajectories and we did it like so, we said delta x is equal to f minus g delta t plus square root of f plus g delta t under the square root sign times a random number and this is a Gaussian random variable with mean 0 and variance 1 okay, we did this yesterday in fact this turns out to be a more general thing right in general and I'll just maybe in general I'll write it here in general if you have a Fokker-Planck equation of this type, in general if you have a Fokker-Planck equation of this type where the coefficients or the terms in the equation are something some function a which represents the drift term and some function b which represents the diffusion term in general then this is equivalent to a recipe here which is a times delta t plus square root of b times delta t times r and the only point of confusion here is sometimes this factor of half people lose track of, the factor of half came here because of the Taylor expansion the b here is the same as the b here so I realize I went over this rather fast and my suggestion is you go look at the notes so at least this part of it was derived in the notes that I put online for you to read and then for the rest you know it's fine at this stage to memorize this idea so you memorize the idea that this partial differential equation will give you stochastic trajectories according to the recipe I gave you and those stochastic trajectories will be the same kinds of stochastic trajectories as you would get from a completely different method of simulation and a completely different method of simulation requires the use of random numbers at every little time step with some delta t as a discretization choice and there are many many equations in physics that have this particular form now the difference for stochastic chemical kinetics and it's a very important distinction is that the a's and the b's the drift term and the diffusion term are not independently specifiable right you can't independently say that this is some rate of synthesis minus decay and independently say that this is some sort of noise term for stochastic chemical kinetics if a is f minus g then b is f plus g right there's no choice about it okay so this is the idea which is a slightly unusual one if you're used to just injecting noise on the right side of your equation this doesn't correspond to an injection of noise this corresponds to modeling noise that's already there that's the difference so what the homework equation is going to do so here's a here's a crazy stochastic here's a crazy Fokker-Planck equation right so take a look at it this is called the Onstein-Ulinbeck process and this is the equation that corresponds to it don't worry about the words if you haven't seen this before but okay so you have the same form so what I do is whenever I look at this thing I write it in a form I'm comfortable with and then I just look up the coefficients and I realize what they mean so if I am given an equation like this it can initially seem quite intimidating don't be intimidated by it because it has a very standard form this equation has this term which is the d dx term and it has this term which is the d squared dx squared term okay so that's the starting point well okay there's no minus sign there but there should be a minus sign here so then you put an implicit minus sign in your heads over there okay secondly there's no factor of two that's stupid factor of two and that factor of two you have to put in in your heads okay so once you do that you realize that this equation is equivalent to this equation so let's see if my recipe is actually sorry can you guys see if I point on this side I know I've been very sorry about that okay so this equation is the same as this equation maybe I'll stand here for the rest of the class okay so this this equation there's no minus sign there and therefore there has to be a minus sign in the a term there's no factor of two there so therefore you've got to put a factor of two in the b term and so you just look at this and you see that that's the equation it's talking about that's how you deal with the differential equation of the Fokker-Plank form you just look at the dictionary and put it in a form you're comfortable with second thing now that you look at this equation it is intuitive okay what is this equation for it is the equation for the velocity of a particle and many of you in fact in the talks that we heard during the student presentations have used equations of this form to model diffusion processes that's how many of you are familiar with it but for the purposes of the homework if you're not familiar I'm going to walk you through it what does this equation say it looks like a differential equation for the velocity of a particle this equation if you just look at the left term it says dv dt is minus gamma v over m right or in other words it says m dv dt is equal to minus gamma v right that's all these two terms are so what is that that's just the behavior of a particle in a viscous medium this is m a f is equal to m a that's the force okay then there's this all important term on this side what is that term that term I got by sticking that with a factor of 2 under a square root sign putting dt under a square root sign and having a variable a random number alpha which I call r over here but this alpha is a normally distributed random number with mean 0 and variance 1 okay so that is somehow capturing the idea of the diffusion now what I don't want you to worry about is how we chose the form of that crazy coefficient okay it is what it is that's what the process of Brownian velocity is just like in the case of chemical kinetics where the diffusion term is already given once the drift term is understood okay in diffusion the diffusion term is already given once the drift term is understood in particular depends on the viscosity and temperature right and we sort of derived this on day 1 if this had been a physics class I would have spent many many lectures going over the derivation and implications of this equation since it's not just take my word for it that's the right quantity that's the equation for the velocity what's the equation for the position the equation for the position is just the integral of the velocity so dx dt is v so delta x is whatever dx dt is v times delta t very simple so that's the setup are there any questions about the setup no okay now why is it called non-steady Nullinbeck process this is a stochastic process where if you start off with some value of v as an initial condition okay that value of v is going to be brought to 0 yeah that's what this term does so unlike this is different the velocity of a Brownian particle is different from the position of a Brownian particle for the position of a Brownian particle if you start off at some position right you're not going to relax to some other position you're going to actually fluctuate around that position the fluctuations are going to be bigger and bigger in this case if you start off with a high velocity the velocity is going to relax to 0 right and that's the difference and somehow the little diffusion problem that I solved on day 1 which is just like a random walk where the position is getting kicked left and right this is in some sense a slightly more sophisticated account of diffusion where first I calculate the velocity and then I integrated to calculate the position and this allows me to do calculations for particles that had a velocity to begin with because the little Einstein solution I showed you on day 1 the particles had no velocity to begin with on average okay that's the history of this equation for the purposes of this class however or there's a typo here for some reason when you I guess when I wrote greater than to capture expectation value and then I put equals and it made it greater than or equal to this is meant to be expectation value is equal to that thing on the right side okay so sorry about the the error okay so what am I going to do I'm suggesting right when you do this how many how many degrees of freedom here how many how many arbitrary units so there's a unit of mass you want to measure it in grams or kilograms or whatever it is there's a unit of time right there's also a unit of length anyway but I'm suggesting that you set the units you choose your units of length time and mass so that m over gamma numerically is equal to 1 you're always allowed to do it and so that kt over m is equal to 1 you're always allowed to do it so this is without loss of generality once you finish those substitutions the equation becomes very simple delta v is equal to minus delta t plus square root of 2 delta t and now it says simulate that's what I'm going to do now which I have to do this question yes in this case in this case it is differentiable in this case the velocity defines it at all times that's what we're doing here in this case this is the most sophisticated treatment of diffusion not the one we did on day one where the velocity is not differentiable okay in this case the velocity sorry not the velocity position is not differentiable in this case the velocity is given and we're going to calculate the velocity okay the velocity is not differentiable but it exists at all times okay fine so what programming environment are people using familiar with python python everybody so I'm old school I never moved to python so I'm still stuck with matlab anyway so it goes so what we're going to do is just do the simulation just do the simulation and we're going to define some parameters I mean the parameters are actually zero but we're going to define some parameters let's say dt is one parameter what other parameter does it want me to keep track of let me move this to another screen okay there we go so it's asking me to do this experiment where I take various initial values of the velocity and various values of dt those are the only two things I want to change so initial velocity and dt are given and now I just want to you know do a differential equation and integrate it this way so it wants to go to a maximum time of 100 okay it's given me the maximum time for your homework problem I haven't given you the maximum time for the homework problem you're going to figure out how much time it takes for the system to actually reach some sort of approximate steady state and I will leave you to explore what the answer is to that okay so let's say t max is 100 okay so t is equal to zero so we're going to move the first the implicit increment is you're going to increment time right and then the explicit increments are given okay so we'll say v is equal to v in it just to make it symmetric okay so what we're going to do is increment time and increment x so x is going to be x plus and we just read it off here right so x is going to be oh sorry not x but v v is going to be v plus this thing is over here right so it's minus dt so v minus dt plus square root of 2 times dt the first term is v times dt v times dt yeah so good and this is going to be a random number with mean zero and variance one in matlab that's called rand n okay so we did that and then we want to keep track of v so let's make a vector or a matrix okay so how many steps do I want well I'll just do n steps right so n is equal to t max over dt and let's say the ceiling of that n steps right and instead of doing a while statement I'll just say 4 step is equal to 1 n steps the reason I'm doing this for memory allocation vmat is equal to zeros n steps and I'm going to load the v let's say vvec okay everybody see what's going on here now tvec is I guess zero let's load it in here that should work so does anybody spot any errors in the code it's that simple any errors in the code let's run it okay so I'm plotting time versus velocity so there's the first example in this course of a squiggly jiggly curve now the real question is I mean of course you're getting squiggly jiggly curve because it's you're adding random numbers it is reasonably heartening that this curve is not run away with itself but it is staying sort of reasonably flat because we expect the velocity to stay at zero okay let's try starting with a high initial velocity as the problem set has asked oh I closed it so there I mean you can't quite see it but there's the initial velocity and there's the velocity relaxing okay I don't know if this is what it should look like but if there's no mistake that's what it's going to look like and here it is with a different initial velocity it is with a different initial velocity let's see that okay so here's a couple of curves and the nice thing is that they have both relaxed to the same distribution eventually now the real question is what happens when you change your time step dt that's the trick okay if my time step causes problems so it says time step of 0.1 and 0.01 so I'm going to do time step of 0.01 see what happens so for the best to the naked eye it looks sort of similar now how do you convince yourself of this fact you're going to have to look at the distribution at the end and see if the distribution really looks similar for different values of the time steps I can make the time step even smaller and see what happens it's looking pretty good now here's let me show you what happens if you make a mistake what happens if in this recipe what happens if in this recipe you had never taken a course on you've never taken a course on stochastic processes and you sort of assume this was a typo if you assume this was a typo this equation would end up looking like the following delta x is equal to something a delta t plus b delta t times a random number you can even put square root of b but suppose you take the delta t you put random number this is basically equal to a plus root b r delta t okay now if I do that and we can clearly do it and see what happens take the dT out of the square root sign now what do people think is going to happen let me start off with my original time step okay and let me just run it you see for a given choice of dT I mean as long as the rescale b and all that these two things are equivalent right it's only when the dT changes that you're running into some problems okay here's the system for a time step dT of 0.1 now I'm going to go to a time step dT of 0.01 any guesses what's going to happen nothing should happen because the time step in principle it's just going to get you closer and closer to the solution as it becomes smaller and smaller this is the standard trick of integration so what do people think is going to happen I've made the time step smaller any guesses I'm going to hide the plot behind the screen here and reveal it hold on okay any guesses okay it's not good in fact my guess was wrong we were discussing yesterday which way it's going to go that's not good what happened here so the simulation is not converging with different values of dT in fact if I take dT of 1 you can sort of guess what's going to happen whoa okay so that's the very very important reason numerically why the dT is under the square root sign the dT is under the square root sign because if you wanted to write it as a traditional sort of Euler type of integration then it's equivalent to saying that this thing has to be scaled right so the diffusion coefficient essentially becomes scale dependent depends on the scale of your numerical integration so there you go this was just my little attempt at this homework now what the rest of the homework asks you to do any questions about this okay so in general the idea to get these things working and if it wasn't a diffusion process but it was a levy flight or other kinds of in general this noise term needs to be scaled appropriately with your point of integration and it's not always okay fine by the way if this is you know people often think about you know injecting noise and even if it's external noise you're not allowed to inject it and then multiply by dT it's just not allowed unless you scale it properly so let's look at the homework again right so it says do velocities converge to reproducible distribution to do that you have to run the same thing a thousand times and see what the final state is right what are the RMS values you have to do that finally as an optional question I ask you to integrate the velocities into the position so let me put the dT back under the square root sign and let me take v initial to be 0 and let me take dT of 0.01 for example and plot okay so run it and now I want to plot the position which is just the cumulative sum of velocity times dT that's just numerical integration right so let's do that so what have I done here I've taken the velocity curve which I had earlier and I integrated it to get the position curve this position curve is diffusion it's x as a function of time and you see that it's not sticking close to 0 so you could do an interesting thing you could run this again and again yeah let's just plot the thing here oh did it not hold on should have okay so this is wrong oh I didn't clear it I didn't clear it I didn't clear it hold on okay okay isn't that fun okay so this is diffusion this is diffusion and this is the position integrated by using the second equation the homework right and now if you were to take the histogram of the n positions what would you get you'd get a Gaussian right which would have some variance and you can check how fast does the variance increase you can sort of see by eye that it's not increasing linearly it's sort of increasing more rapidly than flattening out that's increasing with a square root of time and you can use this to numerically work out what the diffusion coefficient is in your experiments in your numerical experiments and the answer will be d given by the the standard expression right the kt over 2 gamma expression so why did we put this crazy term under the square root sign we put that term under the square root sign because it has to give standard diffusion once you do the integration that's how this thing was chosen so if you want to do that last bit you're free to do it let me wrap up now this is the homework you're going to have to do we derived this equation yesterday if you don't remember the derivation I've put a new reading material where I've derived this expression again in the reading material it's a paper that I published some time ago here are the parameters right depending on the parameter values this system either has one or two steady states right all I'm asking you to do is to work out what those steady states are that's where f is equal to g we derived this equation yesterday you're going to use that equation to plot what you predict to be the steady state distribution the stochastic steady state distribution I want you to do the Langevin equation which is what we did just now right starting at various states until you get steady state I want you to do a Gillespie simulation if you don't remember how to do that ask somebody I'm sure some people here have done the simulation it's also in the reading material okay and finally this is optional this is about calculating the switching times between the two fixed points very good one little piece of advice the Langevin simulation is not exact it's approximate and so when it gets close to zero it's wrong in particular when it goes negative it is absolutely wrong right so your simulation is going to take you to negative numbers and you're going to have to put some sort of failsafe in your code to make sure that when you go negative you come back to zero that these kinds of little tricks you're going to have to use any questions that's the homework this is 50% of the grade if you don't manage by the end of the weekend then we'll certainly sit with you and help you work out whatever difficulties there are questions? zero and ten sorry there's an error here V brackets T equals zero is equal to zero and ten there's a little typo so there's sort of six combinations okay thank you yes