 Apologies for that. Today and tomorrow and on Friday, we are going to leave the book completely because everything I'm going to cover now is stuff that mostly at least has been discovered the last 10 or 15 years, way after the book was written. And even the second edition of the book doesn't contain anything of this, in part because I'm gradually going to go more into applications, things that are closer to actual drug discovery in the lab and everything, and that deviates a bit from the physical aspects of the book. But before we do that, let's head into some of the discussion points. What's the difference between transition states and folding intermediates? I'm going to have a couple of slides about that that I added this morning, but I'll let's do the quick version here first. Sorry? And why? If you think about this in our good old friends, the free energy landscapes, where are they located? Yes, and which one? So the folding intermediate is really actually, you could even start to think of the transition states hardly even being part of the phase space. It is, but the point is that you're going to spend so little time there, right, that it's not really going to influence averages a lot. Remember these discussions we had that you can calculate anything as the average over all possible states if we just could sample all of phase space, every single confirmation the system has. Well, the easy way out of that is that there is nowhere we can do that in practice so that we tend not to bother too much about it. On the other hand, simulations today are so good that although you can't really sample all of phase space, you can sample the local part that you would sample under equilibrium conditions. And if you sample 99% of it, that's going to be a pretty good approximation. But those transition states, their energy is so high that in practice we're not really going to spend in a significant amount of time there. So when it comes to observables, they're not really going to, even though if I create some fancy experiment at some point in time, of course I will have a transition state there, right? If I'm actually observing folding, it will have had to go through the transition state. But because I spend on such an incredibly small fraction of time in the transition state, it's not really going to influence any averages I see. So what were these Arrhenius and Chevron plots that we talked about? They're similar and yet not. Right. And these apparent rates are what? The sum of the rates. And I know, I'm well aware that that sounds really stupid. But the point is don't think of this as the sum of the rate, but think of them as the efficient rate. Because in general, in some sort of regime where we have little denature into whatever, we're going to be entirely dominated by the folding process. But there will, of course, be a small contribution here that some things are unfolding to. It might just be one, up here is going to be one in a million. You can ignore it. Here it might be 10% and then we can't ignore it anymore. And then we're going to have some other regime here where we're completely dominated by unfolding. And again, here it's going to be less than 1% that actually folds back. But here it might be 10%. So in both of these legs, the Chevron plots really measures the apparent effective rate of folding when we take into account that some of their reaction goes backwards. And then we have these strange regime in the middle when we have sort of a balance, right? That here you would have 50-50. Stop going just as much going forward as going backward. But there are, of course, things going in both directions all the time. And these are the rates we measure. So here it's a bit, I would argue, this part is hard to understand conceptually what it means. These parts are, and here it's just the sum of the rates, these parts you should think of that the effective folding and unfolding rates. And as you say, the point is that this is so much easier to measure experimentally. No, no. Actually, that's a good question. So what happens here when you're exactly halfway? What does equilibrium mean? So you can certainly have an equilibrium. If you have an energy landscape that looks something like this, right, you're going to have 99% for instance of your population here and 1% there. Equilibrium doesn't mean that things don't happen anymore. It's very easy to think that equilibrium means that all processes have ceased. But equilibrium just means that there are just as many, the flow in that direction is going to be the same as the flow in that direction. So that on average, we keep having roughly 1% here and we keep having roughly 99% here. So that equilibrium doesn't mean that we have just as much folded as unfolded stuff, but that the rates are the same so that we no longer change the distribution. You can have equilibrium at any single point here, because under a certain conditions, assuming that I have whatever say 2 molar of gonadinium hydrochloride, and if I make this experiment under 2 molar of gonadinium hydrochloride, I'm going to have an equilibrium. A 2 molar of gonadinium actually, that should probably be 0.5 or so. Somewhere here, I'm going to have most of the protein folded and very little unfolded, but that's going to be an equilibrium. Under those conditions, I will have that equilibrium after a while. I can move to say 10 molar of gonadinium hydrochloride. That's going to be another equilibrium. Here I will have most of the protein unfolded and very little folded. So that equilibrium means that if I add 0.5 molar of gonadinium hydrochloride and then you wait a while, in this case just going to be a few minutes. After a while, you will be at the point where say 10% of your protein is unfolded, but then you have a new equilibrium. Exactly. So what happens here, this part I think would be easier to interpret than an Arrhenius plot, right? Because in an Arrhenius plot, you have these two separate curves. At this point in Arrhenius plot, you have the reaction, sorry, in an Arrhenius plot at that crossover point, you can specifically say that the reaction rate, the number of molecules per time that fold versus the number of molecules per time that unfold is exactly the same. That you can't quite say here, because this is the sum of them. It's going to be close to this point, but it might not be exactly that point. So what did you use that for? Sorry, we'll come back to that later. This has been up twice how the enthalpy and entropy vary during folding. I won't bother you with them again, but make sure that you understand it. Read the book or last lectures or ask me otherwise. So what's the use of these apparent folding rates, the chevron plots in particular? By erasing most of the stuff here. So how do you understand the transition from that? So the point is a single chevron plot doesn't really tell you anything at all. It just tells you that, well, at very high concentration of seguined edinium hydrochloride, things will unfold. We don't need a chevron plot to tell us that. The interesting thing is when you start having these patterns, if you have lots of curves for some mutations and then some other mutations, I have no idea where they would go, maybe that direction. By looking at the relative change of these curves, how they have moved both in, basically, you extrapolate these lines. You see much how, first how much the left line has moved and then how much the right line has moved. You can use this to extract from this logarithm of k the differences in this reaction rate. You can't translate to free energies. And there were two free energies we were interested in. One of them, how much did we change the free energy of the transition state? But the problem is I can't directly say that the transition state became five kilocalories per mole lower. I can only say how much the transition state changed relative to the unfolded state. So I can say how much that barrier changed, but that barrier might have changed either because the transition state became lower or because the unfolded state became higher. So that's why I need to take two things. First, I need to test how much did the barrier to the transition state change. And second, how much did the folded versus the unfolded state change? Then I can say, did I influence the transition state or did I change the stability of the protein as a whole? And when I do that, I can determine four each. So each of these curves would be one site mutation, one amino acid. And in general, you might want to try five or 10 different amino acids. So this can easily be a lot of curves. And you might even want to try several different mutations for some of the amino acids. So for each such mutation, say residue 47, you would get one five value. That would tell you, does residue 47, if we start to change things here, do I really influence the transition state? And if that value is close to one, then that residue participates completely in this transition state. Then it's part of the folding core of the protein. And conversely, if the five values close to zero, it's not really going to participate at all. Five values can be negative too, but I won't bother you about that. So how do you use those five values? You can certainly use it to understand what the transition state looks like. Could you do something faster with it? If you would like to change how fast a protein folds, what residue should you go after? The residue that participate in the transition state, right? It's not going to help you. It depends. You might only be interested in stabilizing your protein. And then you can go after all residues. But in particular, the residues that participate in your transition state are going to be important for folding. So if you want to speed up the folding of the protein or speed up, you could actually, you could, there is nothing here that's specific to folding, right? You could do this for any process, if it's a binding process or something. And you would like to find out what are the residues that really participate first in this process. Those are the residues to go after. So we use that, and in particular this nucleation condensation model a bit to first talk about how enthalpy and entropy, let's start with the first, with nine here. What do we mean by this concept of enthalpy entropy balance? So why is that important? And why do they have to be balanced? So why would it not take place? So what, this is a good example. Whatever you get a question like this, so I can, I could continue half an hour. The key thing, it's usually very efficient, at least in your mind to reverse the question. So assuming that, now we're saying that we need enthalpy and entropy to be balanced. Well, if it's not balanced, there are two things that can happen, right? Either enthalpy is much, either the change in enthalpy is much larger than entropy, or conversely, entropy change is much larger than the enthalpy change. And it's very useful to do a Gedanken experiment at what would happen if one of these were the case. So what would happen if we had a much larger change in enthalpy than in entropy during folding? It would fold to what? Because at some point, we're going to need the entropy to go down, right? So if there was just an enthalpy drop, you would very quickly get to, because that would be straight downhill. So you would quickly get to some sort of collapsed state, but it would not necessarily be uniquely defined, because that's where you need the packing in the entropy. And if we get all the enthalpy, if that drops very quickly, well, the problem is that we just took out the entire gain, right? After that, we would just pay, and that won't happen, because we won't go uphill at the end. And conversely, if we just get this amazing drop in entropy very early, then the problem is that then we would have a gigantic surge problem. And that would lead to an astronomically high barrier in the free energy. So the problem is that you can't have all the gain initially and then go uphill, but you can't just go uphill initially either and then hope to just go downhill after that, because then the free energy barrier would be astronomically high. So we need them to be balanced. And that is kind of part of this nucleation-conversation model that's also Leventhal's paradox. Do any of you want to have a go at trying to explain it? Why is Leventhal's paradox not a paradox? Well, it is a paradox. How do we solve the paradox? Assume that they are balanced and the dependence on number of residue gets cancelled. Right. So that I kind of, I fooled you early on when I just showed these plots about Leventhal's paradox, because I only considered the entropy part, right? I only said that the searching is going to be so amazingly difficult. But the problem is we're not just going to have the entropy. You're going to have the enthalpy that drives it in the right direction. So the effective part is going to be the second order terms that are much smaller. This is a, in general, a very useful way to solve problems too. Cherry's paradoxes, because they're great. Because once you, once you have a paradox, once you have something that you can't explain, then you're also, then you've identified that there are problems with your current models, right? There's something here that doesn't work. That's usually not impossible to solve. But you can't really be in to solve a problem until you identify the problem. So finding a paradox in something is usually a great way to write a good piece, deep pieces or something later. And then I spoke a little bit about these network models for folding. What is different with the network models for folding compared to the stuff that both the book and I've talked about before? Number 10 again. Okay, let's say Leventhal's paradox. Explain it. Don't worry. We'll take step by step. So what are the parts? Let us go through it. First, what was Leventhal's paradox? What's the paradox part? So it's important to remember that a paradox is not necessarily a problem. A paradox is an apparent problem. Of course, Leventhal himself knew that this can't be true, right? And that's why it's a paradox. Based on the simple model that proteins would somehow need to go through every single possible confirmation, that would lead to absurd consequences. And we know that those consequences can't be true. We can't explain it initially, but that's the paradox we have identified. So that there are two irreconcilable parts here. So what did this nucleation condensation say? Sorry. And what you can say already at number nine here, number nine has kind of helped us identify part of it, right? Because the problem in the original slides with Leventhal's paradox, where I said that there were a handful, say three to five states for each residue. And then the number of states we would then need to go through would be three to the power of n. And anything that grows exponentially will grow super fast. You might remember that table I had earlier on in the course. Whilst n here gets 200 or 200, this is going to be more than the number of stars in the universe. Probably more than the number of atoms in the universe, if that's ten even. So, but this is just the entropy. Entropy has to do with the searching, right? So this is just the entropy part saying that the search space here is tremendous. So the amount of freedom we would lose or the amount of searching we would need to do to find the native state is so astronomically high that it's absurd. But that only accounts for the entropy. And the reason enthalpy contains many parts. Entropy, the fact that once you have folded, say that you start to fold the helix. Once you've formed a couple of turns here, the number of ways the next residue here can sit in is going to be highly limited, right? And practice the hydrogen bonds. If you add one more residue here, the existing hydrogen bonds and everything is going to guide this residue a lot. But all those terms are what? They're mostly enthalpy, right? So the enthalpy, because the enthalpy will drop, the enthalpy will guide or drive this amino acid to find the good position. So we're not at all going to have every new amino acid in the helix try to explore all of conformational space itself. He will be guided by his neighbors and the interactions. So in practice this view of just thinking about the entropy is of course flawed. It's not just entropy. We have the enthalpy that will go down, which means that we won't have to search that space at all. And you also said before, and that's kind of question number four and number five, right? We know that as you're folding, this is unfolded and native. We know that enthalpy goes down and we know that entropy goes down too. So both of them go down. Of course, the entropy will be multiplied by temperature, but whatever. First approximation. So to first approximation, these two large contributions will cancel each other. And I showed that entropy, I showed that enthalpy is roughly proportional to the number of residues. And you can show it's not too hard, but we didn't do it. That entropy is also proportional to the number of residues, the first approximation. But that means that the first order terms, they will cancel each other. And it doesn't matter how large they are if they roughly cancel each other, right? So in the black line, one of these would be enthalpy and the other one is entropy. So what happens during folding is that both of them go down. And that means that the large contribution in Daventhal's paradox will be canceled by favorable enthalpy. But that effectively would say that the folding barrier would be zero. And that's, of course, also not the case. So what this nucleation condensation model did, it did two things. First, it showed that in this model, we have an enthalpy and entropy that is roughly proportional to the number of residues. That's the first part. So we show that the first order term will cancel. And then with a little bit of hand waving, I argue that the number of residues is proportional to the volume. But very early on, the number of residues in the native states will probably, the number of interactions in the native state will be more proportional to the area when you just have a very small core. And that goes as roughly to the number of residues raised to two-thirds. Area is two-thirds the power of volume, r squared instead of r cubed. So the nucleation condensation model does two things. It shows that the first order terms roughly cancel. And that shows that there is a second order term that's a bit smaller. And although it appears to be just a bit smaller, because these terms will always show up in the exponent, that means that an exponent of the number of residues raised to the power of two-thirds, sorry, double exponents there, that will grow much, much, much smaller, much, much, much slower than the exponent raised to the number of residues. Again, plot e raised to the power of n versus, say, e raised to the power of n raised to two-thirds in a program or something, is kind of amazing how different they are. The book goes through that in quite a lot of detail. We can talk more about it later in the Q&A session. But then I brought up these network models mostly to show that science makes progress. So some of these very simple stuff we've shown in the book, in the book that we aren't quite true, or at least they're true, but they're simplifications. So what's the difference between these modern network models we think of for folding? Right, but we have multiple folding paths, right? So think of this as a spider's web or a traffic jam. It's not that all traffic in Stockholm would stop if you stop one street. There's always possible to get around it. I will have some of that stuff in the next two slides. So when are proteas then thermodynamically stable versus kinetically stable? That was some of the conclusions we got. I'm deliberately, I'm well aware that that's a bit of a fussy question, but I won't lead you there to the answer I'm looking for. First, are proteas thermodynamically stable? Oh, different answers. Good. So why would they be, let's go after this, so why would they be thermodynamically stable? So who says they're thermodynamically stable? Anfinsen, was he right or wrong? I would say he's mostly right still today. There are exceptions, there are always exceptions in biology, but from everything we can see for small proteins he was mostly right. So are proteas then just thermodynamically stable and not kinetically stable? So first what would we mean by a protein being kinetically stable? It's a good first step. So that's one aspect that has to fold fast enough. Could you think of it some other aspect? Remember when we spoke about membrane proteins? So I'll draw you two proteins. Here's an unfolded state and then we have some sort of barrier. Is that protein stable if that is the native state and that the unfolded state? That's stable, right? Good. Let me draw a second protein here. Is that protein stable? But what if this free energy barrier is 100 kilocal? So it's, well, no, the problem is that I'm fooling you. What type of stability am I talking about, right? That's where you're in the path of the paradox. So the first one, what type of stability is that? Third dynamic stability. And this is definitely not thermodynamically stable. You could argue that it's kinetically stable over the relevant time scales. On the other hand, your example, you could also argue, well, that's kind of irrelevant because it's also 100 kilocal here, right? So this protein will never fold in the first place. So the problem is it doesn't really, if these energy barriers are absurdly high, you could argue, yes, they're going to be kinetically stable once they're folded, but you can't fold them. So in practice, when we say kinetic stability, for normal globular proteins, we want them to be both thermodynamically stable and we want them to be kinetically accessible at least in the sense that you have to be able to fold in realistic times. So is the green plot here completely irrelevant? Could you imagine any proteins having that type of stability like the green plot? Maybe 100 kilocal is a bit excessive. Let's say 50. Could you ever imagine that to be true for any protein? Sorry. So what happens in the membrane proteins? So first, this is research. I'm not saying that we believe this. I and some other people believe this. What if you had a catalyst? So imagine that you had a catalyst. What can catalysts do? Or enzyme? They can lower a barrier, right? So what if I had a catalyst that takes the unfolded state and then it reduces my barrier so I can fold here and reach the native state? And then the catalyst would let go of it again. So then I have somehow, I have cheated. So this, I'm allowed to cheat because the path can't matter. I can get to the state and then when the catalyst, let's go with the protein again, then the protein will not be able to unfold because you would need to go over this very high barrier. This is basically what the translocon does. Because once you have a protein, if you take a helix and try to push it straight into a membrane, you would have a gigantic barrier because you would need to expose the unpaired hydrogen bonds and charges at the end of the helix to the hydrophobic interior. But because we're inserting this through the translocon and then pushing it out to the side, the translocon effectively helps you go along the dashed black path here. Once you're inside the membrane, you could actually survive even though, and we know that there are some helices that are marginally stable, that they appear to be a bit hydrophilic even. But to actually get these helix to slide out of the membrane, you would need to go straight through the membrane and that would be a very high barrier. So nature is pretty good at thermodynamics. Sorry, talking about transition states or are you talking about here? Actually, this is a very important question. This would actually violate the Boltzmann distribution. Because if you have your membrane proteins here, if I start with 100, based on the Boltzmann distribution, where would you expect to have the highest population there or there in the green curve? You would expect that population to be higher. But if you look inside yourselves, you're not going to have most membrane proteins outside the membrane. You're going to have most of your membrane proteins there. So we violate the Boltzmann distribution. How can you violate the Boltzmann distribution? That shouldn't be allowed. Well, yes, at some point we're adding energy when we're inserting things, but then we leave them. So once the membrane protein is inserted, it doesn't matter that the way you got here historically was that the body used energy with the ribosome and the translocon. Once you're here, it shouldn't matter how you got there, right? It doesn't. So what would happen if you waited a very long time? Exactly, right? We are not yet at timescales where we have reached equilibrium. So you can't violate the Boltzmann distribution. If we wait an infinite amount of time, they would eventually cross this barrier. But an infinite amount of time is very long. It's far longer than you live. So the problem is that thermodynamics are only, it's strictly true if you wait an infinite amount of time. But in biology, that's kind of irrelevant because we have this absolute timescale that is roughly the lifetime of an organism. And if something is stable for a sizable part of the lifetime of an organism, that's the only stability that matters. Yes, we do spend energy when we're pushing things out through the ribosome. That's not really the energy we need to spend here, right? That in general, a catalyst, you don't necessarily need to spend energy with a catalyst because what the catalyst allows you to do is it allows you to find a different path through here. In the way this works in your body is that we use some energy to bind the ribosome to the translocon and everything. But that's in pure physics. A catalyst doesn't necessarily need to use energy. A catalyst opens a different pathway for you to get there. So you could, yes, you could. And I'm not sure what way you think about it, but one way to think about it is that normally you should not really, during the normal lifetime in your body, the fraction of protein, some proteins will misfold, right? But this fraction of protein is going to be so small that it's not really going to influence the way your brain works and everything. At least not until time, at least not until you're old enough to have had time to reproduce. I'm sorry, that's because that's the way nature cares about it. Offspring is important. Once you've had the offspring, diseases that happen later won't really influence the ability for you to propagate the species. But what likely happens is that for instance, if you have bovine spas, pungiformis, and phyllofate, some of the diseases, or if you're eating brains from other animals, what likely happens is these plaques that you grow, they create this breeding ground, right? Where you now get a lower free energy path to form, to grow the plaques more and more and more. So normally the plaques would grow very slowly, but under some conditions the plaques starts to grow faster. So effectively it's the same thing, that the protein, and in this case, normally we say this, normally we think of this as good, right? But in the sense of plaques, this is a bad pathway. Because it happens sooner than we would like it to happen. Was that roughly what your thought was, sir? Yeah, but I think these, but here's the opposite. For protein misfolding, the problem is that you typically, it's great that you ask questions. They typically have something like that, right? So that's the unfolded state. That's the native state, and sorry, I should probably have a much higher barrier there. That's the native state, and this is the misfolded state. Because the native state is kinetically stable under, sorry, the native state is kinetically stable normally for decades. But under some conditions we find, under some conditions it becomes easier to cross that barrier, and then you end up in the misfolded state. So for these protein misfolding diseases, then you actually have a misfolded state, the prions for instance, that have even lower free energy than the native state. That's great. Wait two slides for that. I have some, I update some slides this morning. What's the role of, the last question is what's the role of transition states in folding? They determine the folding rates, yeah. So I had some yellow dashed curves there. My idea was to use that for you to think about some different pathways and everything, but the problem with those curves is that they are one dimensional, and that's just stupid. So I've drawn this two-dimensionally here instead. Sorry, I thought about that after printing the slides just that evening. So these are some folding pathways, and I'm going to have three slides of them, I think. For all of these, the zero here, this is the unfolded state or the coil all the way on the left, and this is the folded state. And for all of them the unfolded state has the same energy, imagine if this K calc from all or whatever, and the folded state will have the same energy, but the pathways are different. So arrows here would mean these are ways they can go, and my idea is that we should start to think a little bit about what these pathways mean. So for all these intermediate states, in this case you would need to go through the yellow path here, and that would be some sort of off pathway intermediate, and that's another off pathway intermediate, and the numbers here are the free rough free energies of those states. So you compare the yellow curve on the top with the blue one on the bottom here. Which one would fold faster, and why? So how many of you think yellow? Raise your hand. How many of you think blue? That's good. We have a little bit of time. Let's do a peer exercise of this. So turn to the person next to you, ideally find somebody who doesn't think the same as you do. Spend one minute and try to convince the person right next to you, you're right. Sorry? Which one of these folds faster? From the left to the right. I'll try it again. So how many of you think that the yellow one folds faster? And how many of you think that the blue one folds faster? That's kind of amazing. So now we almost have a consensus that the blue one is faster. Sorry, that yellow one is faster. So what's the difference between them? That's this state, right? So what is the difference that would happen in one of these states? Anybody want to have a go at explaining it? So what's going to happen here, right? You start there and then you're going to have some sort of barrier. This would be 0.9 and then we need to continue up 1.2, but they're not astronomically high, these barriers. And then we start to go downhill to 0.5. That's uphill. Then you drop down to 7.2. What's going to happen here is that things proceed roughly the same way until you get here, right? But suddenly it's going to be really good to go there instead. And then you're stuck here. And now you're going to have a barrier that's three point, that's going to be 4.6, right? To go back up. You're going to be paying a lot to go across the barrier. So that, and now you have to wait until you can cross this almost faked kilocalb barrier until you're back. So you're going to have a much, much, much slower folding because you have a stable states. So stable states, instinctively they feel good, but they're really destroying things for us here. But this is the, sorry, this is the, when we talk about folding, we mean reach the native state, right? So this, all these are intermediate. We need to get, sorry, we need to get to the goal here. Look at, sorry, I actually made a copy of the yellow one on each slide, but look at the yellow one there again. And then we'll change that a bit to say that let's assume that we have to go through that intermediate state. And now we have two things that are a bit different again. So which one of these will fold faster? So why will the second one fold faster? Because this one keeps going downhill, right? You hardly ever have any uphill barriers. So again, this would fold almost instantaneously. No, actually, you do have a small barrier. The largest barrier here is 0.9 kcal. So you can have a very, very small barrier and then it's just downhill. Go ahead. We can take both of you. In this case, I, in this case, I more drawn everything, right? So that minus three, four, and then you go uphill to 1.2. And the high ones here would be transition states. Here, this would be all states. Yeah, no, but I'm, think of this as having the old states. It's just a handful of them. This includes the transition states. You had a question too? Yeah, but then again, then you're somehow assuming that there should be a sort of barrier here or something in between, right? So that here, even if you go backward, you would instantly start to go in that way again. So that again, this is highly simplified, but I don't, there is nothing in between here that I'm hiding for you. So yes, from minus 4.1, you can certainly, this is not a gigantic barrier. So you might go up a bit here. You're not going to go up there because that's a very large difference. And the Boltzmann distribution will make sure that we have equilibrium. So most things will keep flowing forward. Right, yeah. I could have, I could have added some small, I could have added some more small barriers here. But the point is, if you have lots of small barriers, each barrier will slow you down a bit. So these are two more. Let's look at, in this case, we're not comparing them. So what would happen in the top case here? So in principle, all of these states, how, well, all the last states here are negative pre-engine, right? So all of these are good, but you're at minus 7.1 already here. Right. So the problem is, you're not really going to gain these free energy. Both these states are going to be really well populated. And you also have a reasonably high barrier between them. So you're not really going to gain a whole lot by going over that barrier. So here, I would suspect you would likely have a sizable fraction of your podium stuck here. If this barrier then started to be, say, plus 2.0 or something, the barrier would become so large that this might even become your native state. And this could then be a misfolded state. And then the final example is that you could end up with something like this. What would happen here? Yes, but it would fall reasonably fast, but could you say something else? So what's different? I would argue, this is a much more realistic example than the others. But what's the difference here? You have multiple pathways, right? So which pathway will you take? So we start at 0. And where should I go? 0.7. I agree. And where should we go after 0.7? Minus 1.2. And then minus 0.9. That's not going to be the only pathway, but it will be the dominant one. So that most of the flow are going to be along that. But all of the pathways will participate. And I'm not going to try to draw all the other ones. This is not just important for protein folding, but it's going to be important for drug design and binding and everything we will talk about today. Don't assume that there is just one pathway. Anytime you're designing a drug, it can enter or exit the binding site in multiple ways. It can possibly bind in multiple ways. But the point is, even if you know the binding property exactly, the way that you enter or exit the site can be important. Yep. For example, you'd say most of the time, you would find, I mean, not most of the time, in the minus 7.1. It's like it's too high a barrier to cross to get to the 7.2. I know. Sorry, there are two parts here. Remember, thermodynamics versus kinetics. What does thermodynamics tell you? Well, no. Thermodynamics, Boltzmann distribution, right? They're very similar in energy, and both of them are significantly lower than all the other states. Boltzmann distribution doesn't say that 100% will be here, 0% will be there. Boltzmann probably says 55, 45. So that because the energy difference is very small, you're going to have a sizable population here too. So the other part is kinetics. And the kinetics is related to what you just said, the barrier between them, right? And if the barrier between them starts to be too high, what happens? Exactly. And if the barrier starts to be too high, you will have a sizable population here. And under some conditions, what evolution might very well do is that evolution might simply accept this and gradually optimize this to be the native state. Because if you have the, this is a protein, it always folds here. It stays here for an average 100 years. Can't we just use this as a protein? And your body has done that occasionally. But unfortunately, under some conditions, you can actually cross that barrier and get here. And then your protein A stops working and B, some bad things might happen due to this state. And that's where you get prions and things like that. How do you mean, do you mean if both of these would be transition states? Sure, in general, you can have many transition states are just high energy states between other states, right? So if you're moving from that state to that state, that is a transition state. And then things starts to get a bit blurry because this starts to, what do we mean by folding? Well, normally the stuff we've done this far in the course, we said that folding is the way you get to the native state. But what is the native state? Well, the biologically active one, right? If we think that this is the biologically active state, then I would say yes, then that's the transition state. But you certainly have another transition state in this case from the biologically active state to the really lowest ones. So transition, there's nothing fundamental and special biological about a transition state. A transition state is just a high energy state between two low energy states. No, you're right. Because it's, sorry, my bad, it's lower than the 3.4. That's a transition state. Yep. So this is the problematic part. If we took all of these and added 50 k kals per mole, it wouldn't work, right? Because they also, then you would need to spend all these energy to fold them. So we also need the thermodynamic stability that your body can help things a little bit, then we can rely on things, not crossing barriers. But it would take way too much energy if we are somehow, if the body actively folded each protein into some sort of just kinetically stable state, it would be way too costly. So your body relies on the fact that most proteins tend to fold entirely by themselves. And then in some side conditions, we might let it get a little bit of extra help from the kinetic stability. And of course, proteins that end up with too high barriers so that they would be very high energy transition states, they, natural selection would select against them because those organisms wouldn't survive. So the point is you need both thermodynamic and kinetic stability for most proteins. Yep. So then it gets complicated. You cancel this mathematically. And that's, it's essentially when you're connecting, say, resistors in parallel versus in series and everything. So two small barriers, it will, if you have two small barriers of five k-cals each, they will take a bit longer to cross than one five k-cals barrier. But it's not going to take as long as a 10 k-cals barrier. In the chapter where the book, in the chapter where the book talks about it, it actually goes through all this math and details. You cancel this exactly. The reason we don't do that, you don't have two barriers either, right? You probably have 50 in a cell or something. So it gets too complicated. But when you cancel it, what I would do today, I would use a computer. So that depends on the exact energies, of course. But in general, it can't be a whole lot higher. Think of that, crossing two barriers of five k-cals would probably be in the ballpark or crossing one barrier of maybe seven or so. Well, remember, you can always approximate things, right? Remember the simulations I talked about when we had folded this small BBA5 in a computer. So why did we end up with these lag at times? That's because our model sucked. Our model tried to approximate something as a single step process. While this was really a multiple step process, we have multiple intermediate states that we didn't account for. In many cases, it still works to approximate something with a much simpler model. And typically, when you're applying models, the reason why you use models is not to have an exact description of the world. If you want an exact description of the world, just go in the lab and measure anything you want. The idea of having models is to make them super simple so that it can aid your understanding. And that's almost, you should be aware of the things that it could be, say, two pathways. If you're talking about drug binding, the likelihood that you will have more than two pathways that are important, you will certainly have more than two pathways, right? But it's only going to be the two or three lowest free energy ones that are important. And I think it's going to be exceptionally rare that you have more than two important pathways. But it can be more than one. That took a bit, but I think it's great. And that it's spending time answering your questions is about the most important thing I can do. So that's why I wanted to take some time. Today, I'm going to talk about drug design and a little bit about how you use both computational and experimental efforts in modern drug design. For better or worse, this is all changing right now in industry. We're undergoing a revolution and we so need that revolution. So you are hopefully, based on the bioinformatics course, you're probably pretty good at understanding that we can predict, say, sequence, sorry, structure from sequence using homology. And you can use a force field-based method, say, build side chains or something. You might even be able to simulate a protein folding. We can do that because we're interested in it. In general, we don't want to simulate the folding of a protein in drug design because you're not going to trust that structure. Rather, we would typically start from an experimental structure. We can energiminimize that and we can simulate models of the structure to understand how it behaves. This is super important if you're into biophysics, biochemistry. If you love proteins, if you want to understand how proteins work. But if you're in industry, the one thing you care more about than anything else is understanding how you combine small things to your proteins to change how they work. And the reason for this is that this is pretty much how your body is using proteins. Most signaling in your body happens by something binding to a protein and causing the protein to do something. And of course, if you now want to either want to intercept or amplify the signaling, change it in any way possible. We somehow need to act with the tools that the body has given us. So this is, of course, a much smaller problem than understanding all the motions and all the folding of a protein in general. If you're in pharmaceutical company, you don't necessarily care about protein folding. We just want to understand maybe there is some important structural transition when you're binding the ligand. And that we might have to understand. But we want to understand as little as possible and we want to do as little computation as possible to reach our goal. So what's your goal? Wrong answer. Any other choice? What's your goal in pharma? Nothing academia. Yes. Your goal is to make money. And this sounds horrible, but the point is this is where all your retirement funds are going. So that if you at some point would like to retire, you want those companies to make money. You can certainly say that in academia, shouldn't we aspire to higher values? You certainly can, right? But that's the question. Do you want to starve when you're 65? So that this sounds horrible, but again, what's better, making money from designing drugs and curing cancer or making money from designing weapons? I prefer the people who make money by curing drugs. The reason I say that, that's going to reflect on some of the things that go through that it's not horrible or anything. But the choices we make in pharma, we make them because they need to make money. So before we get there, we have to understand a little bit of what molecules do. And the general thing and the general nomenclature you use is that you have some sort of drug, small molecule, and you have a protein. It can occasionally be DNA, but it's almost always proteins. And this is what you call a target. It's a target for the drug, but it's just a receptor or something. And then if you're lucky, your drug should bind your target. If it doesn't, it's not really working at all, right? So somehow they need to bind each other. And then this should elicit in the biological response. Something should happen. And it's not really more complicated than that. Everything in pharma works this way. The problem is, of course, you don't, you want the right thing to happen. You don't want the wrong thing to happen. They should bind reasonably efficiently and everything. So this gets more complicated. The way this works, this has been a revolution with one. Remember the very first part in the course? It's so easy, I think, to take this for granted. Remember, I think it was lecture two when I showed you Cyrus Levinthal's first attempt at visualizing molecules in a computer screen? It was 40 years ago. Only 40 years might seem like a very long time. I know that it's all the longer than you have lived, but in the history of science, it's nothing. And by now, you think it's completely obvious that we can visualize molecules, we can draw ligands in them, right? This is relatively new in the history of science. The book you have was written roughly 20 years ago. So this is roughly twice as old as that book is. So today, we have a ton of structures in the protein data bank, and you can even simulate them, and then you can see how this small ligand will, if you put the ligand close to the binding side, you will actually see how the ligand gradually binds in this sort of deep pocket here we visualized in the protein. There are lots of other examples where we even have protein structures with ligands bound, and then you can see that I have no idea if it's probably a small helical part or something, and then a small molecule inside there that it binds in some deep pockets. So most of these sites tend to be some sort of deep pockets in the protein that they're well defined. They're typically hydrophobic. In a couple of cases where we have antibodies and antigens, you can have very large structures on the surfaces or cells. But again, when you drill down to the details here, somewhere deeper, you're going to have a pocket that somehow needs to fit a target. So all these things had to do with that, we're going to need to find a complementarity between a ligand or something that we would like to bind. We would need this to complement the existing binding pocket in the protein. And the complicated part is that there might also be something natural in the body that should bind there. And do we want to turn that off? Do you want to enhance the function? At some point, I might be competing with molecules occurring naturally in the body. I can actually tell you already now, finding something that binds, that's trivial. Anybody can do that. You can do that. Seriously, with the existing knowledge you have now, if you have a couple of days of experience with somebody teaching you, you could design things that bind. The problem is that it's very easy to end up with things that bind in a hundred other places too. And that's going to cause side effects. So the problem is you only want things to bind in the right place, not in the wrong place. That's super hard. So how many proteins do you have? We've talked about how many genes do you have in your bodies? Roughly. Sorry, did you say? 19. Yes, 19. Good. Sorry, I've heard 90 for a while. Bullpark of 20k. And you might think that all of these are targets, but pharmaceutical industry, all of us, we're fairly narrow-minded. You might have heard, you might have seen me and other people talking about how important membrane proteins are in general, that they're a very large fraction of drugs hit membrane proteins. That's formally true, but it's even worse than that. More than a quarter of all drugs in existence, they target this class of G-protein-coupled receptors, G-P-C-R, 27%. And that's all the modern drugs. So there's probably more than 50% of the revenue in the pharmaceutical industry. They hit just one class of proteins that are important for signaling. You could say that that's good or bad. I would say that it's amazing because there is a huge amount of hidden targets here that we could use. So what started to happen in the last few years is that we're getting better at targeting both nuclear receptors and in particular ion channels. And the ion channels, they're my love in life. But the point is that these are still just the three or four largest categories here, right? This is an amazing amount of hidden space that you could turn into new drugs. So why haven't we targeted these before? I'll get back to that. So it's actually fairly new. It's only 10 years old that we know most of the genes and the body, what they do in the first place, right? You're spoiled. You have no idea. When I was your age, we didn't even know that we didn't have a full, the first full human genome was determined in 2001. Before we get into that, and we're going to need to classify a little bit what drugs do. And this is also pharmaceutical. I want you to know these terms for the reason I said, if you're ever, this has to do with employability. At some point, if you ever want to apply for a job in pharma, they expect you to know these things. So there are a bunch of different drugs you might design. If we have some sort of access here with, on the right here, we have more drugs. And then this y-axis is just how much the biological response we have. If you have a normal receptor, for instance, the ligand gate that I in channels I showed you before, they will normally work in your body, right? And there will be some molecules in your body, say the glutamate receptor binds glutamate. There will be some normal molecule in your body that activates this receptor. And then there are a couple of things you can do. In this case, we're not going to look at the natural molecules. We're only going to look at drugs. So you can have a drug that gives you exactly the same response as the natural molecule, but we do it much stronger. So that would be what you call an agonist. Agonists are molecules that create the same, it creates the normal biological response, but it might create them in a much stronger fashion. And a full agonist would be one that really creates a very strong response. A partial agonist could be a molecule that creates a bit of the response, but maybe you don't want to run the car on full throttle. So agonists, they activate the receptor in the normal way. Another thing is that it can happen that for whatever reason that this receptor activates too much in your body under natural conditions, so we want to turn it off a bit. And if we want to turn it off a bit, you want to inhibit it. And that you do with a molecule called an agonist, the ante, the opposite. This is a bit misleading because it doesn't really have the opposite effect, but it has the effect of, you can imagine this could be a molecule that binds in the same place, but it doesn't create the response. But because I'm now bound there, my molecule is bound there, and that means that the natural molecule can't bind there because I'm binding stronger. So then I can turn off the response. That's, I would actually argue, most drugs tend to be inhibitors because it's easier to handle. It's easier to turn off the process in your body. In a few examples, you actually literally want to create the opposite. That's, imagine that you have a very complicated response from some, I can't even think of what responses would be right now, but instead of turning something on or neutralizing it, we might actually want to create the opposite effect. And this could be to, iron chance would be difficult. But an inverse agonist creates an effect, but it's the opposite effect. You see that we suddenly, rather than the normal activity, we have an activity that goes in the other direction. So the important thing is that we have agonists, inhibitors or antagonists, and the inverse agonists. Sorry, these antagonists are, because they inhibit, they're frequently called inhibitors. So that would be the normal biological response in a healthy cell. The problem is that it's not enough to, if all you needed to do was to create agonists, antagonists, or inverse agonists, it would be a simple world. But the problem is that the second you start eating drugs, you obviously, you need your drug to bind to the target, or you don't have a drug in the first place. But then it's going to be one of the three I mentioned on the previous place. The problem is that you can't bind to other stuff. It's going to be, it's super easy to get something that inhibits iron channels. There are whole classes of toxins that turn off iron channels. But if you give these to a patient, you will inhibit all the iron channels in the patient. And you're going to inhibit lots of iron channels that you didn't want to inhibit. So the problem is you need this to be specific and not have side effects. That's the first problem. The other problem is that you need your compound to survive from the time you take it until it has effects. And there are two problems here. The first problem are all these biologicals I talked earlier about in course. What's a biological? Yes, it's a peptide or protein based drug. You will see other types of drugs in a second. So what's the problem with that? Well, it will be degraded in your stomach, right? We have enzymes for that. The other part is that you have something called the blood brain barrier that's mostly meant to protect your brain. But there are very few molecules that can cross into your brain. So even if you can get something in the blood by injecting it, the point is most things can't get across the membranes into your brain. So if you want something to enter your brain, it has to be very small and specific molecules, or you need them to tag along to use. There are some receptors called transferrin. So even getting things into the brain isn't entirely... Sometimes you actually need to put an infusion directly into the brain if you want to get them there. You need compounds that are easy to get to the body. Again, those biologicals, if you need to inject them, you're never going to make a billion dollars. And again, that sounds... I'm well aware how harsh that sounds. But if a company doesn't make money, they're not going to keep developing drugs, right? So what's better? A company that develops drugs or a company that goes out of the market? They need to stay in the market. And ideally you would like a slow, steady, nice release of drugs. You don't want the gigantic dose every four or five hours and you don't... In particular, if you need to inject the drug, how many of you would be happy to inject something with a needle every six hours? Well, how many of you would like to do it if your survival depended on it? So of course occasionally you might have to do this, right? But it's... You're not going to sell a billion copies of that drugs. Nobody would like to take a drug, say, some cholesterol-reducing drug if you had to have six injections per day. This entire concept is called admin-slash-talks. And this is for absorption, distribution, metabolism, excretion, and toxicity. This is the largest problem in drug design in industry. You need to have the admin-talks working. And if you're operating a pharmaceutical company, you're going to have an entire division focused on admin-talks properties. And you know different things about drugs. There are... The good thing is that there are some very simple rules that will help you achieve this. It's something called Lipinski's Rule of Five. These are purely empirical parameters. Based on all the drugs that people have developed successfully, we know that drugs need to be small. They need to be small enough to be transported. There is another reason why they have to be small. This is the same problem for a protein. If a drug is very large and very flexible, it's going to have a lot of entropy, right? And the second you bind it, it's going to lose entropy. So it's not going to want to bind. So drugs have to be very small, preferably much lower than 500 delton molecular weight. They need to be polar because if they're not polar, what's going to happen when you inject them in the blood, right? They will just clog together. So you need to have them soluble in the blood. If the drugs are not soluble in the blood, they won't be transported in your body. We want a few hydrogen boners and a few hydrogen bond acceptors. And in the general, both of these things, we don't want them to be too polar. If they're non-polar, that's good so they can actually cross membranes. So most drugs tend to be mostly hydrophobic with a few polar components. This is a great idea if you look in history. The problem is that it hasn't led to a new drug in 20 years. For a couple of reasons. One of them is that we keep increasing the requirements to put a new drug on the market. There is no way aspirin, one of the most sold drugs in the world, there is no way aspirin would have been approved today. Too many side effects, too risky. It dilutes your blood, you can have it up with blood. Again, it would probably not even have passed the first clinical studies when you're going to prove that it's not harmful. But of course because it's already on the market, we allow it. But this means that we have fewer and fewer new drugs. The other problem is that the traditional way we approach drug designers are going to show you it's not really working anymore. So this is a great idea. The only problem is that it doesn't work. And that's what has happened in pharmaceutical industry. We've gotten stuck. Or at least we were stuck some 10 years ago. And this is one of the reasons why there is so much interest in biologicals nowadays. We need completely new ways of finding drugs. This is a bunch of examples of typical drugs. You might very well have taken some of these, nasal decongestant in particular. You can get it without a prescription. You won't recognize these names because these are the names that the pharmaceutical companies use internally. And then they typically come up with a separate marketing name on every single market all over the world. And that has more to do with patents and everything. So do you see what the common features of these drugs are? They typically have some small hydrophobic parts, maybe some either six or five member carbon rings, right? And then a few groups around them. And all these groups are then optimized to bind different things. So this is again, typically how drugs work. How do you think we've... Yes? This is a group. So why do you think they have lots of rings? So there are two reasons. First, I said that they should be relatively hydrophobic, right? And these rings are hydrophobic. The second thing had to do with the properties of proteins, that if you had something that looks like this, long aliphatic chains, that's going to have a lot of freedom, high entropy. When this binds, it's going to lose all that entropy. That means that you lose a lot of entropy, and that means that you're not really going to gain a lot of free energy when you bind. Because of all these rings, this molecule, how much freedom does that molecule have? You can rotate around that bond and rotate around that bond, but this entire part is pretty much rigid. So this molecule will not have a whole lot more entropy when it's in water than when it's bound. So this rigidity of the molecule is what makes sure that they don't have too much entropy when they are not bound. And this is not just a coincidence. When we try to design, when you try to design these groups, you deliberately work with these rings so they have a few rigid components. How do you think most of these have been found? In nature, yes. And this is actually not the joke. Historically, the way you find things is that you go into the Amazons or something, and you find something in nature that had a natural effect. It could be that you had a slight effect on blood pressure or digitalis, for instance, original finding plants. Or you might find that certain parts of the population that tends to eat a certain diet, they have less problems with whatever, blood pressure or something. The fatty acid, the unsaturated fatty acids was, for instance, discovered that way, the Mediterranean diets. But the problem is that at this point, you only know that there is something in that plant that has a very weak effect. But then you need to purify that. Then you spend huge amounts of efforts in the laboratory trying to isolate this, eventually find this compound. The problem is that we're gradually running out of those ideas, and there are relatively few amazing compounds in nature. A large part of these compounds have to be improved because the natural one I will come back to that in a second, is going to have very low efficiency. So you can then use either computational chemistry or experiments to try to make a similar drug with a better effect. What's very popular today is that rather than failing, I can copy your drug. So assuming that you started a pharmaceutical company and spent 10 billion dollars, I can create a drug that looks almost exactly the same. That's great for me because I have to spend 100,000 dollars in developing this, and then I can sell my drug cheaper than yours. Awesome for me. Unfortunately, you might have some patents that I need to work around the patents, and I need to make sure that I don't tread on your patents, which can be difficult. But if you can copy a drug, that's fantastic. You could argue that it doesn't push science forward, but again, the role of these companies is to make money, not necessarily to push science forward. We're interested in pushing science forward. What's happened more and more in the last few years, though, is that we do really do drug design. So that you find a receptor, you determine the structure of the receptor, and then I know that this receptor could be important, for instance, for blood pressure or something. And then I specifically design a brand new organic compound that has never existed in nature to specifically bind to this receptor and create some response. And that is something that has happened the last 20 years, and what we've, even 25. And I would say it's the last decade that we've started to get lots of drugs on the market that have been developed this way. Because it takes a very long time for them to reach the market. Today, this is the dominant way of designing drugs, together with biologicals. And I think you will see more and more biologicals in the market. I will spend, give me another five minutes and then I'll give you a break. So the way drug discovery works in practice is that the first part nowadays happens in academia, because the companies are not interested in this. It's too risky. So at this point, people, this is something you might do as part of a thesis project or something that there might be some new protein that a group at University X has discovered, and then they've been able to determine a structure. Actually, they probably didn't discover the protein. It's an important protein that we know is important, say, for blood pressure. And what this group then managed to do is that they managed to determine a structure of this protein, this receptor. Having a structure is the first critical step, right? Because we can't really start to design something, unless we know what we're designing the structure for. In some cases, you might be able to do it biologically, but having a structure will usually be important. And at some point, you're going to need some sort of cues here. We're going to need to find something that might give one small molecule, and this could really be divine inspiration, but you're going to need to find some molecule that you think will have some sort of effect on your receptor. And for now, we can let's call this divine inspiration. If you're lucky, does this small molecule, if it binds, that's good, but does it have any biological effect whatsoever? If it doesn't have any biological effect, it's probably just a sidetrack. But hopefully, this has a very small biological effect. And then we can try again, hopefully, if we're lucky, maybe we can optimize this. So we start making this molecule work better. And at somewhere here, you will now go to your colleagues in the industry and say, look, we've discovered this really cool protein receptor XYZ. And we have our small molecule ABC. And I've shown in my lab that small molecule ABC works really great in the lab. It can inhibit this receptor. Could you possibly fund us to do some animal studies? Or you could ask a research council for some money to do animal studies. And at this point, you would likely create a startup company too. To say that if this, well, we're going to place all the patents in the startup company. Because if this is successful, this might be something that you can commercialize. And at this point, you might even try to get a patent on the, on this small drug or something to try to protect your intellectual property here. And at some point, you're now going to need to start doing proper clinical testing. And this is where you need things start to get a bit expensive. And that's why you need the commercial route. Because until this point, you can do it at facilities at SyLive. SyLive Lab won't help you do animal tests. But once you get into the actual patient test, it's going to cost a lot of money. There is no way the hospitals will do this for free. They're going to write, they're going to ask you to write them very big checks. So here you need somebody with financial muscles. And this will likely be a venture capitalist funded, smaller pharma company. And they might now pay you to do a phase one study. And what the phase one study, this is the stuff I talked about. You pay say students $50 to take this brand new drug that you've just tested on animals in the lab and eat this and see if you die. That's basically what we do. Of course, we don't hope that they won't die, right? But eat this and see if it's safe. Hopefully it is safe. And again, you can't imagine the reason why we have phase one studies is that it is not obvious that it is safe. This is still, some things fail here, but once you get to the things that they're safe in animals, most things pass. The reason we have this is that there have been a few examples like the Thalidomide scandal, for instance, where things were not safe. You could also argue that it's very unfair here that most drugs are only today are only tested on males. Why? And particularly because few males get pregnant. And it's not because we're an asteroid, because there have been some things that have been really difficult to detect. And you don't want to, even if you ask a 20-year-old or 25-year-old woman whether she's pregnant, she might not know. And you might call it absolutely horrible devastating results. So initially, we tend to test it on healthy males, and then you gradually expand this to more and more of the population. The second phase is that at this point, we just want to know that it doesn't kill you. We could not care less whether it actually has any effect whatsoever. If this is good, and if it doesn't kill you, the things say, well, does it have any effect whatsoever on a human? And that's what you call phase two, that does it actually reduce your blood pressure if we try to design a new blood pressure drug? And then you might be unlucky and say that apparently your mouse model didn't really work here. So it's, again, a mouse is not a human. Things might fail here. If you're lucky and this was sufficient, well, here things started to become really interesting. Now it's going to get very expensive, because as you're moving down this ladder, we're going to need to test it on more and more and more patients. Now you might also want to test it on women, on pregnant women, on possibly on kids. You might want to do some very large phase three study. So this is where a larger pharmaceutical company might will step in and buy the smaller pharmaceutical company. And now you're talking about hundreds of millions or billions of dollars. And the poor professor at this point might have 0.001% of the company around. He will probably get rich anyway. In phase three, you're going to, to approve this on the market, what the regulatory agencies are going to ask you to show that, well, there are already 300 blood pressure drugs on the market. The only conditions under which we will approve you new drug is that you're going to need to show that this is better than the existing alternatives. That your effect is better or that you have fewer side effects or for whatever other reason that the patient, it's somehow better for the patients. You can't just put anything you want on the market. Most things tend to fail in phase two and three here. So the problem is that if you start going through this in preclinical, the red parts here are the fractions that fail. 70% things fail preclinical in animal tests. Then you have roughly 40% in phase one, 60% fail in phase two, 40% fail in phase three. At some point here, the food and drug administration might not approve your drug for whatever reason, because you might think that you've done your homework and you did it, but they might not agree with you. They might think that, well, we don't think that your results are good enough from phase three, so we want to approve your drug anyway. And occasionally there are even battles here, but say between European and American regulatory agencies, because the European regulatory agency recently rejected an important drug for American company. And now when a European company comes to the US, this is not how it should work, right? But there is politics in this too, because this is multi-billion dollar industries now. The important thing here is that failing here is cheap. What do you think happens when you fail here? Yeah, this is where companies go on. Multi-billion dollar companies can go under here. Do you know what the only thing that's worse than failing here? If you make it here, you start administering to patients, and then you realize that there are side effects and you have to pull the drug from the market. And that has happened. That's happened for several drugs the last 10 years. That's when even large companies can go under. So I'll show you one more slide, and let's break. These are just estimate numbers. It takes in the bulk a little bit over a decade to get a new drug to market. And then you haven't, in this case, there's probably been a research group that has focused on this type of receptors for a decade prior to this. But the actual drug development is a little over a decade. At best you're talking about half a billion dollars or so to develop a new drug. A lot of money. And maybe a few hundred scientists involved. And most of it fails. Failure is the norm in this industry. After the break, I will go a little bit more into showing how we actually do this in the lab, the different stages we go through and the different computational methods we're working on and where I think things are heading in the future. But it's almost 10th or it now should. Let's meet at a quarter to 11 here. So take almost 20 minutes. So I will continue from where we were on developing drugs. This is expensive. And if you're running a company, you could imagine if you could reduce the time to market, reduce the time to patent, reduce the cost and ideally reduce the number of scientists, you would make more money. Do you know how long a patent protection is valid? So normally it's 20 years. But the problem is that that starts accounting from the time you start your patent. And you're going to need to start the patent at the very early part of this process, right? So by the time you actually have something on the market, half the time has already expired. So you're only going to have a handful of years to make your money back. Under some cases, just for pharmaceuticals, you can actually extend the patents by five years nowadays. But again, you don't have a whole lot of years to make all that money back. And you're certainly going to make more than 300 million euros from something that's successful. But for every successful drug, you might have had five failures. Then it starts to add up. So what we are going to look a little bit about all these stages, at least the early part of this pipeline and the research part and what we use both in terms of bioinformatics and so-called chemoinformatics, which is kind of bioinformatics, but it's not working on genes. We're working on small molecules. I won't go into details here quite yet. I'll take you through those slides. So remember this slide I had about development phases? All that early preclinical part, sorry, at some point you're going to need to decide what you want to do. The computers won't do that for you quite yet. But I'm not kidding, that might actually happen soon. The preclinical research that you do in a chemistry or medicine lab, this is where computers have entered and they've pretty much replaced everything we do. We occasionally need protein structure, but not always. I got one question of the break is that it's finding a hit. It sounds once you have this hit, right? Things should be easy, but how on earth do you find these hits? Yep. So how many small molecules have been found before? Yeah. And what fraction of chemistry space do you think that covers? So if you start with a rough approximation first, 0%. Chemistry space is insanely large. One of the largest available databases is called Zinc, where you can order compounds. That has 100 million compounds or something. It's still nothing. So chemistry space is infinitely large, which is the problem. At some point you're going to need a small molecule. You can get this from a bunch of different sources and actually find, as I mentioned before, finding something that binds is usually not that difficult, but assume that you have some sort of database that we can screen through or that we can find something in nature that has an effect. Do you know what this molecule is? Obviously not. I would be a bit worried if you did. This is a molecule that was originally called homeopressol. That was shown to have effect as a proton pump inhibitor. Do you know what a proton pump does? Take a wild guess. It pumps protons into your stomach. What does protons do? It's a charge, but from a physicist, the proton has charge. But for a chemist, protons do what? They create an acidic pH, so where would you want a strong acidic pH in your stomach? What happens if your stomach has too much acidity? You get heartburn and acid reflux and the things, right? So if you could create drugs that could inhibit proton pumps, you could create drugs against this. Just turn them off a bit. In this case, there were lots of similar molecules that they found. Some of the good molecules, eventually you realize that this molecule exists in two forms, this left versus right-handed ones. You call them S and R when they're not proteins. Eventually, people realized using computational tools that of these two so-called race mates, the S one was much better at inhibiting it. So people used computational tools to predict this and eventually realized you should just produce the drug called S homeopressol, which is just the S version of it. And that's pretty much what later on became LOSEC, which was AstraZeneca's blockbuster drug. So that's an example that you find something and then you need to optimize the molecule to make it behave better and ideally be more efficient than have fewer side effects. The way you actually do that in practice at some point, we're going to identify a hit molecule that has some sort of activity. You can do this with so high throughput screening, and this is not necessarily done in a computer. The traditional way to do high throughput screening is in the lab. There are machines like this that can test in the ballpark of 100,000 or 200,000 small compounds per day, very high throughput. They are not cheap, but on the other hand, if you're running a major pharmaceutical company, money is not necessarily your problem. If this could turn into a blockbuster drug, you will screen as much as you can. The hardest part is usually finding more things to screen. And even for a small initial test, you might start by screening up to 1 million compounds. Yes, they might cost a couple of dollars each, but so what? It's just 10 million dollars. If you're lucky, you might get 100 hit slash leads, things that might potentially have some sort of efficiency. There is no way you would expect that this would turn into a drug. Again, but you're starting, you're completely blind when you start, right? And you just want to find, is there something potentially in chemical space that might bind to my receptor? I write expensive here, and it's in one way that's expensive, but a worse problem can actually be that you might need to order some of these drugs, and you're also limited to the drugs that are available in the chemistry database here, because you're going to need to order them physically and put them in the machine. There are a bunch of different estimates of the size of chemistry space, but they're usually in the ballpark of 10 to the power of 50 or 60. And if you think that proteins space was large, this is insane, right? So that if you just start to randomly test things here, the probability of finding anything should be zero. Now, of course, the drugs in those databases, they're not random. They're small compounds that we know have good binding properties and everything. They tend to bind in more than one place, unfortunately. But in many cases, this is a couple of examples of screen testing. One of them, we didn't find anything, which tested 300,000 compounds. And the other case, they found ballpark 150 small hits when they tested 200,000 compounds. This is not that bad. This is just maybe a quarter of a million dollars or something, and you now have 140 small molecules to work with. Sure, those molecules might bind in a truckload of places, but then you at least you have something on the table to start working with. And then you can try to optimize these drugs so that they only bind where they should bind, that they don't bind where they should not bind. I know that that sounds obvious, but it's harder than you think. But the problem here is that you might have very few hits, and these hits might not be very good. The other problem is that at some point, you're going to want something here that's patentable. At some point, then you're going to need to add more into computers. And this is what you call the QSAR, the Pharmacophore Modeling. I'll get to that in a second. So ideally, you would somehow like to do all of this computation, right? Or at least do as little as you possibly can in the lab. What QSAR does, this is just a fancy way, this is a quantitative structure activity relationship. I'll show you it in a slide in a second. But this is just, if I know that small molecules with two hydrophobic rings and one hydrogen bond donor tend to bind well to this type of target, and you now need to design a new drug for this target, what would you look for? Things that have two small rings and one or two hydrogen bond donors, right? So this is just a way of describe roughly how things should look and try to find things that look similar. And Pharmacophores, again, that's just a slightly faster name of, we're going to need this model, one hydrophobic ring here, one hydrophobic ring here, and then a hydrogen bond donor. Please find all molecules in chemistry space that looks roughly like that. That's something that computers can do. And if there's something computers are good at, it's searching, right? And what's happened the last 10 years is that we're doing, we're increasingly doing computational or virtual high throughput screening. Your case is EVHTS. So the virtual means that we do it in a computer. They are very different. If you think of the experimental part, the experimental part is expensive, it's slow, but it's also correct in a sense, right? If I find something that has an activity in the lab, I definitely know that I had that activity. The largest public screening centers can do something in the ballpark of a few hundred thousand molecules, a pharmaceutical company can do more. There is no public facilities that can even come close to the best pharmaceutical ones because they spend more money on it. It's this particular, this is not very important research, right? That we know the fundamental principles here, but this is about doing it commercially. And then you're going to need a partner in industry. If you do it computationally, it's certainly cheap, fast. I would say that it's not really accurate. You're going to have errors. And then you can easily, instead of testing a couple of hundred thousand molecules, maybe you can test a billion molecules, but that's not good. What does it help you that could test a billion molecules if you're not as accurate? Can you really trade volume for accuracy? I'll think about that for a second. Maybe we can. We would like accuracy, right? But what does accuracy help you? If we do the first one and if I get zero hits, yeah, it's really accurate that you had zero hits, but that's not going to help you get to the next step. On the other hand, there is no way you could ever test a billion things in the lab. If I can test a billion things here, maybe I can give you, say, the 10,000 best ones. And then you take those 10,000 best ones and go into the lab and test those. Testing those, so remember the previous slide where I said you might spend a quarter of a million dollars. Testing 10,000s, that will only cost you 15,000 dollars or maybe 20,000 dollars. Much cheaper. And this is something the computer would run overnight. So yes, this is inaccurate, but it's a really good way of somehow narrowing down one billion molecules and try to find the 10,000 best ones. What says that I found the best 10,000 ones? Maybe because I'm going to, again, I will have sorted away 10 to the power of five molecules here. So what says that I didn't sort away something really good? Absolutely nothing. But here's the trick. I talked about omnipresol before. What says that omnipresol is the absolutely best proton pump inhibitor possible? Nothing. And if your goal is to make money, your goal is just to find one efficient drug. Yes, it's great if you find a better one, but as long as you find something good, you're happy. It would be great if we find, it's not that we like missing something good, but since it's not an alternative to screen through all of these experimentally, we have to accept those as the rules of the game. QSAR in general can be pretty complicated, but fundamentally it's super simple. So we somehow correlate and biological activity with a simple chemical property. By far the easiest one of this, which I like to use because I have my ligand catarine channels, when it comes to efficiency of anesthetics and how hydrophobic they are, there is a perfect correlation between how good a molecule is as an anesthetic and how hydrophobic it is. The more hydrophobic, the better it is as an anesthetic. So if you're now going to design an anesthetic, would you pick a molecule there on the scale or would you pick a molecule there on the scale? If you're in the business of designing anesthetic, you'd better pick molecule here or your boss is going to fire you. And it's not really more complicated than that. In this case it's simple because it's just one parameter and you might take 10 parameters, say remember all those Lipinski's rules of five, right? If you're smart you should probably try to apply, maybe you can't fulfill all five, but make sure that you fulfill four of them at least and then have three or four parameters because suddenly instead of having 10 to the power of nine molecules, you don't even have to worry about predicting interactions, you can probably screen away 99% of your database just by correlating it with the activity you would like. That is something, you can start from a billion molecules and this type of screening is something the computer will do overnight. And I'm not sure about you but to me saving a quarter of a million dollars overnight, that's pretty nice. So the type of things that you might want approach here is a molecular weight, charge, dipole moment, the surface area of the molecule, how hydrophobic it is, etc. It's not rocket science, very simple stuff. What people do today is that we try to use machine learning for this. So you look at all the drugs, I'm not sure, this is not a course in machine learning and we won't have time to go through it, but what's happened the last few years that there's been a revolution in techniques to detect images in particular. Arne might have told you a little bit about machine learning in the bioinformatics course and the way this happens on internet in the general is you use a concept called deep neural networks. And deep neural networks have been a revolution for machine learning in particular in combination with graphics processors. The challenge is that they only work well if you have tons of training material, but this is an area where you typically have tons of training material, all the experimental evidence in the literature and everything. So using QSAR in combination with machine learning to try to find similar molecules can usually work really well. And what you might want to do is that you might look at some molecules that you know bind and say that there should be, say, some hydrogen bond donors in the blue parts and some acceptors in the green part, whatever. I don't remember what these parts are, but you somehow, instead of describing all the atoms in the small molecule, you try to classify the molecules, roughly how it would behave as a binding partner. The advantage of this is that it's superfast and it actually finds ligands in many places. The problem is that you will only really discover things we already have in the databases, right? You're not going to define, unless we have something to bootstrap from, we're not going to be able to design things out of thin air. And this is a problem as we increasingly want to target receptors that are new. If you have very flexible molecules, first we have the problem with entropy. And if the molecule is now very flexible, it can bind in possibly 100 different ways, depending on how you move the molecule. And you're going to need to find the correct confirmation that you're actually using in the binding. Now that can be a problem. This also assumes that we know the binding site for the molecule. We don't know where the molecule is binding. We are somehow lost here. And it's important that we include some sort of non-binding molecules, but it can easily end up with a situation where we should pick everything. Something is going to bind. And it's just saying that small hydrophobic molecules will in general bind. Yes, I know that that's going to bind, but it's also going to bind to 500 other places. You want some sort of selectivity. So if you want to do slightly better, assuming that you now have your very specific protein. So rather than, sorry, QSAR is also bad in the sense that this has just numbers in a table. Very simple numbers. Say the 14.5 hydrophobicity, 16.5 in, well, it's not going to be a chart, 0.5, 1.0 in charge, 16 in surface area, whatever. So just a table of numbers, and then you use a computer to recognize those numbers. Because we want to make sure that we have some sort of selectivity, right? We want to make sure that we don't get everything to bind here. And I want to make sure that I, if I build this model, I want to make sure that it's selected, that the molecule, molecule that I'm actually this is a bad way of formulating it. First, I want to make sure that in the site that I'm targeting, only my molecule should bind really well. Second, I want to make sure that my molecule should not bind in 500 other places. Otherwise, I'm just selecting for things that are small and hydrophobic in general. If you want to do this, you can do better than this because you know a bit, maybe you know a structure of the protein, right? And you might be able to say it's not just that you want two hydrogen bond donors, you can say that you would really like a hydrophobic ring down here, and another say hydrophobic ring up here. Maybe there should be a hydrogen bond donor here and an acceptor here and then a charge here or whatever. And we don't really have a molecular structure. I don't care about what the structure of the molecule is, but I say that, oh, and they should be two nanometers away from each other. They should be 2.5 nanometers so that you have a small super simple model in space with just three or four dots or three or four active groups. This is called a pharmacophore. And all these companies will have pharmacophore models of the important sites for their molecules in the databases. And then you can go to these super large databases with a billion drugs or something, try to screen through them and what drugs fit my pharmacophore model. This is typically something you will design manually, and you might very well do it if you work in pharma industry. But of course, once you've designed this manually, the computer can screen through a billion molecules for you overnight. And then it might say, well, this particular molecule fitted that you had some sort of polar group there, some hydrophobic rings there and, I'm not sure, polar group there. So you find patterns and you refine this and then you hopefully go back to the lab and see did this work. Do you think this works? It does. It's primitive, but there have been some really important drugs developed this way. This is an example. This is an example of a bunch of full agonists. Do you see any patterns here? So these are different drugs that target the same receptor. But all these full agonists, they have two or three of these rings, and they might have some OH groups in different ways. They're not identical, but they're not completely unrelated molecules. So there are some sort of common patterns here in the series. And I forget here which one is best. It doesn't really matter. So this is just another way of visualizing a pharmacophore. And there are some various different things. If you have a molecular viewer, most molecular viewers can actually, they can have some sort of support for visualizing pharmacophores to show what you have in different parts of it. And at some point, you also want your molecule to actually fit the pocket or something so that, again, in particular, if I know that all the previous molecules that bind to this were roughly, say, 5-thousand cubic angstrom, you should probably look for molecules that have roughly the same size so that they fill the pocket. But the one thing we haven't used this far is the protein structure. I know that I've kind of hinted about protein structure. But in principle, everything this far can be based about looking at other drugs that bind your receptor and try to find patterns, find things that are similar. That's not going to work perfectly. But the point is that you just need some sort of lead. And once you start getting that hitter lead, then you can start pulling in the thread and try to systematically improve it. But we are leaving out a huge amount of information because everything we know about proteins here, we should be able to use the protein structure in a much smarter way to define things better. You had a question? So it's important to remember that most things you find, whether it's by chance or by divine inspiration, they're going to be bad binders. So that the way pharmaceutical industry works is that you iterate. You do something, and then you look at the result, and then you try to improve that, and then you do it again, again and again. So at the very first step, you might just assume that we have a brand new receptor that I don't know anything about. And then I start screening this databases computationally or in the lab. And then you say, let's say that you find 50 small compounds that bind. To tell the truth, their binding is likely going to be pretty crappy. But at least we can, the first 100,000 molecules we actually could test in the lab. So I'm really sure that these are good binders. Well, no, not good binders, but they're binders. Now I have 100 molecules that I can work with, right? So then I try to use, for instance, a QSAR or a pharmacophore model to say what was, rather than worrying about all the details and all the atoms in these individual molecules, what are the unifying features of these molecules? And that might be that they have two hydrocarbon, sorry, two hydrophobic rings, hydrogen bond donor on the left, hydrogen bond acceptor on the right. It's not going to be that beautiful, of course. It's going to be a bit fussy. They will have some differences. But now I can describe a model and say that, you know what, in general, it appears to be this type of molecule that has some sort of affinity to this receptor. Now I can go to an even larger database. So now I can go to the largest computational databases I could find. In principle, I could even, no, I'll wait for the right one step. So now I go to a database that has 100 million compounds. And then I try to screen through all of these 100 million compounds, which ones appear to be similar to the pharmacophore model I had. And then I might say, okay, it's 10,000. Then I go into the lab and test those 10,000. And hopefully I'm going to find a couple of drugs that are slightly better. At least I will hopefully find more drugs. So that will show that, well, in general, there were a bunch of different pharmacophores that could bind here. But this is number three and 14 that appeared to be the ones that are most promising. So now you would like to go to an even larger database, right? But there are no larger databases. So what do you then do? Then you synthesize. So you can design, again, you have 10 to the power of 60 molecules you could create. So now you sit down with the computer and ask the computer to synthesize more drugs. Imagine just randomly come up with drugs that look like this pharmacophore. Then it starts to get expensive because they can cost thousands of dollars per drugs to synthesize. But again, if this is a promising molecule and you're starting to have some good leads, you will do that. So the point is you iterate over and over and over again. And in each step, you try to use these models to understand we're not interested in understanding all those hydrogen bonds. We want to understand what are the important features? Why do these classes of molecules appear to be good? But I'll come back to that too later. So sorry to say that again that they do. They do. But again, there are lots of receptors out there. This takes lots of time and effort. The early start on the research level, that's certainly something that you might be working on as a PhD student in a few years. But if you're now saying that I would like to do a study that the cost is going to be five million dollars because we need to screen a law then everything. And if you go to the pharmaceutical companies and I have absolutely no idea what this protein is doing, what do you think their answer is going to be? No, thanks. Because if you have no idea what it's doing, these companies in general, as a company, you don't work on random receptors because it's too difficult. So you tend to have your expertise that, for instance, you're working on diseases related to acid reflux or diseases related to the nervous system. So most companies are only interested in those receptors. They're interested in them. But there's a lot of work on this in academia. But I'll bring in the last piece of the puzzle here. Today, because we're having more and more protein structures, we want to use the protein here and use docking techniques. Docking techniques is in a way much simpler than the simulations you have already been working with. Because in principle, you only want to use the tools that we had, simulations, force fields, to predict what molecules should bind and how they bind. That sounds simple, right? The only problem here is that you run some simulations by this time. Running one simulation takes time. And here we would need to do it for maybe a billion molecules. There is no way you could do that with a simulation. It would take too much time. So just naively looking at this, docking, we want to predict the best possible ways to put two molecules together. And then try this for lots of molecules so I can find the best molecules. So I'm going to need to find some sort of sorting way. If you give me 10 molecules, I'm going to need to rank them, say which one seems to bind best. This could be some sort of the force fields or the interactions you've looked at. As you're going to see in a second, we have much simpler ways of doing it. In general, I don't know where the molecule is going to bind. In particular, if it's an orphaned receptor. So I'm going to need to try lots of different ways to... It's basically a Lego building. I need to trust every single possible way they can bind to each other. And that's going to be costly too. So I'm going to need to find a way to do this super quick in computers. And you could argue that in some cases, say that we have a dopamine receptor, it's a DPCR. I might have a super good pharmacophore here, right? And I know that it's exactly in this region that's going to bind. So if I can now take the pharmacophore and say pick a thousand good molecules that looks roughly like it, instead of testing this in the lab, I could try to put these in the binding site and then let the computer decide which one of these molecules is likely to bind best in this particular binding site. This is in principle very easy. You're basically doing this. But you have the only problem is that you have a billion holes and you have a billion pieces. But otherwise, it's exactly the same thing. The point here is that you don't get any Nobel Prize for doing the most beautiful, fantastic interaction functions. This has to be fast, fast, fast, fast, fast. If you have a billion pieces, you can only afford to spend like a hundredth of a second per piece or something, right? So the key word of docking is that it's fast, sloppy, but we might find something. The difference here compared to everything else, for this to work, we're going to need the toolbox, the box. We will need a structure of the protein, so we want to need what are we trying to get things to fit to. To do that, if you wanted to do a full blown simulation here, this would take forever. We don't can't afford that. So with docking, we're going to need to cheat. We somehow need to sample. We need to test lots of different possible ways they might or might not fit. And the way we do that is that we assume that the receptor is an ice cube. We can't move it at all. And then just take your small molecule and try to maybe move your small molecule and put it in lots of different ways. Again, computers are super fast at this. And then we're going to have to accept we can't have water. We can't have all those beautiful details that you do in a full simulation. Just very quickly try to determine, do you have a paired hydrogen bond? That's good, plus one. Do you have two charges of the same side right next to each other? That's bad, minus 10. So you do some very sloppy, simple scoring function. The reason why we call it scoring is that you could even say that if you have long, flexible molecules without those hydrocarb, without those benzene rings, you can say that you can add some rough estimate of what the entropy loss should be based roughly on the shape of the molecule. Your guess is as good as mine here. The only rule of the game is that it has to be super fast. So try to find things that are good and try to find things that are bad and assign some relatively arbitrary scores to it. But the problem here is that this very quickly becomes expensive too. If you take a small ligand and we just allow it to be five, six different rotations or translations, and then we also have a number of bonds inside the ligand that might be allowed to rotate. And then I want to test this with say 10 angstrom of each side of the box in three dimensions. And then I want to sample those angles, rotations in say 10 degree intervals. This would take hundreds of years on a computer. These are just estimates. But the point is I can't even do this. So here too, you're going to need to be super sloppy and try to find a way, start placing this somewhere. If it looks bad, stop immediately and go to the next molecule. If it starts to look bad, can we optimize this a little bit? We will never be able to test every single possible conformation. The way you do this is this sounds fancy. They're called genetic algorithms. They have nothing, they have very little to do with genes. But you optimize this roughly the same way the genome works so that we create a bunch of random conformations of this small molecule. And then we see, we try to evaluate things that they look good. And then we take say the 10% of the best ones, and then we mutate that in the sense that, okay, try to create more poses very, very close to the good ones. We will accept that there is no way I can ever test anything. So try to pick the best ones and test more things close to the best ones. And then you just repeat this a couple of hundred times. I'm well aware how sloppy this sounds. The point is that it is sloppy. There is another way you could do this, of course, that again you can cheat in physics. So I can say that if the blue part here is my receptor, maybe I should start with, start by building one atom there. If it's a hydrogen bond donor, I will put the hydrogen bond receptor right next to it. And then I will say, oh, you needed a hydrophobic ring. Okay, I can put a hydrophobic ring right next to this too. So you're gradually building the ligand inside the receptor. This works surprisingly well too. And the point here is that you can do absolutely anything you want. This is kind of divine inspiration. Any way you can come up with a good molecule, the proof, again, it's in the eating of the pudding. If you used your divine inspiration method and that always results in good binders, you're going to make lots of money. The way you evaluate this, whether things are good, it's in principle a force field. We had these physical terms for charges, fundamentals, interactions. It would be too expensive to evaluate this in a full simulation. And in particular, you can't have all the water. So we try to give special points, for instance, we have hydrogen bonds, because again, all these hydrophobic interactions, they would really require us to have all the water to treat them exactly. We can't afford that. And at some point, if you expose a hydrophobic surface to water, I just calculate, okay, that's 10 angstroms, that's bad. And then I assign some score to that. And I can also say that it's, you can even use some sort of statistical potentials to say that some of the atom contacts are likely, then they should be good. Some of them are unlikely, and then they should be bad. Your guess is as good as mine. This should just be fast, fast, fast. And you typically try to score this on a grid or something. I can show you here is that you take your small probe atom, and you put this in every single position. Again, I'm not going to try every single position between these grid points. So I just, if that is, if that is a really good binding spot, that one will hopefully, I will hopefully see some good binding energy there. I might not. I might throw away lots of good stuff. So the rules of the game here is that I accept that I will throw away lots of really good stuff. But that's not really how I judge myself. The only point is at the end of the day, will I have done slightly better than random? As long as you're doing better than random, you're good. And that sounds like a horribly low standard, I know, but I'll show you why this works in a second. The point is that we don't really care about accuracy. If you start from a million compounds, in general, of those one million compounds, you might expect that say, I'm not sure, sorry, if you start from, let's say that is one in a million compounds that will actually bind to your receptor. If you now pick 100,000 compounds, the likelihood that you will have a good binder is just 10%. So on average, you're not going to find anything, right? But instead of starting from one million, let's say that you start from one billion compounds, there is no way you could test all those in the lab. But maybe I can test them in docking. So by default, out of one million compounds, I would have one good. But what if the docking is so good that most of the stuff that I pick up the docking, I'm sorry to say, that's going to be crap. I will make so many mistakes that most of the things I select will be mistakes. So maybe out of that one billion compounds, if I can take that down to one million compounds, I will select one of every 1000 molecules, and let's say that I make 99% errors. But that means out of one million molecules, maybe I end up with, say, 100 good ones. Now the fraction of molecules of that one million that are good is still almost nothing, right? But it's now a factor 100 more than I had before. So I have refined the fraction of good things by factor 100. The vast majority of things are still bad. If I now take this new set that I selected based on docking, and if I go into the lab and screen 100,000 of these, I expect to find 10 good ones instead of 0.1. And that's the difference between failure and success. So docking is horrible, but as long as you're doing better than average, you can compensate for that by going to insanely large databases. And the cool thing is that I wouldn't show you this unless it worked. So that those two compounds I showed you before, like the Mason-Krusssein, the problem is with the Krusssein, if we look at the second line first, there we actually have 150 experimental hits. And in that case, if I just run docking, I only got so five of them. That's stupid because it's not going to help you. But do you see in the first case, in a case of receptor where I did not find anything experimental out of 300, when I do docking on a much larger database, suddenly we have two hits to work with here. And that's the difference between zero and nothing, which is the difference between not making money and possibly making money. Yep. No, sorry. When I say hits here, these means that I got them from docking, and then we have confirmed them in the lab so that these are actually, you're going to get, docking doesn't get you yes or no answer. Docking gets you a general score, right? But this means after docking, we were able to identify tons of things, and when we tested them in the lab, we found two that worked here and five that worked here. So five more in addition to 146 will likely not help you, but having two instead of having nothing is a very important difference. Sorry, probe atom. Yeah. So at some point, this is my large receptor, right? And I have no idea where things are going to bind. They could bind everywhere along the box. But if you look at something large like a ribosome, it's an insanely large area. I can't afford to test every single position. So I'll explain this the right way rather than the quicker. What I can do, if I have my entire large, the light gray molecule here, I can say that maybe I have, say I might have a positive charge there, I might have another positive charge there, negative charge there, something that can form a hydrogen bond up there, and maybe a high something large. I have some sort of cavity here. I can trace out the rough shape of this, right? So because I know what all the atoms in my large box here are. Rather than testing every, you can then choose that if this is now my protein, I only have to determine this once. But rather than testing this for every single position, I can take my 10 angstrom by 10 angst, or sorry, in this case, it's going to be 100 by 100 by 100 angstrom and design some sort of grid cells. So in that grid cell, I will have some properties. In that grid position, I will have some properties. So in that position, the property is going to be plus charge. In that position, property is going to be hydrogen bond partner. I can take my small molecule, the black one here I might want to try. And rather than randomly testing every single possible confirmation, I go through all these grid cells, say one angstrom, sorry, 0.1 angstrom apart. And if I now place my molecule, rather than calculating every interaction between that atom and all the 100,000 atoms in my protein, I just say, okay, I had a carbon here, carbon to plus that's bad, because carbon is hydrophobic usually, right? And then hydrogen bond, well, I might be able to form a hydrogen bond. So that's good. So this is just a very quick way of approximately scoring things. So you assign properties to every single position in space. And then I try to guess roughly how compatible my molecule would be with that. So in principle, we could, of course, do this much better, right? If I could do, if I had an infinite amount of computing time, I would like to just simulate this the way you've done overdoing it and calculate exactly how a specific molecule should bind here. Physics wise, that's definitely the right thing to do, but I can't afford to do it, at least not historically. Sometime in the future, we might be able to. So the point, the point which you should think with docking, don't worry about the fact that it's sloppy. The fact that it's faster, sloppy, that's the whole idea. Because you can do things a million times faster than you could ever do them in the lab. But then there are some things that are hard. We frequently end up with docking results that are not that beautiful. And then you look into them and you determine the structure. And then you realize what really would happen here is that some of the side chains in the protein might move a bit. And when the side chains in the protein move, if I could just take that into account that if you now bind a small molecule, the side chains in the protein would have moved a bit to accommodate that better. If I could just have taken that into account, I could have had a much better docking pose. There are some programs that include flexibility in the protein, but they do that at the cost of being much slower. And here's where your expertise comes in if you're ever sitting and doing this, right? That don't assume that the more, don't think that you should necessarily throw the kids in sync at the problem. Anybody can come up with a more advanced way than docking. I could do, you can all come up with 10 ways to do things better than docking. But if you start adding all those 10 ways, you're going to do that at the cost of not having time to screen through a very large database. And that was the whole reason docking worked. So again, anybody can say, do this with a molecular simulation instead. I'm going to show you some ways of doing that. But that might complement docking once you are very sure that you have something really good and you would like to understand this better. But the point of docking is that it's fast, not that it's accurate. So don't throw out the fast things if you're doing this. Of course, anytime you look at the comparison of the programs, all these authors will claim that their program is better because it's more accurate. Accuracy is not the goal of docking. Speed is. And at this point, you're going to have a drug. You will have something that binds. Unfortunately, you will need to feed the patients five kilos a day of your drug. Why? This has to do with the balance, right? If you have A plus B, and then a reaction, whether this is AB bound, that will depend on the constant there. And in general, if things are not a great compound, sorry, if things are not a great binder, well, just add more B and you're going to force it more to that side. So I'm not sure about you, but the first problem is that if you start eating five kilos of anything, you're going to have some pretty substantial side effects. So when you talk about this, people in the lab, essentially, you want something that has a better free energy of binding, right? So you're going to start out with something that maybe just a five kilo calories per mole. If you could get that down, every time you increase this by kT, what's going to happen? Or, sorry, decrease by kT. So every time you increase the binding by kT, sorry, and again, if it's two kT, then it's roughly a factor of 10, right? So for every two kT, you improve the binding, you're going to move it one order of magnitude more to the right-hand side. That's important. The way you typically measure this, you are now comfortable with talking about kilojoules and kCals because we went through this from a physics point of view. The way experimentalists typically describe this are going to be in concentrations. So that's going to be the concentration where this is roughly balanced. So if you have a very, a binder that's a millimolar binder, that would mean that you would need roughly a millimolar here to have 50% of it bound. A micromolar binder is something that already the micromolar factor, thousand less, will bind really efficiently. So the lower this concentration is, ideally you would like a picomolar binder. Then you take a milligram a day and you get the same efficiency. So you would like to improve the binding affinity. And that's what you do with computers. And here's where you start getting proper computational chemistry, simulations, you properly calculate binding affinity because now you have your hit or your lead. This is the compound you're working with and your boss comes in or you go in through your staff and say that this is molecule X that we're working with. You might have 10 molecules that you're working with and we have exhausted all the databases and now we need to get this better. This is what that's resenicated with Omopressol. They realize that they only need one of, one half of the molecule and that was going to be better. And now you're going to need to understand binding properly. You're going to need to understand your free energy. You're going to need to understand entropy. You're going to need to understand the selectivity and then you need to be smart. So this is one example where this, this is actually the first drug on the market that was actually designed this way. So you know what HIV-1 proteases? Never mind. It's an, you can probably guess what type of disease this is related to, right? AIDS. So this was one of the first, if the drug's sufficient to get AIDS. So what they started with a hit here, that was basically two, it's a diol compound to two alcohols. It's a symmetric one. And then in step two here, you created a pharmacophore that you need to say, how did the bond donor accept her? Some distances between it and then to phosphate groups, I think it's work. Then you found a database hit based on that pharmacophore. You made an initial design in the lab. Small is beautiful. And then you needed to add this design, extend it a little bit, to add the two alcohol groups. And you added a bit of urea molecule. Then you optimized the stereochemistry for binding in the specific binding site in this HIV-1 protease inhibitor. And this was then the drug selected for phase one studies. And it didn't just go through phase one. This went through phase two and phase three and went to market and has been used efficiently to treat AIDS. Pretty cool, right? And again, this was the first example of a drug where you used computers the entire way. Of course, these things are not selected by divine inspiration. Between each step, there are probably 10,000 experiments you made, right? They probably had 500 candidates here or 10,000 even. And then you make 10,000 experiments and pick this one was the best. Let's continue with that one to the next step. So you can think of this as a very large seed where you gradually pick the best ones and then you try 10,000 more crazy ideas. And what we're increasingly doing now is that at this stage, the crazy ideas were mostly human selected. But what we're increasingly telling the computers, hey, have some crazy ideas. Try to find 10,000 variations of this compound that might or might not find better. And then we'll check them in the lab. Sorry, somebody had a question? Yeah? No, sorry, those probably weren't phosphate groups. I forgot for those P1 and P1 prime where they might have very well been these benzene rings. I can look it up if you want to, but it's basically properties, right? And then even here, you have some sort of early hit that had probably had a super low effect. Somewhere here, you probably went from millimolar binder to picomolar binder. But this is something that you can actually, there is no way you could have a dragon that's a millimolar binder because they would never pass the food and drug administration. Do you know what this is? Remember that I showed you some biologicals earlier, right? This is the equivalent of a biological. This is one of the most important blockbuster small drugs. Do you see that it consists of a bunch of benzene rings, sulfur atoms? It's not really super complicated. Again, it's not something I would come up with, but it's based on a bunch of iterations people have gone through. This is atorvastitine that you probably never heard of because that's the chemistry name for it. Have you heard of Lipeter? So that was the drug. In the early 2000s, the most important drug in the market was Lucic, which has been one of the biggest export successes of Sweden ever. This was a drug that took over as the world's most selling drug after Lucic. And Lipeter is basically a drug that you use to reduce cholesterol. If you're a pharmaceutical company, this is what you dream about. A drug that's tailor-made for rich western people with lots of money and that have middle life diseases. Again, it's important that they get these diseases when they're basically because we're fat. And then it's a drug that you should take the rest of your life. This is somewhere you start having wet dreams if you're a pharma executive. And again, it's so easy to make jokes about this, but we are complaining about them, but we're not. We could of course do all this drug development in academia if you want to dispense tax money on it, right? We're not. I think if you're there, of course, there are a lot of people in the world that die from heart failure and everything related to this. Some pretty horrible diseases is curing. And rather than blaming them that, hey, we could have done better ourselves, we didn't. This drug at the peak, it pulled in something like $14 billion a year. Until 2015, the total revenue for this drug was $140 billion. That's like the gross domestic product of a small country. But then something happened. So if you're here, you're very happy when you're the CEO, right? What happened here? The patent expired. And then people had me two drugs. They probably have something here because they probably had some other extra patent that made it hard to completely go around it. The other reason is that because there are now copies that tells one fifth of the price, what do you have to do? You have to reduce your price, right? So suddenly you're no longer as happy if you're the CEO. And in this particular case, they need to find new drugs. And these are not something you pull out of your left arm and come out. These drugs tend to come like once every 20, 30 years for a company. And then it's some other company. So this is the problem here. Once you get the drugs on the market, you only have like one, two, three, four, five, six, seven, eight, maybe nine years to make all the money back. And then you lost the patent protection. Sorry? The patent is for 20 years, but that patent started counting somewhere in 1990 or something when you first discovered that there was a molecule that had a affinity to that receptor. Then it probably took them 12 or 13 years before this was approved on the market in the first country. And then you're going to need to approve this all over the world. And every single regulatory agency will want the new test. So it's like the point is that drugs are expensive because the risks are so insanely high here. And they certainly make, I am so not making an excuse for these companies. They make tons of money if you look at some of their stocks. But it's not an obvious situation that they are the bad guys. Many of these effects we have, of course, created in society. And here's actually the smart thing with patents. Why do we have patents? To encourage people, to encourage people to do what? So patent is a balance. What would happen if you didn't have patents? So one or two things would happen, right? Other than they wouldn't publicize it at all, you might say that if you have a problem with cholesterol, you could come to our special facility and get treatment with some secret drug that we're not going to tell you what it is. That's bad for society. The other thing that could happen is that companies decide that if I spend half a billion dollars developing this drug, the next day it's going to be copies on the markets. There is no way I'm going to do that. I will just lose money. Let's not do this research. So patents are a twin egg sword that we need patents so that people actually invest in this. What they get from the patent is that you have 10 years to make a shitload of money. But after those 10 years, this is available to the public. Long term, the idea of these molecules that bind and reduce cholesterol is available to everybody. In return, they get 10 years when it's protected. But long term, it belongs to the community. Yes, that's what we're getting to. The problem is that this keeps taking longer every year, because there are more and more regulatory requirements. And for every small scandal, there is politicians come up with another 50 things that you should prove. But at some point, it's going to take 20 years to prove that a drug is safe. And then it would be pretty stupid if you develop drugs, because your patent would have expired before you can sell it. So everybody would like to speed up this pipeline. And it hasn't really happened yet. But one of the reasons for all the computational stuff is that we ran out of ideas with the old way of getting drugs for the Amazons. So this has started to pick up the pace quite a lot again. The biological is one example. Another thing is that you can actually occasionally use molecular simulations for drug discovery. This is an example of a brute force binding of a drug. So this is a large protein that they simulated for millisecond or so. And then this small molecule, it explores all the surface of the protein. And after half a millisecond or so, you're going to see that it's binding in here. And this has actually later been shown to be the actual drug binding pocket, because you've determined an x-ray structure. That requires an insanely large computer. I'm not sure if I have some pictures of it later on, but David Shaw is a very special guy that he... Did I tell you about David? Okay. So David was a professor, a young student at Stanford, and did a PhD in computer science, and was one of the world's most foremost experts on parallel computers, which again, in the early 1990s, was a big thing. And then he became an assistant professor at Columbia University in New York. And he was interested in building very large parallel computers. And he tried to find money from this from public funding first. He didn't get funding. And then eventually he went out to a bunch of far stock market companies, Morgan Stanley, et cetera, and tried to get them to fund this. That didn't work either. So basically, when the opposite happened, they basically recruited David instead to work on computational aspects of stock markets. So he left academia, and David, as part of his work, I think as Morgan Stanley started that, but I don't remember that, David funded the whole modern arbitrage trading, using computers to do trading on stock markets that everybody does now. So at one point in time, there is a hedge fund called the E-Shaw. And that was at one point in time the world's third largest head fund. This guy's pretty rich, if you might say so. So when he turned 50 or so, he decided that he wanted to go back and, I guess, have a small hobby and start looking and solving some hard computational problems. So they started to developing special hardware made just to do simulations thousands of times faster than you could do in any normal program or machine. But of course, if you're worth a couple of billion dollars, you don't apply for assistant professorships. So he basically hired 50 people and had them start worrying, this is his small hobby business on the side. So they've designed a number of machines like this that can do simulations of proteins on millisecond scale. And this far, they haven't sold anything to the pharmaceutical industry, but make no mistake about it. The guy is good at making money. That is their long-term goal. So within 10 years, I presume that all the big pharma companies are going to have machines like this using them for drug design. You can actually, when you simulate something completely, you can actually do a proper energy landscape analysis. And you can actually show exactly what position of this ligand will have the lowest free energy and how is the free energy changing in all other positions. How does the free energy change if you replace, say, a methyl group with an ethyl group? Or if you change it to a different, say, well, if it's a biological, you could of course change the amino acids or something. That allows you to design drugs on a much, much, much more detailed level. Which, in principle, you could do with any computers, but with any computer would be too slow. With this type of computers, because it's a thousand times faster, we can afford to do it. And the difference between MD and docking is that with molecular dynamics, we can calculate free energies really accurately. So you can get free energies that are accurate to within roughly one kilo calorie per mole, much less than a hydrogen bond. And again, this is the future of pharmaceutical research, because if you now can take a specific molecule and have a series, and if you can start to decide exactly how should I design my molecule to bind, the other thing that you can do is that you can start to think, how should I get across the kinetic barriers? If you somehow need to move into how not just if you want to bind in a good place, right? But at some point you're going to need to move into this position too. How will you move in? Well, if you're simulating it, you're including all the motions of how you're moving into the binding side too. And this is where it starts being a bit of science fiction. There are plenty of groups that have shown this in academia for known targets where we already know how things bind. The interesting thing is going to be whether this works in the pharmaceutical industry. And here's the problem. The pharmaceutical industry is a bit secretive about this. Why? So I say that there's lots of promise here. And academic groups, what academic groups do, because we can't afford to spend a million dollars on all these test rights. So what academics groups tend to do is that they tend to pick a protein that we have a known binder. And I know that this drugs bind because I've seen it in an x-ray structure. And then I do simulations to show that I can really predict how it moves in and how it binds. And then you write the research paper showing that we really understand something. And you do understand something. But this is post-addiction. We already knew that this molecule bound. The really interesting thing, could you show this for a completely new say orphan receptor or something? And the people working on that, they're in the pharmaceutical industry. And why don't they share their results? Right? Because to them, this is intellectual property. If my company can come up with a smart algorithm of doing this that you don't know, that translates to money for me. Because that means that I can go after receptors in a way that's better than you do. So I think that this will likely take 10 years before we see whether these methods are successful in industry too. I will show you some more examples of that in a second. But to be a little bit more concrete, I'm going to show it for GPCRs. So these are the membrane proteins that I said were so important that most of the drugs were targeting, right? This is such an outdated slide. You do not have more genes than you had two years ago. Forget about that number. It's probably 100 now. The other thing is that this is an example where we have pharmaceutical companies that have determined structures that they have not deposited in public databases. Because if I have my fancy XYZ receptor, and if I now spend a billion dollars to determine a structure of this so I can make some drugs, I would be completely insane if the first thing that I did was to share this with the world. So what happened, even with some of the first structures here, they were initially funded by the pharmaceutical industry. So some of them were public, other structures were kept secret for a year because they wanted a head start to design drugs. The way GPCRs works is that there are these seven transmembrane helices, and then you have some sort of neurotransmitter or something that binds here on the outside. And then this entire protein goes through a protein earthquake. It changes its confirmation. And it's a G protein coupled receptor. The coupled part is this G protein that sits on the inside. And when this molecule changes its confirmation, it creates a signal on the inside. And that's again because there are signal transmitters, right? And that's why they're so remarkably nice to target. And there's a whole, there's a super complicated network of these, there are probably more up-to-date versions you can find online. The structure of these you already know because they're similar to the very first membrane proteins that was discovered. That sounds a bit stupid, right? How on earth can be they be so important if the structure is similar to the most simple membrane proteins? Well, the devil is in the detail. We know that they are 10 transmembrane helices for a very long time. These were so difficult to determine structures of that a lot of people, me included, thought that we will possibly never ever have any structure of G protein coupled receptors. Had you asked me about that when I was a student? That would have been my prediction. Then through various reasons people finally managed to determine structures. But the problem is that everything that decides first how it binds, second how it moves, and third how it signals on the inside. It's going to be tiny details, single amino acid changes, so that which way is that loop pointing? What binding site do you have here? So what specific molecule will this receptor bind? And that's the hard part. At first sight they're going to appear to be almost identical, and once you know what the difference is, it's going to be obvious that that was the difference, but you can't predict that merely from sequence. And it's also an example that any bioinformatics model will give you something that looks roughly like this. But the bioinformatics model will not tell you what the specific position of that loop is or exactly how the binding site here looks like. And if you want to determine a drug, you better if you want to develop a new drug here, you have to know exactly what the binding site looks like. And that's why if you're running a pharmaceutical company you could spend billions of dollars determining the structures. So what happened in 2007 is that within roughly 10 days you had two structures published of the human beta 2 adrenergic receptor. Kind of a coincidence, right? So this nature and science are the two top journals in fundamental natural science. So can you imagine what happened here? These groups were competing fiercely. At one point in time they actually collaborated and then they broke with each other. And I think in particular Brian Kubilke is the guy who had historically been the expert on understanding GPCRs. Ray Stevens is an amazingly skilled crystallographer. So they were initially working together, but at some point there are lots of rumors about how this break happened and that one group tried to cheat on the other. I have to confess that I'm biased because I've worked in the same department as Brian. I really like Brian. He's an amazingly nice guy. But and again that on both sides here they have company ties and everything. They Ray Stevens, they've done tremendous amount of work of mapping out the entire space and structures the 10 following years here. But there's a lot of money involved here too. Possibly lawsuits. We also know that this, once you know the structure of this, we've been able to determine very specific drugs, how they bind up here. And you see that this is a binding site you have identified. And that requires you to know exactly what residues are pointing into this binding site. And if you think that this is dry and boring, I think if you're running a pharmaceutical company this is where you start to seeing dollar signs before your eyes. Because you know the binding site of something. And what's happened since then is that you start to have an explosion of high-resolution structures of various receptors. In particular the G-protein coupled receptors, which are the no sorry they're not sorted by G-protein coupled. But basically from 2011 or something we've had the whole difference here is that we have more and more G-protein coupled receptors. That in particular means that we today we probably have close to 100 different structures. You start having a bunch of different human ones. You have quite a few from mouse or rat and everything. And you can, if you go online and look for this, you can actually even see that how they've gradually mapping out the entire tree of all G-protein coupled receptors. Because at some point you can start to use bioinformatics, right? If you have a structure known with 90 percent sequence identity, then you can probably build a good homology model. But the cool thing that's happening now is that every time you have a new structure in a new branch of this tree. And David Shaw and Ron Rohr, they have even simulated G-protein receptors and drug binding here. If you're doing simulations over here, we're one microsecond already. That's a fairly long simulation. And I think this simulation took something like 10, 20 microseconds. So here you see the small drug binding. When this drug bind bound, you actually saw some of these healers is moving. So they can see the entire activation of the receptor, how the activation happens, what transition stages go through and how it finds the new structure in the activated sense. And again, then you really start to be able to predict not just the first step is understanding that the drug binds and that it binds well, right? The second step, can you design a drug that induces a specific motion in your protein? And if you, after those studies, people have determined the co crystals at an x-ray structure of the receptor together with the crystal. So the pink one here is the simulated structure. The gray one is the one you got in the lab later on. Pretty decent match, right? And again, it's very easy to say that computers are slow. You're going to think computers are slow when you sit in the lab and try this. But in this scale, you're happy to buy a supercomputer for a couple of million dollars or tens of millions of dollars. The other thing that you should remember with computers is that they get twice as fast, probably every 18 months, right? There are a few other experimental techniques. So when I started out as a student, when we could run a 10 nanosecond simulation, we were thrilled. Today it's 10 microsecond simulation in any thesis. So that's a like a 1000 fold improvement in less than two decades. And that's going to keep paying back. So we will have, we will very soon wait 10 years and simulation like this is going to be commonplace. You're going to see them everywhere. You're also starting to see more and more simulations where people are actually studying what happens when things bind because that's where if you're, as you might have seen in the previous simulation, that you had the molecule being up here and then it spent some time waiting there. We can show the movie again. So you see in the binding, do you see that is waiting up there? That it spends quite a lot of time here, but it's not really happy. Now it's waiting and then eventually it's going to drop down here. So what do you think that waiting states corresponds to if you compare that with protein folding? You're going to have some sort of intermediate here, right? Where it's a relatively low free energy, but it's not the best one. And of course, if you now want a good binder, you would like something where the transition states should be relatively low so that you can bind quickly and you should quickly reach the bound state. And that bound state should have the lowest free energy. So this is really all the same thing as protein folding. It's something we historically haven't thought of as protein folding because it has been too expensive to treat that way. But again, wait 10 years. This is how you're going to design drugs in the future. And what they can show that they can even show that by changing things, they can influence these barriers and show that it binds faster or slower, but I won't go into details there. I think this is another example of, oh, this is just another drug where they show that, in this case, I think they actually showed the full activation and that they showed that they could achieve this with another drug. But in the interest of time, I will skip that. Brian Kubilke in particular has continued this work in the receptors and they've been able to show not just what the structure of the receptor is, but they've shown how these entire interaction networks work, how they're binding the proteins that are on the inside, how these proteins, how the structural change in these receptors induces structural changes in the yellow and blue parts here, and really how the entire activation networks on the inside of the cell happens. And I say that this is still very much work in progress, but they've been able to start deciphering the first part of this signaling pathways in the cell. And I think I've touched upon this a couple of times, but do you see the point that all these things we think of biology, cell signaling, cell division, it's all caused by proteins interacting and binding and forcing something to happen. And of course, this happens in a millisecond or something, and then you have quite a lot of time for the cell to repeat this with trial and error. In particular for the deep protein couple of receptors, people have been able to show really that it's particular one helix that's involved here and how this helix is moving and changing position as a consequence of binding a ligand in the outer part and that this causes things to activate. And I think that they had, yes, they had some 20, 30 microsecond simulations that they showed later on how the helix really changes place during activation, where you're going from active to intermediate and inactive states. And I won't, I'm not going to spend too much time going through that in detail. Yep. That's a super good question. We can make it more general to make it. So the question is once a receptor binds a ligand, why doesn't it stay bound forever, right? Because that's a low free energy state. Let's think about what that would mean biologically. Assume that I have my small, I'll use an ion channel because I think it's easier to understand. But so I have my ion channel and by default cells, cells should not be leaky, right? So by default, most ion channels are closed. And now for whatever reason, I have something that binds up here that causes my own channel to open and now ions are flowing through here. And then ions keeps flowing through. So all we managed to create now is that the cell keeps leaking. That's pretty bad. That would be horrible in biology. So what virtually all these receptors have developed, their properties that that state when it's open is not the lowest free energy state. That's an intermediate state. So what, and again, we only know this because we see it in nature. So what happens is that you're having some state that looks like that. So that is the open state. And this is some kind of desensitized or inactivated state that eventually you stay here for a little while, and then you continue to some other state where it's closed again. What now happens, remember that we have an equilibrium here, right? So that in particular, if you now change the confirmation of this molecule or something in this particular state, the protein might now have changed it so that it's not as advantageous for this molecule to bind anymore. And at some point, I'm going to have the law of mass action that some of these ligand will bind and they will eventually unbind. It's not until this ligand unbinds that there is receptor will recycle and go back here. And this could take a second or two seconds or 10 seconds. But because this protein, in the meantime, I've kind of disqualified this channel so you can't participate in being open because it has gone to this deactivated state. And eventually when the ligand releases again, because it might have a lower affinity in this state, then it will go back here. So the point is that when the ligand eventually dissociates, typically when in the case, this would be the case when you have the ligand bound, right? What would typically have, if you don't have the ligand present, you would have a free energy like to look something like that instead. So without the ligand bound, that should be the lowest free energy state. The reason why this works is that you don't have a uniform concentration of ligands in your cell. If you have, say, the synapse, if you have been one, sorry, one nerve cell here, and then I'm not going to get any prize for my drawings, if you have the nerve signal coming in here, and you have the neurotransmitters being released so they gradually open up. Initially, you're going to have a very high local concentration of the neurotransmitters here, right? And that will mean that lots of them might bind to my small ion channel on the receiving side here. But what will happen with those neurotransmitters about a second later? They will diffuse all over the place, right? They will diffuse out here and they will diffuse out here. So the local concentration of this small molecule will very quickly decrease. Well, very quickly. That might take a second. In the meantime, we have put the receptor in a deactivated state, so that it's not open. And eventually, when these have diffused away, we're going to need to use energy, because at some point, they're going to need to reload this whole thing to fill these vesicles with new neurotransmitters. But to do that, we will use energy. So the reason why this works, it couldn't work completely in equilibrium. The very fast opening or closing of a channel, that is something that happens, because when you're binding a drug, we are changing, we are altering the state that has the lowest free energy. So when you have bound the drug, it's better for the channel to be open. When the drug eventually releases, because again, this drug is now bound because we had lots of it present, right? If I significantly reduce the concentration of the drug, it's this equation again, A plus B, AB. If I am significantly reducing the concentration of B, the equation will go that way. So if I, as B is diffusing away, they will gradually reset the receptor. And when B is no longer bound, the receptor here will say, oh, I no longer have that ligand bound. It's now better for me to be in the normal open state again. And it will recycle. So I just find it amazing that this works. Trust me, it does work on the molecular level, which is basically the last 15 years we found out. Until 10, 15 years ago, there wasn't, actually, forget 15, 10 years. Until 10 years ago, there was not a single protein that we knew in both open and closed states. The reason we know this today is that we have structures where we can see that it's either open or closed and how it changes. I just have two more slides. I'm going to show you just so that this is work in progress. Already in this work, sorry, as a consequence of this, Brian Kabilka got the Nobel Prize in Chemistry in 2012. Whom did he get it together with? No, he got it together with Bob Levkovic, who is his old advisor for the study of GPCR biology and everything. And this is the both fun and challenging things with Nobel Prizes. Their decisions are secret for 50 years. You don't know how they reasoned, but of course, I would guess that their reasoning was very much focused on that the people had actually done all the important biology, the overexpression and everything. That was Brian Kabilka and Bob Levkovic. And while Ray Stevens' work is second to none, it's an amazing work. They probably mostly saw his work in crystallography and everything. It's not the most important biological contribution. So we won't know exactly, right? But the price can be split over up to three people. So when they decide, in the occasions when they decide to only award it to two people, although there was a third obvious candidate, that sends a very strong signal that they, for whatever reason, they didn't think that the third person was as worthy. It's actually quite fun. If you're interested, you can find some of the some of the price motivations are public after 50 years. And that's the 1960s we're up to. So you can actually find the price motivations, for instance, for the DNA structure. I might have told you this earlier in the course. So when did Watson and Crick discover the DNA structure? When did they get the Nobel Prize? 62, nine years. Why on earth did it take nine years before they were awarded the Nobel Prize for this? They were not even nominated until 1960, which is, this is, you could argue, this is the important discovery in molecular biology in all of the 19th century, right? We know that. Oh, sorry. To get the Nobel Prize, you have to be nominated. The Prize Committee can also nominate you, but nobody even thought that this was a Nobel Prize-worthy discovery until eight years later. The important science is obvious in the rearview mirror. In 1959, it was not obvious that the DNA structure was important. It was, of course, important, but it was not obvious that it was a Nobel Prize-worthy discovery. Think about that. The things that you think are obvious are the things you see in the rearview mirror. It's hard to do science looking forward. And just looking forward, there are things that are happening all the time here. In nature, less than one month ago, April 5, Ray Stevens Lab had a new angiotensin-2 receptor, which is related to angiotensin-1, which is related to blood pressure. New GPCR structure discovered by free electron lasers. Super cool new technique. The first GPCR structure that I'm aware of that's a free electron laser structure. And in this case, they actually show that one of the helices behaves in a different way. This could very well lead to important blood pressure medication in the future. And related to that, also in April, small European GPCR drug firm, 800 million euros. So just to give you an idea, this is a small startup sale. They basically have one or two drugs in the pipeline, looks promising. And then you have some headhorns show pharma firm going in and spending almost a billion dollars and getting them because they're so concerned about making money in the future. That's all I have for today. What I would like you to do until tomorrow is think about because the challenging thing is that this is not mentioned in the book. Look through these slides, search on Google if you want to, and think if you're, first, what's the problem with the old way of designing drugs? How are we designing drugs today? And basically, if you were forced to sit down and design drugs, what are the tools we could use? Think a little bit about that because you might be there sooner than you think. 1203. Let's meet one PM at PsyLifeLab.