 What I'm primarily going to talk about today is actually drug discovery and docking. I know that I said yesterday that we'll talk a little bit about molecular simulation and free energies. I will just touch upon that because I realize this is a much broader and larger concept and you're going to do a lab on this on Thursday. So this is likely more important to you and then Bjorn and Dari will do at least the easy part of free energy simulations this afternoon. I'll come back to the free energy part and simulations in a second. So let's start with this discussion about all these questions and when that's done I can go up and get the lecture notes. We'll take as long as we need here because this is not easy stuff. Let's do the same thing we did yesterday. It's not because I want to force all of you to answer, but when all of you only answer the questions you know, I'm realizing I get this great ignorant view that oh they all know it. So it's much better when I see that there are some things you don't understand or are blessed as to you. So let's start. Number one. Right. So transition state is something you need to cross and do you ever observe any molecule in a transition state? You don't see it? No. Yes, but even then you can never observe it. You can indirectly get information about the transition state with smart things but you can never directly observe a transition state or the probability is epsilon. And in contrast what's a folding intermediate or an intermediate state. But if you compare, it's easier than that. What's a folding intermediate compared to a transition state? That's the first step. But again, a transition state is also something in the middle between folded and unfolded. How is a folding intermediate different? Right. And why can't you see it? Yes. By more stable, this is a local minimum and the free energy. It's not the global minimum, but it's a local minimum and that's the definition of a metastable or intermediate state. I'll get the, there was a paper jam in the copy this morning, so I'll get, I think let's go through this and then I'll have it, hopefully it's printing now, so then I can get your lecture notes from today. Good. So transition versus folding intermediate. Let's just continue around the table. Arenas plots. I think this is not just necessarily just folding rates, but it's any rate of a chemical reaction and normally in simple chemistry you would only have one curve. But when you do this in folding you typically have what? So normally this is something you can do for any chemical reaction and it's, people use it long before protein folding. And in a simple chemical reaction I say you would have one line and what's the complication with protein folding? Yes. So you both have folding and unfolding processes, right? Any normal simple, most simple normal chemical reactions, they are so biased that they will only go in one direction. Say carbon and, if you burn hydrogen or something, right, then you're going to get H2O. You will never have any hydrogen, H2O unburning to form oxygen and hydrogen. Technically you could, but the difference in free energy are so gigantic that it's a pretty good approximation that 100.0% goes in one direction. Protein folding is complicated because the differences in free energy are so small that it's a balanced reaction. And that leads us to question three. Chevron plots. Yes. So it doesn't necessarily have to be a denaturant, but the x-axis has to do with how folded versus unfolded you are. That's typically denaturant. In theory it could be a temperature or two. But same thing on a Chevron plot, you also have the logarithm of the following. In one way it's a horrible plot that it's less, at least to me, it's less obvious what it actually means, but it turns out that it's much easier to measure and that's why everybody uses that. You will hardly ever see any arenas plots in protein. So that Chevron plots are better for this type of reactions that can go both forwards and backwards. Sadly this would be easy to understand, but proteins aren't easy to understand. The key thing with a Chevron plot measures the total rate of reaction in the sense that the reaction rate, how fast you go to equilibrium. So that's the net effect of both folding and unfolding. And depending on where you are in this plot, you're going to head more towards folding or more towards unfolding. But the whole point is that this is measured something efficient, it's much easier to measure that in the lab. How does enthalpy vary during folding? Does it ever go up? So it's like, where it, it's like it can't spring? So at some point, at some point if you're, there was, I was, in principle during folding it doesn't go up, but at some point of course if you're plotting the density or something at some point you're going to get started with pushing atoms into each other and then it will go up. But during normal folding it only goes down. I think this is a great example if you ever ask the question like this. There is no right or wrong answer if you actually explain it properly. You could have said that yes, it would eventually go up, but then you'll have to say why it goes up. How does, in contrast, how does the entropy vary during folding? Yes. So there's absolutely no contrast whatsoever. I was trying to fool you two there. And that means what? When both of them go down? You can ask that too. The balance between the two. And that is, sadly, that is what comes back here and everything. That's why comp, that's why protein folding is so complicated. That's why I have low energy barriers. That's why the flux backward and forward is simply a very complicated reaction. Much more complicated than simple stuff. What is the use of apparent folding rates with the Chevron plots? The use of it becomes proof that it's like the addition of the trade in one direction and trade in the other way. Yes. And in particular, as we're going to see from the next question, this makes it, they're easy to measure and they make it possible to study stability in practice. What is phi values then, Sarah? What is in a phi f value for a residue? Yes, but well, you don't, you don't prepare almost, you don't prepare strictly the folding rates, but you're getting something from the apparent folding rates that you compare. What is that you're getting? The first step is what do you get from the Chevron plot? We're measuring, you think? Yes. So the Chevron plot measures the apparent form, like with the logarithm of k. But normally we're not particularly interested in 14.5 per second, what a folding rate would be. So what is that we first try to read from the Chevron plot? The free energy. Well focus less on what you want to say and what you say. I'm saying more free energy. Good. Yes. So again, don't be afraid of guessing, there is. So the logarithm, k is related to the folding through the free energy, right? So k equals the sort of constant and we never care about those pre-factors, multiplied by an exponential related to the free energy. And that means that we take the logarithm, sorry. So the logarithm of k is related to folding rates. Now we, sorry, to free energies. I can never get the absolute free energies, but I can get free energy differences from this. So the free energy is of the transition state relative to the unfolded state. But that is not enough because the phi f value, I need to compare two things here. And what are the two things I'm comparing? So there are two, sorry, I can only calculate free energy differences here. So there must be two free energy differences I'm comparing. What are those two free energy differences? No, almost. So I compare the free energy, how stable the transition state is, rather how much I change the stability of the transition state compared to how much I change the stability of the folded state. And that means that, sorry, and that's kind of the definition, we'll stay there for a second. The phi f, just for each residue it describes whether this residue, how much the residue stabilizes the transition state relative to how much it stabilizes the folded state. No, I can't. I can't observe the transition state, but this makes it possible because I'm indirectly observing it, I can calculate, I can't even calculate the absolute barrier, but I can calculate when I move this curve around, I can change, I can, when I mutate this in my alanine-to-aliasing, how much does this change the transition state barrier? And that's, exactly, we, this was this graph with two, with the two chevron plots and I was looking at dotted differences between them and I'm well aware that that was a bit complicated, but the really cool thing is how this leads to eight and that is, well, that's not me, Dorinez, that's you. If you walk here, then use that phi f value. So this was the stuff Alan first developed. So again, when we don't know what you're looking at, take a step back and what is that originally, why were we looking at all these three enders, right? Because we wanted to understand folding models and particularly we were interested in understanding whether this nucleation conversational model made any sense. So the phi value was a smart way of looking at whether residues were part of this transition state, right? And we can't observe it, but the key thing with the phi values, we can see if I change residue 14, if all the stability of this residue affects things already in the transition state, so if it affects the transition state just as much as the final state, if phi is one, then this residue was part of the transition state. If they did not affect the transition state, but only the final state, well, it was a good residue, but it's not the residue that's part of the transition state. So this helps us identify this folding nucleus, sorry, the transition state. It's not trivial, but I think it's an awesome way. It's one of these, and that's why I think this would be Nobel Prize-worthy. Like you're using super simple measurements. Well, you might not think super simple when you interpolate things in these Chevron plots, but come on, you're just measuring afflorescence. It's kind of the cheapest technique you can imagine in any lab. And it enables you to do things that are completely impossible to even do with X-ray crystallography or anything. It's a beautiful experiment. But I'm not sure how beautiful it is if you're one of those 25 students that had to add another 100 points in the plot that it requires an insane number of experiments. So does that mean, is it one of those things that like once it's done, you can use it from ubiquitously, or will it always have to be a kind of experiment to do like that? You will have to keep doing experiments that would arrive at that. But the idea here is that what this allows you to do, for instance, in a protein, like the ones I showed you yesterday, is that once you've identified this folding core, you know what 5, 10 residues form the folding core. And if you now want to start, well, depending on what you want to do, sometimes you actually want to go after and start influencing things in the folding core, assuming that, let's say that you made a new drug that's going to be an HIV drug or something. And this is biological. You only have one problem. This drug either unfolds or folds or something. There's something wrong with the stability of the protein. If you want to change the stability or make it fold faster, you're going to need to modify the transition state. Now you know what 5 residues should go after. And then you can, of course, start to do this type of experiments for each rest you see that stabilize or destabilize it. Yes, that will be a lot of experiments. But as you will see later today, that's nothing compared to the overall cost of drug discovery. On the other hand, conversely, you might be designing a larger receptor in an antibody. You have a beautiful antibody already. This already folds. It's great. You don't want to destroy it. And in that case, you probably want to stay away from the transition state if you're going to do mutations, right? Whatever you do, you don't want to interfere with how the protein folds. So in many cases, you don't have to redo the experiment. So that depends whether you actually want to start fiddling with the transition state. So we already spoke about the enthalpy entropy balance. Explain this nucleation conversation thing and how that explains Leventhal's paradox. So let's start with what is the nucleation conversation model? What does it say? It's the same thing here. It's not the case that you don't know. There are two things. Just looking at those names, you can say something. Right. So the whole idea is that you first have some sort of hydrophobic collapse. And this hydrophobic collapse, once you collapse, you're going to start having some residue. Then we can be very far from each other in the sequence and they start to forming some favorable interactions. That's this folding core. And eventually, you're going to have more and more residues joining this part that grows just like a crystal in water or something. So it's a model very much taken from physics. We also know that we have this enthalpy-entropy barrier balance, right? And the problem is that both the enthalpy and the entropy go down roughly with the number of residues, the first approximation, because that's the size of the number of residues. And that means that you're not really going to get any free energy barrier at all because if they go signed, by the same amount, they cancel each other. What I then argued is that if you look at the very early part when we're forming this, that if you don't really have much of a core yet, the interactions are rather going to be proportional to the area, r squared, rather than the volume, r cubed. And that means that the energy will not go down quite as quickly in the beginning, but the entropy will go down slightly quicker. And that means that there's going to be a small term that is not proportional to the number of residues, but the number of residues rates roughly to the power of two-thirds. And that means that effectively, rather than saying that Leventhal is related to something to the power of n, it's going to be an exponential to the power of n two-thirds, to the power of two-thirds, a small difference. But that if you keep these laws of very large numbers in mind, that's going to lead to a tremendous reduction of folding time. So the problem with diffusion collision is that we don't really have any model for how these interactions or how the entropy should go down, right? You could try to do it. It's just that neither the book nor I have thought about doing it. I guess in diffusion collision, for a small protein, diffusion collision definitely works. As you get up to very large protein, I would find it difficult to explain how the energy or entropy would vary in diffusion collision. So one problem with diffusion collision, right, that if you have a secondary, sorry, if you have a protein with n secondary structure elements, the number of interactions and the number of weights you can combine this and go up exponentially, like some of each of the power of n. So you might be able to, in particular for small proteins, you can of course explain it to, but as you get to a very large number of residues, I think that's mainly why diffusion collision fails. But it's, remember, this is basically up to you. It's very, I'm not going to ask this on the exam. It's happened to me, once you study physics, occasionally you have these homo-exams in theory, which is very fun. Because you get an exam, you have one week. You can use absolutely any aid you want, apart from humans. And then, but occasionally it happens that the professor retracts question for because it's not possible to solve question for, they realize. And this is a problem. The only thing that we can do with the theory showed that there is a possible pathway that would make it possible to fold with a low free energy barrier here. And if that's the case, nature might use that. You can never prove that there isn't a pathway. So in theory, there might be some large proteins that could fold with the diffusion collision. Sorry, say that. Then the same way that you say that probably there are some key contacts that define the nucleus and then the folding or the, the folding state expansion in the structure, you could also feature this with part of the second structure or the reform. No, of course, you can imagine lots of intermediate balls. I would argue that the key difference is that diffusion, sorry, diffusion collision is really hierarchical in the sense that you first form the short range of interactions. The helices in particular, they're only short range, right? So that all the helical elements would form a helix first. Similar with a beta sheet, there's slightly longer range than helices, but a particular anti-parallel beta sheets that go up, down, up, down, up, down. You could imagine they form together first. The key difference with nucleation condensation is that you will have residues that are just in close spatial proximity to each other start forming the core. And there is absolutely no reason whatsoever why they should be closed in sequence. And then I started to talk something about the book that was not mentioned because it wasn't clear when the book was written. These network models were folding. What does that mean? Yes, and not just saying. The key thing that we've been able to show with simulation is that there are many folding pathways. We can suddenly put flags on them and identify different clusters, no longer hand-waving, even for fairly large proteins, 50-60 residues, well, small bi-biological standards, large bi-computer standards. We see that they can go through half a dozen different pathways. And that means that you may be, rather than talking about transition states or one intermediate states, you can start thinking of these intermediate states as hubs or something in this network, that proteins that fold well might be proteins that are very easily connected and it's easy to move between these hubs. But now we're getting very close to modern research. Yes, you can certainly imagine having different nuclei. I would argue that for most common proteins, and again, this is limited to the proteins people have been able to fold in computers, there appears to be one very dominant pathway in many ways or at least the rate critical step is frequently dominant. So you might have lots of different ways that you get to this folding nucleus, but the actual folding nucleus is relatively identical in many cases. Then you might have the protein folding through some other pathway, yes. So when are proteins thermodynamically versus kinetically stable? And what did we mean by those two concepts again? Yes, and it actually turns out in many cases these two appear to go hand in hand, but it's the whole point. This is the peaks in the Friendi landscape. These are the valleys. So what is then the roles of these transition states in folding? Yes, both are. So they're both the rate determining in the sense that they can't be too high, then we would never fold. But if they were too low, we would never stabilize the protein either, right? You don't want a protein that's fussy and that can go back and forth. I'm going to go up and get those, hopefully, the printer should all work now. So I'll go up and get today's lecture notes for you in one minute. But there is one thing I want you to think about while I do that. Here, I show you two curves. They're not the most beautifully drawn curves, but these are two proteins that fold, we can assume. This is the sort of denatured state, non-folded state, and that is the folded state. So which one of these folds faster than the white? Think about that for a minute, and I'll go up and get the papers. Which one's folding faster? It's a little faster. Yeah. Is it very, but it doesn't mean too obvious. It means like a little bit of rate determining stuff that doesn't matter. One is one and the other is the same. And it looks like the first peak is similar. So then that's the rate determining the set. The first one is? And because the starting point and the end point are roughly the same, I think. The starting point has a little lower on it. But I guess the second one is a bit bigger, like relatively. Yeah. Do you want to have a rate? Yeah, I'm just. I think one's going to be very fast. I mean, the rate is back to the second peak. I just see the first peak looks on the second one, similar to the second peak. Yeah. And so maybe that's a piece that's turning up. But didn't you say like the un-denatured state, like it would go directly to the single local minimum? And since the local minimum is going to go higher, because we reached the states lower, so it would both go to the single local. Yeah. If we could just, everybody's thinking that like the light is about to burn again. Um. Wait. I'm sorry. I mean, I think that's the only way. I have a jugglery. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I have a bane. I'll let you do the state playing so I can go through that. OK. We couldn't agree on that. You couldn't agree? We battled it out. OK. That's pretty much how you saw most scientists. It's just right if you have a violent fight at the conference instead. So let's see. What side won? OK. That's one answer. Do we have any other answers? I don't see any other answers. Sorry? The first one is faster. Why is the first one faster? Let's take votes. And you have to vote for one alternative. So who believes that they're at the same speed first? OK. Who believes that the top one folds faster? And who thinks that the bottom one folds faster? OK. Good. Equally divided. So remember those energy gaps I talked about, right? So here's where we start. This is the unfolded state and this is the zero energy level. And so if we start here and you fold here. Now, this is some intermediate state. But intermediate state, remember folding is not necessarily one-dimensional. So this is just some states that we might or might not get stuck in. And of course, in this case, I might have been a bit bad because I showed that you can go in this direction too. So in this case, you might, if we, to first approximation argue that these are the same speed, you might actually imagine that they are the same speed. I should have drawn that. The other way of thinking is that this one, if you start here, well, the barrier here, to first approximation, they might be roughly the same. This one can certainly go back. It's a low barrier to go back and then you start gaining energy here when you go back right. And then you have to go up here again. On the other hand, the barrier here to go forward is also lower and then you fold down here. So the way I've drawn them here, they probably are roughly the same speed. But let's think about something else. If this is a real network model or something, this is not just two-dimensionally, but that this is the native state and this is one state of a thousand and that is one state of a thousand. What would happen then? So if, what I know is that this is, of course, an horrible approximation, right? In particular, if you think about those network models in simulation, folding does not progress one way through a single reaction coordinate. Reaction coordinate is something that, if you want to describe something as a function of one variable, you can think of that as a reaction coordinate. In practice, we don't have it that way, but you're going to have a mesh here. You're going to have billions of states. And if you start here, you might go to this state, but it's not at all obvious that you have to go through this state to get here. So all these might, you might have a billion different states like this of different energies. Imprinciable, yes. But that's one of the last slides I brought up, last lecture, and then you get into this problem. What happens is that if all these states are connected, what would happen in a case like this is that if you start high and then go lower and then you gradually go lower, in principle, that's good. But the only problem is that when you've gone lower, then you will have to rely on the fact that you can't go from this lower state directly to your next better state, right? Otherwise, you would need to go back, and going back is now going to be fairly expensive because you need to move uphill. And in that case, if these, if this had not been one dimensional, if this just had been three different states, then it would be better to go, although it's bad to go uphill, the probability of getting here would be better in the sense because all these higher-energy states, they would be bad. You would only have one state that's significantly more stable than the native state. But as I realized, I probably shouldn't have drawn these one-dimensionally. When they are drawn one-dimensionally, it should be the, assuming that the barriers are roughly the same height, they would be roughly the same speed. And this, well, so if you didn't follow me, my ramblings here, the key thing is that this also has to do with this distribution of energy gaps. The most fast-folding protein is to have one state that's more stable than the unfolded state while all the bad or misfolded states are higher energy than unfolded. If some of the misfolded states start to be reasonably good, not quite as good as the folded, but reasonably good, then we will like to fold into them, but we might have to unfold them to get back to the real folded state. And that has to do with this. If I pack my fingers the wrong way, that's from the right way. So that depends on, now they are, if they actually are connected, then they are at the same speed. If these were just different discrete energy levels, the second one would be faster. The way I've drawn them, they would be roughly the same speed. I didn't think about that yesterday night when I drew them. This relates to something else. Many of you have taken the bioinformatics course. And one of the things that surprised us all early on was that it seems obvious that once you take the simple bioinformatics model or something, you realize there might be some errors in it, but then can't you just throw this in a computer and start to minimize it and simulate it a bit and you should fix up the model? That seems obvious, right? We should be able to improve the models. It doesn't work. People have tried it for years and there are even stories that if you touch the protein, you die in the sense that if you start to simulate it's primarily based on the fact that you always have packing errors. You have side chains packed the wrong way. They're packed this way but they should be packed that way. You're not going to fix that with a short energy minimization or simulation. The only way to fix that in a simulation is to unfold the protein and then refold it again. So that while it might sound great with a bioinformatics and homology model that's almost correct, almost correct does not cut it if there are horrible problems. So you have to use the baker's lab because finally they were able to derive some really good homology models where you actually had the packing right. The key difference here is once the overall topology of your side chain packing is right then you are in this final downhill part of the energy landscape and then it works beautifully to use molecular simulations or refinement to get the last pieces in place. But that will only work on the final end game. It won't pack side chains in a different way for you. So how should you pack side chains? Nope. Well, this anthone simulations in the US would do it. You should use some sort of Monte Carlo scheme, right? That you've done in the lab. So try it this way and then try it that way. There are only those two possibilities. There's no reason to also try it at 300 billion different possibilities when it's unfolded. If you're not sure whether the alanine or the elucine goes on top, try those two possibilities. Well, slightly more than that if there are combinations. But the key thing, remember, brute force and using the largest computer in the world is not always a solution to the problem. And that is a beautiful connection to what I'm going to talk about today. How are we going to use modeling in general in modern drug design? And this is used in industry. This is used on very large scale. Not necessarily the methods you think. They use, I would argue that there is very little MD in used in this field. And in principle, you already mastered quite a few of these steps. In the bioinformatics course, you probably got to the point where you can take a sequence that you find in a database and try to predict the fall by homology modeling or something, right? And you can frequently use a web server for this do today. These servers would actually usually build these side chains automatically for you because if there is an alanine but your sequence has a tryptophan, you're going to need to build in those atoms. It's not a very, building side chains is a surprisingly easy problem. Getting the final parts in place, yes, it might not be trivial, but it's something you can do in a web server in a couple of hours, not very hard. And then you could hopefully minimize the structure. You could also simulate the model of this structure assuming that it's really good. And as I told you that if this side chain packing is almost perfect, you can get a beautiful low energy model here. Another alternative could of course just fold your protein in a computer. But why don't you try to fold your protein in a computer? Yes, I guess these, the typical target in drug research is not the 50 residue toy protein. It's like a 500 residue receptor. There is no way we can fold proteins like that. Second, that this takes you, well, if I haven't done this in a long time myself now, my students are much better at it. It will probably take me several hours in an afternoon to dig up the best web servers. Several hours on my laptop or three months in a super computer even for a small protein. This is going to be higher quality. So this is how you get a model of a protein structure. Forget about MD or Abinicio. There's this, there used to be this joke on the computational chemistry made in this. A few years ago that there was some Chinese student, have you gone through what Abinicio prediction is in the bioinformatics course? So Abinicio is basically what you would do in MD, right? Try to predict the fault from the laws of physics. And there was some Chinese student I think who asked about this. Somebody said in a joke that Abinicio is Latin for doesn't work. But you know what, as much as we love protein structures, if you're running a large pharmaceutical company, you're completely uninterested in protein structure. The only thing that you're interested in, can you design drugs? What you really want to be able to predict is what drugs bind and what effect do they have. And we're going to need to take a step back there because in some cases it's not always. This is a horrible over approximation of drugs but this is the classical way of how a drug works. You have some sort of blob, a target, your protein. This is by no means obvious. You're going to think that this is obvious because I will show you these targets. But Sarah, what's the target for Alzheimer's disease? Yeah, isn't it? I have no idea. And this says that we don't know. It's not that I don't know, nobody knows. So for most diseases you don't have an obvious target. And what most companies do, why some companies specialize in a class of diseases, it's not a coincidence. The company specializes in a class of diseases because they know these targets very well. So they do lots of fundamental research, for instance, in obesity or something. They have an idea what receptors might there be or some proteins that we could target. And then you're going to need to find something to bind to this. If you don't have anything to bind to this, it's going to be very difficult to influence it. And when this drug, whatever it is, hopefully you're going to get the biological response that you like. You're also going to get a lot of biological responses that you do not like. What we have seen based on very early x-ray studies and more modern darkness studies is that there's typically an obvious, at least for the simple stuff, there's obviously a very clear shape complementarity between some sort of drug here, sorry, the receptor here and the drug in the middle. Small drug, large receptor. It's one of these things. There's no time to look at it, but when people have worked a few years, they start to say, oh, with that pocket, that's more cavity, it's going to be a binding site. And these binding sites are typically also small hydrophobic patches on the surface of a protein. So that's not trivial to identify, but with a bit of training, it's not very hard, neither for a human or a computer. What type of drugs like that do we have? Well, it turns out roughly one quarter of the protein-coupled receptors. Remember, I might have told you at some point that, in the biology section, that 50% of all drugs target membrane proteins, that's our usual motivation why membrane proteins are so important. Technically, that's true, but it's a bit of a modification. The modification is that the vast majority of all those things are deep protein-coupled receptors. Now, deep protein-coupled receptors happen to be membrane proteins, so they're a very general receptor. There are tons of different types. Neurotransmitters, recognition and everything. The dopamine receptor is one example. Heavily used, the only problem is that until about 10 years ago there were no structures whatsoever of these. And for a very long time we thought that we would never be able to determine the structures of them, extremely complicated to crystallize. There are some nuclear receptors, and then there's the third category here, ion channels. Ion channels is the fastest growing target. Because this is also the problem, if you're going to start a new pharmaceutical company, do you want to run in the same direction as everybody else? Or do you want to find some sort of new target that nobody else is working on? And then there are lots and lots and lots of successfully less important targets. So most pharmaceutical companies tend to focus on this roughly third of it. They might be, but it's also in many cases, there are some more about these proteins in general. So you can do it for ion channel. I know that AstraZeneca, they have a long history of working with pain, understanding pain. And it turns out that one of the most important receptors of pain is sodium ion channels. Ions have multiscated sodium channels. Now that is one type of problem. We don't know the structure of those yet, but we know lots of structures of other ion channels and companies overall built up a whole lot of competencies in ion channels. Also ligand gated channels. So of course, if you are a company and if you have lots of expertise here, you're not suddenly going to go there instead, right? Because what is your unique selling proposition? What is your advantage compared to all your competitors? Remember, these companies are sitting in tons of patents and everything. They frequently have patents for something that will bind to a receptor, although you're not quite sure how to use this in a drug yet. I'm sure that there are a couple of drugs worth billions of dollars hiding up here. The only problem is that we don't know what they are yet. There are a couple of different things a drug can do. So normally the black line here would be that if nothing really happens, that's what a receptor would be kind of sleeping or something. And normally if the right thing, the biologically active molecule would bind to a receptor, you say that you activate it or something and you get some sort of response. And then there are different things you can do. The easiest thing is usually, can we find a small drug molecule that binds exactly the same way as the neurotransmitter or something? It just activates the receptor and by adding more of this molecule, we're going to activate the receptor more than it would do naturally. This is called an agonist. And if it's a molecule that does this completely, it's a full agonist. If it's weaker than the body would do, it's a partial agonist. In many cases, the opposite is true, though, that your body might be activating your receptor a bit too much. And if your body activates your receptor too much, you might want to shut off this response. So then you can have what you call an inhibitor. An inhibitor might be something that binds in roughly the same size, same size, but it just turns off the function. It doesn't really get the channel open or something. It just blocks the real drug from binding there. And for some receptors, agonists, you get some other molecule binding, possibly elsewhere. And that has the opposite response. Say, if increasing a current, it decreases the current instead. All these types of drugs exist in nature. The problem here, though, is that this entire picture, sadly, this is the picture that we usually work with. It's simple, right? I have one receptor. I'm focusing on, say, my ligand-gate-ryan channel. I have one, maybe two binding sites, maybe an allosteric site. This is already starting to get too complicated. That is one molecule binding one site. Does it increase or decrease the response? Biology is way more complicated than that. So the second you eat some pill or something, you might hope that this would go into your mouth. It should be digested. It should be, well, if it's a protein, bad things would happen here. But if it's a small compound, it might survive and go up through the blood-brain barrier and everything and have some effect here. But the same drug would likely have lots of side effects, like coins or something. Again, something that's small and hydrophobic is going to bind in more than one place. That's the curse for traditional drugs. So obviously, they must bind to the real target. They should buy to as few other targets as possible. We can't, typically, it's going to be impossible that it never ever binds to anything. This might sound harsh, that why are there so many side effects to new drugs? I would argue there are way more side effects but all drugs are already in the market so we usually don't pull them off the market. Aspirin, it's an absolutely horrible drug. There is no way aspirin would be approved today. Too many side effects. But if it's been on the market for 100 years, we think that it's harmless. It's super dangerous. We must, the compound must survive from administration to the charging but it must be small enough to get to the brain. If this is a protein, it would be digested in your stomach, right? Then you would need to inject it. If you're a company, you hate things that have to be injected because at some point you want to sell this at 7-11. Otherwise, that's where you're going to make the billions of dollars. Ideally, you should have a slow and steady release of drugs. Patients are bad at taking their pills on time and everything. The problem is, the second you take a pill you're going to get a gigantic initial dose, right? Then the dose goes down exponentially as this is broken down. That's why I would argue the best possible way of administrating a drug is when you can have a patch on your skin. You can really have it diffusing through your skin. I'm going to talk more about that when I talk about our research next Monday. There is a larger name that you frequently talk about. What I haven't even talked about here, at some point you're also going to excrete the drug, right? This is going to interfere with your metabolism and even excretion. It's also good if the drug does not kill you too much. Seriously, everything is toxic. Water is toxic in two large amounts. Toxicity is always a question of amount and how toxic it is. This concept is typically called admatox. Admatox at most pharmaceutical companies is far more of a challenge than getting a compound that binds efficiently to a protein. But I'm not really going to talk about admatox. In particular, I'm not a toxicologist. Remember this when people start talking about docking and screening and everything? Admatox is where most drugs fail. Not because it's getting something to bind is easy. It doesn't have all the bad things that's hard. So it turns out for a very long time drug development was quite simple. I think I have a slide on that or two how I come back to it. There's a classical rule called Lipinski's Rule of Five. It's somewhat related to that but not quite. And again, this is not a law or anything. This is just based on 100 years of trial and error. What compounds make efficient drugs? They noted that every single drug known had no molecular weight. So that it's small enough to be transported for instance, your blood or so. Or get in the plane. It should be polar enough to get into the blood stream. So this has to do with the partition coefficient. It should be fewer than five or hydrogen bond donors and fewer than 10 hydrogen bond acceptors. And that means that it should be reasonably non-polar so that it can also cross membranes. And this is complicated. So it should be a bit polar and a bit non-polar. It can't be 2A polar and it can't be 2A polar. There were lots of successful drugs this way but pretty much none the last 20 years. We've exhausted the classical way of doing drug discovery. And one of the reasons that drugs have side effects and these side effects are now no longer acceptable to the general public. Which is a bit of a problem, right? Because on the one hand, and I'm by no means going to stand here and protect and defend Big Pharma. Big Pharma can in some cases be some of the nastiest companies of people. They're interested in making money. Now, on the other hand, you can argue, given the choice if you're going to make money by selling weapons or make money by curing people, which is better. So from that point of view, if these companies don't make money, they're not into welfare because that's met well. If it's welfare, we could fund it with your tax dollars in the long run, right? So if this is going to be a successful business model, at some point they have to make a profit. And the problem is that we are increasingly looking for higher and higher and higher demands. There can't be any side effects. This has to be tested for 20 years before you put it on the market. This is up as costs. It's going to be more and more expensive to develop drugs. They will fail more and more. And eventually, we're not going to see a whole lot of new drugs. It's easy to blame the companies for it, but we're also certainly part of that equation. Traditional drugs would look like this. And of course, I expect you to know all these drugs. Nacel decongestance. I bet you bought this drug. You've never heard of these names? Because these are the chemical. Well, in some cases, chemical names. In some cases, they are business names. The way that the actual drugs are named that has to do with marketing. So most of these drugs have a chemical name. Losec, which is the name of the AstraZeneca blockbuster pill in Sweden, which was then pre-Losec. Different markets have different other drugs and you want your drug to be unique. So in the U.S., it's called Nexium. It's called Omeopressol, which is the chemical. But it's not called Omeopressol in any market. The common factors of all these drugs is that you have a bunch of fairly rigid frequently aromatic rings, right? But then you also have a bit of polar things. So small, relatively rigid molecules, organic, and they're nowhere near the size of a protein. How do you come up with these molecules? Well, that's maybe how we do today. The traditional way is and I'm not joking here, it's pretty much good on Amazonas and look for things. So many of the capsicane, for instance, it's a target that goes into TRP, TRPV receptors. That's the burning component in bell peppers. Or chilies. Small compound like this. Oh, I think I showed you that from a previous slide in the course. The only problem is that we're gradually exhausted of those drugs, partly because they have side effects and everything. So we're not finding a whole lot of those drugs anymore. Even Omoprosol was a drug kind of like that, that the first trace we found was a naturally occurring drug, but it was poisonous. You're not going to sell a whole lot of money for a drug that's toxic, right? You might try it either if you're really smart we might synthesize something from scratch, but typically you end up with having a trick and an idea and a molecule you have identified and try to improve it. It's super common to do what you call an E2 drug nowadays. Do you know what this is? So assuming here that Sarah's pharmaceutical companies has developed an amazing drug and, well, something like this. Can I just add an ethyl and you've spent 100 billion on this, by the way. Can I just add a methyl group here and then I try selling my drug too? She's going to sue me for everything I have, right? And she's going to win. So adding once more because you're going to have a bunch of patterns on the entire composition of the drug and everything. But what if couldn't I create the drug? Let's just assume that that's Sarah's drug and I just happen to have a computer program that predicts, you know, what this drug will bind in exactly the same receptor. It has nothing whatsoever to do with your drug. Profit for me. And I might spend 10 million on this. And then you're going to cry because your company just lost all your sales while I work on the profit. So E2 drugs is great but it's hard to get around. On a typical modern drug it doesn't take out one pattern. You put out a bomb carpet of 1,000 patterns on everything from the composition from the ways of manufacturing the drug from the way of processing and targeting this receptor. So the whole idea is that I should be terrified of competing with Sarah because I know that the likelihood of me avoiding every single one of those 1,000 patterns is zero. And if I happen to intrude on just one of them, I'm dead. You can certainly try to design organic compounds but what we are more and more getting into is active design. We already talked a little bit about this with peptides and proteins. So I'm going to focus on the simpler stuff here. So modern drug discovery consists roughly of these steps. We need a target and I'll just assume that you've been so smart that you've already identified your targets here. This is based on old traditional research. This is why all pharmaceutical companies run very large biophysics and biochemistry departments. They crystallize proteins themselves. Because if you know the receptor, then you can start designing something to fit it. Expensive but necessary. And then there are a couple of different phases. By far the longest phase is what you call preclinical and that's what we do in this building. So that first you need to start something, find something to start with, a hit. In principle, divine inspiration works there. But that's in principle. In practice it doesn't work because the probability of finding something with divine inspiration is zero. So you're going to need to find something to start with which was traditional the Amazon. Then you need to ask whether does this have any effect whatsoever if it doesn't have any effect that's kind of pointless to continue. This effect is typically going to be low. So you're going to need to improve this affinity or the efficacy. So the affinity is how hard we bind. The efficacy is how efficient you are. The effect it has biologically. So this is called lead optimization or something. And then you need to start testing this first small cells in test tubes. Eventually you're going to start having an animal facility which we don't have in this building. We have small, we have cell testing facilities here but not animal testing. That's over at KI. And you can recognize animal testing facilities but I'm not going to say how because this is recorded. It's a bit sensitive because you know why. Once you get here your drug is worth nothing. Nobody's interested. Sorry. You might have some companies that start talking to you. But they're not going to. It's not worth anything in money. There are 1,300,000 of drugs that get here. So at some point you're going to need to go through phase one studies. Is this safe in humans? This is where for example with the Teginero things and other things failed. When things seem to work in some test and it's safe in humans that's when a pharmaceutical company will start talking to you. That's when they might be interested in buying your small startup or something. And then you're going to need to see does this work in humans which is not as obvious, right? And eventually you're going to get to what's called phase three. Does it work better than the previous alternative? In principle, you're not allowed to put something on the market unless it's better on the previous alternative. And you're sure it's not going to make a billion of a drug that is not quite as efficient as aspirin to cure headaches. So then you probably have something that's interesting, right? Because then you could argue that it works better in the sense that you have fewer side effects. And for most simple drugs we will hardly accept any side effects. That might sound strange because you might think that most drugs have side effects. Well, but yeah, but most drugs have side effects of one patient in 10,000 or 100,000. We are extremely diverse. And these side effects might typically be that you might have a bit of slight increase in blood pressure or something that tiny things compared to what the original diseases. There are a couple of rare exceptions. That might be, for instance, drugs used in cancer therapy or something that you're going to die unless you get the drug. And then we tend to accept way worse side effects. The problem here is that you typically fail. In preclinical states 70% of projects fail. And the sad thing is that this is the failure at each rate. In phase 1 40% of them fail. They're not even they're dangerous side effects in humans. Phase 2 it's only 40% of the drugs that appear to have any effect whatsoever once you get here. And this might sound horrible, right? Because you already tested that it had an effect here in a mouse or a monkey or a horse. But 40% of them they don't have any good effect in humans. In phase 3 then it's usually better because here we usually measure so much that if they actually have an effect in humans and we've designed the drug it is usually better, but at least 40% of them is not better than the previous alternatives. And then roughly 25% fail when the Food and Drug Administration called different in different countries are going to prove it. Because for whatever reason that this might for instance in Sweden this might for instance be that well either it could fail because it's not considered harmless enough or it's because it's slightly better but this is just 1% better. What can happen to many companies is that again you might have developed a beautiful cancer drug but this is going to cost one million dollars per month for treatment. And then the Swedish Sociology which is our equivalent of FDA says you know what, this is an awesome drug but it's not worth the cost. How many patients are going to have 12 million dollars per year to pay out of their pockets? So that unless you have somebody who actually can fund this and pay it eventually you're not going to have any market. So this is hard. And this you want to fail, you really want to fail in drug discovery because the earlier you fail the cheaper it is to fail. Nobody cares if you fail here actually the pharmaceutical companies don't care at all because nowadays they let the universities handle all this. So this beautiful thing that they love to collaborate with universities what they are actually saying is you know what they much prefer if the university and the taxpayers pays their pre-clinical research and then they buy the success. The one in a hundred company that's successful they buy it roughly here. But forget about the economy for a while. We need to develop drugs and it's much better to find that there is a mistake and there's something horribly bad in the drug here rather than discovering it here. When I told you about anesthetics there was a no I probably didn't I can tell a little bit about it but there have been examples where drugs have gone all the way to the market and then they've been pulled after a couple of months on the market. That's where the case where CEOs are fired because you just lost the company billions of dollars. So the cost might take at least 12 years it's frequently been 15 or 20 years and do you know how long it's patent protection in most of the world is? 20 years. This is probably longer today. If you patented it and then it takes you 15 years before you can even put it on the market you're going to have 5 years to make money from it. There have even made exceptions to this now. Pharmaceutical companies are allowed 5 extra years so you can apply for an extension and get 25 years. But this is why drugs are expensive. You still only have 10 years to make all that money back that you made on the development. You might have spent 300 million euro this is probably higher today, it could be up to a billion or so. So there's a substantial amount of money there. Do you think the companies typically make that type of money after drugs? They do. We've got to show it, not always of course. There are some things that fail but large teams and everything, the successes are awesome. Let's talk a little bit about this from the discovery and research on some sort of target validation and then what we call high throughput screening and then a little bit of the preclinical stuff. Assuming that we have the target let's assume that we're preclinical scientists. We now need to find this and at some point we're going to try to optimize it. This part is nowadays 50% computer based. The way a typical pharmaceutical company works and this varies a bit in the old days by the old days I mean when I was in your age. What you do now is that you have an iteration time that might be 4 or 6 weeks. So every 4 weeks the teams sit down around like a table like you and say so what are the experimental what are the computational results we have these are the 10 interesting compounds we found for our receptor and everything and then you sit down and discuss this and decide which one to synthesize and then you send this off to usually China and have them synthesize because $50,000 per compound or so because it's a serious amount of complicated organic chemistry to synthesize one arbitrary molecule that should look a way that you like it and then you wait roughly 2 weeks or so and then you get these compounds back and then you run the tests and then based on these tests you now run a well in worst case you're going to none of this had any effect whatsoever and then you're going to need to start over again and you just spent half a million dollars in these 4 weeks just on these 10 compounds you cannot try 10 new compounds based on those results the usual outcome is that of course you might have found that you know what there was one or two of these compounds that appeared to have a very tiny effect and that of course that means that you should probably search more in that direction right and you can keep doing this week after week and every month you spend another half million dollars at some point the CEO is probably going to start to ask you to see some results from this the company a serious amount of money in addition to your salaries and this is why I typically use computers for this so that can you come up with can you come up with smarter ways to test if we can make smarter predictions and still only synthesize 10 molecules but get better results for those 10 molecules with synthesize we can significantly shorten the number of iterations we need and hopefully get better results so this why I mean that you kind of ping-pong back between the computers that predicts you what to synthesize 10 molecules based on those results you feed it back into the computers and do this over and over and over again so what you typically need to start with is some sort of hit molecule today this typically comes from high throughput screening or something that I'm going to show you some other things in principle in the old days this was something you found in the Amazons that is one of Sweden's largest export successes ever this is Omio Pressol which is what you can take against heartburn and also cure ulcers in combination with antibiotics originally this was a completely different molecule that was toxic and you had to Omio Pressol existed in two race mates two versions of the stereochemistry and then you eventually isolated that and optimized the molecule but at some point there was an early lead that people decided was worth going after and ideally you should not just have one molecule you should have a bunch of it or so from some sort of series molecules that look similar because it means that this appears to be a generally interesting direction to go in it might be possible to optimize this one way of doing this is what you call HTS high throughput screening and high throughput screening is not as fancy as you might think high throughput screening is basically the equivalent of a chemist doing lots of tests but nowadays this is typically not a chemist but a robot you might be able to do a thousand simple tests today now this assumes that you have some sort of simple assay that if I have my Receptor X it's a simple way to test whether something binds to my Receptor X if your Receptor X is an ion channel did you see our patch clamp lab here on floor 4? it can show that I have it there some other time in principle you can measure this by attaching well, expressing something in a cell and then putting small glass pipettes on it 30 minutes or so to do very carefully and then measure the currents as you're adding different chemicals today you have patch clamp robots that can do this with thousands of cells and automatically test a new cell new chemical measure the currents see is there any ion channel current throw this away test a new one it's relatively expensive compared to what we do in a small lab scale but compared to the cost of synthesizing chemicals it's nothing it's less than 1 euro per cell so there are all types of different robots optimized for different things that screen through this very quickly and if you're a pharmaceutical company you might be screening hundreds of thousands or millions of compounds if you're lucky you might get 10 or 100 leads and oh sorry the cost might be in the ballpark of 1 dollar the point is if this was 100 dollar per well you couldn't screen a million and if this was 0.01 dollars per well we wouldn't bother with computers then it would be more efficient to do it chemically this still ends up being fairly expensive you need lots of chemicals every single chemical that you're going to every single chemical that you need to test here has to be synthesized so you can't sit down and draw random organic compounds because they're going to cost 50,000 dollars each to synthesize and then 1 dollar to test so here you can only work with a large library we have that up at Psylab lab here actually so we have a library of a couple of 100,000 compounds that some of them we know might be active but other ones are just we have a stock of them we have a chapter that you're interested in testing all these 100,000 compounds is reasonably cheap because we already have the manufacturer there is also a large database called Zinc which stands for Zinc is not commercial and that's I think it's a million compounds or so that you can order they're available commercially the only problem is that this doesn't work or at least it shouldn't work the real space of chemistry molecules is on like 10th of the power of 60 if you look just as a small molecule and I even if you test the probability of finding something even testing anything that's reasonable that's one in a million or much less one in a billion probably you're not going to find anything by searching randomly unless you have a receptor that binds virtually everything but then virtually everything is going to bind to it too there are a couple of cases where you have been able to find molecules and I would guess I don't know these receptors specifically but it might be a receptor where you have an idea of what the natural chemical the natural agonist looks like and trying to find something similar so in this case it was possible to find experimental hits the first case not we haven't been able to find anything with lactamase you tried a third of a million hits nothing and this is what's so frustrating with this business most things fail you're not going to see anything so there are very few hits in traditional HDS or high throughput screening one more question so that's a good question so what people do is that you combine this is a rough estimate of course so you think applying this Lipinski's rule of five you can't go above 500 in molecular weight and then you also calculate so you need that recent number of rings so this should be reasonably rigid roughly how many ways are there to combine small chemistry groups whether this is 10 to the power of 30, 40, 50 or 60 or 10 to the power of 100 is really irrelevant it's a very large number but it's not infinity another thing we could do is what you call pharmacophore modeling or QSAR I'll show you what that is in a second oh sorry I missed up there this is I'll come back to that in a second so QSAR do you remember this plot I showed you about the Meyer overtone correlation of anesthetics so that if you have all these compounds except methoxy fluorine and let's see yes except methoxy fluorine and then you just do a measurement for methoxy fluorine and you react methoxy fluorine has a partition coefficient that would be close to 1000 you could probably say without seeing this plot that methoxy fluorine is likely going to be a very efficient anesthetic right because it appears to have similar properties as the other you can think of this as a series there's a series of molecules with similar properties in this case hydrophobicity if you're now going to pick new anesthetics will you pick ones with low partition coefficient or ones with high you're going to try to go after the ones with high right even though you don't know anything about the molecule so QSAR stands for quantitative structure activity relationship I even have that there this is much simpler than it sounds so just make take all your compounds make a long list of all the tables how heavy is it what is the charge what is the dipole moment what is the surface area what might the partition coefficient be how many hydrogen bonds does it can it form and then use this in a database to try to find other molecules that are similar so this is kind of bioinformatics but for chemical molecules right can you imagine what you call that chemoinformatics it's a very broad field you can even do this in 3D so this is a molecule where you have in this case it's serine sensistins in blue they're somewhat similar glutamines and they're in the rear so you can even if you know what the receptor looks like you can map out the three-dimensional space and see what small compounds would roughly fit into this shape and pattern the advantage is that now you're not you're not doing it forget about the coordinates you're not doing anything structurally here you're just looking at a table with five numbers it's going to be you can screen through 10 to the power of 60 molecules in no time well not no time but you can screen through and maybe not 10 to the power of 60 but you can 10 to the power 10 to the power of something that's fairly large because it's fast so there are some really cool advantages you can screen through an insanely large chemical database even if these chemicals haven't been synthesized yet you can even let a computer program generate new compounds that look like your previous compounds nobody has even thought of this compound yet and it does find ligands the problem is that you're only used to work you're going to need something that all really binds fairly well right it's kind of like a homology model you can find molecules that look like the molecule you already had but you can't find something if you don't know any molecule that binds if the flexible molecules are large you need to find the way that well if there's a molecule with three or four large bonds I'm going to need to know what this molecule looks like when it's actually binding it's probably easier to show you this episode slide so if this is my large molecule where is there a polar group here where are the potential hydrogen bond partners where are their aromatic rings here and then we typically use this in a way to try to describe a very simple pattern that you might say aromatic ring and hydrogen bond donors or whether it's a dipole so you rather than describing atoms you try to describe these properties either in three dimensions say that there has to be a hydrogen bond donor exactly two well roughly 2.5 nanometers from a hydrogen bond acceptor there has to be two aromatic rings the molecular weight should be roughly 200 why would the molecular weight matter well both because of these Lipinski's rules right it can't be too heavy but if this is going to bind as a lock as a key in a lock it's probably a good idea to have the key fit roughly in the lock if it's too small it's going to not really going to take up it's not going to bind well enough in that part so this is very boring in a way it kind of works I say that it works but it's rarely it's really perfect but you can certainly find lots of similar full agonists or something for a arbitrary receptor in this case you might see that most of these I think have two aromatic rings in common right so apparently these two rings is one of the common features here and that's what you can usually find with pharmacophores so if you have something pharmacophores modeling that you use are it's a great way to find more things like that and it might be that you had remember that you're sitting in this meeting inside you need to find more things to test you had one out of your 100 compounds that show a little bit of effect this is an awesome way to find more things that might look like that the only problem is that it gets complicated you can start having some you can start mapping out aromatics, hydrophobicity you can start mapping out the volume and everything in principle it works but it's not going to be a revolution the reason why people use this is that it's fast and because we don't use protein structure this far we actually haven't really used protein structure we're just looking at the structure of the other small drugs that bind right so in theory you could even do this even if you don't have the original receptor structure but given that this entire course has been about proteins it seems a bit stupid not to use the structure of the protein and this is of course the reason why those pharmaceutical companies are very interested for instance now in cryeum all these pharmaceutical companies want to start new cryeum departments and start doing cryeum themselves so getting a structure today if you're actually going after something as serious about it you usually want to start with the structure of the target and that opens another possibility molecular docking screening or docking so docking is pretty much exactly the same thing as hydrophobic screening but you do it in computers instead and that's why you actually have a different name for the VHTS virtual high throughput screening and in principle this is very easy you could do this in theory in theory you could even do this in a simulation right just put your receptor in a box and then you put a small compound in a box and then you simulate this for like 10 years and they're going to see whether it binds and then you simulate one more molecule for 10 years and then you see whether it binds and the great thing now you only have 10 to the power of 60 minus 2 molecules remaining so it's not quite going to work to do this in a simulation but if we forget about the simulations for a second this in general this is fairly easy we want to dock we want to put two molecules together and you could say that it's even the best ways to put two molecules together so it's some way we're going to need to quantify what is the best way to put two molecules together but we also need what are our ways to put two molecules together right this is somehow going to be related to energy functions and everything but we're not really going to use proper data this is just going to be some sort of arbitrary score I need to say that this is plus 100 which is good and this is plus 5 which is not as good and I also going to need to find some way of doing this faster than 100 years otherwise this company is not going to survive very long in some cases it might be possible to do this through pharmacophores and everything but the problem with pharmacophores they will only find what we already know in principle if we know what the docking site is here and what the properties are around the docking site we would like a computer program to try to place things in this docking site and find something, let's see if there's a hydrogen bond acceptor there and donor there we would like to match those and we would like to score those well and here we want something hydrophobic to interact with us so can we just get a computer to recognize that we might only need to test a fairly small number of confirmations of this drug just around this site and then we're in business because then we can start doing lots and lots and lots of these in parallel Jens Karlsson is a young researcher we recruited here while he has his specialization in docking, he has this really fun slide of showing that he just had a two year old daughter at the time and said his two year old daughter has daycare sitting and working on that and he's sitting and working and working on that the difference that she's good at that he isn't but it's this is a fairly good summary of docking this is really what you're doing this is a two year old, well the different two year olds they like to succeed one time out of four and here you might succeed one time out of four million or something you're trying to find something that fits so in one way you're focusing more on the structure of the receptor what parts for this molecule where on the receptor could we fit this so you're going to need a crystal structure or potentially a homology model to do this the structure is going to be better because it's higher resolution but in many cases you might not have an alternative you might have to make do with a homology model I'm just going to see how we're doing in time, yes and to formulate this slightly more scientifically than I did in the hand waving, the first part we're going to need to test lots of confirmations and this is somehow related to sampling sampling in the same way that we talked about entropy and everything and the second part this way of deciding which is best scoring function so what is a good scoring function that's completely wrong we know exactly what a good scoring function would be we might not be able to determine it but what would the best scoring function be yes or no, one that finds one good compound because again if you're running a pharmaceutical company what is your goal do you want to find the best theoretical anti-HIV drug that nature could make or are you happy to just find a really efficient HIV drug you don't need to find the best you just need to find something that works and ideally well it would be great if that was number one but in practice it doesn't have to be number one if this is, if you can test a billion molecules among your one billion molecules could you rank these among the top 1,000 molecules so it would be something good it doesn't matter if this could be placed 999, that's fine because the top 1,000 we can often synthesize and then the stockholders are going to be really happy because you now have a new blockbuster drug the fact that you also made, yes you would have synthesized 999 bad drugs too at the cost of roughly $50 million who cares, that's nothing the only problem is that this is hard this is almost as hard as MD so even if you take a small molecule here that we can rotate this we can rotate it around the XYZ axis, we can translate the drug and then there are just there are just four rotating bonds in the molecule because these molecules are mostly rigid but there are four small bonds we want to rotate if you can sample say 100 confirmations per second even with a fairly coarse sampling here this would take hundreds of years to finish so this is not going to work you can't test a huge amount of drugs and you can't test every single possibility and in this case it's just 10 angstrom where we're spacing right the problem with this of course is that this would be like sampling all of phase space in a simulation most of these for instance if I'm starting to place this compound in a place where two atoms are overlapping I should stop testing two atoms are overlapping that will never be good it's just stupid to keep testing for 10 years there so what you use in docking is not typically simulation or hardly even well kind of related to Monte Carlo but you typically use what you call the genetic algorithm and this is not this this sounds really fancy but it's actually very simple genetic algorithms just works by mimicking the way natural selection works so you start by just throw out say a thousand random populations of molecules throw the molecule in 100 different places on the surface of your protein and then we see what was the scoring function I had and then I see so what were the best confirmations of these 1000 and maybe pick the 10 best ones and then I do a mutation in the sense that I'm changing something I rotated a bit here or try to place it slightly differently and then I score this again so rather than genetic algorithm is just really fancy name for trial and error but the difference is we try to learn from our errors we go we do a trial then we look at the errors and then we continue sampling in the direction where we appear to be doing well yes so that's let's see I think I have a slide on that but for now this is a black box assuming that I have some sort of scoring function that or here you can assume here we're talking about the sampling right where we're placing things assuming that you had a perfect scoring function that if you could just find the best drug and place it in the right place I would score it perfectly that's not going to be the case because we're going to have approximate functions but even assuming that you had a perfect function it would still be a way how do we search so this is primarily based on how do we search and how do we test new positions exactly how we score it that's not known yet there are other algorithms you can basically try to put part of the molecule there's a fragment of the molecule in one position and then try to add more parts to the molecule to try to build this there are probably another half dozen if not two dozen algorithms but the point is that this is not really it's not extremely scientific in the sense you're not talking about physics or distribution or anything just test lots of things if you test smart things they're going to score very well at the end and if well if they don't score well you're not particularly interested in it but it's much about brute force testing as many things as you can but it should be fast fast fast the faster you are in scoring these the more things you can test and the more things you can test the more things you have a chance of seeing and that brings us to the scoring function because in theory you could use an empty simulation to score this right but the only problem is you want a free energy you don't just want an enthalpy and an empty simulation for every single compound you would need to put it in water to calculate the free energy of binding which I will explain more later this would take a week if not more so you can't do it even forget about free energies even a simple empty simulation just relaxing a small protein you did that in the labs right that will take an hour you can't do that not if you want to test a thousand per second and the problem with MD is that there are too many atoms in the system there are thousands of particles so that there are a couple of different scoring functions in principle you could use something like MD or the scoring function MD is what we call this force field right so there are docking programs that try to use some sort of physics for terms for Coulomb and van der Waals interactions you typically ignore the water because there would be too many atoms in the water you can have some sort of empirical function that you just give plus one for instance if you can form a hydrogen bond exactly in an MD simulation you would calculate what the electrostatic interactions are between all the atoms close to this right but you can just say if these two atoms are close enough that they might form a hydrogen bond plus one you can say if two hydrophobic atoms are close to each other plus one if a hydrophobic and hydrophilic atoms are close to each other minus one so just something very simple and now I'm starting to throw out a whole lot of the baby with the water but I might gain speed and the more speed I have here I'm just screening things and of course this is not completely random you can tune this so that you reproduce experimental protein ligand complexes in some cases we know how molecules are going to bind so you can train this with a computer to make sure you give good scores to things that should be good and bad scores to things that should be bad you could also use something called knowledge-based did you talk about this in bioinformatics knowledge-based potentials so this might sound really strange but it makes statistics about favorable interactions and unfavorable interactions so whether two oxygens what is the probability of oxygens being close to each other in the protein data bank and oxygen and nitrogen or carbon and carbon why on earth would that work but it's related to something you know really well a certain distribution we've talked a lot about in this course right so that if you know the Boltzmann distribution the probability of seeing something is related to what the minus delta F divided by KT right so if you just observe what is the probability of seeing something and then take the logarithm of that so minus RT ln the probability is related to a free energy that's the minus sign Christian forgot on Friday you typically there's a name for that you typically call this the Boltzmann inversion so that you have some sort of distribution a set of probabilities you can invert that to get the free energies provided you have enough statistics you're going to need statistics about you're going to need exhaustive statistics about every single particle and that usually doesn't work it's also going to be insanely noisy so these knowledge based occasionally they work you have to make them very very smooth and of course a knowledge based potential could for instance model that hydrophobic things will be close to each other hydrophilic things will be close to each other but they will not be closed many docking programs you have to use a combination of all these three and that might seem even more horrible in docking there is no credit for being proper all these things we talk about properly reproducing physics having a correct distribution or something forget about it the only thing that matters in docking is what? yes it's a good thing should score well bad things should score lower on average you don't care about individuals it's horrible from a physics point of view but it's very nice from a practical point of view we should take a break fairly soon but I'll finish the parts of docking here there are a bunch of ways you can use grids for instance to score this is probably easier if I do it so you can take a small molecule and I can test this molecule on every single grid point so rather than calculating interactions I can start by taking my protein and saying there's this grid point here we have lots of hydrophobic things so I'll call this grid point hydrophobic and these two grid points I can call hydrophilic because there are more hydrophilic residues close to it and here is another grid point where I'm not sure hydrogen bond except or whatever and then when I have my small molecule I just run this through my entire grid and probe roughly how well does this score so if you're into horrible approximations then docking is for you and again it's not by no means that I dislike docking the advantage is that it's fast right and then suddenly you can frequently get these things that you start to take a compound and the better you place it you start mapping out what are all the interactions around it and what is the best way to put a molecule here and that might take you one second and that's the beauty of it because if this takes you one second you can try a thousand different poses and you can try a million different molecules and this might take you a week on a super computer at most and what's happened there now is that computers are now so fast and this works so much better there are lots of errors in this that you have to compare this to the amount of errors and mistakes you do when you synthesize molecules chemically right so that the whole idea is that when you can try this you can frequently find something that kind of works and remember those lactamase and chrysine things that you actually find docking hits so in this second case where we already knew something about the molecule we're not going to find quite as many docking hits as experimental hits but here in the first case docking helped us to find some things even in a case where we did not find anything experimentally now two hits might not seem like a miracle right but the point is you have something we have two hits to work with now you can start developing pharmacophores you can see can we improve these hits if you have zero so if you start with zero hits and then you become 10% better you're still at zero so as long as it's zero you do not have anything to work with if you just have one hit from docking something that you can start improving it doesn't matter because at this stage a hit here, this is a good question I'll tell you why we don't care these hits are lousy but they're not zero there's something to work with it's like I have no idea what you're... it's a very small only branch just a rough... you're sitting at the exam and you're filling in the last question you have absolutely no idea what you're going to answer and suddenly could it possibly be related to the Boltzmann distribution it likely isn't, but at least you're trying let's see, can I think about this and that's what docking is, these are not good but it's something to start working with so that you might still fail not forget about 5% you could still fail 99% of the time but if you're better than chance we have something to work with and the reason for that is that the price of computers go down all the time what you could do 10 years ago suddenly we can do at least a thousand fold more every 10 years docking is going to improve by a factor of 1,000 because of the speed of the computers the lab result, it might go down a factor of 2 in 10 years and that's why we're gradually seeing this shift that we're moving more and more from early drug discovery away from the lab and into the computers it's cheaper to do it than the computers and if it's cheap we fail early and we fail cheap surprisingly little so this one of the fields that 10 years ago with a gigantic consumer because it was in power I would say it might be in the ballpark of 5% or 10% of 3D supercomputing time that's used for docking, less than empty so you still use supercomputers but Jens Carlsen the group up here they might be using a couple of they might be using 50 nodes 1,000 cores or so it's a substantial amount of computing time but you have to compare that to $50,000 for synthesizing one chemical it's not cheap but it's cheaper I have no idea probably not because these are likely based on chemicals that were similar to some compound this was already binding while this probably came from the zinc database or some gigantic thing so they might be similar but they're likely not going to be identical we don't know there are going to be two more slides and then we'll have a break there are ways to try to make this slightly better because to get docking really quick what you typically do early on you assume that the protein is entirely rigid and you assume that your molecule is entirely rigid do you remember the concept I talked about lock and key or induced fit and selected fit these are really concepts that originated in docking and so the simple early stages lock and key just find something that fits perfectly that's perfectly complementary well it's fast but it's not very accurate at some point you want to start allowing these molecules to be a bit flexible but the second you allow these molecules to be a bit more flexible the degree of freedom in the molecule explodes so it becomes way more expensive and this is something that's happening more and more now and that's related to your question that as we now suddenly have a factor of 1,000 more computing power we frequently use that factor of 1,000 to allow at least this small molecule to be flexible because then we might be able to make a better prediction but here you start having a balance what's more important, is it more important to screen a factor of 1,000 more molecules it's more important that we try to do a better job of scoring the molecules we do screen and I don't have the answer it depends and the awesome thing is once you've done this you might have a drug if you're fine with eating 5 kilos of medicine per day because this has to do with the affinity right these are not very good and if you have something with a very low binding affinity most of the molecules are not going to bind to your receptor the way to fix that is by adding more because the more the more product you add the more you're going to put the more of this compound you add the more compound you're going to have bound to the receptor so if you just increase this concentration high enough and whether this is 1 kilo or 5 kilo I don't know but the point is that these concentrations are probably a million times too low the binding affinity is a million times too low so to get anything to happen here you would need to eat 5 kilos I haven't seen a whole lot of these drugs in the pharmacies for some reason because if you start eating 5 kilos of something well pick any small semi-toxic organic compound that needs 5 kilos of it per day it's not going to be good you can probably go down to any pharmacy and buy something cleaning or detergent or whatever and start drinking 5 bottles per day no don't do that it's not going to work so that's I think it's a great place to we have a lead, we have a good idea here but it's not going to work in practice it's way too bad it does both particularly nowadays for a long time people don't do anything you can do small fragments and literally as you're sitting in the binding site you can actually build it in place it would be great to have a metal group here it would be great to have a hydrogen bond owner I'll add a hydrogen bond owner that will likely be better from a docking perspective the only problem is that each of these compounds is going to cost you $50,000 to test the other alternative is to work with but the advantage is that you have all of chemistry space you can do absolutely anything you want right and that's why you can hopefully define something with very good affinity what people tend to do today is that you use either an existing public database or if you run your own pharmaceutical company you can have internal libraries and again if you're working with ion channels your company will have a very large database focused entirely on ion channel drugs and you're going to have drugs that you've tested before that seemed to be interesting but it didn't work for whatever reason those are going to be the first drugs you test again the advantage with those libraries there's one called the zinc library and it's nice because it's not commercial anybody can order it, we've done it tons of times and then you pay between $10 and $50 per compound because somebody has already synthesized it and then of course I can test 100 compounds it's got me a thousand dollars, that's fine it's nothing compared to the student salary so that usually it's better to work with start at least with smaller libraries where it's cheap to get the compounds rather than again I wouldn't want a student project that we need 100 compounds and it's going to be $50,000 each because in all likelihood it's going to fail it fails for the pharmaceutical company too right so that we want to fail cheap so let's start with the cheap stuff and at some point we're going to need to start optimizing and testing and that leads into what we're going to talk about after the break let's see the time is now 10.40 should we meet here at 10 minutes past 11 right, so before the break we got to this point that in theory we have a drug but it's going to be so inefficient that we would have to eat absurd amounts of it which would lead to some other side effects and this is one of these single things you find you had a hit and typically a collection of hits some sort of series that's what we typically move over into what we call a lead so that it's like a hit is just one thing but a lead is some general direction that this class of molecule actually looks like it's worth pursuing and the next step is then what you call lead optimization so now we would like to get this down from we would like to improve the affinity and this affinity is typically measured in concentration which is really the concentration you need to get say 50% effect or something in a binding acid and initially this might be millimolar or well millimolar is probably what you're going to see not particularly efficient a really outstanding drug would have picomolar affinity so I talk about a billion times better and the way you're going to do this is today you do a whole lot of computational chemistry you might need to determine an X-ray structure if you're a receptor but if you're a receptor in complex with your drug to see exactly how it binds to see can I improve the drug based on this structure it would be awesome if I also had a hydrogen bond donor here right next to this ring and then you try to add one and see if it improves so you need to understand binding really well and then you need to gradually refine this and this is typically where you have this 46 week iteration you have something that works but you're trying to get it better and hopefully even if you're the lead of such a team you're going to have some sort of boss above you right and he likely expect you to see that week by week the affinity well the affinity improves in the sense that you the concentration you need goes down at some point if this concentration no longer goes down what's going to happen well they get tired of spending half a million per month for your chemicals so they're going to close the project and if you're lucky you can move over to another project and what eventually happened is as Reseneca instead of tell you south of Stockholm so they had focused on a very broad area of research and eventually the company start to feel you know what we've been investing in 15 years and there haven't really been any new blockbuster drugs in this area and eventually the company decided to pull out of this particular area and then you close an entire facility and a thousand people go out go unemployed that happens of course they open lots of new facilities too right and ideally you want molecules that are easier to synthesize and hopefully not too poisonous and this much strange we can we're frequently able to fix this this was happened with the only press all you eventually found a way to cut the molecule in two halves and when you only have half the molecule it was no longer toxic this is a great except this was actually one of the first computer design drugs or computer refined drug HIV one protease which is one of the first anti HIV drugs so let's see if I remember this correctly so that you started out this is a dial so you started out with this symmetric drug that had some sort of activity and then from that drug you created a very simple pharmacophore that you needed a hydrogen bond donor acceptor here in the middle you have some distances and then you had these two phosphate groups at the distance of roughly 8 to 12 extra super simple pharmacophore right and then you started screening databases and started to find lots of things like it and particularly that hit that you then test and you realize that this appears to work eventually you end up with a slightly different design this is not that much smaller so you have phosphates here too so it's a slightly larger drug and it might appear there and then you extended this drug to make a dial make it slightly larger two alcohol groups you added a urea part to it here don't ask me why it was likely good in some testing eventually they started to optimize the stereochemistry of it so that it would bind more efficiently and after a bunch of this is an extreme summary they probably went through a hundred iterations right and this is the drug they eventually selected for phase 1 studies and this is used clinically today good questions probably 10 or 15 years ago 15 years this was big when it first appeared today this is how you develop all drugs there is a big group in the US at Yale University by Bill Jorgensen and they're really the world experts who optimized these drugs he spent his entire career on give him a millimolar drug and they will get a picomolar version of it frequently they do this with MD but I will have to come back to that on Monday because in principle once you have something that's fairly good we might be able to start calculating very specifically and particularly improving the binding energy or something this is a drug developed that way do you know what this is? Atorvastatin so it's a statin that you use against high cholesterol basically you probably never heard of that name right no that atorvastatin so it's also known as Lipitor this is the best selling drug in the history of mankind remember how much I said it cost you to develop a drug yes ballpark how much this is sold for? $150 billion so there is a reason why they're spending $500 million but of course for every drug like this they're going to be a thousand that failed hopefully not all failures cost $500 million so the whole idea that the way you think that running a pharmaceutical company is almost like playing the lottery once you get here you want these drugs of course right but you're well known you're not going to win on the lottery every time so the whole thing can you reduce the price of the ticket on which you don't win and therefore you want to fail early you will fail at least 999 times of the thousand but the earlier you fail the lower the price for the non-winning tickets are it's not a problem that this cost a billion to develop because they made 149 billions but of course now this revenue is gradually going down why is the revenue going down? it's going down quite rapidly so suddenly anybody can make it and sell it for a tenth of the cost and of course then the original manufacturer has to reduce the price to the same amount otherwise nobody would buy the original drug it is and again that's the way patterns work right the reason why word patterns to companies is not because we care deeply about companies you get a patent but what is in return for getting a patent it's something that you have to do what is that you have to do no but it's become public right so when you get a patent you have to describe exactly what you did and 20 years later that is publicly available anybody can take your patent and copy it so that's this it's not a terror balance but it's a balance between the interest of the public that this has to be made publicly available but in return for making this publicly available you get a unique right to it for 20 years in theory you could have decided to keep it a secret instead although then somebody would try to synthesize the pill but based on what you knew it seems obvious you just use molecular simulations much better force fields and discover drugs right there's a surprising I think this is a very let's be politically clear it's not the world's greatest idea if I may say so you're going to see a bunch of simulations here just because people have done it and you had this David Shaw and this Anton Machines right in New York this is one example of a protein and here you have a small molecule just out in the water and they run this for several microseconds so they average the shape of the protein and the protein starts binding there I think the actual binding pocket is going to be in here the drug didn't like that and then it's starting to find out somewhere there it's actually found the correct binding pose the cost of this is astronomical because you need you can only do this for one drug it probably took several weeks in one of those Anton Machines long term on the other hand there was a day when people run simulations like the ones you did on the labs and it took months and it was completely pointless as absurd as I might think that this is now there might be a future probably in your lifetimes where this might very well become so cheap that it's just as cheap as docking and then there's no point in using docking anymore it's always dangerous to mock things and I would mock this I said this is not how you should develop a drug but keep in mind if computers are suddenly a million times faster because again I might not know exactly where the binding site is right and the simulation does that for me now the beautiful thing with this is that today the second you have a simulation you can start plotting out what are all the interaction energies and if we have enough of these interaction energies we should even be able to derive some sort of free energy because that's the problem here getting interaction energies cool on lena jones electrostatics the enthalpy that's trivial we calculate them in the simulation right so I just need to store them to this and see what they are what is that I don't get in the simulation entropy right so that the only way there are a bunch of different ways I could of course started talking about probabilities what is the probability of seeing the molecule there if I just collect enough probabilities then I could invert the Boltzmann distribution and say what the free energy is right that would work I could also take this molecule if I know where it's binding I could try to gradually pull this molecule out do the work as long as I measure how much I'm pulling with it on the force I'm literally measuring the exact work I have to do on the system to get the molecule to unbind I would need to do this very slowly or there would be noise and hysteresis effects but in theory if I do work on a system as long as I know how much work I had to put into the system I know how much free energy it took to change it because work corresponds to free energy in some cases it turns out that you can do really smart things alchemically so I can remove an add or remove an ethyl or methyl group in principle that's just related to differences in free energy but there's going to be one problem here that Bjorn and Dari will show you the lab today what is that so this is not entirely easy but assuming that you have a mountain and that you're scaling a mountain and when you start at the foot of the mountain we start measuring how much how much I go up say I just went up one meter it's fairly easy to estimate right and if this mountain now corresponds to a free energy as long as I'm in the valley it's easy I'm going to explore the valley really well and I will explore the relative height of the valley beautifully now it's also very easy to move to the next valley by car whatever magic and I can also explore the next valley but I will never explore the peak so what is the height of valley A versus the height of valley B I know all the relative heights in valley A and I know all the relative heights in valley B but how are these valleys related to each other I will not be able to say that unless I have the heights the relative heights also at the peak of the mountain right and it's very expensive to be at the peak of the mountain so that to actually calculate free energy efficiently in a simulation I need to find a way to also study all the bad confirmations between the good confirmations Bjorn and Dari will show you that in a simple example today and then I'm going to talk a little bit more about how we do that in practice in the simulations the good thing that in theory when this works well molecular dynamic simulations can be awesome they are expensive mind you so this is an experiment this is a protein called FK FK501 binding protein doesn't matter what it does experimental inhibition constants calculated ones so the error here is less than half a kilo calorie per mole you can predict binding each of these simulations might take you a day or something there are some very large pharmaceutical companies in the world and I can't tell you which ones they are but using some of the largest supercomputers in the world and trying to explore this this is still on the research level so why do they do this it's like it's a thousand times more expensive than docking well you might be able to get things you can't do in docking right but I can say that you want to design a specifically design an antibody or something doing things that are really complicated much larger molecules that are too complicated to do with docking and again 10 years from now they're well aware they're not stupid they're well aware that these computers are too expensive to use in production but 10 years from now that computer which is currently the largest one in the world you're going to have in your pocket the equivalent of now the Flops and I'm not actually know it's going to be slightly more than 10 years 1996 the largest computers in the world had the power of NIFO today 1996 was the year I started graduate school so that's where you are now so that the very largest machines in the world today by the time you're out all the time you're going to have them in your pocket and then of course this will likely suddenly be a very attractive alternative so that to be able to get your patterns to be able to go there to be able to have the expertise the research has to proceed the actual practice by one or two decades there are lots of things you could do there you could for instance figure out what happens if you have a molecule here that's going to rotate in two different ways or if the entire molecule is rotated right some of these things can be really hard to get in docking if you're smart you can do it but there are even today there are some things that simulations just can't do better but there's still an exception rather than the rule had you asked me 10 years ago I said you're completely crazy if you try to apply simulations in drug discovery today it's starting to happen and in 10 years I would guess that it's going to be common we have and I think that's all I'm going to say about docking I'm going to move over to slightly more the part we actually used to talk about are GPCRs so why did I include GPCRs in this lecture rather than the membrane protein lecture thereby far the biggest docking target that when we think GPCRs we think docking there are seven transmembrane segment proteins which is kind of fun because that's similar to the very first membrane protein that was determined do you remember which one that was? Rodopsin bacteria or rodopsin the same class of proteins and the scary thing if you just look at them superficially you couldn't tell this from a doxin that's so stupid so why didn't we just make an homology model on rodopsin then and solve the entire problem well we know what we had a structure for a doxin right so why didn't we just make an homology model of these they are homologous so the problem is that they're relatively distant in sequence the other problem is that you're going to have things binding up here right among all the loops and these loops are completely different so that while they look the same seen from a distance the binding sites are completely different there is no way the sad thing is that you're going to make something that superficially looks the same but you're going to have the geometry of the binding site be completely different so you're not going to be able to predict any binding there some of the I don't know this but there were rumors that before that that first structure became available pharmaceutical companies spend in the order of two billion dollars of key protein coupled receptors and I do know that some of the very when the first group started discovering structures only some of them were public because they were supported by the pharmaceutical companies so they I think they released four structures publicly but one structure was withheld for a year due to a collaboration with a pharmaceutical company so that company had a one year head start on the structure which they probably paid a lot of money for this number is about a year or two old so it's probably 30 years this has grown tremendously and the reason why they're important is that they're involved in all these fun signaling pathways and everything it's a myriad of things what happens is that you have something binding on the outside magic happens that some of these healers move a bit and this results in a signal here on the inside that something is released this structure the first real x-ray structure was published in November 2007 do you see something with those two dates what was it? fierce battle here and there were even some there have been some major conflicts between these groups now in the interest of full disclosure I did my postdoc at the department staff where Brian worked to assign what bias here Brian is a super nice guy and I'm very happy that he got the Nobel Prize Ray Stevens was an amazing scientist and they have a huge operation at Scripps where they have mapped out the final thing is that Brian still argues that he doesn't work with GPCR he just works with the human beta2 energetic receptor which is one specific GPCR some of these structures nowadays we actually have co-crystals where we see exactly where caratsalol which is one small drug candidate actually binds to this receptor and at least experimentally we have a rough idea how it might work that the receptor would somehow relax due to this binding and change the position of one helix there has been a complete explosion in the number of structures available so nowadays we have structures of both the relax and the active states we have receptor structures with the complex including the D-protein and we even have an NMR structure and I think there is a crye instruction in the pipeline too and this somehow 2007 is not that long ago right the bovine rhodopsin was a simple that wasn't bacteria rhodopsin but it wasn't really until 2007 this field started nine years ago they didn't go from bovine was a completely different receptor the problem with what I said about membrane proteins that eukaryotic membrane proteins in general they are floppy, they are hard to stabilize they are very flexible, they don't over express well so people actually the crystallography part is trivial but getting the protein to be overexpressed being able to purify it and then making sure that it actually crystallizes was a tremendous effort and had you asked me ten years ago I would say this is one of those things I likely won't see it in my lifetime there were several groups ready to give up that that won't ever happen, people have talked about it for twenty years and then suddenly almost from one year to another which of course Brian had worked in this way more than a year right but suddenly they cracked it sadly he got the Nobel Prize way too soon because we well I love Brian he's a nice guy but he kept coming here and give lectures and everything we would have loved to keep him on the hook for ten more years and we also have all these different structures of different activated and intermediate states which means that we can actually now start to say something about what happens when we bind Ron Drawer in particular David Shaw has done simulations of that too also long simulation so here you get the small I think it's a carousel all here too binding it's going to bind up here and in the bind series you hardly see it here it would actually push the helix a bit out and then cause the structure to see that it pushed the helix out the molecule gets further down and this is going to lead to a structure of reorientation that leads to a change here on the inside I'll show you more about that in a second we know a lot of this binding site now and Jens Carlsen in Ireland the department up here in particular they have been one of the leading groups that are using these structures now to try to dock things so I think this is an example from a project they did together with Ron Drawer the grey part here is what you had in the x-ray structure if you use not the x-ray structure but an x-ray structure of an unbound receptor and then you try to do this in a simulation you end up with a pink one pretty decent fit right so that while there are certainly things that can go wrong and everything but if we can just sample this enough we can get some awesome binding predictions and this is of course just a test is in this case we know how the molecule binds but if this is fast enough you could imagine 100,000 molecules the problem is that these things take huge amounts of time in a simulation if you just see here the number of waters around the ligand or something eventually you're in the unbound state here then eventually you go down and eventually get to the blue part here that's bound and you probably can't see the time scale here but this runs out to 4 microseconds compare this with the length of your simulations and this was even slightly faster than I thought it would have been but eventually you get it binding all the way down this is not something you would do today if you need to do design a drug but since we get to the point where we really understand the properties of the binding site Jens in particular they've been extremely successful at running high throughput virtual screening docking and predict how new molecules bind and there are a number of competitions GPCR dock for instance these are academic competitions but the idea is that when a new group has co-crystallized something they give people like Jens a chance to you know in three months we will publish what the structure is but I can tell you already now that the compound we have is for instance Alprenolol and we've docked this to the beta-2 adenergic receptor have a go at predicting it and the whole thing is that you predicted before you know what the result is and then three months later you will see what the results were and then you can see there is this funny thing in science for whatever reason it's much easier to post than predict so once we know what the result is it's much easier than if we actually do real blind prediction and that comes back to the whole protein folding problem too there were a bunch of people that claimed that they had solved protein folding until they actually were asked to predict proteins for which we didn't know the structure and then they failed for some reason and it's not because scientists cheat right but we fool ourselves to think that we understand the problem it's harder in practice yes that's CASP and there is a very large competition called copper 2 CASP has been so successful I think that CASP set the model for all these other and they're not really competitions there we call them assessments but of course everybody knows that it's a competition particularly if you win then it's a competition if you lose it was an experiment and I think this is just a second barrier where they're showing how in this case that the Alpenol this has to do with actually phenylalanine side chains flipping around to eventually make it possible for the molecule to move down I won't go into too much details about that the neat thing that you can do with simulations though that you can just do with docking suddenly we can start to correlate things with lots of different types of compounds and everything we can use in turn use docking results to see that different compounds and let's see yes different compounds in complex with different receptors what interact what residues do you typically interact with are there patterns here so in this case yes there appeared to be some pattern right that one class of aminergic compounds here tend to form very nice complexes that involve certain residues here so there's a lot of statistics you can do here too and this is again this is sort of a mix between chemoinformatics and bioinformatics that you use a lot in docking too the point here is that the goal in all these things is to find things that bind if you can find things that bind it's good it doesn't really matter how you do it and because that's a it's a fairly positive problem right that means that there are lots of opening for just being smart in general by the time you have 30 different gpcrs and you have 30 different docking results you can start doing statistics of it in general what type of compounds do you usually well bind well to gpcrs there are also different binding sites this is a bit related to these allosteric modulation I talked about and here you're going to see a larger molecule binding to the site and it will eventually squeezes way down too so here there are two different binding sites and what Ron and Al said this was a fairly short simulation 10 microseconds I think they went up to 20 but what they've been able to show in simulations too that you cannot show in docking is that they've shown how the activation works so the way a deep protein coupled receptor activates as you have an agonist and that's bind what did an agonist do again or it creates the same type of signal an agonist creates the same type of signal as the natural ligand would do we use an amplify that would rather be something else so we say amplify that would be an allosteric modulator that you would still have the original molecule bind here but the allosteric molecule binds elsewhere and just amplifies the effect so the agonist is a different molecule that has the same effect and this causes some magic to happen in here so something changes in the celuses that causes this to release a molecule or something on the inside and then this molecule would diffuse on the inside and tell the cell to do something and there's a whole range of different signals here what Brian Kubilka in particular managed to do they managed to determine the complex of this molecule a few years after the first one so you have both the deep protein coupled receptor and the deep protein itself and then we know the complex between these receptors and then the only question is how do you create this initial conformational change and for a long time we didn't really know that there were some guesses and everything but what people were eventually able to do is that Ron Dror and others were able to show with these long simulations and Anton I'm not sure where I have a movie of that that this has to do with one of these long helices in the receptor and this celix actually relaxes a bit and changes this conformation and when this changes the conformation that leads to a change on the inside here that causes this to release the signal and this in turn is caused by something binding up here no, certainly I don't have a movie but what they showed is that you basically from the active state up here when you just remove the ligand you move over the intermediate state and eventually all the way down to the inactive state and this this is between two of the helices helix-6 and helix-3 to be exact and in this case this is the RMSD to the inactive state and you can actually show how you move you start from the active state you go close to the intermediate state and you end up in the inactive state although the only thing you actually use as input in the simulation is the active state and then we can show that we end up pretty much over-acting the inactive crystal structure exactly. I'm not sure you see this but the blue part here is the inactive crystal structure, the red is the active we're starting from that helix in a completely different orientation, right? The computer does not see the blue structure the computer doesn't know anything about the blue structure all I do is that I will remove that compound and then we let this simulate for roughly 20 microseconds and eventually this helix will fall in and when you stop at the end you're overlapping on the inactive structure this is not something you would do in drug development the only reason for doing this if you want to understand how the receptor activates why would you like to do that? So that depends a bit on what you want to do, right? So there's a thing that I've been seeing what is that we've been doing here all day what is that we've been optimizing what is that we've been developing? What are the drugs? What is that I try to do in docking? Yes, simpler but simpler than that what is the first part when two molecules interact what is that the small molecule does to the large molecule? Binds and that's a simple problem, right? We're optimizing binding, we're optimizing affinity but was that really your goal? Right, you want to fix it, right? I could not care less whether it's well if I have a choice a picomolar drug is better than an anomolar drug but that just tells you how hard it binds if it doesn't have any effect it's not even a drug so what if you could have maybe you could have a drug that's not quite as good as binding, anomolar but much better at creating the effect, right? In most cases in particular we have these classical drugs that let's see the classical drugs where you have an agonist that pretty much looks exactly like the original molecule that activates the protein in the same way those two will usually correlate extremely well because actually we're binding in the same place in roughly the same way of course we're going to have roughly the same effect per binding, right? But nothing says that has to be true what if in this case if you wanted to stabilize this receptor in either the open or closed state maybe I could create something that binds in a completely different site if I wanted to stabilize that red helix in the active state maybe I can have a compound that bound between those two helices you're not going to find that out by just docking to the active state in active state you're going to need to dock to the active state you're going to need to have a rough idea that well this helix 6 will move and therefore I should try to dock something between helix 6 and helix 7 for instance so the reason for studying mechanisms and everything is that you're not quite sure how you want to achieve something and to achieve that you need to understand how it works first on Monday so at this week you're going to have labs today? No, on Thursday not today, right? No, today because you didn't have a lab yesterday so you're going to have labs today and on Thursday and then we don't have a lecture on Friday because some of you were heading off to the awesome day at SU related to something I forgot what oh sorry it's here since there's an open house day here or something I already have a reserve time on Monday so let's do the lecture on Monday instead because there's no point me sitting there's no point me lecturing here why half of you are not here and the other half is just missing things what I'm going to talk a little bit about there is how we try to do this on ion channels in particular because we know that they can be either open or closed or there are some strange intermediates that desensitize states but we don't really know how they move between them and the second you start to understand how they move between them suddenly you realize that there are multiple binding sites that we can target can you try to stabilize one or the other the second you stabilize something that's going to be more favorable is there something else you can do rather than stabilize a favorable state destabilize the unfavorable state right so you can bind something or create something that suddenly makes it bad to be in the inactive state that would also have the same effect so that this is a very even before you start thinking about what you're going to bind where there are lots of things you can do in these molecules and you can all either have an agonist or you can have an allosteric modulator in some cases you can have in some cases you might actually work to combine an inverse agonist and a modulator so that you turn off one type of behavior that you don't like but you still want to keep some other behavior that you do like the final part of the question is that even when you have one of these beautiful drugs it will never ever hit just one receptor sorry that's not how easy it is think about the GPCRs they're like elite thirsty often they're very similar if you find a drug that binds really well to a GPCR let's say that you find a drug that binds very well to the beta2 adrenergic receptor can you think of like 24 other molecules that will also bind very well to because if they're so similar right there's a very big risk that this would bind to other GPCRs this is even worse for ion channels because most ion channels are even more similar than the GPCRs so drugs that bind to one ion channel will almost certainly bind to another spider toxins works that way for instance a lot of these simple toxins that they will bind to a bunch of channels so what do you do then yes but how do you make something more specific so the problem is that this is a complicated mathematical optimization problem right I'm not sure how much you probably haven't studied optimization theory but you've all studied analysis I think so if you want to find what is the local minimum of I say local minimum of a function that could be anything say that which is the cheapest way to get to our land airport but you typically have either a boundary or side condition what is the cheapest way to get to our land airport in an environmentally friendly way that would exclude taxi or your own car and it's the same thing here so that you want a binding affinity that's as good as possible but you'll probably also have Lipinski's rules of 5 it can't be too heavy blah blah blah it can't be too hydrophobic it can't be too hydrophilic another side condition is simply and this cannot bind well to a bunch of other receptors yet you know there are some receptors that are quite promiscuous in the sense that they tend to bind lots of stuff one of them is called HERG Heterogogo common receptor that leads to side very easily to side effects really the problem with arrhythmia or something so you typically do anti-screening too so that you want to screen things that you should hit something that you want you also screen another 20 receptors that you will want to try to avoid hitting and in that case it's not just a matter of picking, then you don't want to pick the molecule that's best but you want to pick the molecules that has the largest difference in binding so you should bind the one you want significantly better than the ones you don't want and then try to optimize that difference in binding and that goes down to this admittox right that toxicity and side effect administration is frequently way more complicated and I would argue that a large part of drug design today is really this two-dimensional problem not just optimizing the binding affinity but optimizing the binding affinity without going into bad things in admittox much harder than just the first part you do more calculation you do more screens, the admittox I would argue is still mostly experimental on the screening thing if you have crystal structures of 20 compounds and you want to bind to one of them but not the other 19 you can just do docking to all of them so pick out the ones that get best scores for the good one you have and then among those 1,000 check what are the scores to your other 19 receptors and then rank them in a way so that you want to rather than just ranking them for the best binding affinity you try to rank them to have the largest difference in toxicity ultimately you have to go into animal models and show that it's not too toxic and it actually turns out that animals do this you probably know of a bunch there are a bunch of toxins that are usually based on proteins actually but say spiders scorpion cobra toxin what do these toxins do? they usually bind ion channels, yes and they usually inhibit your ion channels do you see any problem with that? right, so this is not good if you're a cobra or there are a bunch of these frogs in amazon that are very very poisonous you just touch them you can even die like if you're one of those frogs that would basically mean that you wouldn't have any nerve signals right so it turns out that these frogs actually have small mutations in their channel to make and they basically accomplish the same thing by mutating their genes so that this frog itself is probably a little bit sensitive to it but not very sensitive so it's very toxic to other species but it's not toxic to the animal itself oh no, I think it's awesome because you get to travel to all these fun places in the world and work with fun animals and everything yeah there are far worse things that you work with in any normal hospital but if you work with ebola research there are biosafety level 4 labs in Sweden my colleagues of us in Linköping actually actually there is only a biosafety level 3 lab here in Stockholm in Linköping we have a level 4 lab insane expensive to run there are safety procedures, you can handle this it's just that as a researcher it's not the safety that concerns you you're not going to accept well 50% of my students' size but that's fine, I still get 2 degrees per year the hard part is first how much it's going to cost the auditing and second that everything is so much lower because any experiment you have to do in a biosafety level lab it's going to take 100 times longer than just doing it at the lab bench so that's why it might seem stupid for instance why are researchers on the bacteria for instance why on earth do we study any binding in bacterial channels or something where we can do it in the fancy stuff and that's because it's a factor 10 cheaper at least and 100 times faster to do the bacterial stuff rather than having a biosafety lab or toxins or anything so let's start with the simple stuff and then go to the complicated thing when we need it not sooner I think that's all I'm going to have to say so we might finish 10 minutes earlier today I'm going to be a bit nasty, I'm just going to have one study question on Monday you're going to take me through this how modern drug development works and think about in particular some of these things we're not able to just repeat how a drug is developed but how have we really improved the last 10 years what are the challenges today and where might you be heading in the future because if you end up working in the whether you work at the Psi Life Lab or at the University Hospital or at a pharmaceutical company it's very likely that you're going to be stumbling into this at Psi Life Lab we have a facility for drug development and what they're going to ask you even if you might not at all be interested you're a cancer researcher but if you're a cancer researcher what's your goal? well, you want to find a cure or a drug and at some point you might what's your strategy going to be to find a drug, you would need to find a receptor you would need to find a target first you're going to need to find to actually do anything on that target you would somehow need to find something that binds to that target even if you're not going to do any of the docking yourself those fundamental concepts you have to understand otherwise you're never going to be able to do anything useful you're also going to need to realize at some point you can't have the heaven and say that you would like to do it at some point unless you have an infinite amount of money you're likely going to have to limit yourself to some library of pre-existing things and if they now give you 10 good things back what could you do to improve this? so think a bit even if you're not going to be writing the docking codes yourself as a user of this you need to have a rough idea roughly how this works ooh, let me see if I so the problem let me see if I can find a good overview or a review article so one problem is of course that this is a field that changes very rapidly and one important reason why this is changing is that the old way of doing drugs is not just that it died out like not becoming popular the reason why did the old way of developing drugs becoming popular yes, we're just spending money we're not getting drugs it doesn't work and again these companies are not into welfare and you can see this from two points of view again they're these companies all the large companies spend more on marketing than they spend on actual research so I can't say that I feel too sorry of them for being howling out to dry occasionally but on the other hand you can look at this from a different point of view antibiotics what's the problem with antibiotics? yes well, I know that's not quite true we do develop new ones occasionally but the problem with antibiotics is that bacterial strains get resistant to them and they get resistant very quickly so what if assuming that you now come up with a really new smart antibiotic tomorrow because they're all encouraging us to do this we need more antibiotics, right? and your pharmaceutical company comes out with amazing new antibiotic it took you 10 years you spent half a billion dollars on this well, one of two things will happen either everybody starts prescribing this right away and within 12 months that will be resistant to your drug too sorry, you just lost half a billion dollars yes, but that's a separate issue from your point of view this might sound harsh, but ultimately if you want to work with companies you have to understand why a company exists a company exists to make money for its stockholders we might not like that but these monies ultimately your retirement accounts and you probably want some retirement benefits eventually the other alternative is that we are super restrictive we're going to make sure that the bacteria are strained out beyond resistant to this drug and then the government will say sorry nobody can prescribe you it's only the 1% worst case that you can prescribe this antibiotic and send it for 10 dollars per bottle you just lost half a billion there too so that if we want the company if we want private companies to invest their money in making antibiotic there has to be some sort of perspective where they can make money from it if we don't accept that we're going to need to fund it with the taxpayers money and I would argue that's partly the problem with antibiotics because there is no business model for it now on the other hand making say Lipeter for somewhat overweight westerners that's my 150 billion dollars so while we might certainly fault the companies for doing this remember who said these rules we did it's not a coincidence they go after Lipeter rather than antibiotics same thing with malaria malaria is a horrible disease one of the worst diseases in the world but the average person suffering from malaria nothing it's not an important disease in New York or Stockholm or Berlin or anywhere and again that's as bad as the companies are and we fault the companies for not investing it we're not investing either well there's a tiny amount of taxpayers money going to particularly at Carolinske which I actually like but overall in the world we're not investing any money in this so remember that we're part of the equation it's not as easy as it's just the bad pharmaceutical companies good hope there. Do you have any questions?