 Today, we're mostly actually not just today, but the three remaining lectures we're gonna deviate from the book And the reason for that is twofold first. There are some things that we've realized Some things like this are much more modern than the book and they're also a bit more applied But we don't teach this in any other course and this is important for you Even if you're not necessarily going to be working with theoretical research yourself This is an integral part in a lot of the things we do To eventually get translational research into patients biotech applications or anything and because this is Whether you're gonna do it or not It's important for you to be aware of a the possibilities and be the limitations because even if you're not gonna do this in your team In all likelihood, there's gonna be somebody either in your team or a team that you work with who's gonna be working on techniques like this And I'm also gonna speak a little bit of particular about deep protein couples receptors, which are We occasionally lie. We say that membrane proteins are the most important drug targets That's technically correct But that's just because GPCRs are the really important drug targets and they just happen to be membrane proteins And then tomorrow I'm gonna be speaking a little bit about free energy techniques Which are still they are used in industry, but that's pretty much the up-and-coming technique Which is the more advanced version of docking and then on Thursday We're gonna be speaking about both nucleic acids and a bit of how to handle errors in real data Star standard error standard deviations and all these things But before we go there we're gonna speak about what we said yesterday 13 questions. I had this great idea, but I'm not gonna be able to do that until Thursday That I am so when we're once here we have a meeting at NVIDIA and then they have this great app at the podium They have this big bowl full of candy and the whole idea wouldn't any any time somebody in the audience asked the question Throw them a piece of candy. I should have thought of that at the beginning of the course But no candy today But let's go through the questions. What's the difference between transition states and folding intermediates? Fun way to wake up in the morning So if you draw a plot like this Is That a transition state and that are folding intermediate No, that's not enough Exactly while transition states are then and why is that difference important? Not just that we can't get over the transitions, right? You can never observe a full a you can never observe a transition state directly It's impossible because you're not gonna spend any time there You can indirectly derive information about it from how long it takes to get over these barriers While folding intermediates we can observe. I'm not saying that it's easy, but it's technically possible to observe them So then we had this Fanta plots Fanta arenas Versus the temperature or something else Versus the inverse temperature, right? And this is a difference between the arenas and the Chevron plots that I didn't specifically mention the arenas plots They always tried to get at this property that you have some sort of rate constants that proportional To an exponential raised to e minus delta Let's say f divided by kt and you always try to get to that delta f expression by taking the logarithm of both sides And then you have delta f divided by k and the but also one over minus one over t there So we plot them either versus one over t or minus one over t and then they become straight lines What's the problem with that in particular for protein folding and in particular well No, whether it's doing both right We're having some molecules folding and other molecules unfolding so in general protein folding at the temperatures we study is a balance And that's a bit unlike most other chemical processes most other chemical processes by solving a salt or something They are so displaced either towards the left or the right part of the equation that there It's a pretty good approximation to argue that the reaction is only going one way But that hasn't hold for protein folding in particular at this intermediate temperatures where both things are happening and That in turn let's see I don't ask I don't ask that question specifically But that in turn is related to this thing that at high temperature you unfold the unfolding is faster But at low temperature the folding goes faster, right? So that the by definition there will be this crossover temperature where suddenly folding is faster than unfolding or vice versa Most normal chemical reactions all of them speed up with temperature So the only difference the only question then is that is it lower free energy to be insults or in a solution So how do we solve that with Chevron plots? So as we derived a bit that it turns out that The effective folding rate That is the folding rate that corresponds to the balance between the left and right side the one that also includes the reaction that goes the wrong way Will actually correspond to the sum of the of the rates and The point with the Chevron plot is that there was another thing we generalized here. So this is the effective rate But the other thing that we generalized rather than just putting one over temperature here. It can be anything here or not quite anything but Concentration of a denaturant anything that influences the reaction going back or forth So here. We're not literally trying here. We're not only trying to Plot this as getting the free energy directly as a function of one over temperature But anything that has to do with the folding or unfolding and that's when we get these curves So based a bit on these plots and everything even last week we spoke a bit how the enthalpy and entropy how they vary During folding sorry, I can see I already answered number six here enthalpy and entropy during folding we might have Which is surprising because then I would assume eight people here immediately saying what the answer was Yes, and the entropy So both of them go down right and they go down at roughly the same rate if they didn't go down It's roughly the same rate. You wouldn't end up either getting stuck in the unfolded state Or having a very large barrier So that the small remaining peaks has to do with the differences the deviation from say the linear drop We spoke about this phi value. What is it? So the point of the apparent folding rate is to get away from using this Arrhenius plots right and if we wanted to use Arrhenius plot We would somehow need to find a way in particular in this crossover region and the reason why the crossover region is important Is that if I introduce a mutation here We're gonna have things that are quite close to each other So interesting things happen in the crossover region, but in the crossover region, it's virtually in the Arrhenius plots It's impossible to measure one curve without the other influencing us unless you had some super complicated piece of equipment So that the reason why we want the Chevron plots is that that I would like to measure the total rate That also accounts for the things that go in the wrong direction So if you had the Arrhenius plot the area in theory if we could measure how much protein is folded Without accounting for the part that some of it that has folded will to go in the other direction So if you could somehow the second a protein has has been folded if I could remove it from the test tube Then I could measure things just with an Arrhenius plot saying measure the fluorescence or something The problem is that doesn't work in practice because when a protein has folded for most other chemical reactions 99.99999% everything will just fold we don't have anything any reaction going the wrong way, but for proteins Some of the proteins that have folded will also unfold in particular when you're at the crossover point when it's 5050 And that screws up the Arrhenius plots. We we're not gonna get that When when I drew the plot you might remember that it was one black and one red curve, right? But that assumes that you could measure the red curve without the black one interfering in practice You're gonna end up with horrible things when the black curve starts to interfere with the red or vice versa So what we do with Chevron plots is that we kind of cheat So all the Chevron plots measure is that how fast are things going over the barrier either to the left or to the right? We don't care That sounds really stupid But that's why you end up with these strange things that if you're very much to the left here, right? We are folding and it's high, but if you think you're to the right We have the same value and this might be 50 per second or whatever But it's 50 here. It's 50 per second unfolding here It's 50 per second folding and if you hadn't seen the plot that would seem insane. What do you mean 50 per second? Yeah, 50 molecules per second cross the barrier. Anyway, I need direction either folding or unfolding. We don't care and But the reason why that works is again because you have this clear separation when we are out here all of you will know that at 10 Molar goanidinium hydrochloride all those 50 molecules per second crossing the barrier. They're gonna be unfolding I promise and same thing here at zero molar, right? All those 50 molecules per second crossing the barrier They're gonna be folding and Right here. There are 50 50s. It's kind of 25 folding in 25 unfolding And this sounds very strange Until we start to realize that what I'm after I'm not really interested in that where there's 29 molecules per second going over the Barrier, I could not careless But what I can use these plots for is to understand the differences so that and in theory in particular here I can extrapolate because again up here. It's only folding right? So if I extrapolate that curve that curve should corresponds to the rates if I'm only folding While this curve should correspond to the rate when I'm only unfolding So the point is that this is something that's easy to measure I measure both reaction constants they just sum them up and I can still get all the important data just by extracting them from the plot and Then the exact way you extract this from the plot that was a bit of detail We went through it yesterday in the slides and they're well-meshed in the book too But the important part here is to understand what Phi f is So what did I use the plots for? So that's the answer number eight, right? We want to we want to understand whether this particular rest use part of the transition state We want to know whether is due 49 is part of the transition state And the reason the way we get that for residue 49 is that I take residue 49 and I try to mutate it to some other residue anything and What then will happen is that in general there will be things changing, right? But there are three things that can change the unfolded state the transition state barrier or the folded state So if I'm only changing the stability of the folded state relative to the unfolded state I don't care because that should not influence the barrier But that can again happen by either by making the folded state better or the unfolded state worse so what I'm interested in is What fraction of this change is present already in the transition state? Because if it's not present if there are zero percent of this present in the transition state This might be a great or horrible change in the residue. I don't care about the sign But it only influences the ultimate stability of the protein. It doesn't influence the speed with which the reaction happens While if the change if I mutate residue 49 here if that is a change already in the transition state Then residue 49 was obviously part of this transition state So the reason why we need to determine that separately if I try to only measure say the height of the transition state That might seem obvious, right? But what might have happened there is that I just changed the unfolded state is that and of course if I change the Unfolded state I will effectively had changed the barrier in the transition state But it's not really that I changed the transition state It's just that I changed the unfolded states and that's why we need to take this difference So that I check how much does this residue change the transition state versus how much did it change the stability of the folded state? And the easiest way to think about this is that that leads to result between zero and hundred percent That's not strictly true because you can have differences in sign. You can have more than hundred percent, etc But think of it between zero and hundred percent So if this is hundred percent, this was a residue that was part of the core of the transition state Pretty much the first few residues forming it if it was fifty percent It was still a residue involved in the transition state But maybe in the outer parts and if it's zero percent this residue is not part of the transition state And again this way we can take a structure We don't know the structure of the transition state but I can take the structure of the known protein the final folded protein and map out color what residues in this protein Were the nuclei or nucleus the first one to start forming For instance the beta sheet we could show that the beta sheets that insert in a membrane They too they have the small band of residues around the beta sheet and that is where the residue structure starts forming I can't determine the structure, but I can determine that that is the first part of the structure that must have started to form And if you ever wanted to change say the the speed with which a protein folds or influence a process or something These are of course the residues that you want to go after right trying to change the stability or how quickly a protein folds Basically if the red if the rest of you has a five value of zero forget about it It's not going to be important for the rate of folding of that protein number nine enthalpy entropy balance We touched upon it a little bit or but this doesn't hurt repeat Exactly, and it has to do we call this concept a couple of different things in pretty folding funnels or pathways guided and so that we basically we're gradually going down in the energy landscape and the point again if the Energy drops too quickly. We will get stuck in some relatively low energy state But if the entropy drops too quickly the free energy barrier will be too high So they need to go down both of them roughly at the same pace if they went down at exactly the same pace though The effective free energy barrier would be zero and then you would not have stability So the reasons why proteins are stable and where the way we have a barrier is that? The entropy tends to drop over a fairly narrow region when the chain starts to intersect with each other bump into each other and Because the free energy is the enthalpy minus the entropy That effectively creates this barrier and that barrier we argued can explain live and toss paradox at least in one of these models the nucleation conversation So you don't necessarily need to use the equations here, but how could the nucleation conversation model explain live and toss paradox? So we can use a little bit of equations What I would and I think this is a good example where equations actually make it easier If you just had a volume that being folded The energy would drop roughly as so that energy would be proportional to the number of residues, right and The number of residues there well the volume the volume in the space would be proportional to radius cubed Which is a proportion of the number of residues what the nucleation Condensation model argued when you're forming this gradual core Initially the number of interactions you have are rather going to be proportional to the area rather than the volume So that means that you also have some sort of second term here that is rough or proportional to the number of residues raised to the power of two-thirds Because that's the volume and that's the area of the region and then we I also made a hand-waving argument that roughly the same thing holds for entropy So that the amount of residues you have locked in if you have a large volume is going to be proportional to the number of residues The number of when you're just starting this the number of residues that we are effectively locking in is more proportional to the area The way one other way you can think about that is that if there are only a handful of residues formed there If I add one more residue that residues not going to be entirely buried, right? If you have a large area most residues inside that sorry large volume most residues that are in this core will be entirely buried And then it's proportional to the volume when you're forming this all the rest use on the surface will not be buried And that's at least with a bit of hand-waving. We can argue that is more And this comes back to Leventhal's paradox. So Leventhal's paradox really had to do with the number of states that we needed to test Was some sort of number say the number of different from a chandron torsion or something raised to the number of residues those terms and Then we argue that what happens when you take the energy minus the entropy those terms roughly cancel So that the remaining terms we have are going to be proportional to the number of residues raised to two-thirds So if you then would make this into Leventhal's paradox the The free energy then would be roughly proportional to e raised to Something that is not n but raised to n to the power of two-thirds and that should not be Whether sorry not necessarily either but something x whether that is the number of ramachandron torsions or so So the point is that this will we will not have n in the exponent You can have whether it's n raised to the power of two-thirds or something the exact number here is of course an approximation, right? But the point is that it will not grow as n And that's effectively what cracks Leventhal's paradox Remember when we drew the ramachandron torsions and if I had a large chain Say 100 residues and then we asked what are the number of different states that we need to test here, right? And then I argued if you want to make this really simple you could argue that there could be two states or maybe three So if you have 100 residues the number of different things you needed to test there would be say two raised to the power of n or Maybe three raised to the power of n whether it's exactly two or three doesn't really matter The argument I made at the beginning of the course that the important part here is this number because it's this number that Courses as to the complexity to explode So there is going to be some small number here and exactly what this doesn't matter say it's between one and ten It's not going to be five hundred even if it was This number is not going to change the shape of the curve how quickly it grows it will make it larger But it's this number that decides how quickly it grows So what we effectively did now is again? I haven't changed the proportionality here and think but it's this number that is no longer n But rather n raised to the power of two-thirds is that and then I yesterday we formulated a bit this a Bit more accurately in particular We formulated it in terms of delta g or else that f and there we actually if you're talking about reaction rates, right? sorry The reaction rate K That we argue that that is going to be proportional to e raised to plus delta f divided by KT So if you formulate it in terms of reaction rates is that then we're really heavy So and then we would have that K that is proportional e raised to if that free energy was now proportional to the number of residues raised to the power of two-thirds We would have another constant raised to n to the power of two-thirds Sorry, I might I might have been a bit unclear there if you talk about the number of states It's X some sort of low number if you're talking about reaction rates, then it's definitely That makes sense for once good We spoke a little bit about these network models for folding I have a couple of extra slides that I included that so let's wait with that one When our proteins thermodynamically versus kinetically stable Actually, you know what I think it's going to be easier to handle all of these if I show you the slides I don't some networks here The number here are different states and I've written the free energy as a sort of arbitrary scales between them And this is still highly simplified Duh understatement of here So let's try to understand what's going to happen here and what the different states are and In a real network model, we would have like maybe 500 circles like this and that would take a while to go through so that I think you'll agree that it's easy to do it this way What would this mean? What are the different things and what is the? Initial states side states the native state. Are there an intermediate states or transition states here? So this is a sort of reference state that we start from and I've deliberately called that zero. Yep Well Mmm. So are these really intermediate states? They're higher in Friendi, right? So here you're just going up on the barriers all these states are bad if you are there the best thing would actually be to fall back So here we're actually monotonously climbing up climbing up. This would have been even larger peak that we're never going to visit and then Here this is an even larger off path where if you go there you will pretty much immediately go back, right? And then we're going to be happier. So this is essentially this is a fairly simple transition In theory you can take two molecules and put them on top of each other and get to a very bad state And if that ever happens, we're instantly going to go back on path So while there is a barrier here, it's a fairly simple barrier Theoretically there are things you could go out to the side, but you're not really going to do that if you go out to the side We will immediately go back here. So in this case so the circles here are just think of them as snapshots I've taken this I've taken a movie captured your molecule in one state and said what the free energy is here Because if you can go between zero point five and seven point two and one point two and zero point nine You're always both of these states will immediately fall down to the zero point five Which should immediately fall down to that one, right? So you're never going to be stuck there So one point two would be the highest transition state that that's what we need to pass there, right? Because there is no way to get from Start to end without going through that state. So that is going to be the worst state here that we'd need to go through That's an even worse state, but we don't have to go through that state So we're not going to spend any time there, but that's it. We do have to go through So let's look at the second example here What happens now? So the difference is up here, right? What will that be? Well, is it an intermediate or yep? Well intermediate state would be a state that you had to visit So I would say this is probably a misfolded state rather, right? It's not good to be here You would prefer it would be better if you could go here, but based on how good or well It's good free energy wise, but it's bad for your path So that that is basically a hole you fall into that we would prefer not to fall into Do you see how things get more complicated where you can imagine that when there are multiple connections between them in one dimension? It's very simple because you have to go through all states, but in reality There are many forks in the road here So if you start out here, you're going uphill you're going uphill And then we could go a little bit downhill and fall there But it's also possible that you take a detour here and end up in this state, which is bad And if you're now here, then we'll have to wait until you spontaneously start going back And when you're here, you might either go back here or you go back here in theory This is something you could simulate with those simple models that you played wrong with at the beginning of the course So which one of this is going to fold fastest So this will be a much slower process, right? Because you can end up with lots of molecules going the wrong way here Which corresponds to what I showed you yesterday that if there are bad states that have too low energy I'm not really changing the energy of the transition state per se And I'm not really changing the energy of the folded state, but because there are misfolded states It's going to be slower for me to reach the good one Let's see if we have a couple of more ones So what happens here? Now I deliberately changed the path of it So here it's a clear intermediate state, right? Because I deliberately cut off this connection I have to go through that state and now the barrier here goes from minus 3.4 to plus 1.2 So that's 4.6. It's going to be a very high barrier and it's this barrier That's going to determine how long it takes to fold or you can look at it this way and here you have something that's 0 to 0.9 So what type of Process this is going to be it's going to be much faster, right? So that the only transition state we have here is the 0.9 But we still measure Virtually, oh, sorry We still visit virtually all of these yet so that while the starting state and the ending state are the same for all of these Four plus they have very different properties in terms of folding and if you think about this is not just theoretical Well, sorry, I have two more my bad. I Couldn't What can I say? It's fun when you get started What happens here now and of course these are completely fake examples, right? But that's a really good state It's a really low free-ended, right? So this could almost be something like a prior and Again, I should probably have made that barrier much higher But the point is that we're going to be pretty happy there You might even be happy here so long that that protein would have you could even imagine that could might very well Be a biologically active state and then after a very long time here You would eventually go over this barrier to I should have made this a bit higher and of course Yes, if you are there, you're going to be happier there But it might take a long time for you to make that final transition and this is the last one promise What will happen here? And this is of course much more realistic, right? Because in theory There is nothing why why should I why should I not be allowed to move directly between these states? In particular, you have a hundred thousand variables or something That's why it's the last one. So this is the network we spoke about right that there is more than one way We're all those roads lead to Rome. So if we started to zero, which path am I mostly going to take? So that's that's the right and then I might have a little bit of flow there, too And all the ones that are at zero point seven is going to go to the one point two Now if I am at the one point five here I might either go down there or there might be a little bit of flow here And when I am there, I will mostly go there. I might go a little bit there, too And again when I'm here all of them both of them will go there and a little bit will go there So the point is that you're going to have arrows everywhere But the relative rate here is going to be larger. So the most molecules will go this way So this path is this path bad than top path Let's make a theoretical argument that this is your molecule that you start out with and then Dr. Jekyll and mr. Lindahl go into the lab and I kill that reaction and I kill that reaction What's gonna happen with a protein fold faster or slower? Exactly and that has to do with the book describes this and we spoke a little bit about it when one I introduced kinetics Right if there are multiple independent paths if they are serial I have to go over all the barriers But if they are parallel I can choose any road so it's imagine if you have a Traffic jam going out of Stockholm and then you open one more small road. Sure. It's not gonna help a lot But the total flow will be a bit better So that even if this is an off path off of the main pathway a little bit say 10% of the flow will actually go that way and The reason for mentioning that is not necessarily at this path. It's super important, but For two paths it might not make sense, but a real protein might not have two paths. There might be 20 paths with small differences And when we say one path same thing here You can think of the paths as a sort of macroscopic way That doesn't mean that along this path that every single atom will have to move in exactly the same fashion Every single time the protein faults, right? That there are think about this as the general features in the landscape Most proteins will go along the freeway and then there are some protein that will take other paths So in general it's it's hard to start to change For protein folding it can help us understand why the proteins fall they do when it comes to engineering or anything Understanding exactly what paths we would take is difficult and it's not necessarily obvious that you're completely going to change the molecule Even if I cut out that path, right because even if I cut here The protein will still be able to fault and it's not going to be that much worse because that barrier is not Incredibly much higher than that one So this is of course, it's not natural selection or anything We have the same protein, but if there are multiple reaction pathways possible for it We will predominantly follow the lowest transition State barrier Let's see if there were some other questions. We didn't cover there So that I think that explains 11 Thermodynamic versus kinetic stability I know I have asked it before but these are important questions and it's and I'm also realized you've had a lot of information by now in the course And that's why it's worth rehashing. So what is this thermodynamic versus kinetic stability? When is it important? And kinetic stability is So do you now see that based on this why we thought it was so fascinating with these results from membrane proteins? That we start to see that in some cases it appears that the helices that insert in membrane proteins might not actually be thermodynamically stable in the protein Sorry in the in the membrane But it might rather be that the trend under some circumstances That at least it appears that the translocon help us achieve kinetic stability and then we don't need the thermodynamics ability So that was I'll repeat that remember that we had the membrane, right? And the translocon can't change the free energy of the helix in the membrane. That's impossible because free energy is a state variable But this helix that we had here the top and bottom of my helix here. They have exposed peptide bonds So if some way if by some let's call it magic if by some magic way I could place my protein here and now should I made my membrane dinner, sorry Didn't normalize it by pen size That's my memory If I have my two helices here once they are inserted They're not gonna like to be pushed up because then I would expose the lower hydrogen bonds to the membrane, that's very the peptide bonds groups here And if I push them down, I would also expose polar things to the inside of the membrane so effectively If I could choose between being here and out in the water I might very well be out in the water But the free energy barriers to get there are so high that in practice I'm gonna stay over very long time scales in here and The magic would then be and again, this is not true for membrane proteins in general Most membrane proteins are hydrophobic, but what the translocon would then help us achieve is that if we can create the insertion in a way where I would not need to expose them to the If I could insert in this way, and then somehow gradually diffuse out in the membrane the Translocon helps us avoid that very high free energy barrier, but it can't change the minima and again full disclosure. This is Widely believed meaning that someone I and some other people think that it is this case for Some helices and membrane proteins. It's certainly not generally true. It can't change the minima. Why can't it change the minima? Yes, I say why can't it change the minima? But it's even broader than that. What what was free energy? I had a name for that General property of physics. I could it's a state variable, right? So it only depends on the state not the path by which you got to the state But doesn't that contradict what I said? I because we just said that it could change the free energy barrier But then I'm then I'm changing the free energy Exactly, right so that I'm not changing the free energy barrier is a property of that state, right? So all the Translocon helps me to do It's not changing that state. It can't change that state because this is also well-defined state It's a transition state. We would not like to spend time, but I can't change the state. It's impossible What I can do though the Translocon can help me so I don't have to go through that path So what the Translocon here is effectively doing imagine think about the process We just showed you that Translocon is helping us open up a different path Which has lower free energy barrier And the end state here is the same so that the only the only thing that the Translocon would then do It helps me to get to the end state without going through that very very bad transition state But it can't change an individual state. It can just change the path. Yes A catalyst does exactly the same thing, right? A catalyst cannot change the energy free energy But a catalyst can help you so that you effectively Find a different transition state so that you don't have to go through the old bad transition state But none of this can change the free energy of a specific state and then we occasionally Occasionally we are a little bit sloppy about this and I say that it changes the free energy barrier But what that really means is that it changes the barrier? I have to go over by finding a different barrier for me a lower barrier But the specifics no specific state can't have its free energy changed because it's a state variable Sorry, that was a bit of detour, but this explains why we were so fascinated by it. Yep So in general we don't know again and that's why I have all these caveats There are most membrane for most membrane proteins They are a staple there right and we know that because most of them are very hydrophobic and by most we talk about 99% But as always in science, it's the exceptions that are very interesting because the exceptions tell us something important And the exceptions we had found are particularly this heal is the same voltage gated channels We have four or five charges in the helix so there are a few example of these outlier proteins and Very exact stability and again, this is still active research going on We don't I wish I could say that I wish I could give you the answer here This is still on the hunts level. We believe this is true But whether this is just that and it's also the individual helix that has lots of charges, right? So we need to be able to insert the individual helix and have that stable But of course, so what was the case this helix red? Well, I need the blue one don't have blue pens blue is usually positive So this helix with positive charge the second it has been inserted in the membrane, right? We frequently had this other helix with negative charges being paired up So it might very well be that this transient stability We might only need to this to be long lived enough until it find its partner and that we don't know But it's just it's still a super interesting research area and there are room for tons of more PhD thesis in it and some of the Some of the hardest things are actually do it experimentally right because we have a hunch we have we know roughly how it behaves I would argue that they're even simulation models and everything, right? But I think what's needed here is one or two key experiments To try to prove this Are they kinetically or thermodynamically stable? It would likely be a nature of science paper if you come up with a good way to do it So what is then the role of this transition states in protein folding? They exert kinetic control over the process They determine how fast things will fold and conversely, right? They also exert the kinetic control in the other direction such as the membrane they determine how fast that you can unfold So that brings us to today's topics drug design and You know most of these things already But now we're gonna need to apply this a bit more and I guess this is Likely also some of these things existed already when the book was written But this is now a bit more as a chemistry modern computational chemistry rather than the physical approach to it So you can start from a sequence and predict the fold by either bioinformatics or some advanced simulation methods Whether it's gonna work how accurate it is is a separate issue, but you can at least try to do it You could say build in the side chains and say even try to energy minimize the structure Using things that you either have learned or will learn in the labs You could do it run a small simulation of this protein In principle you could work with the protein you could predict the energy and entropy and everything There are two problems with this First this can teach you a lot about one specific protein molecule But the problem is that you're gonna still going to be limited to relatively short time scales You might understand a bit about the protein moves that is certainly important in many cases, but if you're gonna design drugs You might need to well you might want to learn about process that takes a second or something Simply processes that are far beyond what we could simulate You might also need to do this for 500 different mutants or worse You might be searching for a new ligand and the drug and you have no idea what the drug is That's more common than you think most most features that happen in nature is that you find sorry Most diseases that we want to treat end up with finding a protein that is involved in the process That's not that hard there are probably a hundred unknown diseases where we know roughly what receptor protein is causing the disease But that doesn't mean you can treat it and now it's up to you to find something that to combat this disease For example the the example I mentioned with this poor child at Carolinska, right? We know the protein. We know the mutant. We know why it happens But we are no idea how to treat it and That's kind of the topic today So drugs are superficially fairly easy. You have a target that is virtually always a protein and Nature does or nature and your body does this all the time that you have some small Ligand or something the ligand gate that I in channels is one example it will bind to it and Once this binding happens you elicit some sort of biological response and that could be growth if it's a growth hormone It could be the opening of an iron channel or a hundred other processes and The point of drug design is kind of to mimic this It's rare that we bind to something completely new that has never been found in your body before it does happen You could imagine getting an antibody to bind to something to kill certain cancer cells or so But mostly it's much more efficient to try to find something. That's a normally occurring target in your body And see if we can bind to the same thing and create the biological response The problem here is that we don't know what the yellow part is You're gonna need to design a yellow part The way this virtually always happens is that you start from a genome study. I'm sorry I should have had a sequence here, but you start from a genome study and then you find today I would say not 20 years ago But today we find a genome study and say that they find out there are certain mutations or something that are related to disease You might even build some biological networks and realize that this is a very complicated say tumor or something You're gonna study this in the next course. I think on comparative genomics. So if We've talked about proteins as sequences in the bioinformatics course you talked about them as structures here, right? But let's point a protein as a small dots This is the protein Eric this protein interacts with this other protein and We can determine that either functionally or that they are expressed by the same levels or something There are experiments to find out But then there are more proteins in your body like 20,000, right? And we can actually you can imagine starting drawing networks in proteins, too Then what proteins tend to co-vary or something and then you might realize in this particular cancer disease Say that that protein all these five proteins are involved, but that is really the hub That is the most important protein that the peer to always be required for the process And then that might be the protein. We would like to go after and change how to interact The first thing we need to do then is what? usually You need a structure and This might change in the future, but today it's still we need a structure and There I'm gonna show you the second hop today that there are examples where pharmaceutical companies has paid a billion dollars to get a structure Because it's so important Once you have a structure you might either if you're lucky you might actually already see something bound Or we will see these pockets and somehow we need to design in a molecule that binds in this particular pocket of the structure that we now know To get something to bind there and create the effect and hopefully at this point You have enough biological knowledge that we know where things should bind And there are like hundreds examples here, but if you look at it in the body If you classify this I think this is two three, you know it's a ten years old by now, sorry All this part is G protein coupled receptors 27% of all drugs on the market hits the protein coupled receptors That's why we're going to talk about them later today Nuclear receptors and things transcribed from factors the one part that is growing here is actually iron channels It's probably after 20% or so now because we're learning more and more about the structure of iron channels And if you just look at this in the course, it might sound horrible Why should we just think about the receptors GPC ours and iron channels? Well, if you do buy that you cover pretty much half the spectrum of all drug design in modern pharmaceutical sciences So that part it's not unimportant and that might be your new Nobel Prize here But if you have to take a pic as a company target that or target that It's not a very hard choice for a company. You're gonna target that Drugs can do a few different things We kind of talked about this when we spoke about the ligand gaiter iron channels, but we're using different terminology here Drugs can either So normally if this is a receptor and this is a sort of biological activity You could argue that the normal thing would be that the receptors are now activated It starts cell division or whatever it is, right? So you have some sort of normal process here that I don't show you By far the most common receptor Sorry, the most common drugs are inhibitors and they do exactly what they say they turn off the receptor So that you get down to the baseline again the black part and that's an inhibitor drug Then there are examples that you call agonists and agonists are drugs that they they create the response the same Response they turn on the receptor and a full agonist would be the one that turned it on to 100% You can imagine a partial agonist that only say turns it on to 75 or 50% And there are also examples where you call them inverse agonist So an inverse agonist is something that causes a response, but it's the opposite of the normal biological response So there are agonists inhibitors and inverse agonists and All three of them are important. So it depends on what you want to achieve, right? If this is say if you're targeting a blood pressure drug And we know that this receptor is one that increases your blood pressure and if your blood pressure is too high What is that you would like to try to develop an inhibitor or an inverse agonist, right to reduce the effect or It's the opposite blood pressure is a bad example But assuming that it's in this all sort of receptor where would you would like to kickstart it more get it to work Better and then you would like a drug that is an agonist. How difficult do you think it is to develop drugs like that? It's not that hard and we're gonna see it soon, but the problem is biology is way more complex than that so that Finding something that bind to the target is frequently the easiest problem But the reason why pharmaceutical companies are so large why they have entire teams is that first you need to make sure that your compound does not bind It's sorry getting something to bind to a target is easy. Just make it small and hydrophobic It's gonna bind to a ton of places Your place is gonna be one out of 1,000 so the sadly it's also gonna bind to 999 other places side effects and That's the bad part. So it's not enough to design for something This is frequently called the counter-design or anti-design. We need to make sure that it does not bind in the wrong places And there are numerous examples in the medical literature of side effects of drugs, right? And the problem is that there might be a drug and the side effect might be rare It might be a very rare genotype or something so that it's it takes to happen in as these anesthetics A few years ago. There was a drug that even made it all the way to market And then they realized there are cases where patients get an anaphylactic shock and die on the Operating tape and then they had to pull the entire drug from the market You also need to if you eat the drug the drug might need to get to your brain if it's a neuropharmacological drug Why is that a problem? You have a blood brain barrier and the whole point of the blood brain barrier is to protect things from getting into the brain So most drugs will not get to the break. That's still an unsolved problem So what what we typically do now for very rare diseases you pretty much have to inject things directly into the brain Which you will do if it's a matter of saving your life, right? But it's not something you're gonna do on a weekly basis going to the doctor You also need to get something that's easy to the body again You don't want to even have to inject it because you will never and this might sound harsh But if this is requires you to go to the doctor and get two injections per week Unless it's gonna save your life you're not gonna do it and Then this will never fund all the cost of developing this and ideally you would like a very slow and steady release of the drug Because otherwise you get this very strong effect initially and then it wears off So that actually by far the best way of delivering a drugs is the case where you can put a patch on your skin But that usually doesn't work because most drugs will not go through your skin So what this entire process is called is admin talks absorption distribution metabolism excretion This where most drugs fails is by far the most difficult part of drug design And nowadays we're increasingly using computers to predict the admin and toxicity to Drugs there are a couple of handful rules that they receive an area a rule for the call the Pinsky's rules of five and This is very much hand-waving. This is all based on not all based mostly based on trial and error So historically all drugs Have fulfilled this that they are small. They have a molecular weight below 500 Which means that they must be small enough that they can be transported in the blood and everything they can't be too Hydrophobic because then they would never get in the blood stream You should have a few hydrogen bond donors not too much because there are too many hydrogen bond donors and Receptor acceptors here they would stuck stick too much in other places and they're not hydrophobic enough to bind to the hydrophobic patches and Do you see here that these are kind of contradictory right because we also need the drugs to get into your cells and If this is a brand new drug here, this my they will likely not be any obvious transporter that's going to transport your drug into the cell This has not led to a new drug in 20 years It used to be that you went into the traditional way find a new drug in the rainforest or something Purify the molecule and had it it's too hard So there were a bunch. Oh, we even have a bunch of them here These are real drugs on the market typical small molecules a bunch of hydrophobic rings here and These rings are actually not the coincidence. I don't remember where they have a slide about it But this if a drug was very long and large and free with a floppy chain It would have a very high entropy when it does not bound right and when you then bind it It would reduce its entropy too much So for things to bind well, they have to be fairly rigid so that they don't lose too much entropy upon binding so all of these things are Things that you can buy go out and buy at least with a prescription and Historically the way we developed all of these things is that you find something not necessarily in the Amazonas, but Actually, you love you love, but this is very common 50 years ago. There was an entire department in Uppsala of Pharmacotinacy and everything and they used to go down on expeditions and try to find things Because that this is very this is a hidden gem of undiscovered plants and everything, right? And then you know that there was something that we've not just Amazon's but even in Europe too that there was some sort of If you eat this plant, you know that you get rid of your headache or that you would get less tired or something And that is of course because there is some sort of molecule in this plant that does something and Then we would try to isolate this in a laboratory And then hopefully turn it into a blockbuster drug And this can be harder than you think because it might actually turn out that that drug is technically poisonous toxic And the problem then is that you're gonna need to find out We can you find a way that saves that retains the good properties while making the drug less toxic And that's usually a matter of taking these drugs adding a methyl group trying to remove the oxygen there Organic chemists are outstanding at this and it's they're certainly far more skilled than I am So that's where you have entire teams. Can you imagine the what what is the universe of drug space? Like it's hundreds of billions of small compounds you could create right What this frequently leads to today though is that since this is very rare that's happened there are quite a few me too drugs and Me too with this has nothing to do the recent debate, but So me too drugs is that basically your company has discovered a new drug that treats a high blood pressure and you've done that by targeting receptor X and My company then I would prefer not to spend five billion dollars to go after this and Oh shit, that is really efficient and it has very few side effects to go after that receptor So maybe we should do it too Now the problem is you have a patent on your drug, so I can't copy your drug But I can choose to design another drug that binds in the same place where your drug binds And you just had to do that take the entire development cost and proving that this receptor was a good target than everything So I know if I target the receptor you're targeting my drug will likely make it to market to so this is a way to do it way cheaper We can certainly try to I might know that there is a particular receptor that I would like to target But there is no known drug whatsoever that has it and this is increasingly the most common scenario and Then I'm going to need to go down into organic chemistry lab And we will try to design a new brand new compound that has never existed in nature that will bind to the receptor in question And the illicit response I want whether it's an agonist inverse agonist or inhibitor This is hard because all of the reasons we mentioned on the Lipinski slide right and that's as I've kind of hinted before in the course This is not just the future. This is increasingly happening, but it's still very much on the research state We are increasingly designing protein drugs small protein parts of proteins that will bind to other proteins This gets a specificity because of all the things that we've said in the course proteins are way more specific than these small molecules But then you also have all these problems that for instance Administering the drug you're going to need to inject them because if you eat them You're going to digest the protein and all your enzymes will destroy it So there aren't really I wouldn't say that drug It is I wouldn't say that the pharmaceutical word is in a crisis But it's far more difficult than it has been and one of the reason for that is that we don't accept side effects anymore If you take a very simple one of the simple the aspirin There is no way aspirin would have been approved in the market today. It's too dangerous bleeding problems with that related to heart Overdoses and everything but because it has been on the market for 80 years or something. We accept so modern drug design It's a bit different actually So we first need this target If this so if this costs us a billion dollars, we will go after it and that's most pharmaceutical companies They don't randomly start to do work with drugs for all different types of diseases So as a pharmaceutical company you tend to develop a profile So that say my company is mostly interested in neuropharmacology or say blood pressure drugs or Whatever say something related to acid reflux because that means that in this particular area that my company is working on I might have three or four protein structures Do you think I'm going to share those protein structures with the rest of the world? I just paid a billion dollars to get them. There is no way we're depositing them in the protein data bank Because this is again, it's a it's a trade secret So that because nobody else has this structure, I will be able to design drugs that you can't decide and Then we're gonna need to find some small molecule to start with and that could pretty much be divine inspiration Divine inspiration is usually not a very efficient way of designing drugs that if you write that at the exam You're not getting it right Computers is of course we use a lot of computers here find something and I'll I'll explain to you soon how we find it Hopefully this has some sort of effect and it's not extremely toxic the problem is that this is going to be an efficient It's always going to be an efficient because the likelihood of finding something efficient randomly is nil and Then we need to optimize the drug I'll talk. I'll also cover what this inefficiency is soon And as we optimize this and this is fast this entire process Optimizing it we you most pharmaceutical companies. They have a cycle of four to eight weeks So in eight weeks, I need to know what was the result of your previous testing what looks most promising now and what am I going to do in the next cycle create the next round of drugs and Gradually if this looks really promising you would go first into the lab and start doing in vitro tests And eventually you would start doing animal tests if things they look really promising Then you start going with safe phase one studies. Is it safe in humans completely healthy young humans? Phase two is it efficient in humans and the fishers here we say again if I developed a new drug for high blood pressure at Phase two you start administering this drug to patients that actually have high blood pressure That's he does it even reduce the blood pressure in humans. It sure it did it in mice, but a mouse is not a human and If phase three The problem even yes, this might reduce blood pressure, but the question is is it even better than any drug? You already have on the market Because if this drug is not better than what's already on the market, I will likely not get it approved There has to be some advantage over drugs that are already on the market or the food and drug administration or social services We want to prove the drug forget about whether to prove it or not because a brand new drug is going to be very expensive, right? If it's a brand new very expensive drug, and it has no advantage whatsoever over an old drug that is cheap Who's gonna buy my drug? nobody This process might take 10 years You don't want to fail here This is where companies go bankrupt, but the problem is fail you do Preclinical testing 70% of projects fail out of the 30% that make it here Something like 40% fail in phase one meaning that it's dangerous. There is something that happened in humans Out of the two-thirds that make it here another 60% fail that they don't really work in humans It worked in mice, but for whatever reason the human work differently, so it doesn't have the effect We hope for you see this in the stock market now and then that there was this Discovery company X that they reported the results from their clinical studies and then the stock dropped by 40% Because we were in the red the study failed While in some other cases the stock will go up by 50% because it worked Phase three when things fail here That's this is the part where you've done studies over save 10,000 patients or something And you can imagine what happens to the stock if you fail here You've invested a billion dollars all the way and it turns out that it doesn't work We're gonna need to cancel the entire project and then there are some drugs where I as a company I think that this is really worth it. It's really worth taking to the market, but unfortunately the authorities don't agree That they don't believe my tests They don't think have not enough studies and here there is quite There is unfortunately a bit of competition between say the Europe and the US if you look at European use papers They are always upset that the US Food and Drug Administration tends to reject European drugs more than US drugs It's basically they're using us as a trade negotiation To create an advantage for US companies Really unfair Until you look at what the European authorities are doing they're doing exactly the same thing They are have they frequently reject US drugs more than European ones And it's not just because this might it's not just because of trade tactics and everything a European company will likely have interacted mostly with the European authorities right when they develop the drug and the US Authorities or they the Japanese authorities might have slightly different requirements And what then might happen is that they might need to go back take two more years and do another large phase three study And while they're doing this your patent time is ticking and they're going to be smaller and smaller period You're in which you will be able to make money in your drug and The reason for the red part here. This is not bad. The red part here is a great Because when you fail here, this might be three people in the lab or something you pay their salary for one year It's nothing You want to fail here because failing here means you have found the problem here instead of finding it here So failing early is failing cheap There was an example, I think I don't remember the company I should mention the company name This is recorded a few years ago this we had there were some new the drugs I mentioned the anesthetics and they Failed because there were rare genotypes in patients of that under some very rare conditions You could side effects that could almost kill the patient as I mentioned a minute ago But since these were so rare they were not discovered until post FDA It was discovered when the product was in clinical use Can you imagine what happened? They had no, I think there was one of the largest pharmacy. They didn't go bankrupt, but they had to pull the entire drug 10 years of development just throw it in the dust and That's why new drugs are so expensive right because for every drug that makes it to market. They're gonna be a hundred ones that failed Yes They couldn't they couldn't find a way to fix the drug again. There is no way at that point. It's too late and Not with current testing, but you could imagine this kind of gonna be your job in the future, right? That today we know much more about the genotypes and we have much more sequences Would it be possible to already at this stage predict that based on the genetic variation we see in patients? No, 10 of this was madden been 20 years ago, right? And at that point there was no way we could have known But today we have the genetic information. I bet that the specific Mutations that caused the problem are already in the sequence databases And we are gradually knowing what this genetic variation. Could you already hear some of it? There is a risk that this particular drug would interact with something else And if this is 50% we might either try to modify the drug very early on and prevent it so that I'm not saying it's easy But it's no longer. It's no longer completely impossible so If you summarize this the time to market sorry the time to patent is something like 10 years and this for a successful drug It might take another two years for this to actually get approved by everybody. How long is a patent valid? Typically 20 years in some cases in the pharmaceutical research. You have a chance to extend this by five years but that means that By the time and again, this is not just the first patent many of your earlier patents on the small molecules and everything will have Happened much earlier, so you might have something in the ballpark of 10 years when you can sell and make money of your drug And that's why drugs are priced the way they are Costs might be something like 300 400 million euros And this is probably low. This keeps going up all the time because there are more and more requirements for drugs Maybe 100 200 scientists involved and These phases that go through that will it the first part in the car is what you call discovery or discovery research And this is really when you're looking at genomes. You're looking at your targets This is a stuff that people friends that sideline lab do a whole lot Finding out new targets. Is there something that could and divine inspiration is good if it's successful This is very much becoming computerized Today, I would even say the majority of this stuff happens in computers now At some point we're gonna need to start testing things high throughput screening This is also something we're doing at sideline lab not so much electrophysiology maybe but at very large scale If you have 10,000 compounds or something, could you test it? Is there if I have my small pet receptor here? Can you find me something that might possibly be possible to turn into a drug and Then there are these libraries. I mean, I think at silo head. We have a library of roughly a million molecules or so and These are molecules that we have trace amounts of and then you can either well in theory You could do it manually. We typically have machines do it and you just gonna you're gonna need some sort of assay And the assay is just the name. There's some experiment by which I can test if I have my receptor here And if I add this drug does it change the function of the receptor? Now this if it's an ion channel I can try to measure the currents and the ion channels as you're gonna see on Wednesday, but I just I need some way to measure Does that's my receptor or whatever it is work? And then you test this with 1,000 different chemicals Or a million and If you're gonna do it for a million you probably don't want that as a PhD project, right? And that's why we use robots. So there's a lot of this is becoming robotized There is an increasing Amount of companies where you can even do this remotely by writing Python scripts So just the way you can rent just the way they can rent computer space on Amazon There are increasingly labs in the world where you can rent lab space and control it with computers remotely completely pretty cool And at some point though here you're increasingly go more and more into the lab We have medicinal chemistry Combinatorial chemistry still computers and then when you're somewhere here You're having this entire team of people and then you sit down and meet every week and say you have now created a new bunch of Experimental tests here based on those we're gonna decide how are we gonna optimize the care the molecules in the computers the next Two three weeks then we need to order a new batch of 1,000 new compounds that some scientists say in China Asia anywhere, we don't do it in Sweden that much that try to design for us Or if it's a very rare molecule I might even have a lab of organic chemists downstairs that can design a custom molecule for me But then I can't order 1,000 then it's one or two and then four weeks later We test those again, and we look so the hypothesis we had four weeks ago did it work Better or worse if it's better. We continue that direction if it's worse. We need to choose another direction Then gradually you become more and more pre-clinical here and after that we would have all the tests So by the time you get to the tests the development is really over You can't change the drug once we get to the test We have a drug and the only question we then ask is this drug good? Yes or no? We don't change it anymore in test so that by when you get to pre-clinical Everything is settled so all the actual research happens here, but all the costs are related to the testing So let me do one more two more slides So there are going to be a couple of steps that we need to go through In particular this pre-clinical part finding a hit seeing whether it has any effect and optimizing it And this is where all the computations come in We typically need the protein structures that might change depending on how good you are the next 10 years But for now on we still need the structure and what we do is that we need to find something either in a database With some sort of activity Do you know what that is? It's roughly ten billion dollars. I think more no price probably hundred billion dollars Sorry No, it's omicropressol which is in a low-sic for acid reflux is one of Sweden's biggest export successes ever designed by Astra and this was a drug that was actually found and But turned out to be toxic But they eventually realized by modifying this drug a bit they could remove the toxicity and cause this to inhibit the acid reflux Which is says omicropressol is the real drug name what all these companies then do they come up with a name they can patent So this was called originally low-sic and then pre-low-sic and then nexium in the US. I think those are just marketing names But the initial version of this drug was lousy First it was toxic and you probably had to eat 10 kilos a day for it to have any effect, right? So that's where you need to go through a bunch of steps try adding a methyl group there and see if you're changing that chain a bit Gradually change in the molecule a little bit optimize it and get it to be better and better and better And the second part of what you call the optimization or lead optimization So you would need at this point I tried to isolate what is the part of this drug that is really responsible for the binding, right? Because if you have a large molecule It might very well be that it's that particular part that creates the good part and then we keep that part and try to redesign all The other ones to improve solubility increase the efficiency and what you do very much there is high throughput screening and High throughput screening sounds advanced, but it's really the equivalent of a student pipetting 1000 things and Instead of having students do that we have robots to it There are a few of these like xylap so you may be you can maybe do 150,000 tests today and Each of these tests the chemicals might cost like 10 cents per chemical or so So substantial amounts of money, but again compared to the clinical test is nothing So here you want to find out all over this library. What types of drugs might hit your receptor? If you're really lucky out of those hundred fifty thousand there are 100 interesting hits leads But of course the likelihood that those hundreds can lead to a drug at this point It's still virtually zero, but the point is that for any random receptor there are quite a few things that will bind to it So what then well I did one dollar here? It's probably close it's probably closer to 10 cents today, but it's still expensive not for one day But if you operate this 365 days a year it adds up Now of course if those 100 are really good drugs, it's fine, but in general you're not gonna find anything there So this really shouldn't work Chemical space is something like 10 to the power of 60 drug-like molecules if they're gonna be small and everything if You randomly screen 10 to the power of six of those the likelihood that you're gonna find the best molecule It doesn't even show on the chart, right? And in a few with these are actually a real examples from recent studies. They tested 300,000 compounds Zero So let's assume that it was one this is the old days so that probably was $1 cost you $300,000 for that experiment Can you imagine if your boss is happy when you come back? Say that it's zero hits But the point is happen actually I don't think it was mad it happens and Then crusade another example 200,000 compounds. They tested 146 sets Well, which one do you think which one of this do you think is better? Maybe but there's also dangers. This could be a molecule either. This is a molecule that lots of things bind to or It could be I have no idea what database this was it could also been a different database And this database contain lots of small molecules that will bind in many different places, right that this point We still know this could be bad binders that will cause side effects. None of them are gonna be efficient So what you're they're gonna need to do we're gonna need to optimize this and get them to be better and better We still have four minutes. I'll do a few more slides So the other thing that we could do is somehow try to do this computationally I'll explain what Qs are is in a second So instead of doing this in the lab you could use a computer a large computer But you can't do this with a simulation or anything It's too slow because we're gonna need to do this the point of a computer is not to do it 300,000 times but the point of a computer is to be able to do it 300 million times, right? So we need to be very fast at predicting this is going to be efficient or not So what you typically want to say is that maybe we could do some sort of classification Remember those Lepinsky's rules? So find me find all molecules that have five hydrogen bond donors and ten hydrogen bonds acceptors that are small I know for this particular receptor it helps have negatively charged molecules to find me everything that's negatively charged So that is some sort of Correlation between the structure of the drug and how good we think it's going to be at the receptor and this has a name Qs are quantitative structure activity relationship This sounds very fancy, but it's all this I just look at some rough Correlation between molecules that might be good and what in general binds to this receptor I'll come back to this slide in a second so that Well, this is a very simple example for anesthetics that I already mentioned right that we know that things there the more hydrophobic things are The better it's going to be as an anesthetic It's a very simple structure activity relationship there are more advanced ones in the literature and What this enables you to do is a suddenly you can screen something like a million compounds in the computer I'm sorry a million compounds in the lab while in the computer you can increase this by at least thousand fold There is no way no matter how many robots you have there is no company in the world that could screen a billion molecules in the lab So the point here is not necessarily that you need to be better than the lab, right? So what you possibly lose in efficiency or accuracy you can compensate by volume here Because if I test 1000 times more drugs, I'm more likely to find the really good one So what cues are typically does is that we look at things like well? Maybe how large the molecule is charged dipole moment the surface Is it the polar or hydrophobic polar or hydrophobic surface? How many hydrogen bonds do you have is a soluble in water versus soluble in octanol? And this sounds just as horrible as this is just as horrible as it sounds But at this point I don't need a perfect drug I just need something to start from if you give me something to start from we can then try to optimize it and refine it And we will look at that way after the break But at this point if you just find me something that might possibly be able to bind let's say that blue or hydrogen bond donors The cyan there are hydrogen bond Acceptors and then there's some charges or something just find something that is roughly the right shape and that it might work so there are Some law this is the last slide before the break promise. There are some advantages here This is super fast. It might take a millisecond per compound or something So you can screen an insanely large database. It might very well be larger than a billion molecules today And we do find some ligands the problem is that You're only going to be able to find the things you already know because I told you to look for things that wait 400 and that had two hydrogen bond donors and one hydrogen bond acceptor Well, I just told you what to look for you're not going to find anything that doesn't fulfill that and what if there was an Amazing molecule that was hydrophobic without hydrogen bonds is that you will never find it because you already choose to discard all those molecules This is gradually giving way to deep learning methods based on machine learning and it's some of the hottest things that happening in the industry Right now. Can we rather and here here? We're very much trying to describe the physics, right? You're counting the hydrogen bonds But can't we let a computer train this is that show the computer all the drugs that have been successful and Show them all the things that bind state a ligand gate that I in channels and tell the computer find more things like this Don't let me say how many hydrogen bonds there should be and this seems to work And again deep learning is so new that's only been around four or five years But there are a whole lot of amazing papers being published now at how good it is at finding drugs So I think that in a few years you're gonna forget about Q star and you're gonna say deep learning 10 30 exactly let's meet here at 11 and I'm gonna gonna talk about Pharmacophores and a bit how we do the optimization. So we mentioned Q star, right? There is so slightly more Advanced ways you can do this because at some point we need to describe to the computer what we're looking for and there is a constant Related to Q star that it's called pharmacophore That just at least I've heard of So what a pharmacophore does at this point? I have my molecule say my Blood pressure-related receptor or something and I know I found some drugs here that have a little bit of effect And I would like to find more because I don't have any blockbuster amazing thing yet And what you then can start to if you look at this molecule and I maybe have two or three hits here I guess I don't know what for all of these hits There appears to be a Oxygen there and nitrogen there and then I have four hydrophobic rings That doesn't really say that much, but maybe I know there has to be a nitrogen there No oxygen there and a hydrophobic ring between them So maybe you can start to describe all these pair-wise distances so roughly in space forget about all the details here So maybe I said that hydrogen from donor and then something polar a ring here and a ring here a Super simple summary of the molecule. What are just the overall properties of the molecule? And this is called a pharmacophore So just that we have had this very simple models of proteins a Pharmacophore is a very simple model of the overall properties of a molecule that binds in this pocket and The reason for doing this is that rather than I'm trying to search for any molecule I can say that okay if these are the properties that the molecules have to have I Can go into the database and search for molecules that fulfill that pattern and there are large pharmacophore databases So then I can find other molecules that at least have high likelihood of refining this pattern And this is also something that you do every week in this four four to five week cycle of the pharmaceutical development project You need to find more and better drugs There are a bunch of common elements here Virtually all drugs as if they have these two three four hydrogen bonds rarely more sorry aromatic rings Rarely more because then they become too large you might have a couple of hydroxyl groups again they can't just be hydrophobic so that The space is fairly large, but we also tend to reuse the space quite a lot So these are a bunch of examples of polygons against a particular So the point is can you then try to somehow describe what are the common features here, right? It's not entirely easy, but there you have the hydroxyl hydroxyl hydroxyl So there are some patterns here that are a bit common and there might not even be one pharmacophore that can describe all of these But maybe we can define some sort of average properties of all these molecules that has one aromatic part here Hydrophobic part there need a hydro bond donor over there and then two hydrogen bond say acceptors It's not going to be stellar it can actually be pretty bad in it, but it's better than nothing we can also say I hate Microsoft, sorry That's why I don't even have it installed on my computer You can also somewhat describe how large it is and what the amount of volume it excludes What is the volume in this part of the molecule to say some sort of the overall shape of the molecule and I'm well aware how fussy this sounds But it is fuzzy There is no simple answer to just find your molecule and the reason why this is fuzzy is that up to this stage We haven't used anything about the protein structure And that's why we're handling we're blind you don't know what receptor you're trying to target I know that well other molecules in general that tend to bind to this receptor tend to have roughly this shape find more things like it and That's why we're fumbling in the dark So the obvious way to stop fumbling in the dark is to have some information about the structure so the structure-based drug design and that is Very much where we've been heading the last decade or two So most modern drug design tends to be structure-based already on the hit face so that I want to use Some sort of computational prediction to find out what are good hits if I know my receptor structure Can't I just try to calculate molecules that would be good hits there? Yes, so in theory Maybe so the problem is that first getting antibodies to find to these very small molecules is hard, right? The other problem is that if you want to develop a new antibody, it's expensive It's several thousand dollars per antibody if you're lucky, and if I want to test a billion molecules So that we'll get there later, but we're still at the States. I just need to cast my net as wide as possible here I don't care about the accuracy or precision yet. I now just need to find anything. I'm desperate So let's stay desperate a while before we do the expensive stuff So the docking if you look at this superficially given that structure I Want to find the best ways to put two molecules together? Some best somehow means I need to rank them I need to find which ones are good or bad and if I'm gonna do they do this for a billion molecules I don't have time to do it accurately. I need to find a super fast method to do it and I also need to somehow Find the best ways to waste to put the molecules together There is more than one way to put molecules together right so I'm gonna need to test this in theory You could do this in a long molecular simulation, but that way I might be able to test one molecule I'm gonna need to test a billion so I'm gonna need something to that is way faster and super fast at searching here So if you look at a receptor here, here's a small molecule bound in dopamine Maybe you start by understanding the pharmacophore and then we can select the database where the pharmacophore looks roughly like this And then you find 10,000 molecules like that and for each of those 10,000 molecules Well small as this is it's nothing like a protein, right? But you can rotate around that bond you can rotate around that bond you could rotate around that bond that bond that bond that Bond and that bond and you can also rotate the entire molecule So we're already talking about something like 10 to 15 degrees of freedom here for a tiny molecule And then you can rotate it to space and translate it and bite it in different pockets You have point one seconds for that Because again, I'm gonna need to do this for a billion of them And if you compare this it's basically this It's really not more fancy this yes We need to do this super quick Round peg square hold doesn't work throw it away Triangular peg square hold doesn't work throw it away Eventually square peg square whole Okay, let's keep that and then we try more and more. There's slightly more degrees of freedom here But you're not really more fancy than a two-year-old just trying it the point here is not that the two-year-old knows a whole lot about physics Right you can test this anyway, and we're gonna try pretty much the same approach so that We typically folk take one leg and we're not gonna move the leg and too much see if I can change You might have to change at least the side chains in your protein a little bit to allow me to binding But at this point, I don't care about doing this actually Boltzmann I just is it completely impossible to bind this molecule throw it away if there is if there is small chance that it might work Let's save it for later But in this case you're gonna need a structure either a crystal structure if you're rich Or you can build an homology model Ten years ago I would have been exceptionally skeptic to try to do docking based on a homology model because the homology models were too low-quality That is no longer true There are a bunch of really successful studies where people have been able to docking and find things that actually work based on a model But for this to work, they're gonna be two things we need you're gonna need to sample things We're gonna need to test tons of things here and but it has to be very quick and For each of these tests I also need to assign some sort of score to it so that I can say which ones are good versus bad later on and The point is that your guess is as good as mine. You need to reduce the amount of sampling as much as we can here So what you typically do is that you might only We keep all these aromatic rings fixed and then we only rotate along a handful of bonds here Maybe four or five bonds the fewer degrees of freedom you have the better and then try a few different ones and Then we don't allow this to sample everything We might have a small box say one on a meter by one on a meter by one on a meter because I know this is the likely binding site I'm only gonna test inside this specific binding site And I can't afford to sample every single angle. So let's just pick 10 degrees variations This is sloppy sloppy is fine here, but I have to be fast because I only have 0.1 seconds and maybe have 100 confirmations per second or so and then we use 10 computers. So that would be 100 confirmations in 0.1 seconds If you did this exhaustively it would take 200 years. So you simplify simplify simplify and one way of simplifying this is that You find some things either molecules or samples that work fine And then you see what score and then you test 10 of them or 100 and Then we get some scores some of these they look really promising and in that case We extract those confirmations and then we mutate them either by adding a few other groups or by If this particular molecule that we tested in this particular hood this looked really good But I only tested it in angles of 10 degrees increments So let's throw away the 900 99 ones that did not work keep the one that worked and then test this again But now in one degree increments So try to only spend time on the things that look promising And the reason for keeping one of a thousand if we'd start with a billion and then we first take it down to a million and Then maybe to one thousand somewhere along the road We might hopefully we're spending most of the computational resources on the few molecules that work Here too, we don't necessarily have to obey the laws of physics. That's an advantage of computers So here I said that we could take one molecule and try to move it, right? But you could also take a large molecule. Let's break this molecule into pieces and Then we bind the different pieces in different parts here And then we see look if we take the pieces here that looked good and then we try to rent more or less randomly Grow these molecules into larger molecules here And when we do so can we grow something that fills the entire cavity and that looks good and Hopefully we're gonna find some molecules that are better than others while the molecules have with enough bumping into the protein or itself We discard those and then we repeat repeat repeat And the reason why we can do this fairly quickly is that we use fairly horrible scores So all the stuff we learned at the beginning of this course is valid But we don't necessarily have time we could use an entire force field But that gets expensive and I can't have all that water. That's too expensive So maybe we can even do be even more empirical We say if there is a chance that you would form a hydrogen bond here Let's call that plus 5 is good or minus 5 if you think the low is good If things might bump into each other we say that's bad and I sign a completely arbitrary score to it So we can just and we kind of calibrate the score as we go so that if experiments confirm my scores we keep it You can even do this with statistics if we know that it's very common that an oxygen is close to a nitrogen because they would form a hydrogen bond We can say if you have an oxygen and the nitrogen within 3.5 angstrom the typical distance of a hydrogen bond That's good. Let's call it minus 5 While if you have two oxygens facing each other, we know that they would likely repel so let's say that plus 10 that's bad And then you somehow have to add all these things together and then hopefully then you're also going to sorry one more thing You also gonna need to use a sort of grid for this. It's easy to show it here So I take my molecule, but instead of simulating this gradually through space I just places a different grid points or something and then I evaluate for each atom here How likely is it to bump into something and there are a couple of different programs here that do this in different ways But the point is they do it fast even sloppily But in contrast to MD simulations, we can do this at super high throughput That's the billion molecules. This is probably the only course you're gonna see that we don't care about accuracy Does this work? Don't be a dust otherwise. I wouldn't be telling you about it, but rather why does it work? I would even go as far as say on average. It kind of doesn't work, right? If you test one molecule here and it has a low score, would you be willing to take that as a drug to eat it? good and So an average it has very low correlation, but it's the correlation zero There is a slight correlation, right? If you score well here You can probably accept that it's slightly light clear than random that you're gonna be a good drug And if you not test a billion drugs Let's let's be extreme her Let's say that it's only one in it once in it one molecule out of a thousand that we test Would be good So let's see but I I scan through a billion molecules out of those one billion molecules. I Let the docking go down first to one million and Then in the second stage of docking or something I tell you know what out of these one million molecules Rank them and then select the one thousand best molecules The one thousand best molecules I could actually test in the lab I can't test a million and I can't test the building, but I can test one thousand molecules in the lab So as long as there is one out of your one million molecules that you actually identified correctly and put among your one thousand top ones I Will test I will find it because I'm gonna test one thousand molecules So the point is that I don't need to find the best molecule I just need something to start working on right and as long as the probability is not quite zero The likelihood that you found something and again if you if you give me 500 great molecules out of the thousand I'm gonna be ecstatic But I only need one that's really good So docking is very much about improving the odds of it It's not about predicting things in the computer and never testing it in the lab But it's if you only have a chance to test one thousand things in the lab docking helps you test the one thousand molecules that are most likely to be useful and For that to work We need to be able to scan through a million of them in a day so that maybe we can do it a billion in total and Effectively you are predicting a binding free energy. We're gonna talk more about free energies tomorrow It's just that we typically we typically don't call it free energy. We say that it's a docking score And I think these are a couple of examples of molecules that that you've gradually grown in and gotten work better and red and blue Here would mean that it's a favorable electrostatic interactions So in many cases it actually works not in the sense that eats that the predicted binding here is Proportional to the experimental binding, but out of the things that you select we tend to have some molecules that are pretty good And we can use the example here These two targets I showed before the lacked lactamase out of three hundred thousand compounds you tested experimentally they found zero and Then when they did a large docking run and I think they did this against zinc which is a database It's a it's a Recursive abbreviation zinc stands for zinc is not commercial So it's a large research database with roughly a hundred million molecules And you can actually molecules that you can order fairly cheaply and test them And when they docked us to get think they found two hits It's pretty good, right? So by docking at first they managed to find two hits that actually worked experimentally when they just tested three hundred thousand things in the experiments they found zero and Suddenly that docking solution doesn't seem so bad anymore Or crew saying this other target I don't remember how many docking things we tested experimentally that again when they tested two hundred thousand compounds Experimentally they get hundred forty six sets which is kind of nice But if you go and do a thesis project what is it next spring at PsyLife lab if you tell your advisor, okay? You would like to do a screen here now, and that's going to cost you two hundred thousand dollars They would actually just look at you Or you can do it with docking and get one thousand or five hundred compounds So typically in the lab we might prefer to only test hundred or so and out of those hundred maybe you find five That cost to the lab is now one hundred dollars. I bet your PI is gonna say go right ahead and order So the point is not to find more money, but this orders of magnitude cheaper and at this point even the hundred forty six I don't need hundred forty six drugs. I only need one drug and At this point. I just need something to start optimizing five It's awesome. So docking doesn't find all hits you even throw away a lot of good ranking hits But that's sad, right? What says that? But what if the best is what something with her away? It might be that those five molly there might have been something even better the sixth molecule That was by far the best binder and we missed it Well, such as life If you look at if you look at nexium nexium is this again the molecule that we gained that AstraZeneca probably sold hundred billion dollars worth of over ten years What says that nexium is the best proton pump inhibitor theoretically possible Nothing, and it's even true that it wasn't so that the very first generation of homeopress all They improved the drug after five six years and tweaked it a bit and actually made it more efficient And then they got a new patent even so they were able to extend it a while so that you don't need the best possible Drug, you just need a good drug So good is good enough. You don't need to be best This you can make this as complicated as you want to because as computers are getting faster and faster today We we might try to allow for the receptors say the both the backbone and the side chains in your receptor moving a bit as You're trying to bind the drug And again, this might not sound so extreme But in addition to the five degrees of freedom in your leg and now we might have another 25 in the proteins And the complexity here will grow exponentially So this is a very common use of supercomputers all over Europe that you would like to run tens of thousands of tests day and night And as expensive as those computers are they are super cheap compared to doing this experiment And at this point we're happy we do have a drug and you have a drug that works Under with some minor details some minor limitations Lots of side effects and eating five kilos of medicine per day and this has to do With the efficiency of binding Again, you have a lot of protein in your bodies and this drug might not bound bind super hard And to get this drug to bind to all of my receptors in my nervous system I would need a lot of the drug because this just has to do with equilibrium Coefficients between the unbound versus the bound state if I want to force this so much so that every drug every receptor has Bound a copy of my drug. I will need a very high concentration of my drug in general Unless my drug is an insanely good binder and on average molecules are not This is not going to work, of course So to improve this we now need to optimize the drug and optimization is really about binding improving the binding coefficients and You might you should probably have heard about this at least so when we measure binding you typically measure binding in a Concentration and that's rough going to be the concentration We have an equilibrium so that a millimolar binder is a very lousy binder So you're going to need a millimolar concentration of this in the blood For it to be 50-50 bound to all your receptors There is no way you would take that as a drug Micromolar binder You're not going to make millions of that and then you go down to say non a molar binder or even pick a molar binder Here you start to be happy So we're going to need to improve the binding affinity between the drug and receptor with something like three to four orders of magnitude here Oh, sorry, they're back to three between all of these nine to twelve orders of magnitude So you're going to need something that's an astronomically good at binding things And the way you do this this phase is called lead optimization and there's a lot of computers here And you can keep doing more docking rounds and everything at this point If you started out from an homology model and you start having a hit and this looks promising and everything in Virtually all cases. This is when you're going to go down to the biophysics department on the pharmaceutical company Say look we need an x-ray structure of the Lindahl receptor here It doesn't matter if it's going to cost 10 million because it's worth it There is no way we're going to get a picomolar binder unless we understand that At this point you need to really understand the binding in detail You need to understand all the biology of the binding you're going to do you start doing animal not animal tests of the drug Right, but animal tests to understand the biology making primates or so and then you will start doing more and more advanced calculations Design molecules sit down you sit you literally have people sitting down and Tweaking this first manually today. We have more and more computers helping us I Yes, I think I have an example here Hib1 protease this was the for an inhibitor so you'd see that it's an inhibitor that turned off the effect of a receptor So this is a drug that you started out from this hit It's a diol it has two alcohols and it's symmetric and it had some activity that was likely lousy So the first thing they did here is that they went from this molecule and created a pharmacophore So that you needed and then you have three pairwise distances here You had two phosphates and then an H hydrogen bond donor or acceptor in the middle and Then you use this to search a database and in the database one of the molecule hits We found well what we they found was this one that roughly fulfilled this pharmacophore And then you took that drug and take to make it even simpler so then you have two phosphates here and Hydroxyl group there and oxygen there So this was the first design that they then tested this one. They synthesized and then tested I don't know what affinity it had actually but hopefully better than the initial one and Then we wanted to get this two alcohols back So we extended it and made it a diol again. We also added urea and At some point they realized that this binding was not that great in the receptors You wanted to optimize the stereo chemistry a bit to get it to bind with very high affinity So then you keep adding more groups here to get the stereo affinity to be a bit better and Then eventually you ended up with this drug probably up to another 50 rounds This was the drug that was selected for phase one studies But the point is it was already All the important steps here were done in computers And then of course you need to test that the predictions we made in computers they make sense So you periodically every 46 weeks you synthesize the three four best predictions you had this far and of course occasionally There were mistakes and setbacks Obviously, they didn't show that in the presentations But in many and usually in general you made progress over a few years And you gradually become better and better and better and you bind it better and better and better So this drug a few years later this became the first the first ever drug that you could use to treat HIV There was a computationally designed drug the first computationally the first drug on the market where the entire I wouldn't say the entire But the preclinical phase that you really started this was not something you found in a plant or something in nature It was a completely artificial design made in computers This is how all modern drug design happens. I even think we might even have a picture of it. Yes, this is what it looks like It's still being used Today we probably have 10 more HIV fighting drugs and I would guess the vast majority of them are computationally designed No, sorry my bad. This is not that drug occasionally. It helps to read my notes This is another blockbuster drug. You can even see that it's not identical the first one What is this? Obviously, you know, this is leapeter With another computationally designed drug You know what leapeter is It's a dream drug if you're a pharmaceutical company Aturvastatin it doesn't tell you or me anymore. This is the drug That helps against high levels of cholesterol in your blood and again Middle-aged wrist where middle-aged to elderly rich westerns that weigh a few kilos too much They buy this truck The other thing and this might so horrible, but this also drug you can't you don't really cure the disease Right because people keep eating too much fat So they're gonna take it for the rest of their lives and this is how you make an insane amount of money I think the sales a few year when it was at the peak 10 years or so ago. It's sold for $15 billion a year $150 billion total And then the patent expired So nowadays there are generics that are just copies of this drug So why does it take a while for the sales to go down? Is it that people are stupid and they don't realize that the patent has expired? First it takes time to create new drugs, but you can imagine all these other companies. They're they're sitting here They're prepared right the day is expired. So on any modern drug. You don't just have one patent So you probably have a hundred or 500 patents You have a patent on the molecule but I also make sure that I have a patent on the way of making my molecule and if I Halfway through this process. There's a smarter way to make my molecule I'll patent that too. I make sure that I patent the the way we prepare the the pills or whatever I patent everything. So I I literally create a bomb carpet out of patents And if you as much as tread on one of my patents here, I'm gonna sue you to death Because this this again it might not sound like a big deal right but three four five years ten billion per year That's 50 billion dollars. It pays a few lawyer salaries So that you try you do everything you can to try to extend the patent protection The other question in theory all these fancy things have been through in this course Can we use things like molecular simulations for drug design? Historically the answer this has been no because it's been too expensive But this is changing. I'm gonna show you yes. Let's see if start This is an example of a very long simulation using these machines that I told you before You see that this drug here is binding and I think this is something that covers like hundreds of microseconds and Now this drug found its binding site deeply buried there and this corresponds to binding site that has later been confirmed experimentally So While it's not their primary goal. I know David shows this guy the Investment banker from New York started this is of course partly their goal that they want to be able to sell this type of machines To the pharmaceutical industry because instead of having all these scientists employed if you could basically let the computer figure it out and press a Button there are companies that would be willing to pay a lot compared to determine all the structures So there are I would still say this is on the level There are a handful of cases of the literature where this has worked on the other and when I was at your rates This was science fiction. We couldn't even dream of it. So give this another ten years This will actually be the norm in many cases We also know that Despite all these shortcomings and simulations and everything is this is not it This is the entire rest of the structure and the reason why this is a bit fluctuating is as a simulation and the x-ray are not completely identical The gray or black part here is the structure both of the protein and the drug in the x-ray structure While the orange one is in the simulation And the simulation did not use any knowledge of the drug from the x-ray So it's not that we placed it there we place the drug in the solvent and then we let it try to find the binding spot So that the simulation here was pretty darn good at predicting exactly where the molecules should bind the same thing here The orange one is the simulation So we're talking about differences in maybe zero point one angstrom in the placement of one atom there And I would say that's this this deviation is likely smaller than the resolution in the x-ray structure So today and today we're at the point where this is start being useful Give it another 10 years is going to be more efficient to do this in simulations than with x-ray crystallography But in many cases we might even know that this is the binding pocket And while you can certainly argue that it's useful to search for it if we know where the binding pocket is It's a waste of time trying to find it, right Well what you would rather like to do is maybe test 1000 different molecules So what we then need to do for 1000 different molecules I would like to know how good is it for all these 1000 different molecules to bind here and rank them What determines that? Binding infinity. What is that? We need to calculate a free energy, right? What is the free energy of the molecule being bound compared to the molecule not being bound? There's a difference in free energy Delta G That is all binding infinities But this was not entirely easy to do because I can certainly calculate all the interactions But what the interactions do not measure they don't measure the entropy and To get that entropy right we're going to need to do a proper molecular simulation So that you actually sample the different states and this too used to be science fiction, but it works quite well So this is a small toy protein Which is called fk501 binding protein and Then they've calculated this for a bunch of different ligands That they had also taken these ligands and determined it experimentally So you see that the relative error here might be something like half a kilo calorie per mole or something in binding over a fairly large range of molecules So while it's not super cheap yet molecular simulations You can calculate the free energy in a computer without ever taking this into the lab Why would you do that if you could just order 1,000 compounds this is stupid just because you can do it when a computer doesn't mean we should So here's an example of this is You can spend by spending a week in a computer You can do something that completely replaces spending 30 minutes on the lab No, well the running cost of the computer is probably more expensive than the cost of doing that experiment in the lab Well, but it took 30 minutes in the lab and it took me one week in the computer Actually, so you're you're on the right track here The point is not that it costs the person standing 30 minutes in the lab. That's not the costly part. This was toy molecules But imagine and if again if these molecules I told that there are these 100 molecules or so in this zinc zinc is not commercial database But the likelihood of your best molecule being one of those hundred million Again hundred million sounds like a lot But hundred million is just 10 to the power of 8 and then probably zinc is smaller think is probably 10 to the power of 6 The actual space of all potential molecules that we would like to test is 10 to the power of 60 There is no database with this molecule in So in general what's gonna happen is that you sit in your computer? And you now have a super advanced program and you come up that you're gonna need this Molecule and I'm not gonna draw everything here and then something there something there a methyl group there Let's add whatever some five-member ring here your guess to get it mine I think that's gonna be a great molecule and then I have hundred of those and Then I call up my suppliers and look I need you to synthesize these molecules I said, okay, we'll do that. We'll outsource that to our factory in Asia. It's gonna be five thousand dollars per molecule And that's cheap Because you're a common of returning customers. So for this round that you're gonna do need to do the next three weeks Just make the math to the math five hundred thousand dollars Just to produce those molecules because they don't exist yet Now once you spent those five hundred thousand dollars the cost of actually doing the experiment might be one thousand But producing new molecules is exceptionally difficult because they they basically you need to design a new reaction Right to produce a molecule and that can take four weeks if you're lucky We're doing the experiments quick. So doing this in the computer allows us to do things Cheaper in particular if you want to do this with brand new molecules that nobody has yet designed The other thing that we're increasingly learning to do is testing these small things I mentioned what if if I take this molecule and test it with or without a methyl group Would it help to have a metal group there? That is something that the computer we're now down to doing this in hours And you don't even need the computer so what companies uses that they buy the computer time on Amazon And then you have your computer and say run this overnight. I would like the answers by tomorrow 8 a.m So this is becoming way more common and it allows us to go after molecules that nobody would dare to go after a decade ago because they were too expensive to produce and it's also helping us understand that in many cases that Just as we talked about protein folding and that there it's not just a matter of thermodynamics, right? It's all to the kinetics in quite a few cases what determines where the drugs are going to be efficient or not has to do with kinetic barriers So if this orange molecule is moving in here and I should have the native post there in great But the problem is that there are a bunch of waters that are bound there in the crystal structure So for my molecule to get in there. I need to push those waters out And what's then going to determine how efficient this is of course is how quickly will I be able to get over that energy barrier? And can I then design a molecule that helps me just like the catalyst and the protein folder? We spoke about can you design a molecule where this energy barrier is lower so that it will bind faster? And this is not quite science fiction But this is pretty much where the research front is when it comes to computational drug design Improving the kinetics of binding So with that I'm going to spend the last few minutes talks about the curious case of GPCR Where we're going to use this And the reason for doing GPCR is that first you should have heard a bit about them because they're important And I can come back and show you some examples where we had used this successfully for GPCR's So GPCR stands for G protein coupled receptor So you have this receptor which is a transmembrane domain here and it's seven Helices and then they're coupled to this G protein on the inside of the cell And what you have is that you have a neurotransmitter binding on the outside in the loops here And then magic happens So that when this binds there is the protein undergoes a conformational change That signals to this G protein That then perpetuates the signal on the inside of the cell And there are some like 900 genes in humans for this So they do a vast variety of different functions in not just one And remember the thing I said about membrane proteins being the doors and windows of ourselves It's not just a toy word. This is why they are important If you want to start regulate cells you need to target them You this might not sound like a lot when I was your age the prevailing idea was that We will likely never ever find a protein a structure of a G protein coupled receptor. They're probably more than 25 It's probably 50 today or so Because people had people have spent decades trying to get it. Nobody had managed to do it so that There were people were generally accepting that we might be able to do some homology models or something But they're going to be too difficult to crystallize So the point is in those days is not just that membrane proteins were difficult to crystallize There were classes that nobody had ever been able to determine a single receptor But then things to sorry that But even before we had a structure, we know quite well that there are one, two, three, four, five, six, seven Trans membrane helices So it's the poster child of simple membrane proteins And this is the case where all this transmembrane helix prediction worked really well And we could also predict the locations of the loops based on the charges So we knew quite a lot about these proteins from bioinformatics It's even more extreme than that. So bacterial rhodopsin That I showed you earlier, which is also a seven transmembrane helix the first membrane protein structure. We really got It's homologous. It's well, it's rather the same overall fold at least. It's not a gpcr But they are evolutionary related So we kind of knew everything we knew that there are seven transmembrane helices We know the overall fold and we know roughly what it looks like on the other side And still we can't determine the structure And just knowing the rough structure doesn't help us because the devil is in the detail We need to know exactly what the binding sites are and the Rhodopsin doesn't have these binding sites the rhodopsin is not coupled to the gene protein So again while superficially they like look the same, but unless you know exactly what the site say the roles of the sizes are It's useless But the reason why I can show you that is that Some 10 years ago there were two structures published within one week of each other Of the first structures of the deproting of the receptors So there was a bit of row about this that Brian Kubilke and Ray Stevens they were competing here too They were first collaborating and then some of this collaboration broke down And they published things independently And I'm partial because I worked I worked in the same department where Brian was So I should not say who's right or wrong here But you can all you all know what happened a few years later, right Brian got one a Nobel Prize together with his old mentor and I think the Nobel Foundation went around so Brian He's been a powerhouse and a particular Brian frequently says that he doesn't work with gpcrs, but he works with the beta 2 adrenergic receptor, which is one specific gpcr So Brian and his advisor won the Nobel Prize for their studies of gene protein coupled receptors and Ray Stevens was actually let out So Ray Stevens run a very large crystallography group and since they have been able to determine entire trees tons of different structures And you can imagine how important this is for the pharmaceutical companies, right? This is one quarter of all drug targets So in this case there have been several companies that have paid these groups to determine structures And the agreement then is that the structure will eventually become public But because these companies pay so much money they get one or two years of head start So you have two years when your company will have the structure when I promise not to show it to anybody else But in return for us doing this we will get take 10 million dollars from you and up to two years We will be able to release this to the public Which you can say You can certainly argue about that, but the other alternative would likely be that they would do it themselves and then the structure would never be available So I think that's it's a reasonable way to fund the research But the point is that the second we have structures We can also start to determine exactly where things binding you have the look we have the geometry of the binding site We know what happens up here We also We will also see there are even co crystals that is x-ray structures with drugs bound the carousels alone Was one of them and I think there are two or three others now. So now we have the exactly right protein We have exactly the right binding pocket. We even know exactly how some of these drugs bind in the receptor And this is like night and day suddenly there's been an explosion of what we can do and how we can start to design drugs uh So if you get to the first projection maps and essentially that first high resolution structures or something sorry, there's a This is not the term the first complex structures that is deep proteins with ligands bound It confused me a bit the first active state structures were in 2008 and nowadays we even have nmr structures And at some point here, we no longer talk about individual structures, right? You see we have classes of structures human the mouse rat Active structures intermediate active structures because just as the ion channels these structures will go under through different processes And to really be able to stabilize the active state or the inactive state You're going to need to understand many of these And a few years ago raw instead of just having x-ray structures, uh, rondroer et al Who was Part of this david shore group. They were even able to do the first simulations Here's the ligand you're going to see that it binds and then I think you will also see the molecule going through a state change So there it binds and when this binds I know we speed this up a bit I eventually see this helices moving out right and everything and you can actually Just looking at this is a movie. It's not so fancy But the point is comparing this x-ray structures But you see this actually caused the molecule to move from one confirmation that we've seen in x-ray To another confirmation that we've also seen in x-ray So suddenly you can see how the ligand binding induced the conformational shift And here we're just showing the transmembrane part But the point is that this then leads to this earthquake That will cause The interactions between through the g protein to change and then cause the signal on the inside of the cell Here too, we're pretty darn good pink is the simulation Gray is the crystal structure So as when you're when you're sitting with a single simulation and trying to predict you can easily tear your hair because it's so slow and everything But I think it's pretty scary. How accurate some of these things are it's not gonna It's not going to replace all experimental work, but I think chemistry as so many other researchers Researchers is increasingly becoming a computational field And people who would never see themselves as a computer people are increasingly were computation Sorry, this is another way if you're looking at it. So it's a time here from zero to micro to five microseconds And red is the part that we explored initially or early on And then if the color goes through purple and eventually blue here, this tells you how the molecule after roughly five microseconds bound So five microseconds in around 2011 or so. That was a very long simulation today. This is something you can do during a thesis project At least if you have access to a bit of supercomputing time And we also know you can actually find this out through the simulation that all these states we go through corresponds that you have an unbound state You go through a first barrier and then you bound binding that extra cellular vestibule Do you remember you saw the molecule that it first spent a bit of time in the upper part of the protein, right? And then after a while it made the transition to a lower part So that's going to be a second barrier And then we end up in the bound states So everything you learned about protein folding applies here too And if you then want a drug that's small efficient right and everything you're going to need to think about this You need a low free energy here But you also need barriers to stabilize it and you need a molecule that's efficient enough that you can get over the barrier and bind and There are a lot of statistics we can do here too, in particular if you start comparing different proteins So of the all these different proteins, what are the interactions you have between different helices and everything? And i'm not going to go through this in detail But the point is that out of these 900 genes or so, maybe we have Say 200 significantly different gpcrs. I'm just guessing there And each of this gpcr is going to have a slightly different binding site with slightly different properties That binds slightly different neurotransmitters And of course the neurotransmitter that binds to the first one here is not going to bind to the other 899 So again the devil is in the detail It's not just enough to find just because you've found something that binds to one gpcr Well, first if you want to develop a drug you need to make sure that you drug only binds to the one gpcr You want to target not the other 899 gpcr Because if it binds to all of them, you're going to wreck havoc of the entire Cellular signaling so you only want to control one out of 900 different genes But we're pretty good at that There are even Many other sites here that you will almost ah There's another example of a molecule first binding in the upper part here And there are even there are two depending on the conformation of the molecule here. You can have slightly different sites occupied What I haven't showed you there though is that Here we only looked at the g protein not the entire signaling Sorry the g protein coupled receptor not the entire signaling in the g protein itself And that's something that brian kubilke in particular continued to work on for several years and In 2011 they determined the structure of the entire complex including the g protein on the inside And you can imagine when you have this we can start to start to say that how is the g protein coupled receptor Well, how is the ligand binding changing the structure of the g protein in green? How is the g protein upon the ligand binding? How does it undergo a conformational change? That somehow changes the interaction in the yellow protein. What will that lead to in the yellow protein? How will that change the information with the beta sheet domain there? It's pretty complicated But what we know both based on their experiments and simulations actually is that we have identified the healers there How these healers is moved when the ligand ligand binding occurs and how the entire helix here is undergoing this Confirmational change and that is one of the reason why this simulation I showed you before it was published in a quite high impact journal That they were able to show that as you move from the active state and bind the ligand You will first spend quite a lot of time in intermediate states Just as in protein folding and then eventually you move over to an inactive state here that I think that Black dot corresponds to where the simulation ended up and formerly the x-ray structure is somewhere down here That is the x-ray structure of the active state And they can trace every single helix here and really show how the activation happened And if you think that that was the end of it as recently as last year Now I say that my yes was last year And your 10s in two receptors also receptors very much related to the the heart functionality and everything New structures and we were able to explain the selectivity and diversity and and your 10s in two receptors The cool thing with this structure is that it was neither an x-ray structure Nor a cryean structure what one based on free electron lasers Which is a very fancy new technique to determine structures that they're building one down They're building one at stanford and one down in europe and I think we're going to see more and more of this And what the free free electron laser does is pretty much it evaporates the protein Not just pretty much it does evaporate it completely kills the president because you're shining an exceptionally strong laser at it But before the protein has time to break into pieces to manage to capture a snapshot of it So then you can take structures that corresponds to a millisecond time resolution or something the type of intermediate states we spoke about The other thing that happened last year is that there was a major merger So with a small drug firm called ojeda that specialized in gpcrs That was bought by one of the major pharmaceutical companies. Do you see what type of sums we're talking about here? Yeah, it's like one. This is probably 150th of spotify or something I think spotify is quite a lot of hype. I think these companies have more substance in the long term It's very important in the long term. I think we're all going to be more willing to play for blood pressure drugs than music online Actually, I think spotify's evaluation is probably reasonable if they have 15 billion customers or so, but there aren't that many humans My point here is that this is hot The other part that's super hot is the peptide drugs I spoke about. This is a slightly older, but Almost 10 years ago AstraZeneca bought a British company called netedium and I think the sum there was I don't remember. It was probably was one billion dollars or something And that is a small company specializing in protein drugs Biologicals, as you called it Sorry, I didn't have a slide about that That is the other really hot thing Because with the biological you're not The problem with everything I showed you here is that you're somewhat limited to these small hydrophobic compounds, right? And they will have side effects With biologicals you can suddenly use the entire toolbox that everything you learned about protein structure You only have 20 amino acid to build things from But if you can then predict structures if you can design antibodies or something and in particular Can you take existing antibodies exactly what you asked about before? Well antibodies are large and inefficient. So can we somehow try to extract just the small binding domains And express just the small binding domain that is much smaller more efficient more stable Can we then target this to redesign the binding domain so that if I now have a cancer target And I know this on this particular cancer cell it tends to express a lot of protein abc So can I then redesign my Binding domain from the antibody to specifically bind to these abc proteins And then I can I somehow then use this antibody to kickstart my immune defense and kill just the cancer cells And there are drugs. There are already drugs in clinical There are various phases of clinical trials on this and there are a few Japanese companies that are doing Super cool things when they combine. They're basically using computational design for this We see just a few months ago. We applied for a research project together with three or four Europe-based companies. So we might start to do this quite a lot too exponentially Very hot We're not going to make a billion dollars. So I'm worried from the research side I think that's pretty much it. We're going to be able to finish early today There's only one question for tomorrow. I'm going to try different things I've taken you through these things and asked you questions tomorrow. I'm going to be quite So I'm going to ask you to lead the discussion and talk about this Partly what well do talk about the things that we've covered today I'm going to be able to answer questions But I want you will have to ask me questions and then I will respond or you can talk between you But I won't explicitly ask you the questions And then on Friday I'll bring some candy to get you to ask questions The point here is that there are certainly problems with drug design in particular That we have it doesn't work to find trace molecules anymore The take-home message for you is that it's the optimistic one There are a ton of new very efficient methods we can use but we need to get better at using them officially The other part that I haven't discussed about here is that you should also get better at introducing transcriptomics and genomics and this But that was more the bioinformatics course So with that it's 10 to noon. Let's finish here and then we'll meet again tomorrow 9 a.m Then we're going to talk about free energy