 Good morning. We're gonna talk about more kinetics today, but before we do so, let's jump into the questions from yesterday. What did we talk about yesterday? Folds and fold classifications. So what I spoke about yesterday and that you hopefully got a chance to study a bit is kind of the reason why do we observe the false we observe and then my main reason for going through that is that's gonna be the preamble to understanding why do proteins fold in the first place and what is that determined stabilization? Which later is gonna be what you use for instance if you want to design mutants if you're gonna be working with bioinformatics You're gonna design say new biologicals to try to bind to an arbitrary protein What this is gonna give you is Basically a set of constraints that you have to follow that if you're gonna design proteins You're gonna need to think about these things. You would need to think about the stabilization You will need to think about the hydrogen bonding and everything so that you don't just create random polypeptides Because those are not gonna work So given that let's get let's run through the questions I think there might be one or two here that I didn't explicitly mention, but that's a great That's a great check both for me in case is something I should have mentioned and second It's might be some things that you should have been able to look up on the web otherwise What's the main pre-reason? Proteins are similar What is your argument why that is true because there were two other alternatives? Yeah, but they might all be related Yeah Yeah, but that's a the mind. I'm a bit nastier. You're more than welcome to answer You're more than welcome to pick any of them But no matter which one you pick I want a strong argument why that is true So that yes, you could certainly say that it's only a limited number of folds But what is your argument why that is true or why that should be better than the others? So one possible explanation is of course that there are plenty of examples in the literature and the databases where protein Obviously shared the same fold, but there is no functional relation whatsoever And there is no real air and there is no evolutionary relationship whatsoever and from that argument then it's and it's also the fact that we can create Brand new sequences that are completely artificial even and get them to fit one of the existing folds as As is we created and they were obviously not derived by evolution, right? So that the concept that the fold itself can stabilize the sequence and that we want to somewhat target the fold There is definitely a good arguments why that is true Do you want to have go at the other does anybody else want to have a go at the other two? So what were the other two alternatives? So let's pick one of them and you pick divergence first. What type of divergence? You said you said divergence or convergence, but when we say diverging evolutionary divergence So why why is evolutionary divergence the main reason arguably good main reason why proteins are similar? So that's not that it's it's correct in principle, but same same thing here find the evidence What is the evidence that if some if I disagree with you? What is your evidence that you're right? Not that it might be right, but that you are right That's an example of what? So we know in general in databases that protein domains that are evolutionary later. They always have the same fold There's pretty much there are artificial proteins, but they're pretty much There are no examples when two domains that are homologous have different folds. It does not happen So in the 1960s my argument counter argument might have been good But 40 years later we know it does not happen if they are related they do share the same fold period Which so was not obvious right and then in a way It's actually fun that we started to determine myoglobin and hemoglobin structures because when we started when people started working on them It was not obvious that they would have the same domain. It's just a freak of nature The third argument was what? structural converges and what does that mean and why is that true? Can you give an example of this? Yes, Rossman, no actually Rossman fold is probably good Because in particular if you want to beat a sheet right and you want it to be protected from water So then have two binding sites or something So by definition functional convergence is difficult because it's by definition It's not if it was a family of proteins. They would be evolutionary related So I think you kind of need to put it on somehow on the fold level and argue that it's a very common Structure that we see again and again The Greek keys is another example So it's the same super secondary structure at least but there are no evolutionary relationship And the point the reason we're bringing up all those three right that the devil is in the detail or Similar here. It's up to you. What aspect do you think it's important? What way are you looking from this? This comes back to something that I touched upon this very briefly in the beginning of the course if you compare in particular the argument that fold space is limited versus evolutionary diverges That corresponds to two completely different ways of looking at protein structure. Do you remember which ones? I'm well aware that I only touch upon it briefly. I'm gonna come back to this in the course So the fact that it's only a limited number of folds that can be stabilized That's very much a bottom-up view that is based on physics is based on the stability of proteins and its structural view And that's perfectly correct. We need that view The evolutionary divergence on the other hand is entirely based on bioinformatics Or actually, I would say but it's not even based on bioinformatics bioinformatics is the tools that we use to study it But the keyword is evolution It is not based on structure. It is not based on the interactions of proteins And we we could even say that we know that it's too complicated to understand But nature has four billion years to figure it out And of course they are not our toggle all because evolution The selection of the survival of the fittest part of evolution of course has to do with the stability But when it comes to our studies of proteins, we can choose to either approach this structure wise or evolution wise And which one you choose is going to be up to you. It depends once one method is not always better than the other We spoke a little bit about Structural evolution yesterday to can you give one example of this or two or three? That's actually a good one. I didn't specifically mention that I think it's it's an awesome example So then you then you took up another a completely new domain right to get the more advanced function And then you could that could you say you even have a functional evolution The simplest case of structural evolution would probably be like llama hemoglobin or the fetal hemoglobin or something exactly the same protein But a small change but the channels I didn't even think of the chance awesome example So what's the relation between structural and sequence evolution? So we don't necessarily design it's also a good question. We don't first. I would say that we haven't made a strict definition of it but it's somehow You could I don't wouldn't say that you could of course argue that any time that you have an amino acid change or something It's a slightly different structure, right? That would not qualify a structural evolution in my view. So structural evolution would mean that The structure has changed so that the function changes Because that the function is given by the structure, right? So that somehow it has evolved So this was related to stuff. I drew on the whiteboard yesterday right, and it I might even have had some Slide that almost looked incorrect by now. So you have the Sequence that led to the structure that led to the function, right and occasionally we even draw some sort of arrow here and strike them out Because those arrows the information doesn't go that way, but there is of course this part that you go back through that your selection Not for an individual molecule, but for the species There are some examples. I don't think I have a slide about this. I'll just mention it briefly Normally most changes we do are if they're good you survive if they're bad you die There are a few examples in the literature that whether been freak changes That have survived anyway and a great example of this is sickle cell anemia. Have you heard of it? So it's a small mutant the single site mutation in the hemoglobin and If you get it in both alleles you die, but if you only have this in a single allele What happens is that your hemorrhage the hemoglobin Starts to aggregate the entire molecule, which is really bad And it's not at all as good as taking out oxygen and what happens that your entire red blood cells will collapse So the blood cells in the mind when you look at them in the microscope they take on this almost like sickle like shape So why on earth would such a mutation survive? Yes Right so that with this it doesn't create a hundred percent resistance in malaria, but for some freak of nature This means that you're slightly more resistant to malaria sickle cell anemia doesn't occur in Sweden or the US, but it has Made it has been it has in certain parts of Africa This has become an evolutionary advantage because yes, you can't run that much, but you will survive and it contrasts to a lot of your peers You don't die from malaria So why do protein domains have the sizes they do? That's quite right And I would say it's not just a matter that it would take too long to fold but the the entropic loss of packing We come back to that today, but a protein that is free has very high entropy, right? And if you have a very large chain that's free That's even better entropy wise and if you take a very large chain and need to pack this entire specific time in one specific Chain in one specific way. That's an even larger entropy loss. So they would not even be that stable Sorry, that wouldn't be stable at all. So protein domains You might have talked about protein domains from the bioinformatics view I hope so but there is also this concept that you need some sort of basic size of the fundamental building blocks of proteins Even functionally, they're not gonna fold if they're too small or too large Sequence fold fitting. I didn't use that word yesterday. I'm well aware of that, but it's related to something Can you imagine what it was? I talked about this multitude principle also Right so that common folds that don't have any defects They will be able to accommodate lots of sequences. They're liberal while very specific folds that have very awkward features They will require very special amino acids in that specific position and that means that they're gonna be much fewer sequences that fit them So the reason for bringing this up and I will come back to that towards the end of the lecture today that in general The likelihood that a random sequence will fold a protein is nil. It won't happen So if you're gonna ever gonna be tasked with designing say a new biological Some sort of molecule that should bind to a random receptor. You can't start from sequence I'm sorry, but that you're gonna you're gonna fail if you try to create a 100 amino acids and create some sort of function from it Maybe in a hundred years, but today it's not gonna happen We can't you can't design a function directly from amino acid sequence But what you can say I would like to create a binding site for I would like to create a binding site That's gonna bind to some receptors like the ribosome in this particular location And I would like to do this with a small protein and then you need to start thinking think can you do this with 3 4 alpha Healers or a beta sheet look at a suitable fold in the literature borrow that fold and then create a sequence that will a adopt that fold and be have amino acids in the right place to create a binding site where you want it But do not try to start directly from 100 amino acids and creates Function directly from that we can't do it and I the can nature So that that depends it's a bit related to this thing that I said right right the accommodating folds they have They usually have a better easy way to accept lots of amino acids because it's not that specific So assuming that you have an alpha helix roughly 50% of the residues are happy to be alpha helical So if you know if you have a simple say for a helical bundle and then I need to read it Well, I need something that's hydrophobic here as long as it's hydrophobic and likes to be an alpha helical shape You will likely be fine on the other hand if you have the same type of structure that you need a very tight beta turn There then you basically need lysine You can't pick anything. That's large and hydrophobic tryptophan would not fit there And this is actually what's used in particularly Rosetta But most programs today that you use in practice try to design proteins and and do be aware We are doing protein design nowadays in particular in biologicals in pharmaceutical companies But that always happens by pick a target fold first and then try to design in a new strike new function in that structure So what's a typical stabilization free energy of a protein? Exactly and that might be I use that as an example it for a large one It might be just so slightly larger, but the point a handful of hydrogen bonds maybe three four not fifty hydrogen bonds That is two is something that I mentioned much earlier on in the course. What? Why can't this this is stupid because this means that they're going to be fragile, right? Why don't why don't think nature has created a stabilization in the of proteins to be a hundred hydrogen bonds? That would make them much less fragile Right and every single protein that your body produces you will have to break down a month later So for a few months later So that would be an efficient So that you want you want something that's a bit stable and it's not just where a month later In some cases there proteins miss fault and if proteins miss fault you're gonna need to pay to take care of them Right so that this too has been something that has been balanced over billions of years that proteins should only be stable enough Not too stable and the other Hunch I have about this as I mentioned yesterday that we're we're more and more looking at proteins moving between different confirmations We're learning much more about the dynamics of proteins And there was a great paper in biophysical journal yesterday I think whether you're not using cry E and to try to study how proteins move But if proteins were too stable they wouldn't move right They would just create hard bricks that fly through the cell and they couldn't really do a whole lot of their function However, these stabilizations of proteins while it looked like a Boltzmann distribution We arrived at the conclusion that it's not primarily governed mind the Boltzmann distribution. Why? Yes, that's the base. This is not something at equilibrium. We don't have an equilibrium between different sequences in the protein You pick your sequence and then we have to live with it Now that also means that in particular, we don't really have the KT term We can't really change the stability that much of a protein by adjusting KT different sequences would have they might have different folding Temperatures or something, but it's not something that changes for a certain protein We also spoke about the typical sizes of helices and sheets and In this case, I'm not I think I've done the experiment a cold night do not try to answer this in one sentence break it apart and Don't assume that the answer to this is only what we talked about yesterday So what determines the size of helices and sheets? So membrane is one example, you know, let's let's wait a second with the membrane proteins because the globular ones It's a great point actually, but let's start with the globular ones where we don't have the membrane protein complication So what determines the size safe and alpha helix? Sorry And what does that determine? That's it. Well, I'm not quite happy with the answer that it determines the length Defend another suggestioner So now we're getting there this provides an upper limit to the length of the helix So they break it apart But that is only half of the story, right? Because it's not that helices can have any length below that. Yes The layers that would rather be how many helices you have. Yep So that you need to cross so the and that's what I meant that Some of these questions are not quite trick questions But it doesn't necessarily correspond to only what I mentioned yesterday If you want to determine the size of the helix, size has to do with there is both a lower size and an upper size The lower size is determined exactly by the thing you mentioned here that if the helix is too short We would only be paying the cost of the free and you need to grasp across this free-and-you barrier And then you need to extend it far enough that the total free-and-you is below zero Otherwise, it would be a net negative to form an alpha helix So there is some sort of shorter length that might be a turn or two few turns actually and we That's even with the argument we had that You need roughly three four hydrogen bonds to get over this barrier, right? And then you need safe if you gain one or two K Kals per residue you put there Let's say five ten five ten residues or something You're not you're hardly ever gonna see an alpha helix That is just one turn so there is a lower limit to the size that you're not gonna be stable if you're too short and The other argument there is also an upper limit to the size and that has nothing to do with the free-and-you If you have hundred residues that prefer to be alpha helical, they will be alpha helical And there are examples like this and I'm actually sitting with my I said all yesterday and worked on a protein like that in the bacteria But they're gonna be rare and the reason why the error has to do with this probability It's simply not that likely to randomly select 100 risk use up to each other that are all alpha helical But if that's not that likely why do they occur? Exactly that that sometimes an evolution Princess the hair the fibrous produce right then you just have a small repeating gene segment And in other cases there is a very special function where you really need a long alpha helical Helix and the argument is this argument is not really different between helix's and sheets So what is GFP? It's a beta sheet protein Largely and what do you have inside the beta sheet so that in terms of this question? It's exactly the same right because I didn't go into the details There is a lower limit to the structure and the lower limit to the length of the structures has to do that You need to get across this barrier the barrier is slightly different in alpha helices had to do with getting the hydrogen bonds from at least One turn maybe a little bit more in beta sheets It usually had to do with that you need at least these two strands to form the first hairpin, right? And Exactly whether it's one point strand or it might need one point five strand that's going to depend on the specific properties of the amino acids But there is you need to have enough of risk use involved So that you have gotten to the point where it's actually better to be in the secondary structure than not to be in the secondary structure The other argument that we made more yesterday is that To be a beta sheet you then need to put a certain number of units and these units were The point of the derivation I made yesterday was not that it should be exact, right? But the point was just to make this probability argument and it probability argument works You are right within an order of magnitude. We were even better than that We had to somehow say well, what is the unit and then I hand wave that the unit was roughly three or four in a helix and Maybe two in a sheet Full disclosure. It's hand waving The point is that it's not 500 units so that we get and based on that probability reasoning We got these arguments that you might need in the ballpark of 1015 residues in a helix You might need in the ballpark of 10 residues per strand for them to be stable as a beta sheet And that corresponds it was a little bit low But it corresponds within a factor of two at least to the secondary structure elements we see But that's fundamentally. It's not difference between helix and sheets. It's just that the repeat I and You could certainly argue that my argument about three versus two there was not quite correct That brings us to GFP So GFP was this small protein when you had sheets around it and helix in the middle But what is it useful for and this is part of a so the point with GFP is that It's something There are lots of flourishing markers, right and everything in the world and there is a fans Well chromophores and things the problem with most of these things the second you have something in a test tube It's easy to do anything But a hard part if you're doing biotech is that at some point you need to get something in the bio part You need to get it into a living organism And this is not unique to fluorescence. There are tons of places where this happened I would like to get something in a mouse or a long-term human or something But if I want this to end up in your cancer cells Sorry, I can't eject it in every cancer cell in your body of that if I could do that I would know what cells are the cancer cells and I could just cut them out So the problem is usually I would need the body itself to create something and GFP is a way that we can genetically cause Well, I think we can cause a model organisms such as a mouse or something to express a protein where we can get things to fluoresce or we can Tag up green fluorescent protein Which is a fluorescent marker with a custom antibody or something and then we can get these antibody to buy into a specific target cell There are other similar podiums not related to GFP, but One thing that has become exceptionally hot the last few years and that I think is going to be a future Nobel Prize is optogenetics Do you know what it is? optogenetics So it's a point we would like to control I so this actually came up from a bunch of research groups that were working with Ion channels and wanting to understand Ion channels first And then they came up with a way to add domains to Ion channels so that when we shine strong light on it It will open the channel And what you can then do is that this is turned to an exceptionally expensive research field So then you can then create a rat brain or something and you cut it open and then if you shine a laser on this You can alter the cellular signaling in the brain of the rat while it's alive So you can base you can steer the nervous system with the laser now the problem and this is super cool And then it's an it's an amazing tool just as GFP and that's why I think it's going to get a Nobel Prize in a few years Yeah, who this I call could be a white-handed There's one problem with this though You need to cut the rat's brain open because you need to shine a laser on it And I'm not sure about you, but if you ever want to do this for a patient So what the holy grail here could be could you do this type of studies non-invasively so that you would not have to cut it open So Anna Moroni and several other groups are working on doing this with Magneto genetics, so can you create protein domains that are sensitive to magnetic fields? So you could steer them with a very strong external magnet or something and there are similar case There are small domains called ferritin domains so that proteins that bind iron and everything and same thing there If you can express this in the genome We have a vehicle by what to get it into cells When this is into cells the entire protein will bind the protein domain will bind iron and if it's buying an iron It's going to be magnetic and then we can control it But basically so it's a bit of parenthesis But the reason I mentioned this that this is kind of the most normal strategy if you want to get the advanced things into cells nowadays symbiotic Allosteric modulation is what? Yes, and you can even make this even more generic It's when there is some sort of secondary mechanism that alters the efficiency of the primary mechanism so the point they and I It's probably easy to think about that there's a ligand binding the allosteric modulator molecule can't cause for instance the iron channel to open So it's like an amplifier. Just having an amplifier will not make music out of the speakers, right? but the point is if you have a just having say the Whatever the Spotify app. I guess it's nowadays not a record anymore But the record itself the volume from the record is going to be too low So you need the amplifier to create a high volume of the speakers, but you're not going to be happy without you need the source So the original record or CD or Spotify app that is the primary process and the amplifier is the allosteric modulator Although allosteric modulation in some cases can also dampen the effect which happens in our own channels This too is a central concept of not all but a very large part of modern drug design Because we most we usually try to modulate processes The reason for that is that the many cases Iron sensors are shutting something off. It's easy. Just plug your poor or kill a protein Whatever bind something to the protein that it can't do its function give it a fragile biologist It's fairly easy to destroy biology But the pointy we usually don't want to destroy a mechanism completely You might want to dampen it just a bit or enhance it just a bit and the enhancement part is even more difficult, right? That you can never enhance If for whatever reason some process in your cell is not active enough We would like to stimulate it and you can't stimulate it by shutting off a binding site So there is more and more modern Pharmaceuticals and everything that is really based on allosteric and I think it should have been a Nobel Prize But it never got one Then we spoke a little bit about folding units of a protein. So what was the folding units? Well, yes Yes, that's right. It was not the answer. I looked for it. It's quite right, of course I was more thinking of the concept of a folding unit and the concept of the folding unit is that How much stuff do you need for the protein to start folding? And you could imagine one example came we've actually seen some country example of this The Helix coil equilibrium in a long in a long long long coil And if we don't really worry about real proteins if you worry about do the helices form and in that case You can certainly add one residue at a time to helix So then you can continuously start folding and there is no cooperativity whatsoever Proteins in general. They are cooperative though. You can't just fall one risk. It's not that It's a beautiful model You could think of a protein as it comes out of the ribosome it would fall one residue at the time It doesn't happen that way sadly So proteins in general you need larger parts of the protein to fold and we're going to talk about that on Monday because that's part of practically 11,000 paradox and Is some in the opposite end you could imagine that there are some structures not necessarily individual protein But like these plaques in the amyloid diseases or something they're not going to form unless you have thousands of proteins like it and in that case the some other folding unit would be much larger than a protein but What we ended up arguing is that in most cases the folding unit has to do with these domains, which is again It's not a freak of nature, but they correspond exactly to the domains you see in bioinformatics Why? Why is a more important question than how do we know will also that how do we know in a second? They function independently and it's also it's supported this relates so closely to bioinformatics and when we think bioinformatics What should we think of? evolution, right? Evolution evolves in domains not in helices or anything Then if you start to think about random evolution mixing that happens in your DNA and everything Evolution there are of course examples where single-site mutations happen and everything things like if you expose an organism to lots of radioactivity You're gonna have problems with single nucleotides being replaced and everything but in general that's not how DNA works the reason why you're not The reason why two siblings for instance are not identical copies of each other It's not that there were some individual amino acids that were swapped Is that you have enzymes that cut out in part parts of your genome and randomly mix entire domains? Yes, I would say yes because this is also very much part of evolution, right? So this has to do with the physical stability of the proteins and the entropy but here too if you had some domains That were too large evolution would say that those are unfavorable and they're gonna cost too much energy They're not good evolution would try to get rid of them in this case same thing here that The folding units has to do with the folding units if you try if the proteins had to fall the residue by residue It would not be efficient or conversely saying that if evolution happened by randomly knocking out individual amino acids It would evolution would be inefficient and I know I'm so well aware that I haven't told yet But again, you will trust me I will tell you at the end of this lecture the likelihood of a random sequence forming a protein is zero and If you now had evolution and it's evolution and the random Then happened by a randomly changing amino acids in your genome You would never succeed Like 99 well not 99 all any any change you could imagine in the genome would lead to a fetus that couldn't survive We wouldn't exist so evolution only works by the fact that most evolutionary changes are okay They might not lead to any advantage and then that it might not survive over dozens of generations But if things were so fragile that an individual amino acids and remember the defects I said here right if you're unlucky and do one bad amino acid change You would have destabilized the entire protein So evolution would not work unless you try to reuse some sort of complete building blocks So how do we know that I went through that a little bit quick at the end So the fundamental is quite right So that the fundamental concept here is that you need to find two ways to describe something two different ways And if we know that they describe the same thing we can find ways that we say they must be equal And then we can connect these two ways together and one obvious thing is that we know how much energy it takes to fold To melt the protein because that's the amount of energy. I need to add and I also know how much protein I have in a test tube right so I can calculate how much energy I need to add per molecule to melt it and Then I we hand-waved our back note I actually did prove it although it was fairly quick that from the shape of this curves What happens when we denaturate a protein and the temperature interval over it happens I could also make some estimates of the efficiency from these curves So and the reason why this works that these curves describe how cooperative the process is that it happens over a very narrow temperature range But because the curves they describe roughly how cooperative is the process and then it turned out that here too I could get some estimates from the probability of folding and then when I equated this and did a bit of math I could end up with roughly how much energy am I spending per folding unit? But since I also know how much energy I'm spending per protein that you can say up the folding unit corresponds roughly to the domains So the point here is not so much that you deriving that particular equation But find two things that describe the same thing one from a fundamental physical point of view and the other one from the lab point of view Then you can connect the two together How does enthalpy and entropy change upon folding? I didn't really go through that in detail I'm gonna spend a little bit time to talk about it. And you know what let's wait for that for today What is called denaturation? So here too I would approach this in a couple of steps If you look at the shape of these three energy curves and you balance the energy with the entropy you end up with Harmonics right there that you have you have some sort above a certain Sorry, they're not going to be straight lines, but it's basically it's like a second order curve So that has a slope that's changing and while there is a higher temperature above which we are no longer stable That's the normal denaturation of your bold things in general There's also going to be a lower. Sorry. I'm drawing this was right point in general There's going to be a lower temperature below which we are no longer stable either In general, we won't see that because ice freezes and the water freezes and things stop moving before we hit that temperature But the concept again that all processes they have some sort of relatively not just narrow relatively Very narrow temperature range where they are favorable or stable Then it just happens so that Cold denaturation for normal proteins your body doesn't happen that much But in general this is important because anytime you're going to try to design things You need to think of a target temperature And if you're going to design things from the body your target temperature should likely be around 37 degrees centigrade, right? You don't want to design a protein that this where you have the maximum efficiency at 20 degrees centigrade And even that 10 percent difference in temperature can matter For now, this might seem like a freak if you're that's only important if you're studying fish in the Antarctic or something But it's going to be important later today Was there anything else you wanted to talk about or should we get started with today's stuff? Alright, let's head on So today we are I'm gonna speak a little bit about the kinetics What happens when proteins fold and unfold and we're going to get back to look at this multi-globular state But now we're actually going to define the multi-globular in much more detail Sidechain packing we're going to talk about what is really that stabilizes the energy at the end And then you're going to be looking a bit at energy gaps. So the lab yesterday that was sorry. Yeah The handouts. Oh, I think we have them here The nature state pass those down The lab you made the lab the lab yesterday was on what? Folding, okay That might actually be related to the energy gap stabilization. Otherwise, you're going to do that as the next lab We are gradually going to speak more about folding kinetics and that's not just because we're gonna It's going to be a bit more of equations But the point is that it turns out This far we talked a lot about the Boltzmann distribution and the energy is important But for real proteins the speed with which things happen is frequently a much more important factor And we will start relating this much more to live in thoughts paradox and argue when proteins are stable And why the kinetics is the answer to whether we can fold proteins in the first place We're going to speak a little bit about chaperones transition states Folding rates is something that's going to come up and I have two slides on Chevron plots But otherwise that's going to be the main topic for Monday. So we talk about a bunch of different states here. I Might even draw them here And there are some common symbols that we so should explain with I likely won't explain them in all cases There is one state that we typically use and for what type of state is that? That's the native state of a protein and I think the the good answer questions The native state is well-defined the native state is when we have the biological efficiency And then you have some sort of state that we frequently called D or U completely unfolded So D would be denatured and you would be unfolded you can use any term you want for it but the point is that It's not really that well-defined What you will frequently see in books is that you will see some sort of chain that looks that way and as you know now That's wrong In general a change won't be stretched out, but it's going to be a sort of random Jarn So there is no structure whatsoever in anymore and then somewhere we have this State that we don't really have a letter for but I will use the letter M for multi-globular And that's some sort of hint to me There is it's not quite up there, but it's not really native either And I'm sorry that I can only apologize for a generation of researchers here So we haven't been better at defining these But when I was your age the multi-globular we thought that that was almost up there, but not quite But we're gonna spend a little bit of time looking at particularly in that state Because the problem is that if you're now gonna fold protein at some point you start out here, right? And there is some sort of transition do we go there first and do we then go there? That's actually gonna turn out to be true, but I haven't proven that this far So this first step moving from denature to multi-globular This is the part where we're now arguing it was kind of just a hydrophobic collapse turn all the Hydrophobic stuff to the inside and keep the charges on the outside Or at least that's what my generation thought It's gonna turn out that it's not quickly quite true, but the real protein folding is gonna be the second step But to understand this we're gonna need to look a little bit more about the molten globular and what it is So if you want to try to reproduce this in the lab if you are at good low temperature and Know yucky stuff such as guenidinium hydrochloride Guenidinium hydrochloride guenidinium ions that's that's a side chain of arginine And I said you're basically adding two or three molars of a very very strong acid here So that's a classical denaturant you want if you want to make sure that the protein is completely stretched out Or at least not stretched out, but there is no second restructure or anything So at natives if you don't have any Denaturant like that and if you have low temperatures we are gonna be there is some sort of domain or range here We are native if you just increase the temperature a bit We will gradually move over to some sort of molten globular state. This looks quite a lot like a protein still and If you really want to destroy things and start adding guenidinium hydrochloride Eventually you end up with something that's coil like and completely unfolded, but there are these three different regions So what we know now in particular because we can do and the problem with the molten globular You can't determine the structure of the molten globular because it's not periodic and it's not well defined But we can use things like NMR We know that the main chain is quite ordered and here is where we went wrong in my generation Sorry that it's not the fact that it's just a random hydrophobic collapse You do have the rough shapes of the main chain or something. They will form quite quickly We definitely have the hydrophobic core though and it's dense in the sense that there is not a whole lot of water in it There's a little bit of water in it, but the protein is roughly like a sphere and it's just pushed out motor most of the water So the volume is also roughly what it should be or the final protein However, it's not yet in the native state So there is going to be some sort of transition to the native state that we don't know yet what it is the second part So it's not that we are it's not like it's exactly the native state. It's clearly not native and it's clearly not the deny state The definition if you move from this molten globular up to this coil, that's much less well defined So it's that if you just start adding things here, you can all go almost continuously here but here there is going to be some sort of energy barrier between and if you Define a bunch of things that it's compact and it has a well hydrophobic core just as a native protein On the other hand, it's not a rigid structure And we know that partly because we can't crystallize it But you can also see in an NMR experiment that it's it's very flexible in mobile and today you can do it with a simulation, too It is like the native protein in the sense that we usually have secondary structure The helices have started to form the beta sheets have started to form There might not be exactly like they're going to be in the final structure But it's not just a random mix of hydrophobic things on the inside on the other hand Compared to the native state it has it has the transition has happened There isn't really any second sort of phase transition up here or anything So that's we the magic that happens here is gone We might start to see that the side chains have become ordered and then they might have tried to start packing And we'll come back to the side chains. However, there is not a unique packing yet It's not well-defined and that has to do with this rigid structure. It's still floppy and flexible and The disulfide bridges might have started to form at least but this is not yet a working protein And what you can know you might not be able to guess it here But what I will argue is that any sequence you make create a random sequence of hundred polypeptides They're gonna form something like a multi globular. So there will be a spherical shape But there is no magic. There's no protein yet. We'll come back to the magic in a few slides You can there are a bunch of structures to determine this and a more spectroscopy is one of them You can also do this type of laser spectroscopy where we capture time-dependent structures as very high resolution and That's probably not a good example, but If you do this as a find you the scale here is goes from zero to roughly three milliseconds So that even here earlier you gradually see things sharpen up a bit But you can also see that there are lots of alpha helices here and everything so while the structure is forming We already have a whole lot of secondary structure This is another example. Where's a snapshot from something that looks multi-globular like in a simulation And if you think about this in terms of packing if this is a native state where we have beautiful Well-defined packed things in the tier the multi-globulate corresponds to stirring this up a bit heating for instance So you take your beautiful protein and then heat it to 50 degrees centigrade. It's not going to be shattered to pieces But suddenly there is a bit of water the side chains have started to move Again the magic is gone, but it still looks like a protein roughly like a protein at least What I'm gonna argue and that we will come back to likely after the break is that this part is really good And what's gonna correspond to these transitions When you get the final packing in order when you push out the water and everything that's what's gonna be the protein folding While almost any protein can form something like that So having said that what we're gonna start to look a bit is this transition if we started the nature state and then go down to a multi-globular So what is required to move from a denative? Sorry from a un Unfolded or denatured or you can even send non-native state to a multi-globula There's partly this hydrophobic collapse, right? Can you imagine anything else? That's important to get there. I said that almost all protein So rather what's specific with proteins? Why can't why don't why don't any type of small molecules do this? If you have a mix of hydrophobic and hydrophilic stuff After all there's a ton amino acids are not the only molecules that can be either hydrophilic and hydrophobic can have a bit of different properties So what's special with proteins? So I'm gonna It's always hard to do the math first. I'll do the math first because otherwise it Proteins are homopolymers And this might turn like yeah, I couldn't know that Homo polymers means that they are polymers and their polymers Proteins in general are hetero polymers But if you start looking at the simplest possible case, they are polymers and the polymer part here means that they consist of Several residues or monomers that's together I'm sorry. I was wrong. Proteins in general of course are not homopolymers. They're hetero polymers But let's forget about the hetero for a second If you imagine a there's a famous term in physical gedankin experiment So this is an experiment that we can't do but we can't think about it So imagine the case of proteins were not in a chain They were just small random molecules because most molecules in the world are not connected What I'm gonna argue the second you start to pack molecules unless there is really bad clashes The more densely you pack things until you get to a density of roughly one It's gonna be good for them to interact because it's a condensed phase than atoms like to interact But that doesn't really depend on whether you're connected or not But what is gonna depend on that whether you're connected or not is the entropy So the energy as you're going so if we don't know the volume or the number of residues We can think of this in terms of density on the x-axis and as the densities lower You're stretched out and as you're folding up the density starts to approach something like one and I would argue that the energy just goes down Let's have a look at what happens to the entropy So one way is that we can think of something that's cloud like that the amino acids can be anywhere. They want in space Then you have the volume the accessible volume to each monomer That's gonna correspond to the volume that has not yet been taken by something else, right? And if you are in a cloud then well the volume that I can move as a monomer That's gonna be the total volume available Minus Yes Minus the number of monomers. I have multiplied by the volume per such monomer and then I wanted the the Volume per monomers then I should divide by the number of units here to M This is a bit of an ugly expression because I need to know how many monomers I have and I need to know what the total volume is And I don't really know any of those So what I then try to do Since I wanted these x-axis where I just spoke about the density from zero to one I know that the density I have well That's the number of monomers Multiplied by the volume per monomer divided by the total volume if I want something that goes between zero and one And if I just use that equation I can rewrite this a bit So the first term there would be v divided by n The v divided by n minus the second term which is pretty much n times omega divided by n and And then I use that equation down there a second time and then I get rid of that n2 You can do the math if you want to but basically you end up with an equation That's a function of the density per monomer. That's constant. Sorry the volume per monomer. That's constant and the density But it's not just proportional to the density. It's So constant divided by the density multiplied by one minus the density The exact shape is not important, but it's a complex function of the density While if you're a chain well Chain can't be stretched out or anything, right? So if you're looking locally at the chain, there's going to be Some sort of constant volume which is the volume around my segment And then I guess the neighbor before me is busy and the neighbor after me is busy So there's going to be some sort of constant shape around me that I can't access And at this point you probably start to what an earth is going through all this math again So but in it the point is that the dependence of the available volume per monomer is different in a chain From what it would be in a cloud if they were completely disconnected But we also know that we can calculate the entropy right as course That's proportionate to the logarithm of this amount of volumes or states available And if you just do that take the logarithm of either one minus row or The logarithm of row multiplied by one minus row we end up with very different shapes So in general if you are disconnected monomers You're gonna start if you're fully packed in both cases. We have very low entropy in the in the limit Where we perfectly packed the entropy is zero But as you start to unfold the chain we will get more and more entropy We that sort of the entropy goes up most When we give a little bit of freedom to the protein because there are many more states that are available And as you gradually unfold the protein that goes more and more But remember in this case there was no chain So what will happen? What would be if things were not connected? What would be the way they had the most entropy and freedom? Spread out all over the universe, right? So that as the desert here goes to zero The entropy would raise infinitely if this was a protein What this would mean is that up in this state you would go to infinitely high entropy But then you also know your classical equation, right? F equals E minus TS If s starts to approach plus infinity here f would approach minus infinity So if things were not in a chain it would always be best to be unfolded and the reason for the equation on the last slide is That when you have this row times one minus row This means that the entropy will still go up, but then we will approach a constant So because you are in a chain once you reach the state where the chain is completely stretched out You can't get higher entropy And that's only means that the entropy up here would be finite and because the entropy here is finite That means that the free energy is also going to be finite And that means that it might actually be possible to create something down here. That's better So it's not just a coincidence that proteins because if proteins were not chains of molecules There is no way they could fold because we're for any any other Non-connected molecules it would be better to be infinitely far apart and that's actually not limited to proteins That has to do with plastics or anything and that's why anything that's plastic or bags or something They're polymers you would not get the molecules to be stable is a sort of folded upstate unless it's connected as a chain So it's important that proteins must be proteins were not polymers. They could not exist They would not even be able to fold a molten global. It would be happier to be spread apart. Let's see. Oh, sorry You can even do this matter. We take this curves in the last slide This was just the entropy and then I also said that the energy goes down pretty much really nearly, right? So we just say apply the energy minus that entropy you're gonna end up with free energies that looks like that in a cloud And like this in a chain so in a chain you're gonna have some sort of minimum and free energy Fairly close not necessarily where row is exactly equal to one, but at the fairly high density While here you might have a local minimum, but it would be even better to be completely unfolded So row equals to one would just means that things are packed so that the protein is perfectly packed So that would mean that there is no water around it or anything. We've taken all the residues and packed them perfectly in space No, but it's squished together So we haven't started to talk about what fold means yet All I had said that if you take any type of molecules and again We're still talking about homopolymers here if you're taking any type of molecule and if they are connected in a chain There will be some stored states when we have squished them together that Depending on the temperature we will end up with some sort of local minima in free energy here And you can probably guess what that will corresponds to later on but this term with we're not talking about proteins here We just talked about polymers if things are not polymers. We're gonna end up with different things here So produce don't really have at least not first plain first-order phase transitions If things were not polymers you would end up with something much more similar to water, right? And water or anything else at some point you're gonna boil them and then it's much better to be out here So that the first-order phase transition here would correspond that you have a clear barrier, right? And when you get across that barrier you move over to some different states and there is no clear in this particular case again It's a simple polymer. There is not an obvious. There is no simple barrier here We will have barriers later on trust me. This was just polymers We're gonna ditch the homo in two slides here On the other hand, we also need these free energy barriers And the reason that if you don't have the free energy barriers, which was sorry, let me go back to the last slide again What would happen here if we didn't have a free and the barrier while this might be the best state? We could gradually go up here, right? Just add a little bit more if you add epsilon energy. I'm gonna move epsilon away from that and This would be bad because you would not be stable You would not be well-defined all your hemoglobins if you go into the sauna Your proteins would gradually start to unfold in within minutes and everything so you don't have the stability If I poke you you would start to die and you might be unstable So that's obviously that doesn't correspond strictly to proteins We're gonna need some sort of barriers to create the stability of things Because the barrier is bad when we want to get there But once you are there the barrier is good because it helps us not to destroy the protein So there will be some sort of barrier for a real protein structure. Otherwise, they would not have this ability We know so the question is where does that come from and I think I already drew some of these states, right? That there will be some sort of complicated free energy landscape here with areas in Valley We need to get to that lowest level state and The question is how do we get across those barriers and those barriers don't primarily come from the hydrophobicity or anything But it comes from the sidechain packing So what happens when the second transition we spoke about here when we move from the molten globular So this kind of looked like a protein, but it didn't have the magic yet The magic is when all the sidechains starts to find the right position so think about like Well grabbing your fingers into something but when everything just fits into place and it's perfectly packed The positively charged residue finds a negatively charged partner. There are no holes or anything You don't have a the hydrophobic residue finds another hydrophobic partner and it's well defined. It's unique Like all the protein structure we see from the protein data back and everything now This is not perfect. It's not like proteins have slightly lower density than water But it's like 80% of the volume here is going to be filled. It's much more like a solid than a liquid in here It's not like the amino acids will unfold and unfold and move They will find the right states and then both pretty much stay in that state You might have some metal groups rotating and everything but the sidechain packing is unique for a protein. That is the magic Here on the other hand in the molten globular the sidechains will move around a bit then they might change place They can do that because the molten globular is also a bit swollen, right? There is a bit of water in the molten globular But what's happened here on the way we've pushed out all that water Almost all the water at least and the sidechains have suddenly found all their partners but the That leads to a couple of things if you just look at these states that there is some sort of native protein state There is a barrier and here we call it denatured which is a bit unfortunate But think of this as a molten globular state or something If you start from the native protein and add heat or something let's say let's say that we add heat. That's good Initially, that's always bad and the reason why it's bad that you have this beautiful packing And now you're tearing your sidechains apart Again the protein we are it's a beautiful stable thing and you're starting to push this uphill Water can't enter and that's the reason when you're initially pulling this apart. I'm almost creating a vacuum on the inside, right? And you take this beautiful protein and then you're just destroying interactions. I'm not gaining anything But what eventually happens is that the sidechains they will literally be torn apart and the sidechains will find new hydrogen bonding partners Water is going to fill the volume inside the protein And what then happens is that you literally cross the barrier and then you find another state here Which in general is going to be slightly higher in free energy than your native one, but it's better than the barrier And again, I'm going to argue that this is very much the molten globular But based on what you know now we can start to say what the different processes here are responsible for So what is entropy and what is energy here of what I just said? Well, you're not quite wrong, but you're not quite right either So I don't expect you to know this but the point is that there is something I didn't give you So the reason for thinking about this right is this equation This equation that looked so simple and innocent and early on and that determines everything and that's going to determine things here, too The point is going to depend on the direction we're going So I hinted about this earlier on the course when it comes to two types of processes If a process is limited by energy Then the speed by which is regulated which just has to whether whether we're adding or removing heat Where the processes where the limiting barriers entropy, that's much more searching But you have to wait until you find the right thing. This is hand waving. I'm going to show you with the equations in a few slides So what I'm going to argue is that If this is again the density and they and again so that left side here of this block would correspond to something words Largely unspecific Maybe multi-globular or something in the extreme case You would be a completely stretched out chain and the right hand side here would correspond to that the entire molecule is very dense And I would argue that in general energy starts out High and then we go down because it's molecules get closer to each other They will interact more and eventually you're not going to start go up. Why do you will you go? Why will you go up in the end? Exactly you will bump into things so that at some point there is going to be a point where the energy is fairly low here But this does not create a protein If you look at entropy on the other hand entropy starts out high, which is good and Then you're going to drop at some point I'm going to argue that this happens over a fairly narrow regime because that's really when you start to pull up and Then entropy will go down go down go down and at some point you're almost perfectly packed Maybe it will even start to go down quicker at the end or not But do you agree with those rough shape of the curves that the end? So the only point is that That's a good question Let's see a good analogy here If you come into a room with lots of people in it Actually, the classroom is a good example. So when you need to find a place to sit here in the morning You just grab a chair, right? It doesn't matter that much if there are already some students in the room. There are free chairs Until you get to the point where there are only two or three free chairs Then you have to start a lot then you might have to squeeze in behind somebody so that Normally well, you are here. We're still fairly stretched out, right? So there is not really a whole lot of things that's limiting the motion of the chain Yes, we will go down a little bit because occasionally I will bump into another chain But there isn't really anything that's limiting me a whole lot But at some point here when we start to get relatively high densities Suddenly I start to bump into things and I start to bump into them fairly frequently and Then my entropy will go down fairly quickly So this is the point where we would actually have to start to find a chair in the classroom and then You could argue exactly how sharp this is but my only argument here There is going to be a some point where we lose most of the entropy when we're getting to the point where the protein is starting to be packed and Here we are so low in entropy that we have very little freedom and then we can't really move a whole lot And what would eventually happen here would probably where individual atoms stop moving or so forget about that last part not important If you agree with those two curves Then we just apply E equals F equals E minus TS so we take the E minus that curve So we need to turn that around and if you then and take that first curve minus a second one You're gonna end up with something that looks like that By definition Now exactly how high the status and the barrier and the second stated that will depend on the details of the curves But because of these shapes and because energy goes down and because the entropy drops over a relatively narrow range You will end up with a barrier Based on the packet So the question is this barrier is going to explain a whole lot of things about protein folding But we don't really know what type of barrier it is yet Or actually if this was true because you could certainly you could certainly argue that I was wrong You don't buy the argument that the entropy drops over a narrow regime You would like some experimental proof of that. We can give you that proof later But if this is true, let's say that I'm started out here and I'm gonna fold my protein Or actually no, let's it's let's do the zero turn to first I start out here in the end state I am a protein and then I want to get over this barrier from the right side to the left side What is it what I'm paying in when I want to go from there from the native state to the denatured state look at those two curves What type of barrier is it? It's energy right because I'm paying energy the entropy is not really a whole lot of eventually I'm gonna gain entropy, but initially I pay in energy So we unfold the protein to pay in energy If you are in the denatured state and try to fold the protein on the other hand, what type of barrier is that? Again, if these curves are right Entropy so you need to somehow you need to search to find the packing. So you need to test all these different It's unlikely that you would find them randomly and you need to test all of them until you find the one That's really good And that's part of the reason why protein folding is fairly slow. It's a searching process so the cool thing here is that Despite this equation that looked so seemingly innocent in the beginning. It's super complicated You can have a barrier that has different character depending on the way we take over it We'll come back to that Let's see. Yes, I'll take five ten more minutes. That's gonna be a good break point You might not buy that for now, which is perfectly fine I'm gonna show you some equations after the break where we can actually derive this from experiments That this type of barriers are true but We spent a little time talking about the denature states we talked about the multistates So let's have a look at the final native state one So The native state I would argue side that it's unique. It's unique in two ways. We have the biological function there That's how we can measure and test that it's unique and structure-wise. It's a very close packed state It also has very low energy It has both low energy and low free and The low energy comes from sorry the low free energy is Mainly caused by the low energy the entropy here is lousy Lousy in the sense that is low There's only what if it's unique we hate that entropy-wise. We lost all that beautiful freedom So the only reason we want to be there is that you must have some very nicely packed states where you're really good energy Now of course to find that state for we start we might need to get over lots of entropy barriers But we will only be stable in that state at the end if it has lower energy and you can this is one of things What I love with simulations. This is right We know that from all the predictions we do in bioinformatics and everything those native states really Corresponds when we have all the hydrogen bonds in place computers have solved that for us. It's no longer in hypothesis We know that it's true So although we can certainly talk about free energy and everything but somewhere here in the end game when we determine Why do you get here? We should somehow look at energy levels Almost corresponding to these very simple apps you did early on in the course But then I also said that not all proteins are going to form stable sorry Not all sequences are going to fold stable proteins. So maybe we should look at those energy levels if Well, we already had the energy the entropy and the free energy and Then we can do the things that we did a couple of times before when we looked at it You can put how much entropy that is how much freedom do we have as we are changing the energy? Why did we use those type of plots? So that that was this type of plot that was very I was about say uncomfortable. Oh, yes, they are uncomfortable too They're not obvious. I don't have a gut feeling for entropy relative to energy means But it's these plot that determine what regions are stable, right? When do phase transitions happen the slope here had to do with this one over the temperature part So that being up somewhere here was good being down there was sorry Being up there was bad being down there was good But we also know now that what determines with something is a stable fault. It's going to be low energy So somehow we need to get to the left part of these diagrams. Otherwise, we don't have a low energy and There's going to be some sort of curve here that could vary any possible way But if we want to get to the left part of this that also means that in general the entropy will go down So going down in energy is good going down in entropy is bad And then you can imagine a couple of different things that can happen one of them is that you have a beautiful nice curve that gradually goes down here and Eventually you're going to get reach some sort of point here where you have the slope corresponds to one over the temperature And then we're going to be stable That happens for lots of molecules, but not proteins So what this corresponds you basically freeze in almost like freezing water or no not actually water is not even a good example Freezing a polymer is a great. Oh, I even says hetero polymer there Or a random chain of amino acids if you just take 100 amino acids and synthesize them in a chain And you have them in a test tube and you're happy It's a hundred degrees centigrade and then you gradually drop the temperature They might find some sort of random collapse state, but there is no magic It's just a multi-globular Yes, all the hydrophobic parts will be turned to the inside and the hydrophilic parts to the outside But it's not going to do anything. It's not particularly well defined What also means that if you're adding epsilon heat if I'm heating your protein by one degree centigrade It's going to move up to a slightly higher energy You will have moved one torsion in the chain or something So yes, it will move a bit and you could probably determine some sort of average Contents of the alpha helices and sheets, but there is no magic no protein But in a few other cases you will still go down here But there are some very low energy states that you almost need to take a jump in energy down And that would correspond to this. What if there is for your polypeptide? There is a one state where all the hydrogen bonds are paired up into a beautiful beta sheet or alpha helix or something So you have lots of favorable interactions Now of course that's going to be well if you take your two strands of your beta sheets and tear them apart You just lost four hydrogen bonds Or more maybe 14 So that's you're going to take a fairly large jump in energy, right? And for now, I'm not saying which one of this is true, but you could probably Imagine that you could have both of those things What's going to happen in in the first state that I showed you on the right That's a fairly boring continuous thing But what will happen here as the temperature goes down Under some conditions you can jump to that lowest energy state And if we jump there, we're going to be stable then we don't want to move away from that But the problem is that it's important that We don't get stuck here, but if you first get stuck like the one on the right one You're never going to be able to find the state. So what we need here is that The distribution of all these states is going to be determined by the Boltzmann distribution and Even if there are some barriers to get across between these states as you're dropping the temperature here The temperature becomes lower and lower and lower and then we favor the lower lying energy states But there will also be barriers between these states, right? But what if the barrier is so high here so that as your temperature goes down you should favor the lower states But as the temperature goes down you will not be able to cross barriers either So that your barriers need to be so low that you can cross them But this states has to be so low that it's significantly more stable than the others So this is going to be a balance between how high the barriers are whether how low this energy You would like a very well-defined low energy state that doesn't really have a gigantic barrier to it These energy gaps is pretty much what explain folding Very few random structures have low energy or they have low energy, but they don't have this super low I like the word magic because that you have this amazing beautiful paired up secondary structure That's what creates the pairing of all those hydrogen bonds And that's what in terms gives you the perfect packing with the Lenny-Jones and everything you push out the water And then you then you make this gigantic drop in energy Now this Stabilization and a free energy will still be fairly low because while we get the gain in energy We will pay for that in blood or entropy So that we need some sort of well-defined lowest energy structure that's clearly separate from the others That's going to be the reason why this is stable and if this gap Between the lowest and the second lowest one is much larger than kT Well, there should be some sort of barrier here It's hard to get over that gap because this is literally a gap. There has to be a gap in the energy on The other end if the gap is much smaller than kT We will be able to gradually move over it just by kT and that's bad because then it's not going to be well-defined Remember what I said when you want to get there the gap is bad Once you are there the gap is what saves you So you somehow need to be able to get over the gap can't be too large Then you can't get there, but it can't be too low that we would slide out of it when we're there Yes, literally that's literally the energy barrier But of course in terms of free energy in terms of free energy there are the part we have to find it to right So there's also an entropy part to it The reason why you get this for proteins is the packing and it's not just random packing of hydrophobic amino acids What determines the packing in proteins and whether the packing is good? The sidechains and not just the sidechains the specific sidechains right it's going to be super important to have I say a tryptophan in this Pacific position and it's going to be super important to have an arginine in this other position You can't just randomly replace them and This is the homopolymer part Proteins are not just a general polymer. They're hetero polymers And this is the reason why I brought up the strange thing about the density and the polymer part Proteins need to be polymers because if they were not polymers, it would never be favorable to have these compact states On the other hand the compact state doesn't give you anything more than a random fairly complex structure To get from this structure to have a transition where you have something that's highly specific You need hetero polymers that are very very specific different amino acids and it's important to have the right one in the Right place That's something that you don't have in plastic But we do have it in a protein So the point is that protein you can't create a protein just if you just randomly created a sequence of amino acids That's effectively the same thing. It's a bit of plastic You're not going to get a protein by randomly things because you're not going to get the specificity by randomly assembling amino acids And this is the problem because if you're not going to create a protein with 100 residues Divine is well if you can do it with divine inspiration more power to you But the problem is one and 100 amino acids is a tiny protein, right? It's going to be super hard to design a protein and somehow get this magic And that's why we use computers biofematics to somehow Can we try to identify what the uniqueness is so that we get the really beautiful packing? And one great way of doing this was cheating and see what nature has done You're not allowed to cheat on the test but in the work you're allowed to cheat by looking at what nature has done Actually looking at what nature has done is okay in the test And that's why random polytheftides will not form proteins And you can calculate you can guess what the probabilities are going to be if you talk about 100 amino acids under 20 The 20 to the power of 100 that ballpark. I will show you one more slide and then it's time for break Remember that if we spoke about fold stability, right and that I introduced you could think of that It's not really a temperature because you think of that denominator as a sort of characteristic temperature of the transition Remember that for yesterday That slope and these diagrams Also corresponds at some sort of one over temperature thing And it's going to try these characteristic temperatures They really it's going to correspond at the temperature at which we have the protein folding happening So that if you're running this as a temperature that's significantly higher than what that slope would correspond to You can easily get to the gap here, but you will also get be able to get away from it When the temperature is significantly lower, you're not really going to be able to get to the gap So that's these temperatures that might be in the ballpark of 350 kelvin or something is very much going to correspond to the protein folding temperatures And for a specific if you know the specific chain and if you know the free energy of the entire chain We couldn't be a recalculators Theory is the keyword here Because the problem for a real protein you would have at least 10,000 atoms or something well a thousand atoms 3,000 degrees of freedom. It's an insanely large landscape. So therefore we can't we can't calculate anything like this for a real protein If you didn't do it yet You are going to do that in one of the labs and we we frequent to understand this is great to work with a toy model And that's why we like to work with the simple energy of things You're going to work with beads on a string or something because these systems are so simple that we can Try to enumerate all these energy levels to capture the ideas here But for any real protein you can't do this You can't even do it in an MD. Well, maybe in an MD simulation for a small protein But so we use this to understand the concepts, but this is not the way you could design proteins in general But what we do know is that The folding process in general has to do with the transition between some sort of modern globular and the native states Vitrification in general all type of random polymers will move down, but they might get stuck They will simply stop moving While the ones that form proteins they form proteins because they under the right conditions They can't jump to the slowest energy state and it's well defined and then we will stay there Let's skip this about the vitrification temperatures because not that important if you want to go into details The book spends quite a bit of time on it. It's 10 30 now I would suggest we reconvene here at 11 a.m And then we're going to speak about fold uniqueness a little bit more about the kinetics And I will come back and show you why those entropy versus energy barriers really are with you I think I changed my mind just so slightly here. I will tell you about this melting versus vitrification temperatures Partly because it couples back to some things we covered earlier on in the course Remember when I first introduced these strange entropy as a function of energy diagrams. We did that in the context of understanding continuous gradual transitions versus phase transitions And if I I drew curves like this, right, we have energy here and as here Where I said that the slope here corresponds to one over the temperature and At any point along this curve. I had an energy and an entropy So if I'm above this curve up in the top left here I can be at the same energy but with higher entropy that would be good Or I could be at the same entropy but with lower energy. That would also be good So that this is the good part and this is the bad part say lower F and higher F Now of course I can't just move up here because I need to stay on the curve But in theory would be better if I could be up here The concept of the gradual phase transition is that at any point along this curve would be stable and what determines well I was about to say what determines where I am. It doesn't really determine what I am but at Any given point in this curve There is a well-defined slope on the curve and that slope defines the temperature So the temperature really has to do with the temperature is a function of entropy and energy rather than what causes it and It's boring. So this corresponds to heating water any point here I'm stable and happy at and if it dropped the energy I would move sorry Well at a specific temperature here, there is going to be one unique defined point But then I also said that there were these other regions If things look like that, right That would mean that if I'm now at this point That would be like balancing at the edge of a knife That is the worst possible free energy I could have along that curve So here was the opposite here was the best free energy I could have along that curve the point that's closest to the good part Here I am at the point that's closest to the bad part So this would be like balancing on the edge of a knife that you don't want to do typically And in general this would mean that if you look at over broader distributions of temperature or something you would have some sort of regions like this like that and then maybe a region That looks like that and then a region like that And what I then argued is that If we then draw some sort of limit here That would mean that here I would be gradually heating the ice and it would gradually heat the ice I would gradually heat the ice and then I'm going to get to this no man's land here I so don't want to be here And what that would mean that I would effectively jump here is that and then I would be gradually be hitting water So this corresponds these regions of depressions corresponded to phase transitions, right? Let's have a look here what we have So you could have a transition that it's gradual gradual gradual this corresponds just to heating or cooling water And at some point we're going to get the point here that we're stuck at the fairly high derivative with the lowest temperature But we might also have a shape of this landscape that Instead of seeing that this is sharp Do you see that the dashed line here, right? You could imagine instead of getting stuck here we could make a jump from that point all the way down to that point And I got a great question as a break wouldn't that correspond to zero derivative which we'd be infinite temperature Well, yes in this simple model it probably would but you could imagine if this is this somehow continue down in a very weak way here It's a model So you might be able to jump from that point to the upper point here and completely skip the strange part of that if We can do that as a temperature that's efficiently high So what determines this slope is going to be the temperature of the move And we already spoke about when proteins melt and everything right that it depends on how much energy we need to add here So if the melting or folding temperature of a protein is higher than the temperature that we would get stuck Then we are able to make this jump and form a real phase transition into well unique for the sun But if this is a fairly low temperature if I would only be able to get that Until say minus 50 degrees centigrade What would first happen is that I would get stuck here and things would stop moving and because things have stopped moving I no longer have enough energy to actually make the jump to the well-defined state So yes, there could in theory be something better, but I can't really make the jump there anymore So that it actually does correspond very closely to phase transitions. Of course, this is a very simple model and For a real protein. It's not that the folded state would be a single entropy state You can still have motions of bonds torsions can move a little bit so that this is highly simplified But this is what has it the vitrification temperature would be in the temperature What would stop moving and it's important that we can fold or melt at a temperature that's above that So that's going to depend exactly what these temperatures are and everything. What's going to determine that? What determines those properties? The site genes the specificity the packing So what we said yesterday and repeated this morning is that sequences that fold into stable proteins They do so because their native state is Unique well-defined. They're good and they have a unique structure that is able to accommodate lots of these sequences And what we showed in the previous slides here is that This uniqueness corresponds to having a well-defined energy gap It's exactly the same thing if there was not a clear energy gap You wouldn't be that unique then you could gradually move away and the whole point if you can do it gradually They're going to be lots of states just like it and then you're not really unique anymore so the uniqueness corresponds to the gap that you're different and The way that determines this is very much natural selection if you were not unique You would not have a very stable protein and you could easily get over the barrier And as good as the barrier is Sorry as bad as the barrier is what we want to fold the second you are folded the barrier is what helps us stay folded and We can actually hand wave about this now With the Boltzmann this year because what I've been telling you a bunch of times is that it's going to be very rare that you have this uniqueness But already yesterday we spoke well we can talk about this in terms of this so sort of vitrification temperature And that's the temperature we got from the cost of these defects right a few kilo calories so that The likelihood of having one such energy state that is significantly lower than the other ones That's going to correspond to having that delta e and comparing that to the vitrification temperature And if we want something that the energy gap should be something like 20 times low Lower than the vitrification temperature so high that we're not going to get over it by mistake That's going to correspond to probability to maybe one in a million or a billion or so The exact number doesn't matter, but it's it's a fairly small number So the likelihood if you randomly create a protein that will sorry if you randomly create a sequence of amino acids The likelihood that is going to have one unique state that's significantly better than the others and again You can say 10 x there. That's not important, but it's going to be a very very low probability Yes, so Yeah vitrification You're gonna see a vitrification really just means that you stop moving So you freeze in but you're freezing gradually you can think of a measure with slime or something It's not really a face temperature I'm gonna show you that in cryo e and later on but For now on think of it as that you're some sort of glass-like state and then you just stop moving But it's not like you make a face transition I think of water molecules that just stopped moving instead of making a transition twice And the only reason I would not introduce the concept if it wasn't for the fact that the books loves to use it so then And whether this is of course not exactly ten to the minus eight It might be ten to the minus six or it might be ten to the minus nine But it's a very low number and Unless you feel really lucky or remember it's Friday the 13th today Unless you feel really lucky don't try to go in the lab and hope to strike that jackpot Because you're not you're gonna you're gonna waste your entire piece de-testing things you cannot none of us can form Without massive help of computers you can't form a random protein So you can form a random see a sequence that will fold into a protein without massive amounts of help And this also starts to hint to the point. Why are proteins unique? Do proteins have to be unique? Couldn't you imagine having two such states? So what would the probability be of having two such states? Yeah, but how much lower roughly right because if they're independent probabilities The probability of one state having that energy is that low the probability of having another state that has that low energy It's roughly the same and if they are independent. We should just multiply the probabilities So maybe ten to the minus sixteen if there are three such states ten to the minus twenty four We start to be fairly low now, right? So it's and here's a it's not forbidden you can have such states, but it's gonna be exceptionally rare Probability it's already rare that you can form a stable protein the likelihood that you would have a molecule with two states It's so astronomically low that you can almost forget about it almost So there are these exceptionally rare examples that we spoke about But protein is where the native state is not necessarily the lowest free energy state and here's These are so few so we can't even calculate these probabilities correctly These are just orders of magnitude examples But given this and given the number of sequels we have and again I might be off by five orders of magnitude here But there are lots of proteins in the world and there might be one protein in hundred thousand or a million or something that can have a second state that's stable and And Yeah, with rough order hand-waving order of magnitude estimates that seem to hold most proteins only have one state And there are a handful of them that might have to and unfortunately that's usually causes problems So what these correspond to is just that you happen to have two They're not stable, but they have two low energy states where the structure is stable But only one of them is going to be native Having said that I'm going to spend a few slides talking a little bit about how protein folding happens in Vivo because that's something that the book doesn't bring up and it's something I want you to know about for the rest of your Careers and everything I'm going to come back a little bit to the kinetics But if I figured that you'd have enough equations that I should mix in something. That's not equations today So the danger here here. I only talked about the physics and we've not considered the process at all Where do proteins come from in your body? Well, I mean as this comes from Stuff you eat, but where does the genetic information come from? It's not a trick question DNA good I was a bit worried there for a second this information is first read by an enzyme called DNA polymerase Why is that enzyme? What was the definition of an enzyme? So it catalyzed it in theory because this is one state and you could imagine the copy if you made copies of DNA That's also a stable state exactly how it happens Again the free end you can't depend on the path we take but what this does is that it helps it happen faster because otherwise It would take forever But DNA can make sure that it happens faster than we think It's pretty darn good at not making errors. Well, it happens now and then like once in a billion or so So what this one does it helps untie DNA and it helps break the base pair so that you can then form two stable molecules of DNA the Sorry, not DNA The replication here Can you use that for something? We need to use that in a couple of cases There's a very special noble prize related to this If you want to sequence things or if you want to copy or express DNA, you need to create more copies of DNA The only problem is that your body sorry say but your body's not that good at doing it We don't how frequently do you need to copy DNA in your bodies when we divide cells, right? And then we need to have one copy per cell so these enzymes they don't need to be that efficient So if you would like this, but it would be great if you could somehow use these enzymes I have lots of them or something to speed up the process that would be great because then you can make the process happen faster So can if you there is a chemistry process and you would like the process to happen faster. What could you do? typically Race the temperature is a great idea. What's gonna happen? It's a protein. What's gonna happen when you raise the temperature? You'd be nature the protein too bad There's not gonna work you can raise the temperature maybe five degrees or something or ten degrees and at some point You're gonna start to unfold the entire protein That's bad and for a long time. That's where we were. We couldn't replicate DNA very efficiently. Can you think of any? So what you really would like you would like this enzyme and you would like to be able to run the process much higher temperature, right? Could you come up with any way to do that? So remember yesterday that I said it was not they completely crazy things to do this with Cold chalk proteins there are other organisms that don't live under very cold conditions But there are organisms that live under very hot conditions In particular organisms that say live in geysers or so They need their machinery has to work at 70 80 degrees centigrade But let's go to one of those organisms and steal their DNA polymerase And this is particular organisms are called thermophilus aquaticus that you might heard about Thank Sorry Is PCR which carry molluscens you have it in lectures got the Nobel Prize for? It's a super simple discovery. He stole an enzyme from a heat-proof bacteria and They can heat cycle it and create more copies And that's what we do this used to be a super advanced techniques And now you do it in every single abits high-life lab in a day you have PCR machines That's what we used to replicate DNA. It's as simple as that But the fundamental process you needed a protein that was thermosable and it was actually discovered in Yellowstone He didn't discovered he stole the protein, but You're gonna need That just gives you DNA is you're gonna need to read this recipe, too so there's another gigantic molecule discovered by Roger Kornberg from Structure biology to stand for it that I worked together with Mike Levitt in his lab Here you have the DNA that is bound to a gigantic machinery here, which is another polymerase Which is also an enzyme that takes the DNA information and copies this into RNA messenger RNA even And Roger spent the good part a few decades trying to determine structure of these molecules The second you have this information in RNA instead It's being fed into the ribosome Ribosome is the third Nobel Prize of the slides and then the ribosome is a gigantic machine There's roughly 50 I think it's 58 protein chains and there's a ton of RNA bound in it too and then you have the Messenger RNA come in here and paired up with transfer RNA that carries the small amino acids building blocks, right? And then you polymerize the chain So that you create just stitch the peptide bonds together and then effectively have the protein at the end You have the pro the nation chain coming out of this protein exit tunnel and in practice this folding takes a few seconds to minutes You're gradually moving through the chain and this and creating more protein and Today we have there are a ton of these structures to determine now, but there's less than 10 Well, no, it's not then it's 15 years ago so that today virtually everybody in this field has switched to cryo electron microscopy But the first structure is just a bit over 50 years old But you can you imagine the work of determining a structure with 100,000 atoms in it and understanding where every single chain and everything else The reason we know is that we can actually test the synthesis quite well And there are some proteins like luciferase for instance that when the protein is folded it will emit light and If we then start the synthesis process here, there is no light There's no light. There's no light. There's no light and then after a quarter of an hour or something You're gonna start to see the light increasing And by that way, we know it took roughly 15 minutes for the first protein to be completely synthesized and folded So in general the slowest process here is usually how quickly we can build the basis and add to it The protein folding is usually but not always but usually faster than than the whole translation and In the ribosome and the way we know that is that I let this process continue I let this process continue And then I use a way to just stop adding more rest use and then boom it stops instantly Because when I'm no longer feeding in more rest use through the ribosome I Instantly won't get any more protein. So the folding is much much faster than creating the chain in the machinery at least for small proteins we also know that And what this means is that folding at least for small proteins is what you call the co-translational So that as you are translating it the protein is folding right away because the folding process is faster than the translation process Which should make us happy because that's kind of what Amfin said, right? If you just create the chain whether it's a ribosome, we don't really care, but the chain itself will fold It's not that we have to produce it in a specific order or something The bad part of this is that it's not really true There are lots of enzymes that improve folding rates and everything There are lots of enzymes that improve folding rates and everything and there are even some large proteins that won't fold unless you have very special conditions So when I was about your age, there were Some new results of we had started to discover the first molecules that actually help other proteins fold and at the point We had very little idea about what they actually did So and it turns out that in particular one problem is that you have some of these large proteins with lots of Hydrophobic molecules in that they will just stick together in a hydrophobic glue In the ways they should not stick together because you might have one of forms of beautiful beta sheets But we haven't formed the entire protein, right? Remember what I said that folding is frequently faster than the production of the amino acids so what if I now had a beta sheet that The end terminus should go there and then there should be some sort of long chain going out here forming another domain And then I should continue the beta sheet there, right? You could imagine that The only problem is that once I've folded once I serve once I produced translated the first three sheets here What if there is some hydrophobic part in this long sequence that shouldn't be longer? Yes, it would be better to have the fourth strand there But the fourth strand has not yet been produced and unfortunately I might have a strand down here That would now bind there by mistake It's not better But I don't have that strand yet So while I am folding the protein it would be better to bind that one and now you've already formed an incorrect beta sheet And it would be a very high free energy barrier to start breaking that up That's a bummer So you're now going to get the protein collapsing into something before you folded it and you're going to be paying a lot to unfold it So it turns out that there are proteins that can help this amazingly beautiful molecules chaperones or chaperonins So these are Very large molecules that effectively have a hydrophobic interior and you see that they consist of a ton of different subunits The most famous one is to grow EL and grow ES So it's a molecule that contains of two parts so that you effectively have almost a hydrophobic lid to it and This molecules can use ATP to go through a cycle where it opens up. It binds these large hydrophobic aggregates But because this protein is now hydrophobic on the inside you have a large hydrophobic cavity here Here it's hydrophobic all over so you can allow these proteins to unfold and expose it to other hydrophobic parts and As you now have the entire protein the protein can now slowly inside here find the right fourth beta sheet and fold correctly again So these creates environment that I exactly it does unfold the protein and that's why we need the ATP But it creates an environment like kindergarten or whatever you call it that the protein can gradually find its right shape and then It will release it again So why doesn't nature use this for all proteins then you have figured there could be a great idea, right? You could have much more much larger much more complicated folds and everything seems like a great idea So this is This will of course vary from protein to protein right? What this protein mostly does remember that what I said here the problem is that you're gonna have proteins Where there is a very large barrier and unfortunately we fall down on the wrong side of that barrier We end up with something where we don't want to be because it's happens while we're folding it What this creates an environment where I try to take away a bit of that barrier? And I take probably take away that by making this folded states likely less favorable So what I'm doing I'm pretty much creating an environment where I make it a bit easier for you to search around again But of course that will likely mean I will also destabilize the the finite folded state a bit But I make it if I didn't do that. I would forever be stuck in a horrible non-working state But why don't we use this for all proteins? Yes, it says there. It says ATP ATP is well, it's good in a way, but it's bad We don't want to spend it, right? If you had to spend ATP to fix up colorated potent that would be horrible. We would need even even more energy So why do we have this in the first place? So that could of course be cases where you need to create large proteins, right? In particular in higher organisms the more advanced the organism is You need to create even more advanced large protein structures And there are simply structures that we can't create them with tiny building blocks with four to six alpha helices And if you don't have any choice Again depending on the organism the functionality might be so important that it's in a few cases It is worth to spend ATP energy in order to be able to create protein structures that we otherwise could not do But this is going to be a balance that is determined by natural selection It's less rare. It's much less common in bacteria because in bacteria you want simple structures You can because what you're going to be paying here The folding rate is going to be much lower if you need to involve chaperonins, right? And bacteria need to be optimized for speed. Yes. I wouldn't so much argue I would say that it's an observation mainly we Why? Well, we don't know that we are slightly more complicated than bacteria, right? We have we have a vertebrae and It's more of an observation rather than arguments. Obviously. We have more advanced functions There are lots of functions in humans that you don't have in bacteria. Why we haven't good question. Oh sure Nobody there's no question that is better for this, but the question why? Why are we not bacteria? So that goes back to we like to see ourselves as the pinnacle of evolution I'm not quite sure whether it's true in many ways as I said bacteria are probably more efficient. So it might be that Give it a few billion years and we're gonna die out and the world will be ruled by bacteria And this you might not think it but it connects back to where we all started Leventhal's paradox So, you know Leventhal's paradox manner that and the argument is that it's the search problem here is so insane that it Would take forever But when I introduced that we said search problem, what would you say now? it is Word what type of barrier did we have? to folding Entropy so what Leventhal's paradox really describes is the entropic searching, right? How do you get over the entropy barrier when we're folding things? and There are a few ways so that all we need to do is that we need to find a way to make the search process more efficient and there was actually a very great suggestion by Phillips that What if folding just starts around the end terminus of the chain that would work? Awesome together with the chain gradually being produced. It's a fantastic model. It's a beautiful model It solves Leventhal's paradox and there's only one problem is that it's not true Which we can test experimentally It doesn't work. You can remove the end terminus of most proteins and they will still fall Too bad and that means that we need to throw it out but what Leventhal's is effectively saying is that It's not just a matter of finding this lowest minimum of free energy, but proteins are effectively under kinetic control and that's been a minute that So what it means that it doesn't matter if there isn't theory there could be an even better free energy state If you can't find that in if it would take a year for you to find that it's completely irrelevant because we're never going to Find it in the first place. So forget about that lowest free energy in the free energy landscape All that matters are what are the states that are accessible in finite time? If you can't find it within a few minutes forget about Which you want it's huge because now we're throwing out almost all of thermodynamics We're gonna get it back in in a second, but it's all about the barriers What can you find and how can we get over these free energy bars? If you can't get over it that has nothing to do with free it with protein folding So that the argument that came up quite soon as there has to be some sort of pathways Imagine this is some sort of path through the forest here or something there has to be guide the protein will have to implicitly be pushed in the right direction There is no way we can try everything Now how it is pushed in the right direction will of course depend on the landscape and that's where the total free energy comes in That's not completely independent And what we can then do is that what people that fairly early say can you come up with different models where as this Chain is being collapsed. How should the chain be guided? It's not a completely random search And there are three traditional models that are used and all of them All of them have a bit of truth I would argue that one of them is significantly better than the others, but here too you could argue that there are points to each of them So the first one is the oldest which is diffusion collision And it's hierarchical and let's start with the model first that if you have this long chain The argument is that we would very quickly form the secondary structure in particular the helices And we already seen in some simulations that helices can form very quick And the second you form this helix instead of having 25 amino acids here that are all independent You know, I just have one element one big helix diffusing around and then you have another 25 residues here But that's also now just one helix. So suddenly we just have two helices instead of 50 residues That's a dramatic reduction of our third space, right? And let's just assume that the beta sheet was equally fast Which we sadly know that it is at all ways But if that beta sheet could form if we could form those three four elements suddenly It's just three four large building blocks diffusing and then they need to both bump into each other and search a bit And then we would form protein that might take a second or so we can pay that second We can't pay a billion years, but we can wait a second So you somehow call this framework model But I think diffusion collision really describes that that random secondary well not random But secondary structure elements that diffuse around and collide with each other and when they find the good states they bite And this could definitely explain how helices and Sorry, they could definitely explain the protein folding It would definitely explain how we get over 11th house paradox The only thing that we are not sure of course is whether it's true or not For if you have lots of small alpha helices it might very well be true beta sheets more questionable But there are proteins where it will work You could also argue that what really happens is that almost what I drew in the blackboard before that We have some everything is focused on the multi globular that we have this chain That's collapsing into the multi globular and then we have a complete mess here But the mess is roughly in the right place and then this is just as much hand-waving as it sounds And then we would gradually have the secondary structure elements formed. This was popular Decades ago The only problem here is that we as we've got a more and more advanced experimental techniques We've learned more about the multi globular So you when I was a student we kind of like to think of the multi globular that way Compared to what you saw early on in this lecture. What do you think about this model? The sad point is that there is increasing experimental evidence that that's not what the multi globular looks like We have the secondary structures already in the multi globular So sadly, I would say out of the three models. This is likely the worst one So for whatever reason it likely doesn't happen this way But but the point is that doesn't necessarily mean that it's a bad model because that To solve Leventhal's paradox I only need to show that there are ways for the proteins to find the native state without going through very high barriers There might be something that's even better than what I proposed and then it can happen even faster But to understand why proteins can't fold it's enough to find one possible way Then there might be some even better But I only need to prove that it won't take an infinite amount of time. So this is not an entirely stupid model And the third one, which is the newest is what you call this sounds like a bit a bit complicated But it's we're gonna come back to it new creation condensation The argument here is that you have these gigantic protein mess chain Whether you call this a molten globular or something. Let's skip that for a second But what happens then is that there are a few very important contacts here that start to form that whether you call this key Residues or VIP residues or something, but some residues are more equal than the others They will start to form some sort of contact a core think of it as a droplets A new creation is something you frequently use in physics when say when I start to form the eyes will start to form It's not forming all over the liquid There is a core somewhere where the eye starts to form and then you have more molecules condensing on that so the idea here is that there is some sort of Core a droplet in the middle that you start to form things and then you're growing things out and out and out and The idea if this was true, we should be able to somehow identify that as a transition state that might be super expensive But if we know that that is the state you need to get to it once we are at that state It will be downhill on the other side So why don't we just determine that state and check whether it's whether it looks that way or not What do you know about transition states? It's the highest possible free-hand you can imagine. There is no way we could determine that we can't By definition, you're trying to determine a state that will be balancing on the edge of the knife So while it's a great model we determining that you can't crystallize that state. It's completely impossible But bear with me we're gonna come back to that. I think next week But the idea here is that this initial models would somehow lock things in And then it's not the matter of every part of the protein trying to interact with every other part of the protein The only rest use that could bind to these are the ones that are already relatively close in the sequence So this would very much correspond to what ice gradually forming in water If I had if you haven't understood it by now, this is likely the model that is the best description of real protein falling The only problem is this far It's only a model and we have no idea whether it actually serves live in talks, but I'm sorry whether it solves live in talks paradox or not Yes You could argue that of course we haven't defined exactly where this is that we also haven't decided exactly where this happening So my counter argument here will be well, maybe this happens not out in water But this could happen inside the exit tunnel of the ribosome and for some proteins that actually where it happens alpha Helus is in particular tend to form already in the exit tunnel, but The other things that don't add too much detail because if you trust me There is enough detail in biology that it will last a lifetime, right? and if you start to If you start to build up too many complications or problem You will never understand this directly a good argument is that if you ask the question is it possible to get from here To the USA parking lot out there In theory, there are lots of way to do this right and you could try to determine a measure that map out every single possible path in the house through Stockholm University campus Or I can just walk out that door and exit or I have found one way and The argument the question was is it possible and I just show that it was possible You might know an even better way, but that doesn't change the argument all we wanted to show that it was possible So that at this point I don't want I don't want to worry too much about the details We just want to show that there are models where we can solve the searching problem and crack live in those paradox I'm not saying that this is the best possible case for every single protein So that this would probably be something like the multi globular that you started to form structure anything that this is see me stable and So this would be the multi globular this would be this transition barrier that we need to get across to really fold the protein And that would be the fallen state So you're actually you're a that's actually a good point and I would not say that that's very rapid There's a there's a long search process So this will this will test lots of different transit lots of different states Most of these states are not going to be right, but at some point when you find the right pair here It's not that it's going to take a long time for the pair to fall to form or something But you will of course you would need to it's going to be lots of trial and error here So you will constantly go back and forth here until you have found the right transition state Whilst we have found the right transition state you will gradually condense on it But you're actually quite right in this case rapid might it might not be good to say rapid and slow there Hmm, so that's that's to be a good idea It's not entirely sure I because we're just saying that there are two points here are these residues important for the transition state So the one question is what is the role of the rest of you here a second question is what is the role of the rest of you here? So the question is if the rest of you if you stabilize how much do you stabilize or destabilize the transition state? Versus how much do you stabilize or destabilize the final state? But if you course exactly right, that's where we want to get but it's a company It's not just one state So I need to try to separate how much do I stabilize the transition state versus how much do I stabilize the end state? And I'm going to hint that in a couple of minutes here, but we won't crack that until next week But that's exactly where we want to get So there are two types of states that I've shown it one of them is folding intermediates And you can determine folding intermediates the multi-globulas are folding intermediates We can't find the state that might not be super well-defined, but it's something that seems stable It's not the best state, but it's a myth you case you call these meta stable states It is a local minimum in the free end you so I can observe it The transition state on the other and by definition I can't I can't determine a crystal of the transition state Balancing on the edge of a knife. It's a local maximum in the free-end But the transition states will of course determine how fast things happen So the problem with your residue in the extreme case this might not affect the transition But what if I just destabilize the multi-globula right? Effectively the barrier is lower But that doesn't necessarily change that the it says it's it's there's something hard here I need to check both folding process versus default unfolding process and I might be able to separate that So that the way to try to determine this is to look at this experimentally what happens in folding and what happens in unfolding But the problem is that now it's not structure It's kinetics we want to look at and there are a bunch of more or less advanced experimental methods try to do kinetics The simple one would be Stopped flow kinetics are continuous flow and which essentially do you have the chemical here a and a chemical here b It's two small syringes and as you push it here. They mix And if you then have a very small here and very narrow test tube here like point one millimeter or something as The liquid here is being pushed through this test tube Well here it has only had time to be mixed maybe one millisecond two milliseconds three milliseconds four milliseconds To by choosing where to observe safe for Russians. I can observe it as a function of time. They have had to be mixed The only problem is it couldn't be expensive because I have to keep adding more and more liquid here all the time Right as I do the experiments But that makes the possible to get things properties for instance for Russians as a function of 100 milliseconds 200 milliseconds 300 milliseconds I'm not just depending on one snapshot. I can determine it continuously as a function of time and When you do that you can try to then mutate residues and see what residues influence for instance the fluorescence and see Which ones were early formers at least? But that just tells me whether they were part of some sort of intermediate states Or whether they helped to form the protein. It doesn't tell you whether they were part of the transition state Because by definition I can't observe the transition state This would be only be local intermediates So to understand this transition state we would we need to go back to look at these barriers and We already know that the speed with which things happen Has to do with an exponential raise to the minus the fee energy barrier divided by RT Right, so the higher the energy barrier is the slower it happens and Just as we had on the slide before but and we might not have emphasize there are two energy barriers We have to think about it's either we go from unfolded to folded Then it's this barrier or if we go from folded to unfolded then it's the other barrier and Both of them are going to be important So how do you determine these barriers? This is where hate having given you slide copies because you can see I can't never surprise you Anyway, it's the building we are in that named after the person who got the prize noble prize for determine these barriers as Fanta are in use the sweet So he came up with the idea that sorry that I'm back Do you see this whole the temperature dependence here? This whole term is related to the temperature, right? So we can find the slope of how a temp our reaction changes with temperature This is really the slope. This is the coefficient that determines how fast it changes with temperature So what's Fanta are in your state is that if you thought this is one over the temp versus one over the temperature And how fast the reaction happens the slope of this red or the blue line here Is going to be the reaction going either from native to unfolded or from unfolded to native and And they can determine from these you couldn't theory get your free and your various and I thought that story We've spent two hours together, but if you give me two more minutes of attention This is not entirely easy to measure independently because if I have my native protein here And then I keep adding some manage some unfolded The only problem is that some of those darn molecules that I just unfolded they're going to fall back So the problem is that it and me it goes both ways So that for now you'll have to bear with it Let's pretend that the second the molecule has unfolded that I could remove it from the sample and I wouldn't have to care about it Maybe that it would become part of the precipitate or something This is unfortunate to the problem. Nobody uses this hardly anybody uses this plus for protein folding But if I could do it if I could measure that curve independently from that curve, I would get the slope stronger and if you could do this Unfolding makes a lot of sense the more energy we add the more heat we add The higher the temperature is the faster it goes like any other transition you would expect in a chemistry lab It makes 100% sense Folding on the other hand The more energy you add the slower it goes It behaves exactly the opposite way of any normal transition you would imagine if you wanted to happen faster cool it down And already here you start to see that the process there could they're not the same type of barrier in process And of course now you kind of already have a hunt why does that way? So you can actually determine the temperature dependence on this and I'm not going to take you through all the math But again the temperature we should just calculate. What is the derivative here and how does that depend on the temperature? Well, you're gonna end up With that is roughly constant so we can forget about it and then we're gonna get the derivative of the free energy and The derivative of 1 over temperature Well, first we have to take the derivative of 1 over t which is minus 1 over t squared and then just the derivative of temperature With spontaneous we would just stop here and try to plot them as a function of each other But this is now going to be a complicated expression where it's derivative of free energy Related to temperature and 1 over t2 This is not something that I would recognize immediately But if you sit down and do the math or look back into the book what we did There is a relation of the derivative of the free energy with respect to temperature and that depends on energy and If you evaluate this and put it it's gonna turn out that this temperature actually depends on the energy between two states So that the larger the energy is That's going to explain what the rate constant here is Sorry, and what this means is that oops my bad too fast What this means is that From the speed which this happened we can determine what the energy difference is between these two different states So that we need that the energy bar the energy at the barrier not the free energy now But enthalpy the enthalpy at the barrier must be higher than the enthalpy at the native state We're just reasonable the barrier is bad. You would prefer to be at the native state But we also know that the energy for the same reason the energy of the unfolded state must be even higher than the barrier So energy-wise it's actually better to be at the barrier than at the completely unfolded state And that corresponds to this curve. I drew right that the energy drops all the way continuously You can do exactly the same argument with entropy. It's a slightly different equation but Based on what we know with the speed the fact that protein folding slows down if you raise the temperature We know that the entropy of the unfolded state is highest The entropy of the barrier is actually better But the entropy, sorry, and it's worse and the entropy of the native states is by worse of them all and What that really gets to is this plot is that the energy goes down monotonously and The entropy actually also goes down monotonously. There is going to be more pronounced in a certain area and that is what's great this But the whole argument that's fold the unfolding is an energetic barrier We know that because unfolding goes faster the more energy you add And we know that folding is an entropy barrier because we have exactly the opposite the more energy you add the slower it goes Because we need to find those so this is not just hand-waving. We know from experiments that this must be true So folding is about searching process and entropic process and that was all Levent Hall was about While unfolding of the stability has to do with energy and the point is you need them both As much as we hate the barrier when we're trying to fold we would be in very bad shape if we were not stable The only problem is that this is completely useless You can't measure it in the lab because for a large complicated protein There is no way to remove the things that has unfold. I can't unfold it and just look at the unfolding I always look at folding and unfolding in mix And there are ways to study this and we're gonna study them on Monday There's a bit let's see. Yes, I will two slides remain as we'll have time to do this so what I would The way you would like to study this is the equilibrium rates So that in this process, how much how fast are we in general? Are we net moving to the right or moving to the left? Now the net process of moving more to the folded state That's of course a sum of things that are folding and some of them are unfolding or vice versa If I am net moving to the unfolded state that just means that there are more molecules unfolding than folding But I really would like to do I would need to sum this up so that this equilibrium constant is really The ratio of how much is folding versus how much is unfolding, but those I can calculate So the first one had to do with the going from folded over the barrier And this one had to go being from folded over the barrier So it actually turns out that the barrier doesn't enter there. It will only depend on the free energy between the two states Because this again, this is the equilibrium rate Sorry, this is equilibrium constant not the folding rates And this is going to determine if you wait an infinite amount of time What fraction of the molecules are going to be folded in state B versus unfolded in state A And then if we go through a math and I'm not even going to try to go through this in details with four minutes remaining Consult the book if you would like it The way we would like to do as a function of time how many molecules are still in the folded Let's just call them state A and B How many of the molecules are in state A as a function of time? Well, that's going to be a sum of these things right the ones that I will remove the things that are moving over to B And I'm adding the things that's coming from B to A And then I can use these two equations because rather than say A and B of course depend on each other the sum of A and B is constant because the total number of molecules and Then it turns out that I can start to reformulate that in terms of the number of molecules the ratio I would have at infinite time And if I plug that in eventually I'm going to get an exponential So of course the longer I wait the more molecules have moved over from A And at the very end here the constant you get up here in the exponential It's actually going to be a sum of the constant moving from left to right and right to left It's very non-intuitive and I'll do your favor. You can forget about that after this slide So why do I even show that? Well, I Want to show that there is a way because to get the K here. This was just an equilibrium constant I would like to know how fast does the net process happen and The net process you can actually look at the sum both of folding and unfolding at the same time So it does work and rather than worrying about Svante Aurelius and trying to remove the molecules that have already undergone Let the probe molecules do what they want to do and then we'll just add up the rates This is what everybody does in a chemistry lab And if you do this You're not with the curve that looks almost the same But you're gonna curve that goes down there and then it's changes shape and goes up again So in this case on the x-axis here, I have the naturant concentrations say go in a denium hydrochloride And and this is super straight But the reason why this works if I don't have any go in a denium hydrochloride What will the protein molecule what will all the molecules do? They're gonna fold right they love to fold So up here. We're only folding We don't care about the unfolding because we don't have any denaturant. So up here. It works really well I can ignore the denaturation If you have eight molars of a go in a denium hydrochloride You can pretty much forget about folding anything is gonna unfold so up in this region everything is gonna be unfolding and The point is if I'm you see that this is even a logarithmic diagram So if I am here the contribution of that curve is negligible There's gonna be a bit of complication around this midpoint But let's not worry too much about that for now, but so these diagrams is they're called Chevron plots Let's we have an illustration just goes from these signs You have in uniforms and you recognize them because they always have this double this angle This gives you two things for any protein Give me one sequence and then we test this as a function of unfolding could be a function of temperature function of denaturant function of anything else and All I need to do is I measure how fast does it fold versus unfold? What does that get me? Absolutely not well unless you like to know how fast your protein folds, but what if We now use your suggestion Let's do one mutation here What do you think is gonna happen to this curve? They will start to move and Exactly how they move we will look at next week but In general you're gonna have when you do lots of experiments like this You're gonna end up with a curve that has moved in some sort of way and How much that angle is moving versus how much that angle is moving? Helps us know was it primarily the folding we were influencing or the was it was primarily the folding speed Where we're changing or the stability to unfolding and what that's gonna help us do is tell us how much are we? Stabilizing the transition state in particular and the specific rescue that you changed Was that part of the transition state or not? Now what this will help us to do is that we can identify the transition state and this is super cool We can't determine the structure of it But I can know rescue alanine 47 was definitely part of the transition state I have no idea what it looks like, but I can say yes or no was this more a rescue part of the transition state So indirectly we can map out transition states But that's gonna be the topic for next week oh 20 study questions things are moving up here. I think we've actually covered all of these Work your way through them if there is anything that you feel that I haven't covered we'll talk about that on one day morning So what we're gonna do next week we will the good that we're gonna crack live in false paradox even earlier next week and Then we're gonna spend some time Looking at real protein folding. We're gonna spend some time looking at nucleic acids in particular I might spend a little bit talking about our research And then I forgot whether it's next week or the week after that We are mostly going to deliberately gonna take you all out to study visit at Sylab lab if you want to But they feel that we kind of wave Sylab lab as a flag for you all the time But you haven't really spent that much time there If you're in a specific labs, I will show you our labs cry you and electrophysiology But by mathematics and simulations if there is anything specific they would like to know about Let me know and I can try to hook up another PI to And then you're gonna be doing more labs Lots of work, but enjoy the weekend