 Good morning. We have an hour and a half hour until lunch since we had the David Roos presentation this morning, but that also means that I'm going to run this without a break, otherwise we won't get through these slides. Today I'm going to talk about the denatures state, and then I will have time to get roughly halfway through the kinetics of protein folding. This I'm going to continue on Monday next week, and then on Monday I'm also going to start talking. We're actually going to finish most of the statistical analysis on proteins, and then we're going to start talking a little bit about free energy calculations, docking. For instance, when you start developing drugs in practice, somewhat related to the stuff David talked about here. What do you do, and how can you start using simulations in particular to not just look at things, but derive things that are non-trivial? And then I'm also going to have a lecture on both on docking and on actual research next week. Let's get started with these discussion points we've had. 13 of them, unlucky. Pick one, and then we'll go through them. 9, GFP. So there's one caveat here that I actually didn't tell you about. What's so special with GFP? There are tons of fluorescent probes, right? No, there are literally tons of fluorescent probes. You can buy any color you want. So what's the difference between GFP and the typical fluorescent probe? A color, a dye? Yes, and why is this special to be a protein? Exactly, so you can get a cell to express GFP naturally. And this actually turns out to be important for a ton of proteins, right? Because anything that you can express as a gene, you can put in a live cell and have the organism express it. Other types of dyes or something, that's something you typically have to get to bind to a protein or something, but this can be naturally expressed. Sure, no, sure, there are cells or others too, but if you look in general in say microscopy or something, most dyes are small molecules, right? And you're going to need to find a way to get this molecule to bind to something. But the beautiful thing with GFP is that you can get this to be expressed in the genome. And also GFP can be tuned to actually fluorescent tons of different colors too. That was one out of 13, more. Yes, so you can of course, there are lots of reasons why you could argue that many proteins are similar. And this certainly something we didn't prove. But I would argue the consensus today is that there is a surprisingly small number of, called it folds or topolytus or something, but simple physical or mathematical ways of packing the structure. That has nothing to do with bioinformatics in the sense of evolution or evolution. But no matter how you get these sequences, divine inspiration, express them in the lab, most fold, sorry, most things will not be stable and a few proteins that do fold fold because they are stable in one of the few folds that are stable. Other things? Number of things, what is called a saturation? Yep. So it's the phenomenon by which proteins can also unfold at lower temperatures, usually close to zero, if you think about it. Right. And this is called that saturation is, this is something that's very special to proteins, right? All other phase transitions and anything you could imagine, they would go like ice is stable at low temperature and then it melts. So this is extremely strange feature that we're only stable in our narrow regime. If we increase the temperature, something happens. But if we decrease the temperature, something also happens and pretty much the same thing happens. And that's pretty much unique to this type of molecules. You would not see that in any simple compound or so. Yes. And this is primarily related to the hydrophobic, hydrophobic effect, right? When you solvate oil and water. Others for protein domain sizes? Yes, and you can think of this as two levels. If you look at secondary structure sizes, well, that's kind of relate to eight, but we can take that one too. When it comes to simple secondary structures, it's simply very unlikely to have 100 residues after each other that are all are stable in an alpha helical form. A kid could happen, but if it's 50%, it's 0.5 raised to the power of 100, which is nothing. When it comes to larger protein domains, this is, well, primarily because the larger something is, the more expensive it is to produce it. If you have an error anywhere, it's not going to be stable. Now, if this was evolutionary, very important, the body would still use it. But as we saw, what's on Monday, there are also relatively few cases where you need very large things. It's simply easier to use smaller building blocks. And when you need to do complicated things, you assemble them for multiple building blocks rather than having one gigantic fold or domain. We have plenty more. Five, right. In one way, this is, of course, very much related to evolution, right? That no matter what sequence, the only sequences that will ever form a protein are the ones that happen to be stable in one of the very few folds we have. And roughly how many fold were that? 1,500 to 2,000. The point is, it's not even 10,000. And this puts up some extremely tough boundary conditions on what things can fold proteins. Let's see, so which one? Six was the stabilization free energy. Which one we talk about? Six, sir? Yeah, what is the typical? Yes, and the point there is that this is roughly what? How large is that energy? Well, let's take a step back. The first thing that is not proportional to something and what is that it's not proportional to? So that means that the stabilization energy of a protein is not a constant, right? But it's not 10 times larger for a protein that's 10 times. It's not a 10 times larger for a protein that it has 10 times more sequences. And what ballpark of energies are you talking about here? Of course, it's not a single energy for all proteins, but yes. And if you hate numbers, how should you remember this? I would even say that 20 is a bit on the high side, but it doesn't. That's one way, but the length of an alpha helix is not an energy, so then it's going to depend a lot on the units. So there are a couple of things there. I think I would argue there are two important energy units you should think of in biology. Can you imagine what they are? And 0.6 k kelp from all this? What? Good try, next try. So 0.6 k kelp from all this what? kT. So that's a thermal energy at room temperature. If you have a higher temperature, it's going to be different, but at biological temperatures, to kT it's the fundamental energy scale. Things smaller than kT are not relevant. Barriers significantly larger than kT. We're going to feel that they're barriers. Then there is a second energy, which is the one you mentioned, the hydrogen bond. And roughly how large is a hydrogen bond? Yes. And as I say, they're 5 to 10 kT. You can also say ballpark, not more than 10 kT at least. The important thing is that hydrogen bonds are relatively stable, but a protein, in turn, is stabilized by just a handful of hydrogen bonds, two, maybe three or so, maybe four, but not more. If you think in terms of those units, you do not have to remember a ton of different numbers. And you also avoid the fewer specific numbers to remember the fewer mistakes you're going to do with units, because suddenly you're talking about kilojoules per mole or something. If you know that a protein is stabilized by a handful of hydrogen bonds, you know roughly how expensive a hydrogen bond is, then it's going to be easy for you to translate that either to kT in terms of kilojoules or kCals. More things. We had some questions about structural and sequence evolution here. I would argue there are two types of structural evolution. Well, no, that's a sequence evolution, right? That's how it happens on the sequence level. That's how the bodies does the trial and error part. Just try to move it, cut the large part of the sequence in a gene, insert that in another gene. But of course, that's just the cutting. At some point we're going to decide, is this stable or not? Is this advantageous enough that it's going to survive natural selection? So structural evolution in general is much more related to the natural selection. In what sense? I would say that that's the second part. The structural level, if you're less stable in a fold, that's bad, because you're not a stable, right? So the first part is simply that mutations that tend to stabilize the fold you're in so that this protein folds better or easier or something, those are going to survive a natural selection. It's good for the protein. They can't be too stable, mind you. But again, if it facilitates folding into a good structure, the second part is that it might not actually stabilize the structure, but it might enhance the function. And that's, of course, also something that's very good. So if you look at the second part, if it enhances the function, we talked about a couple of examples of that. Yes, so hemoglobin is simple, right? So you talk about tiny differences in the geometry of the protein. But again, if that tiny difference suddenly improves the binding free by, say, one kilo joule per mole or something, it's definitely going to be better. That organism does not have to express as many red blood cells. Protein domain, sequence fold fitting. So why is this stabilization that I just was spoke about, why is that not governed by the Boltzmann distribution? Because it definitely looks like a Boltzmann distribution, right? Right, so Boltzmann distributions is all about equilibrium. You need to move between states. And by definition, you already have one sequence. So a single sequence does not move between different sequences. Then we did a lot of mathematics and proved that you actually end up with an exponential distribution that looks exactly the same, which is not entirely a coincidence because that has to do with the solubility of individual amino acids. But this is not an equilibrium between different sequences. We already talked about the sizes of helices and sheets. What is allosteric modulation? I'm going to talk a lot about that next week. Protein between the enzyme, if you bind in the half then it's in another side but it's not the capillating side. Right, so it's kind of like a transistor or something. You can have one small molecule or something binding that changes another process. It's not really an enzyme but it's somewhat related to this process. It's not that the first binding is not, for instance, what generates the oxygen binding or something but the fact that hemoglobin, for instance, changes conformation a bit, that aids in oxygen binding. I'm going to talk a lot about that when we talk about ion channel research next week. Some of the stuff that David mentioned in this talk today is very much related to helosteric modulation. This is super important, in particular in eukaryotes. Yes, and there are tons of them. The problem is, of course, they're very hard to study in many cases. And there are proteins that have just taken decades for us to realize that there is actually helosteric modulation in them. So what is the folding unit of a protein? Well, it's not necessarily, it was related to the last stuff we spoke about on Monday. It's somewhat related to the melting temperature. So you just look at the word folding unit. Do you mean the part that folds? Yes, the part that folds. So in general, the question is, of course, that it depends, right, but this is intimately related, let's see. This is intimately related to the concept of domains. So on the sequence level, a domain is usually the parts that we exchange between genes or something that the evolutionary conserved large parts. In terms of biophysics or something, a domain is rather something that folds independently. And roughly how large is that? And how do we determine that? Yes. So I would argue that this corresponds closely to what we spoke about was it two lectures ago or something that you typically have, like, two, three layers of sheets or helices or something. If it gets significantly larger than that, it's going to be expensive and difficult to fold them. If they're significant, once you get down to one layer, you don't really stabilize anything. And the way we know that is what? No. So this is how we could compare the probability of an individual fold being either folded or unfolded. We could use all these statistics. And then we compare this. So that's where we could calculate how much heat do we need to use per folding unit to get it to melt, for instance. And then I can compare that to what is the heat that I use in total to melt, say, a kilo or a gram of protein or something. So this is actually possible to measure experimentally. But we need quite a bit of mathematics to convert the experimental numbers to see how much energy do we use per folding units to melt it. So that's, well, primarily because that's not what we see in the measurements, right? So that you don't have, occasionally, you might see a small piece of the helix form. But the important thing is, if you take a protein and start to melt it, you're not going to get one helix melting and then the rest of the protein is conserved. The second you melt, you're going to see nothing happens, nothing happens, nothing happens. And suddenly, you have destroyed both all your helices and most your beta sheets. You might still have a couple of helical residues. But this is based on observation. We see that. We destroy all of them. So that's why I say that it comes down to definition, that you can define a domain either as the folding part, which is quite common in biophysics. You can also define as a domain, say, from bioinformatics and sequence. You can do either you want. But when we say domain in terms of geometry and everything, we typically think of this as the folding unit. Domains in biophysics somewhat overlap with domains in biophysics. Oh, sure. They overlap a lot, right? But I would even say that it's 99% overlapping. But 99 is not 100. And this ultimately has to do with how you define it. They are virtually always identical, which, of course, no coincidence. Because then the recent natures tends to cut these domains as likely because they are independently stable, right? If this domain is stable when it folds, I can likely take this domain, take it out of protein A, move it into protein B. Because if it could fold independently as part of protein A, it can likely fold independently as part of protein B, right? If this was only stable as part of a larger protein, the likelihood that it would still be stable when I cut this and put it in another protein would be zero. How does enthalpy and entropy change from folding? I think, you know what? That's something I'm going to talk about today. So I'll skip that and get started. What I will talk about is we're going to start looking at a lot of this folding, unfolding transition. And we're also going to look a lot on the unfolded state. Because actually, the unfolded or denatured state is more complicated than we think. And then we're going to look a little bit at exactly what is that stabilizes us when we fold. And we're going to start to look a little bit how quickly we fold and why we fold. Chaperones and folding rates, yes, there are a bunch of fun things we'll start here. We will get, I will actually get all the way and start introducing some experimental plots. But exactly what we do with these plots, I'm going to tell you on next Monday, I think. In an experiment, it's very easy to say that something is denatured. What does denatured mean experimentally? Yeah, that we've, we destroy the protein. Destroying proteins is easy. I just boil it. Add 10 molars of guanidinium hydrochloride or something. Increase it. Well, in many cases, it's harder to avoid denaturating the protein. The point is that we don't really know what that state looks like, right? And it's not even obvious that that is one state. In some case, if you boil a protein, well, you can't unboil egg, right? You can certainly remove a concentration of salts like guanidinium hydrochloride. So it might actually be different types of unfolded states. Just because it's not folded doesn't mean that these are all the same states. One probability you could imagine that it's, if we were to think about it, you just take your entire chain and it stretched out. You already talked before that that's extremely unlikely. Why? No, forget about hydrophobicity. It's a much simpler concept that we started to talk about early on in this course. Exactly, so that to get this chain to be extended, you have to put every single torsion in the chain in its trans-state. How many microstates is that? That's one microstate, right? In comparison, the number of microstates that would correspond to something that looks kind of curled up is gonna be billions and trillions and quadrillions. So that is simply based on entropy. There are very few with, there are only a handful of completely extended coils. So when we say extended coils. Yes. Not just like that. But even, well, even if it's almost extended, right? You can, of course, allow some variation, but you're gonna have very few states when things are far apart. You're gonna have tons of states. If you just take a piece of boiled spaghetti and throw it out on the floor, what is the likelihood that they're gonna be straight? They're not gonna be straight on average because there are many, that doesn't mean that any particular state you pick, I don't have a pen here. Any particular state you pick, if you just pick one form, that form is gonna be extremely unlikely. But there are lots of forms that looks like that one. There are lots of forms that look kind of curled up. There are very few states forms. There are very few microstates that are completely extended. And that is the reason for this observation we have here. If you look at this experimentally, proteins are really coiled up. They're frequently even collapsed. They kind of look like, almost like a folded protein. Look is the key word here. They're not gonna work like a folded protein, but they're not quite compact, but compared to the extended chains, they're more similar to the folded protein. We still have some secondary structure when we're unfolded and that's kind of fun. Because I'm well aware that that's kind of contradicts what I said before that, what is the folding units? But the secondary structure is not necessarily the folding unit. The secondary structure is primarily local property of amino acids. Even if you have a helix, if I destroy this helix, yes, I might break half the hydrogen bonds, but there will still be lots of individual residues that are a bit helical. And if we really, really, really, wants to completely destroy the protein, I have to add extremely large concentrations of salt, like, well, several molars, not millimolars or micromolars, but molars, and also frequently increase the temperature. So what this turns out that you can actually show that there are kind of three states. You definitely add low temperature and when we do not have lots of salt, we have our native states. And if we start increasing the temperature, but also add lots of salt, at some point you're gonna get a true random coil. But in most cases, you actually end up somewhere here, which we call this, it's kind of a molten state, that it's like, it's the protein, but it's a bit floppy, it's not really well packed anymore, but it looks kind of like your protein. You just screwed it up a bit. This is a way more common state in your body than that, because this might come as a surprise to you, but you do not have four molars that go in a denium hydrochloride in your cells. But the point is that it's probably this strange state that is more relevant biologically than that one. That one you're never gonna see in a cell. No, well, in practice it's gonna look very similar to that one. Cold denaturation is not really that common in your cells, because again, if you start having temperatures approaching zero degrees centigrade in your cells, we have a few issues. So that for now, we'll forget about that, it's gonna be like a molten globular tube. No, we'll get to that in a second. Once you get to 45, 50, 60 degrees, the majority of the protein is gonna be in molten state. At some point the protein will have to fold. At some point you produce the protein. So what we're gonna start, what we're gonna get to shortly, what happens when the protein comes out of the ribosome, when you start producing it? And at some point you start producing one amino acid after one amino acid after one amino acid. It even turns out you have a tiny bit of coil, but bear with me for a while, and I think we'll become more clear. So it turns out that this molten globular is, I would at least argue that it's way more common under normal experimental conditions. It's certainly not in your cells, and that's kind of a good thing, because if your protein started to assume molten globular conformations in your cells, you would be dead, they would no longer work. It's possible to study this in particular with NMR. So we can see that the main chain is roughly the same. We don't really have the secondary structure elements. They're a bit looser and everything, but overall the chain is sorted in more or less the same order. We have secondary structure. We definitely have a hydrophobic core. The volume is roughly the same. But on the other hand it's not really a native state. So you're either native or molten globular. It's not smooth. There is some sort of transition between these. If you drop the temperature, it folds into the native state. If you increase the temperature, it goes back to the molten globular. But the whole transition between molten globular and that coil state, that's smooth, fuzzy. That's a very good question, and we're gonna see that in a couple of slides. So if you can sort the molten globular, there are a bunch of properties that are just like a native protein, but there are also a bunch of proteins that are like an unfolded one. So it's definitely a compact hydrophobic core. And you know what, this is kind of reasonable. If you have a protein that consists of a chain, that consists of 150 residues, and half of them are hydrophobic, remember how strong these hydrophobic effects were. The second you throw this in a cell, cellular environment, they will instantly collapse so that the hydrophobic residues turn inwards. And in contrast, folding a protein can take a second or something. You're not gonna have those hydrophobic side chains face water for a second. So the reason why this collapses instantly, it's primarily based on the hydrophobic effect. Do you think this will happen just for proteins that fold over this happen for pretty much any hydrophobic sequence? Yes. So that's not really part of being a protein with a capital P. Any polypeptide that you have lots of hydrophobic sequences at amino acids in, it's gonna form some sort of compact, hydrophobic core thing in water, because otherwise it would be too expensive. And this is where it's really not folded. There is not really any rigid structure. I think of this as a piece of slime or something, right? That it's not stretched out. It's collapsed, but it's not well-defined. On the other hand, there is at least some sort of secondary structure. And the reason for that is that alpha helices in particular, that's a very local secondary structure. It will form instantly. And it's not gonna be that, it's kind of the same thing there. If a residue, two, three residues really like to be alpha helical, they will form a local alpha helix very quickly. That doesn't mean that it's entirely well-defined. It will kind of breathe a little bit. The helix, the fourth and the fifth residue might come and go and there might be a break in the helix. But it will at least start to form some secondary structure. The reason we know that, why do you well? How do you think we know that? How do you determine a structure or something that's not really well-defined? No, because it's not a rigid structure. You can only crystallize it if it's a rigid structure, right? No, much simpler experiment. Sorry? CD spectroscopy. CD spectroscopy. It's just $100, well, $100 for the machine. Now, it's probably $1000 for the machine. There are some super, we can see that there is alpha helix, period. We don't know where it is, but we see the alpha helical contents. It's unfolded in the sense that when you go from your stable protein to the molten globular, that appears to be when you really melt. If you just keep increasing the temperature, there's really nothing special that happens when you go all the way to the coil. Related to the first one, that big hydrophobic side chains are definitely buried, for instance. In that way, it's ordered. But there is really no unique side chain packing and that's the way it has kind of been destroyed. There's, if you have a binding site or something, by the time this, it's literally molten. You don't have a well-defined binding site anymore. It's not really gonna work. And literally, not functional. You might have to start to see that there are a bunch of things that are on the way to form a protein, but we're not quite there yet. There are a bunch of examples in the lecture you can search for Google to that where you can see different experiments. This is a collection of confirmations on the molten globular. You probably can't see the difference and that's the whole point. This, at first sight, this looks almost like a protein, right? But it's not a protein. It doesn't work. This might also be something that you see. You have helices, but there are a bunch of residues here that are a bit unordered. It's a bit flexible here and whatever you have there on the inside is not quite bound yet. So if you think of this like a beautiful, perfectly well-packed protein, a molten globular is kind of the first date you start to destroy it a bit. The side chain packing is not great. These blue dots correspond to water. So we start to get a bit on the water on the inside. So why don't we unfold completely here? Yeah, although binding I would say that comes down to specific side chain packing, right? And that's what we have started to destroy. Because if you start to destroy that specific beautiful side chain packing, why don't you unfold completely? Exactly. That's mainly the hydrophobic effect. It would be too expensive to start exposing all these to water. Yes, because there might be some hydrophilic residues there and everything, it's just that on average it's better to turn at least the hydrophobic parts to each other. I messened a little bit about this before and there are some equations here that I won't spend that much time on, but there are two things that are key features of proteins that you should remember. You don't need to be able to derive these equations. If you just, as a thought experiment, take your protein of 100 residues and cut that up into individual amino acids. Of course, you can't do that experimentally, but assuming you could. The energy there is roughly gonna be the same because they interact with roughly the same neighbors. The problem, though, is that if you have disconnected amino acids, they can suddenly leave each other, right? They can get infinitely far apart from each other. When things are infinitely far apart from each other, the energy is gonna be zero, but I would argue that for a disconnected particle that would be an awesome state. That is the best state they can be in. Why? Yes, and not just so high. So what would the entropy, as they move further and further apart, the entropy would go to what? Infinity, because you have an infinite number of microstates, right? The second you're a polymer, there's a limit to this. The entropy cannot get infinitely high. No matter, so here we plot density. So density zero here would correspond to that. That's the lowest density we could get. Why is there a limit to the entropy in a polymer? They're stuck together, right? They can't leave each other. So why is that a good thing? So think about the transition. So this is the unfolded state, right? So why is it good that we limit the entropy in the unfolded state? If you look at the unfolded compared to the folded. Why? Right, because if you think of this infinite entropy, you choose between folded and unfolded. In the unfolded states, that would be infinitely good. Like if the unfolded state is infinitely good, you would never ever fold, no matter what energy you have in anything. So that if proteins were not polymers, they would never ever fold. No matter what interactions you talk about. You can actually go through this and the book goes to some effort to show this that in a cloud, you calculate what is the available volume per monomer? And that's very simple. Take the total volume and then you remove the volume that each monomer occupies, a small omega. And then you would have very, very small amount of simple mathematics. You can show that that's an expression that's related to the density. And for a chain, well, then you'd up with a constant, roughly a constant, and then times a constant, which is the volume available to each monomer and then multiplied with something that corresponds to the part of space we have not yet assumed. And then you can start drawing a bunch of curves, et cetera. And depending on the temperature and everything, you can show that you end up with different transitions. The key thing is that for a real polymer, you end up with some very complicated transitions here that depending on the temperature, you can get things to be stable at a state with higher density. And this polymer base way more complicated than a plain, simple, non-polymeric, they're good. But there is also a prefix to polymer there. What does homopolymer mean? So can you think of a good example of a homopolymer? Plastic bags. So what's plastic bags usually made of? So modern plastic bags are usually made of propene or ethyne because they're environmentally friendly and if you burn them, it just results in pure hydrocarbons. Simple, beautiful, important, but also incredibly boring because they just look the same, right? Proteas are not homopolymers, but hetero polymers. So hetero here refers to what? Yes, and they typically have different amino acids, right? And this is differences in amino acids that give you the very, very special packing properties. That's why hemoglobin is different. That's why a subunit in hemoglobin is different from a subunit in myoglobin. Even the topology in this case is the same, right? But the specific amino acids will give you just so slightly different properties. All the specificities in protein come from the fact that they're hetero polymers and not homopolymers. And that's also where you get these beautiful taste transitions and everything. And as you probably know by now, we talked a lot about kinetics and free-enders and phase transition that if we're gonna have some sort of transition, we're gonna need a barrier here. So why, I'm arguing here that there are gonna be some free-energy barriers involved in protein folding. Why? Why is that reasonable to assume? So you can think of this as a postulate. I'm just throwing out there saying during protein folding there are gonna be some free-energy barriers involved. Yes. So this gets you two things. So first, the logic here is quite right. If you ask to prove something, one of the best ways to prove that is assume the opposite and show that that leads to unreasonable conclusions. So there are two types of barriers. Let's think about protein folding first. If we did not have a barrier when we fold a protein, what would happen? Anything would fold instantly, right? And we know that that's not the case. It takes time. But the second part, if we unfold something, I would argue we also have a barrier when we unfold it. What would happen if we didn't have a barrier when we unfold things? We wouldn't, well, yes and no. We wouldn't really have a well-defined folded state, right? Because you could always, if this is smooth uphill, you could always add epsilon more energy. And if we add epsilon more energy, we would become epsilon more unfolded. It would be a completely smooth transition. So that's like, well, if I was standing here and as the temperature in the room increases, I would gradually melt more and more and more until I was a blob on the floor. And that's not what's happened with the protein, right? If you boil an egg, there's something that happens suddenly. Nothing happens, nothing happens, nothing happens and boom, suddenly it's unfolded. So that there is some sort of barrier that we know. We don't know the exact nature of them, but we can prove that there are barriers. And that's related to these free energy landscapes. There are gonna be some things we need to get over. And I haven't talked about this in a couple of lectures, but these barriers are gonna be intimately related to the Leventhal's paradox, right? Why does it take, well, how does it come that there must be so many barriers in these proteins and yet the proteins appear to be able to find these complicated path in a split second. Split millisecond even. I'm gonna argue that one of the most important free energy barriers is that in this native state, side chains are really well packed. So once you really are stable, you're gonna have, first, there's no vacuum, there's no void in a real protein. Even if it's just hydrophobic side chains, they're gonna fill every inch of space there. That you can actually show in modern PDB structures and everything, there's no vacuum in proteins. They, well, I think the book says 80%. Even that is probably a bit low. I would probably say closer to 90. But the point is that there is no room for water. Putting in water here would require a gigantic space. And what's that's gonna lead to? As you start to heat the protein, what will happen? Yeah, so they're very happy in some sort of local perfectly packed state, right? And as you heat it now, you're gonna, this is gonna cost energy. You have to add energy, and at least initially, you're not gonna gain anything because when you move these epsilon away from each other, you're not certainly gonna get a water there or anything. And you're not really gonna get a huge gain in entropy. So when you unfold something, you're gonna pay and you pay in energy. So the initial denaturation process is always, always, always unfavorable. You have to expand, you lose your interactions, you ruin your beautiful packing, and there's absolutely no benefit whatsoever. This is good. If this was not the case, you would not be alive because your proteins would, your proteins would always be a little bit denaturated, right? This is what gets you stability. Eventually though, as you keep heating here, water will start to get thin and you cross this barrier and then the bad things happen. Suddenly, the protein is no longer functional. You've broken it. And we don't know exactly what the barrier is. We know that it's related to that hydrogen packing or at least I'm hand waving that it is. And there is part of this process that's gradual. You just pay and eventually you get over it and the water gets in and then something happens there. Now the second water gets in, you might start to, at this point, you might start to have a structure that actually can access more microstates. And here you probably start to benefit a little bit in terms of entropy. Initially, you don't really gain more entropy. You just pay with energy. And I'm gonna throw this out and just argue that it's true that the energy, we thought this as density. So low density here means that the chain is stretched out and high density here means that it's packed. I don't care about water or anything else. If you go beyond one, well, at some point you're just trying to push the atoms into each other. So the energy, when things are far away from each other, they don't interact. Hopefully if sides just pack, this energy will at some point become lower, right? Because it's good when you're interacting. Eventually, when you start with your nuclear experiments here and pushing atoms into each other, they're gonna start to dislike that and the energy will go up. You can probably also accept my hand waving said that at very low density, you have high entropy and then somehow the entropy drops during folding. But remember the free energy. The free energy is this balance between energy and entropy. So depending the E minus TS, right? So depending on whether the energy curves drops faster or the entropy curve drops faster, you're gonna get something that looks roughly like this. And that is the energy barrier we're after. This is just hand waving for now. I'm gonna show in 20 minutes or so that there's really good experimental evidence for this. This gives you a free energy barrier and it also, it gives you, well, it's usually, it could be an all or none transition, but the point is that it's not necessarily a first order phase transition, but it's some sort of very clear transition. You're either native or denaturated. You're virtually never 50%. That would be a transition state. So entropy, do you do a bit of an S curve? So that's a good question. Why do you think there's an S curve? Remember what I said in the previous slide, but it's probably easier for you to think if you start in the folded state and you start heating it. Yeah, that's when water comes in. No, well, what happens before the water comes in? You're just expanding the protein a little bit, right? Like you're taking your beautiful side chains with a look at my fingers here. If I just expand this a little bit, there isn't really a whole lot of more freedom for my fingers. There isn't a whole lot of more freedom. There isn't a whole lot of more freedom. There isn't a whole lot of more freedom. And boom, suddenly you get a lot of freedom. So that's what happens here. You're just paying, you don't really gain a whole lot. And somewhere here, the water starts coming in and then things really become flexible. So I think that at least as a hand waving thing, it's probably a reasonable target that it should look something like this. Now, let's play this Gideonkin experiment. What if that was not the case? If you had a protein that just consists of say a small glycine side chains or something, they never interact. What would that curve look like? That would be pretty much completely smooth, right? And what would happen here? You would not have any clear energy barrier, right? So have you seen any proteins that consist entirely of glycine? That's probably why. A protein will not be stable unless it can have a well-defined state. So the key thing is that it's definitely possible. But because that's possible, those sequences will never form a protein. And now we keep getting back to this, right? Random sequences will not form. Being a protein is an extremely rare property of an amino acid sequence. So I'll come back to that thought, but what you might, we talked a little bit about the denature state. If you compare this with the native state, the two things that define the native state is that we have this really close packing and we typically have a very low energy. These are related because this packing is what gives us all the interaction, both electrostatics and the fundamentals interactions. You can actually show this if you say count the number of native contacts, count the number of hydrogen bonds or something. And even in a small simulation or something, you can really show that these very low energies correspond to the native states. So native states usually have both a low free energy and a low energy. Lots of good interactions. Even if you increase the temperature, the native states will likely still have the lowest energy because that's not very dependent on temperature. But of course, if you increase the temperature enough, the entropy effect will be stronger and then it might still unfold. I might skip this part. And now I'm gonna tell you a little bit about that. I'll actually, we can actually prove, I would prove in a couple of slides that this is actually two and you can measure this experimentally. But if you just believe this for a second, that the free energy looks something like this. So we have one state here, the native state, and then we have other states that are denatured. And well, here we call this one macro state, but there are gonna be tons of ways to organize this chain, right? So there are gonna be billions of microscopic states that correspond to the denatured, but probably very few that corresponds to the native one. That's what you see in the entropy plot. Low entropy, few states, high entropy, lots of states. You can plot this and these same, these plots that appear to be a bit corny if you plot the energy, sorry, the entropy as a function of the energy. And that is a very low energy. You're hardly gonna have any entropy, very low states. And as you increase the energy, eventually the entropy starts to increase. You might not think of this, but you actually did a lab on this, and we kind of intentionally tricked you a bit because we hadn't gone through all the math. Remember these two-dimensional folding simulations you did? And you might also remember this reference to paper in the literature where they did it in 3D. So what it turns out that if you look at both the native states and all these other states, you can plot what are all the possible energies you have? So that you're gonna have lots of energies that are relatively high. And then as you go to lower and lower and lower energies, you're gonna have fewer and fewer states. And eventually there's a few that they kind of become discrete, right? You're gonna have very few states with super low energy. But this depends a little bit on the type of amino acids you have or in a simple lab, what type of energy function you have. Sometimes there's gonna be one or two states that have much lower energy than the others with a big gap here. And in other cases, it's gonna be very smooth and it's almost continuous all the way down. What you can actually show is that as you start out with high temperature and then you're gonna be very high energy and then you're gonna be somewhere out here on the curve, the slope here is gonna be very low and very low slope means high temperature. As you start to reduce the temperature, you move to lower and lower and lower energy states here according to the Boltzmann distribution. So we occupy more and more of the lower energy states. And then there's one out of two things that will happen as the temperature increases, eventually you will freeze in here. You will stop in a state because to move from each state to the next state, you will need some energy to get over a barrier to, right? But eventually you're gonna be happy that you're so low. You might only be in the second or third lowest but that's pretty much fine because there is still a substantial barrier and there's very little energy to gain from going to the next one. While in other cases, you start going down, down, down but as the temperature gets lower and lower, suddenly you're able to jump to one really low well-defined state that's defined by a fairly large energy barrier to the next one. And this is a bit of hand-waving but if you look at the previous slide, what do you think characterizes those well-defined proteins? The native states there. And that's usually, you have a clear energy gap. You have one state that's much better than the other ones. You either, when you jump to that state, you have folded and then you stay in that state because it would be a large barrier to go to denature and this barrier would, well, you would need a lot of energy to cross the barrier but you would also lose a lot of free energy by going to the denature state. While conversely in this case, well, here would be pretty smooth, right? You could move back and forth but the lowest state here is not really particularly unique compared to the others. So this is the type of energy spectra and behavior you would get in a random sequence. If you just drop the temperature far enough and far enough might be zero Kelvin, any protein sequence will stop moving but it's not gonna have a unique state. It will just be folded in some sort of globular and eventually it will not move anymore. Exactly which state here it is and it doesn't really matter but a few sequences have these large energy gaps so that they have one state that's much more stable than all the other ones and they are the ones that form proteins. You know what it is that defines these things. So what is that defines them? Yes. This is plane Boltzmann distribution because here we have equilibrium. They move between these alternative conformations. So that there's gonna be some energy here divided by KT and this is actually not the normal temperature either but this is gonna be some sort of energy scale that describes how large are the gaps in this energy diagram. In a real protein, what does that depend on? What is that determinist in a real protein? Simpler than you think. Sorry? Well, yes, you could argue better case. Take a step back. What is that determinist? How good your hydrogen bonds are? Yes, it's the amino acid sequence, right? Some amino acid sequences will have good properties other amino acid sequences will have bad properties so that how large this energy gap is is somehow gonna depend on your amino acid sequence. For now you have to take my word for it that very few structures have these super low energies and in good proteins, the lowest and real proteins, the lowest energy structure seems to be very clearly separated from the others by an energy gap and if that energy gap is significantly larger than KT there will be a clear energy barrier. Why? What would happen if it was roughly the same size as KT or smaller? It's the opposite. If the energy gap, if the barrier is significantly smaller than KT, right? It wouldn't really be a barrier. You could go back and forth over it and then we would be back in this thing where you would gradually unfold when a, well, if you walked through a window and it was 25 degrees in the sun you would start to unfold a bit and that would be a bit bad. So that we know that because there actually is a barrier experimentally it has to be significantly larger than KT. This is not valid for polymers in general and that's why plastic, if you heat plastic it will gradually melt. So that somehow the size of this energy barrier is gonna be related to the protein and it actually turns out that's not valid for random polypeptides either. It's only real proteins that I've defined that there's one clear state that's much better than all the others and there's an energy gap. The cool thing is that we can measure these energies for proteins, for real proteins. And it's a bit complicated but remember you have plots like these and as you're basically measuring the temperature as you're going down here at what temperature does the protein fold? When does a typical protein fold? Think of this in another way. When does a typical protein unfold? Yes. Well, if it was 300 Kelvin you would have problems in warm summer days, right? How much higher? Yeah, so it's not 400 Kelvin, it's fairly close. So it's a bit higher and you can actually show that these energies are a bit higher than KT but not astronomically much higher. And you typically talk about these temperatures as vitrification or folding temperatures so I'm not going to go into details there but the point here is that we can measure what these energies are for real proteins and suddenly we can start to roughly how large must these energy gaps be. And the fact that this, I'm not going to go into details there and that gives, you know, if you know how large this energy gap is that gets you something else. What? No, because we don't know. We don't have a pre-constant in the kinetics. That's another three, four slides so we'll get to that but if we know what the energy gap is we can start to at least guess that what the energy barrier has to be, right? And if we know what the energy barrier is that we can start to relate that how probable is it right that random sequence will become a protein? Yes. So the probability of something a folding is roughly the delta, the energy gap size divided by this vitrification temperature. And if this is in the ballpark of 10 or 20 times larger it's going to turn out the probability for a random sequence of folding is roughly 10 to the minus eight. This is a bit of hand waving, I'm well aware of that but the point is it's not going to be a billion times larger than that and it's not going to be two times higher. Is that reasonable? Well, I'm not going to actually get to prove it but you can think of this another way. What is the probability, here we're saying that this would correspond to a unique state so if we assume the opposite what is the probability of having say two unique states? They would have to square that three unique states we would have to cube it. The probability of having for a random sequence to have one state is small, the probability for a sequence of multiple stable states now we're getting into the ridiculous part here, right? But the funny thing is that this is actually really we see this now and then for prions. And now again we talk very much hand waving but I would argue that the fraction of proteins where you actually does see that probably corresponds to say one in a million or something. And those proteins are, they are exceptions but they're not really the exceptions. They're the exceptions but our ideas that a protein should always adapt the native state is always the lowest free energy state. That's just based on probability. The probability that only, that is by far the most common actually by far the most common thing is that you don't have any stable state at all. Once in 10 to the power of eight you actually do have a well, neatly defined stable state. Those are the sequences that will evolve in your bodies. Any other sequence, well, those mutations happen of course but they're never ever gonna result in any viable organisms. They will not live if those proteins are important. And then just now and then we actually have proteins that have two relatively stable states. There is nothing wrong with that. It's not a contradiction of physics or anything. It leads to some problems biologically but it fulfills all the relations we've seen. It doesn't contradict Amphinsen or anything. It's just that at the time we hadn't really observed proteins like that so we didn't think of it. It could and that might very well be what we have occasionally seen, right? There are certainly some, there are certainly strong hereditary parts to lots of amyloid diseases and things like that. We know very little about it but this definitely depends on the sequence so that, sir, it definitely happens that mutations will make this worse but on the other end you should also be aware we're talking about probabilities here. If you make a random mutation in a protein, the likelihood of stabilizing a second native state, it's ridiculously small compared to the likelihood of destabilizing the one native state you have. So in terms of mutations, the main thing to worry about is most mutations will destroy your protein because it's not gonna be good. It won't survive through natural selection. This is really the fun but extremely rare situation finally in a scientific way. It's horrible diseases. Try it on there. So first and then there's a good way. The one particular is, of course, nature has had 4.3 billion years. The second part is that how does evolution happen? How does sequence mutation happen? Because that's a conclusion. That's one of the conclusions here that if this really happened randomly, life would be pretty much impossible. That 99.99% of all babies would be born, they'd be still born or die very young. Because we always have mutations. Sorry, you're all mutated. So why are most of you alive? No? Well, I'm not sure about your family, but I was not the sixth billionth child born in my family. Remember what I said? How do most of this gene mixing happen? Yes, nature does try lander, but nature does not do try lander one amino acid at the time, usually. There are different processes. Some of these processes are random on the single amino acid level, but most of the trial and error in mixing happens one domain at a time. And this is the beautiful thing. If the domain is stable in protein A, it's almost certainly gonna be stable as part of protein B2. So nature kind of cheats, which is good. And that, I think, brings us very naturally to the next part, the real protein sense is how this happens in vivo. I would argue that most of this is true. There are some interesting things we can actually measure and things to the cell and see that it's both a bit more complicated but not really a whole lot more. Protein synthesis really starts with DNA. And the first thing that happens in DNA, these are structures that weren't, most of these structures were not solved when they first printed the book and that's why I've included them. This beautiful protein here is DNA polymerase. What is, what type of protein is this? Yes, why is it an enzyme? Exactly, right? So that you copy DNA to RNA at the start of this, you have DNA. At the end of it, you still have your DNA but you have an RNA. Well, the RNA molecule existed prior to this too. But suddenly the RNA is now, has it contains, it's polymerized in a way so that it contains a copy of the DNA. The entire protein here, we don't change the protein. The protein is reused. So the protein itself is not consumed but this protein helps speed up a reaction that would otherwise take forever because it takes a long time to unbind to strands and making sure that this happens, et cetera. Insanely beautiful molecule. Makes less than one error in a billion bases. And it's something like, if you're hearing about nanotechnology, nanotechnology is very popular nowadays. So when you hear people talk about nano machines or something, how large are they? Sorry? Well, be very simple. When you hear about nanotechnology or somebody talks about a nano robot or something, roughly how large are those things? Yes, you might think that they're nanometers. I would argue that the typical forever people start talking about nanotechnologies as when things is 0.999 micrometers. There's just smaller than one micrometer. These are nano machines that really are 10 nanometers. So they're like almost a thousand times smaller than when people talk about nanotechnology. This is true nanotechnology. Insanely accurate. And again, remember your simulations? The type of noise you get in the simulation on the molecular level? It's kind of insane to be able to use things that noisy and only get an error once in a billion. But that's kind of sloppy. Why do you get an error once in a billion? Can't nature do better? Why would it be advantageous to make errors? Well, no, I wouldn't say that this is the main reason why you evolve is when you actually have the gene mixing during the mating and everything. For a second, let's assume that this really is an error. Errors are not good by definition. The problem is, of course, error control costs not money, but energy, right? I bet nature could do this better, but it would cost more energy. So at this point, there is a balance, right? You could spend, you could certainly, it would be trivial to get this to one in a quadrillion basis, but suddenly you might spend 10 times more energy on this proofreading. On the other hand, you don't want too much errors either because when you make an error, you've produced an entire protein that you need to degrade and that also costs energy. And there are, Montserran, Bergen, others in Uppsala, they've actually, nature, the body, your cells have pretty much optimized this. So the reason why this, this is optimized for your entire cell's life to consume as little energy as possible. You make the errors you have to make because it would be too expensive to be perfect. Well, so that's an entirely different process, right? That has to do with the degradation of DNA. So this is just one small process. So this just has to do with the transcription during creating proteins, but in general, nature, again, there are errors in nature, right? We get cancer and everything. It would be there too, I would argue. I'm not sure whether people have showed that, but there too, I would argue it's too expensive to be perfect, which might sound horrible, but the alternative would be that you would have to eat 10 times as much per day and your lifespan would likely be a 10th of what it is because your cells would wear out and everything, right? It's kind of, I kind of prefer to live to 100, but have some errors rather than die at eight to 10. There is a very cool thing with this molecule. Could you use this for something? Yes, what's PCR? You talked about it in the course? Well, previous course is in the program. So what's PCR? What do you do with it? You create copies of genes, essentially, right? So how would you create a copy of a gene? I have to confess, that's a really good answer. But how would you do this? Now, of course, you would like this process to be as fast as possible and everything, right? And this, of course, is an instantaneous process. So how would you, if you would like this to be a fast and efficient process and everything, how would you speed it up? So you would, so let's take this one step at a time. So you start with a, just to say at room temperature or something, and then you do what? Heat it up, good, and what happens now? Well, that's it. Part of it, most chemical processes go faster at higher temperature, right? So you want higher temperature for the process to go faster. And what do you then do? Okay, and what do you then do? Yes, you have to keep adding enzymes, right? Why do you have to keep adding enzymes? You just DNA to do DNA polymerase, too. And this is the problem with this, it didn't work. But yes, so this is the key thing. You then find DNA polymerases that are thermostable. From exactly the type of bacteria we spoke about last lecture, that live in, say, geysers or volcanoes or something that are stable at 80 degrees centigrade. So the cool thing, the main process of using temperature to, like, sorry, you don't get a Nobel Prize for the discovery that a chemical process goes faster at higher temperature. The beautiful thing that Kerry-Mollis showed that they found this thermostable DNA. So you just add the enzyme and then just keep cycling and it's kind of automatic. And that's the awesome part. So where did they get the tag from? Yes, where did they get that bacterium from? Yes, was he even a Yellowstone? Do you know how much the National Park Service got for that? Zip, zero, not a nil. And they were actually seriously pissed off. These patents have generated hundreds of billions of dollars of revenue to the universities of California, in particular, and a bunch of companies. So now they're actually, I think there are new rules in the National Park Services that if they ever, people are not allowed to use material and if they use it, they want part of the revenue. So what usually happens to discoveries like this is that their research collaborations, there are a bunch of groups involved. They have one great publication together, a Nobel Prize, and then the universities keep suing each other for 30 of the next 30 years because there is so much money involved. It's an insane amount of money involved in that. The next part is that you need to copy the DNA into RNA, messenger RNA actually, but that's right now that we don't care about that. This is another, this is an even more amazing molecule. This is also a prize, we're in this case, Roger Kornberg. He probably got the prize for one paper describing the structure of the RNA polymerase. RNA polymerase two even. I was a student, I was post-doc actually in that department, it's that one in 2001, and that was about the time when they published the papers. That's also nobody, well, of course, everybody knew that Roger was a great scientist, right? What you should be aware of with all these Nobel Prizes is that first the Nobel Prizes, they have a tendency to rewrite history, for better or worse. I do get by, I think these are extremely well deserved prizes, but in hindsight, it's very obvious what an amazing invention they were. Had you asked people around the turn of the century, you would have said, yeah, that was one great paper out of many, and it's very hard to spot these things going forward. But in hindsight, it's an amazing part of the replication machinery. What happens after the messenger RNA? Well, you get to the ribosome. And then you take the, you have some transfer RNA here too that I'm skipping, sorry. And the ribosome takes the RNA, the transfer RNA that now has small amino acids bound to it, assembled these amino acids into a long chain, and eventually the protein will exit to the exit tunnel. This too is a protein that we knew roughly what the shape was. Remember those membrane protein plots I showed you that there was a small than the large subunit? You know that there was something. The exact structure of this was determined, well, there was a couple of, oh, 2001 it even says, Nobel 2009. The reason why many of these prize are reason, it's insanely difficult to determine atomic resolution structures of these proteins. You can just see how large they are, right? And you need to track pretty much every single atom where they are. Today a large part, a particular of the ribosomes are now done with cryoenium. And that's the facility you're gonna visit at 2 p.m. this afternoon. And Alexei Amunds that we have recruited here, he has been a post-doc with Venki Ramakrishnan who was actually also involved and got a Nobel Prize for his ribosome work. The scarier fanathing is that in vivo this folding takes seconds to minutes because you need to gradually produce the protein and then this chain is coming out, residue by residue by residue in the exit tunnel. With modern cryoenium experiments we can actually see that the protein starts to fold inside the exit tunnel already. And exactly how this happens, we don't know yet. There's an amazing amount of really fun science here. Partly. So it's not, it's some sort of in-between environment, right? Because if membrane protein is to fold this way and the membrane protein that's attached to the translocon and that if this sounds fussy it's because it is fussy, we don't quite know yet. I would argue that the problem is that people do try to do it. The problem is it's easy and you can of course, there's a structure of this. You can put a structure in a computer. You can add a membrane too, you can add the DNA. Then you push the start button and then you come back a week later and it's gonna look exactly like you started. We have like the simulation. There's not a problem to simulate a larger system but the time scale is still gonna be like a nanosecond or 10 nanoseconds. There's nothing that happens on that scale in a, say in an entire cell. So the problem is not that you can't do it but the problem is that you can't really reach these time scales that you would need to do everything. And I would also argue, if you're gonna study something that takes a minute you probably don't want the atomic detail, it's irrelevant. So choose a different level of your model. The reason we can show that is that there are a bunch of fun experiments but you can actually show that as you start, there are some proteins for example, luciferase that is a protein that's out of fluorescence so that you see light emission from it. And when you start this process you can see that the amount of fluorescence, nothing happened, nothing happens and it starts to increase because this is after a couple of minutes when the protein has started to form, right? And as you keep adding more protein as you allow this to go on, you will get more and more and more fluorescence because we get more protein. The second I stop the synthesis, well there's still gonna be a bunch of protein in the pipeline going through the ribosome and everything but the second I stop it, I also stop the fluorescence from increasing. But that's how we can see what the time scales are of this happening. So this is a very slow protein, 25 minutes. No, that's not a very, remember because for an individual, this has to do with the individual protein here as this go up as we increase things. This is probably a protein that might take a minute or something to fold the time scale of it here. So the problem with real folding in vivo is that you don't have this simple picture that first you produce the entire chain and then the chain is in water and then the chain decides to fold or not to fold but you have so called co-translational folding so as you're translating this into protein the first part of the protein is starting to fold as it's coming out. And that's kind of in contrast to Amphizan a bit. These are things that Amphizan didn't really study. Amphizan just study global proteins when they are denatured or re-natured without the ribosome. There are a bunch of proteins that in particular they're really large ones where this won't work because they're large and hydrophobic and sadly enough they're gonna be so hydrophobic that while we are translating them they're gonna collapse and stick together before the entire protein has formed and before we've had the chance to fold it. When I was about your age that was pretty much where we were in research. We had no people who didn't really know there were some ideas of this folding that happened. The only thing they were realized that there were this only works. There are some types of proteins that only work to fold in a cell. And what people discovered a couple of years later well roughly at the time I was a student there are these big chaperone molecules. These are insanely large protein molecules. Do you see there? But you see that they also consist of a number of different subunits, right? Small domains here. These are literally cages that they can open and bind a protein on the inside in a large cavity that is then relatively hydrophobic. So they take these misfolded proteins because it's not an incorrect sequence. But these fairly large hydrophobic sequences then get a chance to unfold here and then they can gradually fold in a hydrophobic slightly hydrophobic environment. So these are also enzymes or catalysts that help the protein fold. So in this case some proteins they get stuck on the way unless you have this environment to help them. Because there is some sort of intermediate bad well intermediate good state. It has one with low free energy or very large barriers. These use ATP because you have to some point you need to open some sort of almost lid or passage here. You need to change the confirmation to help things bind and then eventually you need to release them again. So why, and this is of course the reason why nature, the body, yourselves try to avoid this. This costs more energy. For some very large and complicated proteins we might need this for them to fold and then we don't have a choice then you need machinery to help it fold. But if we can avoid it, it's much better. This is gonna be way more efficient to fold small things without it. But this is why a real protein folding occasionally won't work outside of this hell. And there are many more examples like that. So that kind of brings us back a little bit to Leventhal's paradox, right? How can we decide when a protein folds in realistic times? And we had these hundred residues we spoke about. So two, three conformations per residue would be two to the power of 100 and that would be more than a billion years. Not gonna happen. So let's, we kind of have one possibility and that's where you probably already have it on the handout so I can show it. Could it be that this co-translational folding solves it? Because rather than having the entire chain try everything this is beautiful. The chain is synthesized, started with the end sequence and that comes out first and then you fold one residue after another. You don't have to try all possible combinations. That's a beautiful idea. There's only one problem with it and that's, it's not true. But apart from that, it's a really beautiful idea. Proteins don't fold starting from the end terminus. Sorry. Said we can kind of forget about that but I still want to mention it because I think it's an obvious good idea. And I think one should not be afraid. And this is of course a good, people have tried this. People have tried to make experiments. What happens then if you remove the end terminus does the protein no longer fold or anything else? There is experimental evidence to the contrary. It kind of makes sense when we come to this. It would make lots. Do you use sheets? No, it wouldn't. I'm saying like that it doesn't happen. Yes. You can't put the sheets apart. But it's still, it's still a good day. It wouldn't solve Leventhal's paradox, right? The only problem is that it doesn't. So I think the less we say about it, the better. But when we first introduced Leventhal, we spoke about thermodynamics. We always talk about the native state. We didn't talk about kinetics at all. Based on what you know now, Leventhal's ideas are much more profound than we think. Because what Leventhal is really saying, from one point of view, you could argue that Leventhal said that Almeson was wrong. I wouldn't quite formulate it that way but what Leventhal says is that what determines whether a protein is a protein, a real protein, is not just free energy, right? But it's whether we can find our native states. Can we find that fast enough? If we would take a billion years to find the native states and that's kind of related to all these protein sequences that would never fold, right? Then it's not going to be a protein. So this paradox is not just a paradox. It might actually have some very important truth that there are some classes of molecules that apparently managed to find a smooth and smart way so they can reach a folded states. That happens once in a billion. Those are the polypeptides we call proteins. The other ones are just polypeptides, which is of course very different from how we first introduced Leventhal's paradox. So from that point of view, the native state would not necessarily be the lowest free energy minimum but the primary thing is that it's an easily accessible free energy minimum is one we can get to without gigantic barriers. Do you think that's true or not? Or maybe we can separate it into two things. We can't say whether it's the lowest or not but can you say something about these energy barriers? How large are the energy barriers in protein folding? Yes, but they can't be gigantic, right? Because if the energy barriers were gigantic, we would have Leventhal's paradox problem that it would take the edge of the universe. So energy barriers cannot be gigantic, period. Then we would not have proteins. So the question is really, the question is not whether we have energy barriers or not. We know we can't have gigantic energy barriers. The only question is why are they so low? And that turns out, of course, there are tons of different way of reasons why they could be low. The easiest way, you can come up with a bunch of different models saying that folding starts in the end terminus is one possibility, that's a model that's incorrect. It's not a bad model, it's just one that doesn't work. I'm gonna show you three models that are slightly more in tune with modern folding. By far the simplest one is called diffusion collision or the framework model. I think the book likes to call this framework. We always call it diffusion collision. And that is somewhat related to what you just said. At least in alpha helices, that we first form simple secondary structures and then these secondary structures pack together and form larger things. Super simple, beautiful model. We have even shown in folding at home and I'll try to show you some of these results. This was 15 years ago that for some very small proteins, this is true in simulations. You first form the helix and then the sheets. Oh, I might even have shown you a simulation like that a few lectures ago. For some very small proteins, this is the case. And the way to show that this is the case is really that whether this helix is formed is independent of the probability whether that helix is formed and that's independent of the probability whether that sheet is formed. They form independently and then aggregate. Beautiful model, simple. It's not generally true for larger proteins but occasionally for small proteins. You could also imagine having some sort of hydrophobic collapse, right? That I already hinted earlier on today that all the hydrophobic stuff mixes into something completely molten and then we have the secondary structure and everything form. Here too, I bet there are some proteins that undergo that. I would argue that the three models I'm gonna present to you, this is the least likely one. Why? There is a problem with this model. I would even say, well, that's partly, no, yes, in part that's correct but I would even say experimentally what is that you don't have here in this model? There is no secondary structure whatsoever here, right? And that's, if you look at this in CD spectroscopy and NMR experiments there appears to be a bit of secondary structure in the molten globular. So it's not a bad model but it's, well, so and so. And the third model is actually the one that's the best. And that's called nucleation condensation. What this means is that it's kind of like this hydrophobic collapse that you start having the entire chain form something molten but then you start just like when you form crystals and eyes or something, when something starts to freeze you start to have a few interactions here that are good. You might also start to have some secondary structure and then things, and this is the nucleus. You have some sort of nucleus just like you have a nucleation process when you form eyes or so. And then things condensed just in the same way they would condense in physics. So then you add more residues to your helices. You add the beta sheets becomes longer. You simply gradually grow the size of this core region and eventually they would form a large protein. This is all based on hand-waving for now. So why do I say that this is, how do I know that this is kind of a good model? No, you can't see this. All we can see in CD spectroscopy is that there is some sort of secondary structure. This is much harder. Yes, and now you went way too far. It's completely correct us but if we take it this one step at a time, we're gonna need to understand something about the states, right? We're gonna need to get some information about the intermediate folding states, the transition states or something on the way. So that's great. Can we just crystallize them? Why? They're not just not stable. They are, these are the states that balance on the edge of the knife, right? It's not just that it's less likely to see them. You will never see a transition state. It's the most unstable state you can imagine. So that to understand this, we need some information of the transition states but you can never ever see the transition states. So what we're gonna do next Monday, we're gonna try to do this indirectly but see can you somehow mutate residue and understand whether we stabilize or destabilize these states. And there are some pretty cool experiments you can do that way. And then you can actually show that nucleation for very small proteins, they usually fold by diffusion collision. Larger proteins always fold by nucleation conversation. And this actually explains Leventhal's paradox too but that's for next week. One question, please. In this pipeline, where did you say that this? So that's a very good question. I would just say that the rapid step, the hydrophobic collapse is always rapid, right? Because it's a gigantic downhill barrier and you gain so much entropy. We don't, right now, we're gonna need you to see why this is so. But you're right, I certainly haven't proven that it has to be slow. Because you define the statistic in which the natives or some of the natives have been locked and it is like, it is about you. And then what is difficult, what it takes to go up and it's locked in, it's locked in, what happens and so on. Now, it's a good question. Remember what I said about the multi-globally though, that what we realized experimentally, there was no real transition between these two states. That was a very gradual transition. What we knew about this, some sort of intermediate states or something or multi-globally, this is actually not a multi-globally, but a transition state. There is something that starts. There's always a first step to something. So the question is, forget about the speeds here. At some point we're gonna need to start and then we're gonna need to grow. So the question is, what is faster, what is slow? Is it the starting that is slow or is it that the growing is slow? For now, forget about the speeds. We'll talk about that on Monday. We have no idea for now. This is just a model. And you don't even know whether it's true yet. So sorry, say that again. So are you talking about prions now? What do you mean? Well, there's a second even lower state, yes. So right now, remember that's an exception. That's one in a billion. Right now I'm talking about, when I talk about folding here, the first thing that Leventhal said was really, for folding to happen, this must be fast enough that we actually observe it. You could always imagine that from whatever state we have here, there could be an even barrier, higher barrier to something that would take almost forever to form. Kinetically, I would say that's irrelevant to first approximation because if this takes so long to happen that you never see it, well, then by definition, we never see it. Even if this happens one in 10 to the power of eight, we still don't really care to first approximation. Now, the problem in nature is that of course, if you wanna go slightly further than the first approximation, we do have these pro prions and occasionally they do appear to form disease. That is not the central topic of protein folding. It's an extremely rare site condition. What then can happen is that when you have these plaques growing under some conditions, it can become slightly more common. But for normal protein folding, the big question is how do we get our protein to fold at all? Whether there happens to be an additional state that we can reach once in a blue moon, that's not really relevant when we study normal protein folding. For those folding models, we have absolutely no idea. We have absolutely, it's a super good question. It's like Jesus, it's a super good question. I don't know, I don't know. I don't think anybody knows. How, because at some point, you're gonna have a stable protein, right? If that has to change, that fold has to change into another fold. The first fold somehow has to become, has to be destabilized. And then you have to stabilize it in the other fold. Remember the structure, the prion structure, we saw that you had an alpha helix that intercoverted into a beta sheet or at least a coil that intercoverted into a beta sheet. So there too, you're gonna need to at least locally unfold the protein and unpack it, which is gonna be a barrier. That's bad. And then you're gonna need to get those beta sheets to form. But my hunch would be that it's something like this, but I would bet nobody knows. People haven't answered this or say it's there. That's why it's called this research. It might change a bit, but I'm gonna argue, and again, this is now hand-waving. I can prove this in one day. I would argue that this is a very well-defined nucleus. You know it's the same residues. It's not, this is not just a mixed bag of things that just happened to be around there. You know that this is residue 14, 29, 72, and 68. It's exactly the same residues that always initiate this nucleus. It's not random. And how do you, how do you imagine we can show that? So there are, there are gonna be some really cool ways. And that's these plots I want to go into them today. However, the last few slides I'll wait until Monday. We can, there are some really smart ways that we can invent, so we can experimentally measure if I mutate a specific residue, does this stabilize or destabilize the transition state? Not the folded state, but the transition state. And that way you can show that what specific residues are involved in this transition state, and it's usually just a handful. That's from here. Mm-hmm. The mission of nucleation in the first step in the formation of either a neutral or a neutral structure. Yes. Because I was talking about the structure. So nucleation in general is just, that's a, that's a physical term we use. Say when you fall, for example, when ice freezes, that's when water freezes into ice. First you have all these waters. And if you have, if you have water and the temperature instantly drops to minus 10 degrees centigrade, what's gonna happen? So if you're gonna run an MD simulation, try that. Set the temperature to 260 Kelvin or something. Nothing is gonna happen in a minute and a second. The water is, the water is gonna move slightly less because there's less kinetic energy, but nothing happens. It's a highly disordered phase. It takes time. And you can even show that, you can even show that you have super cool water, it's called super cool water experimentally. And this can survive for seconds if you don't touch the system. It's not disturbed. So it's really your, it's still kind of balancing on the edge of a knife. And you can show this then that, there's a beautiful example there. Have you ever just seen this heat pads or something? That you can reheat in the microwave oven and they just break a small piece of metal or something in them and then they generate heat. You can buy them in a shop for $5 or so. And then as they become warm, but they also solidify and they can put them in a microwave oven and reheat them and then they become liquid again and they can reuse them a thousand times. So what happens here, here you have a liquid that is really, that wants to be frozen at room temperature. But this is a super critical liquid in the sense that it's gonna be very, very slow process to actually start to initiate this because there is no, you need some sort of starting crystal. It's not gonna freeze. The second you have a crystal, it's gonna freeze fairly quickly. But since there is no place to start, it's not freezing. So what you do when you flip this metal thing in it, you somehow initiate a process that will cause this thing to freeze. And this really sounds strange that why on earth does something become hot when it freezes? So when you freeze something, you release energy, right? And this is the energy that you feel in your hands. It becomes warm. And that's why it's solid at the end. It has frozen. This is pretty much the same thing that you're having some sort of core that you're starting to add things to. Exactly what the core looks like, exactly how this process happens, that is different between the protein and physics. But nucleation just means that there has to be a core and nuclear somewhere where it starts. Lots of good questions. You know what, it's noon. I'll just show you one more slide and then we'll finish here. There are gonna be two things that we're gonna talk a lot about on Monday. And this is where we're gonna end all the thermodynamics and kinetics. We have folding intermediates and we have a transition state. So large proteins, particularly these really large proteins, they will typically go through folding in several states. That means you can have state one, state two, state three, and they can stop in a few places on the way, in particular these gigantic things. It's not that you start with a thousand residues and boom, then you have a ribosome. So large proteins needs to fold in frequently sequential steps. The thing we had here on the last slide was slightly different. This is really a transition state in the way that we have it in the thermodynamic sense. This balance of the edge of a knife. The difference here is that these folding intermediates you can capture. We can determine states of them and even if you have things like an iron channel opening or closing, this is something we're very interested in that. Can I trap the system halfway so I see what, and that you can occasionally do with a disulphide bridge. Even David spoke about this today, right? You can occasionally trap it in one state or another. That's not trivial but possible to do with experimental methods. Getting a transition state is gonna be virtually hopeless. There is no way you can do this structurally. So here we're gonna need to be smart. Here we're gonna use mathematics to find ways to learn about the transition state without actually seeing it. Nobody has ever seen these states. You can't see it by definition. I will skip the last slides on experimental rates and everything. Don't worry, we're gonna talk a lot about that on Monday and I'm just gonna go to the, and there's actually a better definition to, forget about chapter 19 entirely. We just talked about chapter 17 and 18 and that also means that all these plots and everything, forget about them at the end. We're gonna talk about them on Monday. So the plans today are the following. In one hour from now on, we're gonna have Tom Cheatham give a seminar and he's the opponent for the thesis tomorrow. If you wanna go to the thesis defense tomorrow, it's entirely voluntary, but if you wanna have an idea what these looks like, they're gonna be, I will send a mail on this biophysical chemistry list. Do all of you get the mail, sir? That's a bad question because if somebody didn't, they wouldn't know. Well, I do occasionally send mail, sir. This is gonna be at KTH main campus, room F3. I will send a link to that. The one thing to remember, be on time because dissertations, we typically lock the doors and in a few cases, one person can move in, but you really don't want 15, 20 people showing up late so that if you're really late, don't come in. If you're one minute or so, it's fine because otherwise people are gonna start getting a bit irritated. So be on time if you wanna be there. And that will take, I would expect that this is gonna continue until roughly 12, 30 or so. You can't say because the opponent can keep asking questions for as long as he wants. This is one thing. Actually, the only thing we can do is that we can take a break for dinner. But then the opponent is allowed to continue after dinner. Don't worry, it's gonna be over by 1 p.m. That starts at 10 sharp tomorrow. So be there 10 minutes before 10. Shoot me, but I'll send you a link to the location. And then after Tom Cheatham's lecture, Björn and Dari will take you down to have a look at the Crye facility together with Marta.