 The topic today is going to be the continuation of all the protein studies we did yesterday, but I'm gonna focus more on membrane proteins. Before we dive into the membrane protein world, let's talk about the Wednesday high, it should be Tuesday. No, sorry. Yes, okay, it is. Conceptual differences between fibrous globular and membrane proteins, what was that? And the membrane part we can leave out because that's coming today? Well, yes and no. The shape will be one important effect, but if you think about the genes and the type of sequences that would lead to globular proteins versus the type of sequences or chains that would lead to fibrous proteins. Yes, so frequently the actual sequences are very short and frequently repetitive, but then they gradually build hierarchies that will lead to very, very, very long chains. So they are way more, you have a pretty boring but fairly hierarchical assembly of the fibrous proteins so that they can even reach macroscopic dimensions. And we haven't touched that much upon that yet, but one thing you can think about that has to do with folding. Remember that when we spoke yesterday about, for instance, the number of layers in the beta sheets or something, right? Several of you realize that if you start to making things too complicated, it's going to be entropically difficult to fold it. Can you imagine if you had a fibrous protein that really consisted of one or, well, not just one domain, but a bunch of specific domains that had to fold to be something as large as a piece of a bone or something? There would be an astronomically complicated structure and that would never be able to form. But by having a protein where the individual segments are fairly easy and simple to fold again, thinking, nails, hair, teeth and everything, we need to produce a lot of this material. So it actually makes sense for the body that is relatively easy to produce the individual alpha helices in your hair, but the hair itself is just an assembly of more and more hierarchies of that structure. Globular proteins, in one way that has to do with it, it's not so much the shape. You can argue that it's a globin-like or globes, but the most important property of globular proteins is what? Yeah, water soluble. In one way, it's a pretty stupid definition because the globular, that has to do with the shape. It doesn't mean water soluble, but so be it. I can't change the definition. Number two had to do with the shape you mentioned. So can you name a couple of fibrous proteins and where do you occur? Collagen is where? Some example. Yeah, so you wouldn't have it. Anything else? Myosin in the muscles. So you do see one different, collagen is a fairly hard, almost brittle protein, right? Because it's bone-like structure, but it doesn't mean that all fibrous proteins have to be hard. So muscle is definitely not a hard tissue. Do we have a third example? Silk protein, which we don't really have so much in our bodies, right? But I really love the pair of silk protein on the one hand and alpha keratin on the other because they're pure alpha helices and beta sheets, although the other way around. Silk protein is beta and alpha keratin is alpha helices. It's kind of fun that you can have, it's not pure protein, right, but things that reach a macroscopic level that is pure sheets or pure helices. Rossman fold, we will talk about in a few minutes. What is a Greek key? Any peptide chain or some specific peptide chain? Beta sheets. What type of beta sheets? Anti-parallel ones, right? And what is it specifically that we organize in the beta sheet so that they do not cross each other? It's not really the beta sheets themselves, right? That is just a large part of an anti-parallel beta strands. So what is the part that we organize in the Greek keys? Yeah, but the beta sheet, what part of the sheet? The loops. And you are so lucky now that I don't have a marker because at this point I usually scare somebody up to the board and ask them to draw a Greek key. You should be able to understand that pattern. It's not super complicated but it's not something that you will be able to improvise in three seconds in the exam if you've never done it before. So beta sheets can be very important for either dimerization or in general oligomerization when you have many things paired together. Why? So what would be the driving force in that in the first place? Isn't that stupid? Wouldn't it be much smarter for nature to create a larger protein in the first place? So that has to do with these things, right? That the larger things are, the harder it will be for them to fold. We haven't seen that yet but for now that's hand-waving but we're gonna be able to prove that later on in the course. So that for nature it's always better to have the fundamental units of things that fold be relatively small and then if you need to assemble larger structures you pair multiple ones together. What is the specific, well you kind of touched upon that but the specific in terms of free energy, what is specific driving force of this type of dimerization? Well when we spoke about beta sheet and the beta sheet stabilization, right, and the transition states here. So what was the argument? That it's better to have rest use, for rest use that like to be in beta sheets it's gonna be better to be on the pure inside of the beta sheet rather than at the edge. Now that is a bit of a lie, not so much a lie but everything is more complicated in reality. If you just think about the hydrogen bonds it makes sense but it's not the matter, you don't have beta sheets existing in vacuum. So even the rest use at the edge of these strands, sorry, even the rest use in the strands at the edge of the sheet, they will of course form hydrogen bonds but they're gonna form hydrogen bonds to water instead. But in general those hydrogen bonds are gonna be more floppy, flexible and everything and it's usually better for them to form rigid large sheets so that those strands too that were previously at the edge will now appear right in the middle of a beta sheet. And by pairing up with another residue we now take two edged strands and effectively make them internal strands instead. Because of the fold we will talk about shortly, we spoke a little bit about different classes of proteins. Which class would you argue has the largest diversity? If we forget about the mixed classes for a second. Yes, it can or have to even right? So why is it that sheets has relatively little diversity? So the beta sheet, the beta sheets themselves have a larger extent, right? They are larger structures in space. While the alpha helix is a fairly compacted structures, all local interactions, but that also means by the time you have folded one alpha helix all the hydrogen bonds are already paired. There isn't really any strong driving force to extend the alpha helix, partly growing it maybe, but not moving to the left or right. Then it would be interactions with the side chains. While the beta sheets, the second you start to form a beta sheet, as we already mentioned up here, there is going to be a strong driving force to extend the length, extend the beta sheet to the left and right with more strands. Once you've formed a number of strands, the typical beta sheet might have one hydrophobic and one hydrophilic side. The second you have that, it's going to make sense for two such sheets to turn their hydrophobic sides against each other. So the beta sheets almost automatically end up with pairing, say, two beta sheets. You rarely need more than two beta sheets. With alpha helix, we do not have that driving force, and that means that we're going to need to pack them other way. As you mentioned, there are many more ways we can pack them, but that also means we're going to need to find different ways to pack them. We spoke a little bit about how we pack them and what was that based on. Right. And why does that end up? Sorry, based on that, why do we end up with these strange crossing angles that were either roughly plus 20 degrees or minus 50 degrees? But why exactly these angles? Because of the ridges and because of the repeat of the helix, right? So there are 3.6 residues per turn, and that means that in general, there will be some variation with individual amino acids, of course. But in general, it's a very predictive pattern in which direction the next residue is going to point. And because the residues, again, 3.6 is not an even number. Yeah, there are amazing things you learn at the university today. And so 3.6 is not an integer, right? And because that is not an integer, they're never going to be superimposed right on top of each other. If it was a 3.10 helix, it actually would. But in general, you're going to have some sort of like spiral stair shaped pattern in the ridges that correspond to the amino acid side chains. And then when you take two of these and put them next to each other, well, you have to cut one piece out of the paper, superimpose them, and then you do the math based on this. And depending on when you're turning them to the, let's see, from your point of view, to the right or to the left, you're going to enter the crossing angle that is either roughly plus 20 or minus 50 degrees. Then we spoke a little bit about these mixed classes. I'm going to come back a little bit to that today. So what was the difference between alpha slash beta and alpha plus beta? Most proteins tend to have a little bit of both. So which one are you talking about, alpha slash beta or alpha plus beta? Do you have any other suggestions? It's the opposite way around. The memory rule, the plus means that you actually need both of them. That the structure itself depends on having both alpha and beta. Alpha slash beta, I'm thinking of, yes, you have one alpha part and one beta part, but the plus means that they belong closer together. You can use any memory rule you want. That's just how I try to remember it. Yes. Well, I'll repeat that to you because the delivery went through it a bit quick. The obvious case you have alpha plus beta is if you have parallel beta sheets. Because you have one strand going upright, but the next strand has to go up too. And you're not just going to have coils that would make for a very unstable structure. So it makes a lot of sense to have a strand go up and then a helix go down and then a strand go up and then a helix go down. But that means that you're literally mixing them in the same chain all the time in the structure. While you're going to see some proteins later today that are beautiful alpha slash beta once where you have a beta sheet structure outside the membrane and an alpha helical domain inside the membrane. Super secondary structure is related to the Rosman fold. So we'll talk about that. So, Evan, why is silk protein expensive? Yeah, I wouldn't say that's wrong. It's not that I would say that's the scenario if you write that at the exam. We'll go through that a little bit more today. Same principle. I think even the book probably says that they should be different domains. What do you mean by different domains? The obvious thing is if you have two chains in the protein, right, then you might think, oh, you said one alpha helical chain and one beta sheet chain. That would be one way of doing it. I think the more common part is actually one long chain, but this chain, the half the chain is alpha helical part and the second one is the beta sheet part. And some of the membrane proteins you're going to see it's even more complicated. They actually, they are pentamers. So they consist of five domains that sit right next to each other. And each of these domain have one part in the membrane that's alpha helical and one part outside the membrane that is beta sheet. But when you assemble all these things together, the way it's going to look is that you have one part of the entire protein that's just beta sheet and one part of the protein that is just alpha helix. So the most important point is that they're not really mixed in the structure. It's not that you have the alpha helices and sheets right next to each other all the time. So why is silk protein expensive? No, it's not. It's expensive because you're happy to pay a lot of money for it. And it's called market. There's absolutely no bifysical reasons whatsoever that it's expensive. But hey, if people are willing to pay hundreds of dollars for those per kilo for that shampoo, that's so weird. What happens when we create these permanent waves? So you first there are two steps you need to do. Yeah, and this is a concept that you can do it. There are a whole lot of treatments or anything anytime you want something in biology to create a certain shape. I think I mentioned the the non iron shirts, for instance, they have they have similar types. It's not those shirts are obviously not proteins themselves. But you're basically adding chemicals and then you're using disulfide bonds to create structure. And it's actually a surprisingly efficient way to do things. Yes, yes, I'll come back to that. Because that was the few slightly fewer questions here today. But I'll before I jump into the next slides today, let me go through the last few slides that I didn't have time for yesterday. And then we can keep talking about that. So the alpha and beta structure out. I know that I already showed this slide yesterday. But since I didn't spend a whole lot of time on it, we'll do today. So there were two types of mixed structure. The first one was alpha slash beta. And I'm not sure whether that's a great example even. So that here you see part here you have one large sheet out here. And then you have some alpha helical structure in here, right? So they're kind of separate. No, sorry, I'm bad. You were of course, right? I'm tired today. Alpha slash beta was the one it. Tim Bell definitely belongs there. 88. I'm not so sure. Here's one of these classical structure where you see that every second element in the chain is alpha. And every second one is beta. I need to take this course myself by now. I'll promise to do the exam with you. The classical pattern is beta alpha beta alpha beta alpha beta alpha. Every second secondary structure element has to be beta sheet and every second one has to be alpha. There are some exceptions to that but they're fairly rare. The point here is that forget that you know these 3d structures. The second you see this in a bioinformatics prediction that you have, you're predicting something that's in the ballpark of 20 residue alpha helix and then 15 residue beta sheet, 20 residue alpha helix. At that point, you can start to say something about the super secondary structure. And that's this concept that we had in question 10 that I didn't have time to bring up yesterday. There isn't really anything fundamentally specific to secondary structure. So why do we use secondary structure, the concept itself? Yes, can't be organized, but it's, it's not really in a specific structure, right? That's my point. It's just amino acids. So why do we use, why do we introduce secondary structure? It's a basic building block, but again, the building blocks are the amino acids. I'll even be a bit provocative to say secondary structure doesn't exist. Do you have it? Yeah, but why is this model useful? Yeah, but why can't I just have 14 alpha through Jota structure or whatever, because they're recurring patterns, right? They're very simple recurring patterns and that we see them a lot and then it makes sense to label them. Because instead of having to go through the amino acid and describe the same terms of Ramesh and torsions, if there are only two very common patterns, let's name them so that when we say alpha, we know what we mean. But that is not something fundamental to the amino acids. It's just that that particular pattern is common. And what did I just tell you right now? I told you that this is a very common pattern. And of course, we could call that gamma structure. On the other hand, it's that's not a good name either because this so called gamma structure that consists of these structures we know really well, we have the alpha helices and we have the beta sheets, right? So it's kind of it's larger than the secondary structure. But it's not really the tertiary structure that when we spoke about that that's the entire fold say of a domain or something. So it's not really that large either. So it's kind of like a 2.5 level structure. And that's why we typically call this super secondary structure. So it's like it's above the secondary structure level. They're very common patterns that we will reuse. But just this pattern itself, whether they are parallel, that is not really enough to define the entire fold of a domain or something. Did you have a question or? Yep, we will come back to that in three slides. That's a very good question. But so normally, when we talk about the structure, we talk about primary structure that is the amino acid sequence, we have the secondary structure, which are the fundamental alpha helices and beta sheet elements. We now introduce the super secondary structure, which is some sort of common patterns in secondary structure elements. After that, we have what we call tertiary structure, which is the structure of an entire domain in a protein or an entire fold, one monomer of a protein really. And quaternary structure would be such as hemoglobin is a great example when you have four domains stitched together to form a larger protein. So we haven't I haven't brought up the concept of fold yet. I will do that in two, three slides here. That is one way of organizing it. So why do you think that the Tim barrel is such a stable structure and common? Look at the beta sheet in particular. Can you say anything about the beta sheet there? Yes, but it's not the hole itself. That's good. What do you think is good with this beta sheet? There are no there are no edges on the sheet. So every single strand here is paired up. And then you're quite right. You can you can definitely have a hydrophilic part there and an hydrophobic part here. Depending on this is just a class of proteins. They will do there are different there are several different types of Tim barrels that have slightly different functionality. So the HDA I spoke about, do you see that this is pretty much the same shape, right? So you have a beta strand and then helix and then beta strand and helix. But in this case, they don't really form a barrel. What confused me a bit before and apart from not having had enough coffee this morning, this part of the HDH I would definitely say it's a classical super secondary structure with sheet helix sheet helix out here on the other hand, there we start to have a beta sheets that's kind of separate. That's hard. So is that does that mean that the secondary? So what kind of super secondary story? What kind of mixed structure is this? Is this alpha slash beta or alpha plus beta? It's probably alpha plus beta. But my point is also these are not these are not unique definitions. And that's also I would also say that's probably why it's actually a good question. I should read up on this. In the 1950s, when Francis Crick, Linus Pauling and a few others introduced the secondary structure concept, remember that they did that before we actually had real protein structures. They couldn't even have dreamt of structure like this. So the helix and sheets, they make sense because they're on the one hand, they're so remarkably stable. If you have something in an alpha helix, those ramashan and torsions, they're not really, they're going to change by a few degrees, plus minus two, three degrees. They're going to be rock solid. Same thing if you're in a beta strand that is part of a larger beta sheet, that is not going to fluctuate a whole lot. They're amazingly stable. So how frequently do we see a secondary structure element? Well, that is somewhat helical and somewhat sheet like. They don't, right? Because they're completely different parts in the ramashan and diagram. You won't have a helix that starts to become a little bit of sheet. First you have to completely unfold the helix and then your sheet instead or vice versa. The problem with this definition is that it's less unique. So how mixed do they have to be before we say that they are mixed? How separate do the domains have to be before we say that they are separate? Well, that is definitely mixed one somewhere here. Maybe this part of the sheets here you might have one or two strands that are a bit mixed and then a region here that is separate. So the point is for proteins in general there are definitely some cases where they are definitely, they are clearly mixed. There are definitely other cases when they are clearly separate and then I'm sorry to say that there will be a few protease structures that are kind of in between. They're not super well defined. Yes? Why would it want to form an alpha beta alpha? Can they just form anti-parallel just beta beta beta? We'll move to the next slide. I think that might help. Say that again. Your question was, so that goes back to your point. There are some residues that prefer to be in helix and there are other residues that prefer to be a sheet, right? So in some cases you could probably imagine that there are, depending on the specific residues that that it could be very fragile and if you introduce one or two amino acid mutations you would force this chain over to become anti-parallel instead. But the problem is that what if you have a stretch of residues right in the middle of your sheets that so don't like to be sheet. They would prefer to be helix. Then there are, then you're not going to get an anti-parallel sheet, because those residues prefer to be helical. And the reason why this happens, of course, again, 4.3 billion years of trial and error, evolution, because it might sound really stupid why on earth would you have some residues in the middle of a beta sheet that don't like to be beta sheets. But that will lead to other stable structures that are good for you. So the classical example of a super-secondary structure, which we call a Rosman fold, and that is another one, the beta alpha beta alpha beta. And then you have this arranged, the left part of the HDH was roughly the same. So you have these beta sheets here, and they're going to be parallel. And then you have a beta sheet going in one direction, followed by helix, beta sheet going in the same direction as the first one. And then you have a helix, beta sheet going in the same direction. In this particular case, do you see how you jump back and forth in the sheet here? So it's not necessarily that you're going from left to right and right to left. So first you have these three going from right to left, and then you make a jump, and then you have three more going from left to right. And this is the complication, and this is why you have a large class of these. The only thing we say with the Rosman fold is really that we have this pattern that you're having some helixes on the one side, and then you have a central sheet, and then you have more helixes. But the exact order here will vary from Rosman fold to Rosman fold. What you will create here is that you will create a beta sheet somewhat right in the middle and shielded by two layers of helix. Can you guess anything about the hydrophobic properties of that beta sheet? Hydrophobic or hydrophilic? Likely hydrophobic on both sides, right? Why? So what would happen if it was not hydrophobic on both sides? It would likely unfold right and expose that part to water. It's always a good way to prove something. Prove that the opposite would lead to strange results. In this particular case, you actually almost have two domains here. And when you just see a structure like this, it looks, I'm well aware, this looks a bit strange. The reason why this is so common is that this slightly twisted shape and everything binds amazingly well to DNA. And you're also going to have quite a few polar parts of the sheet here. So that's the nucleotide binding, cofactors and everything. You see this in tons of places. But the point is it's not just a protein. It's a common shape that's slightly smaller than the full protein. But it's slightly larger than the individual secondary structure element. And nature tends to reuse it. Oh, that's a good question. The most common ones have them on both sides. I would say that I'm trying to think what I can... By far the most common have it on both sides, I'm trying to think whether I can find a good example where you only have them on one. I would probably call it a Rosman fold even if the helix are only on one side. But it's not a Tim barrel. So what would you say it's a difference between a Tim barrel and a Rosman fold? Then it has to close on itself, right? And then you'd usually need more beta sheets. I think it would be unlikely to have a Rosman fold that has 8 or 10 beta strands. So if we look at the typical structure interior, this actually applies to the Rosman fold too. We typically have two hydrophobic cores. So it's hydrophobic here. And in the particular case of the Tim barrel, it's usually hydrophobic on the very inside there, too. So these are more... These structures are now more complicated. Remember we spoke yesterday about these simple beta sheets that you just had two layers? Or even the alpha helix ones and say myoglobin that you had one need small cavity. When we start to mix secondary structures, we end up with more complicated structures, which is instinctively not good. They're going to be more complicated to fold and everything. On the other hand, the reason why nature creates this is of course that we can build things with these more complicated elements that we could not just build with a simple alpha helix or a few beta strands. One common case is that remember when we talked about alpha helix before, right? And the polarity of all these peptide bonds. So the edges of the beta sheets up here, this usually makes for great binding sites. They're fairly accessible. So you have some sort of crevice or active site here just between the beta sheet and the alpha helix. It might be easier to know. I'm not sure how good this is. You might have a binding site either right not necessarily in the hole, but at the top of the hole or somehow between the two sheets and helix layers here. And I'm well aware that this sounds fussy, right? But I'm not talking about the specific protein. I'm talking about entire classes of proteins. Exactly where the binding site is and exactly what is appropriate is that will depend on the protein. But mixing these second star structure elements usually creates a way to create very nice, small, efficient binding sites. And that's why we typically see them in factors that bind. And it can either be inside, sorry, when I say inside sheets here, I'm not, I'm not saying, I'm not saying that the binding site happens between the individual strands, right? But if you have, say a fall with two layers of beta sheets or something at the very top of the beta strands here, it's usually very efficient to bind things. So why on earth could that be useful for you to know? Is that a good exercise? Because we don't, we don't just study structure in this course because structures are beautiful to look at. Why do you need to understand structure? Well, yes, but it's, it's certainly true. But on a more fundamental level, in a few years where you're out doing either research in the pharmaceutical industry or something, that can happen too. But I would say by far the most common scenario that there are relatively few, despite all the advances of high throughput sequencing, so it's still fairly rare that you just start to sequence randomly. So if you're working on a disease or something, say a particular form of cancer, you typically have a target. You're not just studying cancers in general, you're interested in a, say, thyroid cancer or something, one specific disease. And I'm sorry to say, but you're not going to be the first person in the world to study that disease either. So people know something about the disease and they will actually know, oh, if you're deficient in that gene, that makes the disease worse. Or they might even know that this particular mutant is what causes the disease. And that's it. Then we need to try to find something that can cure the disease. So what you typically do, if you're lucky you have a structure, in many cases you won't even have a structure. But even if you have a structure, how would that help you understand the disease? Yeah, the mutation is there. By what efficiently? What target? I'm kidding you. The point is, we don't know. You frequently have no idea. It's just that we know that there's a particular mutant in this particular protein, and if that is an alanine instead of a tryptophan, you're susceptible to cancer. We have no idea what the binding partners are in the first place. You have no idea, if you're lucky it might, if you're really lucky, this is a really well-known receptor or something that you can learn something from it. But frequently the studies start with studying the luck and look and structure and see, let's see, can we predict where the binding sites are? Or it might even be that we know that, say, a particular type of drug can treat this disease. But we have no idea why. It's just that if you give this particular drug, it tends to increase the survival a bit. So on this level, it's very frequently a matter of trying to identify the binding site, understand the binding site. Can you now find an artificial drug that is better to bind even with the mutant binding site or something? And it's pretty much there that you start to design. It's frequently not the proteins we're designing, but we need to design a drug to charge it an existing protein. Because as much as I would like to it, I can't replace all the proteins in the body of the patient. But I can't create a small drug or potentially another protein that binds it. And the reason for bringing this up, it's over a comment that we just talk about the binding site with a capital B, right? When you start to study it, you don't know what the binding site is. And that's why we're so interested in guessing, again, that would never be the binding site of this molecule. Seeing this molecule, we know that the binding site has to be here. If there is a mutation out here that makes your protein really sensitive, then it's likely something that alters the structure of the protein, not the binding itself. So knowing where the binding sites are is going to be one of the most important properties when you look at proteins. And of course, I don't know. I have to guess, right? But guessing goes a very long way in many cases. The other part, if you look at alpha plus beta structure, and this is another example where it's not completely well defined. So when you look at alpha plus beta, these parts that should be separate, the best way to define them, I say, is the degree to which beta strands are parallel or not. Because if you look at this protein, is this alpha slash beta or alpha plus beta? So it looks almost like a Rospun pole, right? And the beta sheets are right next to the alpha helices. And you definitely have some mix with the alpha helices or mix with the beta sheets. And there's somewhere here you start to get this much headache, yes, I have. On the other end, if you look at the parallelism of the beta strands, if you have, if all of them are anti-parallel or to 90% they are anti-parallel, they likely go mostly up, down, up, down, up, down. And then they might be an occasional break of a helix. The other alternative, if they are parallel, if the entire beta sheet is parallel, then you definitely have to have a beta strand and then a helix and then a beta strand and a helix and a beta strand and a helix. So if the betas are mostly anti-parallel, I would, in general, be more inclined to say that there are separate structures. If on the other hand, they are parallel, I would be more inclined to say that it's a really mixed structure. But this is not as well defined as alpha helix versus beta sheet. And as you see here, we're not necessarily talking about completely separate domains. You can have, say, two or four beta strands and then a few alpha helices, two or four beta strands and a few helices. It's going to look nicer when we look at the membrane proteins, I promise you. Here, this is probably an easier case, right? Because here you definitely have a bunch of helices and then a sheet over there. These are also very common in DNA binding. So why do we have so many DNA binding proteins? Transcription factors, what else could you imagine doing when you bind to DNA? So one alternative is that you want to describe things. Can you imagine something else? You might not want to describe things, right? It's a regulation. So if you want to silence some genes, and there are definitely, depending on the environment and everything, there will be different genes that you want to express or not. Fetal hemoglobin is a great example. We'll come back to that when we talk about hemoglobin again. There is a special variety of hemoglobin that you only have as a fetus. And the second you're born, well, it's not the second, but as you were born, this gene is silenced and then you start to express adult normal hemoglobin and this is something that you're not changing your genome, but you're changing what genes are expressed. One classical example of this is these sink fingers and the reason why they're called sink fingers is that if you have a lot of imagination and look at the hands, you can imagine a sink binding site in your hands. Don't worry, I'm not going to ask you to draw that in terms of a hand, but here you have the sink ions. And see here, you see how it's binding in the major groove of the DNA. We'll come back and talk about DNA structure later. But again, it's sliding into the valley of the DNA here. Tata binding protein is another example. You see that it's an entire long, long, long, long, long beta sheet here and the entire beta sheets bind in the groove there. What this one does, the structure here recognizes the sequence T, A, T, A and the DNA. It's called Tata box and that's really the beginning of a new transcription site. We'll just start reading. How on earth can a DNA that's turned out? How on earth can a beta sheet recognize Tata? The side chains, right? So you're going to need, can you imagine that the amount of specific side chain packing you need here because it's important that you recognize Tata, but not Tata or TAAT. So there is no way you can guess that just from looking at the beta sheet time drawing here. You would need to look at the specific packing of the side chains against the DNA. So how frequently do you think these bindings go wrong? It's this, this efficiency is insane. It's also and it's more than insane. Nature has optimized this even further. You could imagine having and I wouldn't say never actually it can go wrong, but it's very rare because there is a balance here. Nature can of course create more and more machinery to make sure that these things never go wrong. But that machinery would cost energy so that if you wanted to guarantee that an incorrect protein is never ever formed, your body would be very inefficient. On the other hand, if you're too liberal and relaxed here and sloppy so that things go wrong too frequently, you're going to produce lots of proteins that will have to be degraded because they were incorrect. And I know that Mons Ehrenberg and other people in Uppsala decades ago, they even studied this in terms of energy. And I think the conclusion is that the body has pretty much optimized this to be at the level where we are most energy efficient. That we do accept that things go wrong now and then because having even fewer errors would cost too much energy. But that the specificity here is amazing compared to any computer or other thing you can imagine. So what this does when it binds, it actually, this is the initiation part that then spits the beta, the DNA helix. We'll come back to that when we talk about nucleotide structure. I spoke a little bit about other toxins when we talked about the cysteine knot. But this is not the cysteine knot. So what type of structure would you say this is? Alpha slash beta, alpha plus beta? If I forced you to take a pick between alpha plus beta versus alpha slash beta, what would you say here? And why would you say it? Because? Exactly. And again, this is a horrible example, right? Because it's not, it's just one pair and then one helix. But if you have to take a pick, go by the beta strands. What's that? Coins. So this would be very floppy. If you sit, well, it's the same argument as yesterday, right? If we put this in a box of water and tried to simulate it, yes, it probably would be very floppy. On the other hand, it's well defined because we do see it in the X-ray structure. Can you imagine something that might happen or not with this protein? And what would that do? Imagine if you were a scorpion. So this is likely a protein that is a bit disordered. You can have disordered structures. But what do you think that would happen if this starts to bind, say to another beta sheet? Then I would guess that this part will likely keep extending the beta sheet and this part will also keep extending the beta sheet, right? So that then you will get a larger beta sheet by binding this to another beta sheet. So this is likely a protein that as it's binding to something else, its structure will become more regular. And since we like to form beta sheet, that will likely be a good free energy. So this is going to bind to other things, which is good if you're a scorpion. It's bad if you're the scorpion's target. The prey, yeah? It's going to need to bind to another protein. And based on the neurotoxin, you will see something of the reason why this might bind. I'll show you when we talk about membrane protein. It's going to like to bind to beta sheets and do something in your nervous system. We will talk about that when we talk about the nervous system later today. The last part from yesterday is that we haven't really talked that much about folds. You might have done that in the bioinformatics. So as I mentioned earlier in this course, we talk about concepts such as the secondary structure, the tertiary structure, the quaternary structure. I would say that the concept of fold is that the shape to which a protein folds. And that's almost a tautology, but if you think of hemoglobin, a hemoglobin doesn't fold directly as all the four subunits together, right? You will likely have each subunit of hemoglobin folding together. So it's some sort of three-dimensional larger part. I would say roughly tertiary structure, but that is really the part that is kind of reused. The myoglobin is a good example. You call that a globin fold. These six helices, they're packed in a specific way. So the hemoglobin consists of four subunits. So in that particular case, not just in that case, I would say in the vast majority of cases, a fold is what we mean by tertiary structure. One piece of protein, one chain that has folded. But what you see now that I showed you a bunch of examples both yesterday and this morning is that nature tends to reuse folds, which is not, based on the bioinformatics course, this might sound completely obvious, but it's not actually. So on the one hand, hemoglobin and myoglobin, they are obviously related. The genes, you've duplicated the gene or something to create a related protein and you spoke a lot of this in bioinformatics course, but they will also be case such as the Rosman folds. Every Rosman fold is not related to the other Rosman folds. So some of these small fundamental building blocks, nature tends to reuse or the proteins find themselves spontaneously. There is no obvious, there is no obvious evolutionary relationship with them. It's just that it's stable to fold in this pattern. So we frequently reuse the pattern. And of course, in the golden days of structural biology in the early seventies, as we started to discover more and more protein, the number of structures in the protein data bank was increasing rapidly, right? We kept finding more and more folds. And at some point, it's setting its natural task. Sorry, before that, what we then also said, we started to, we started to determine structures of new proteins that we determined them purely by X-ray and everything, as if it was a brand new protein. And when you see the structure real, ah, this just looks just like something we already know the structure of. Completely different sequence, but the structure is the same. And at some point, Cyrus Schottje and particularly started to argue that nature seems to reuse these basic structure building blocks. Not because of evolution, but because different sequences will spontaneously adopt these building blocks. Why we don't know? And there's a famous article in 1992 called A Thousand Folds for the Molecular Biologist. And Cyrus then argued that there is only in the ballpark of 1000 volts in nature that are then reused. Do you think that's a lower high number? It's an insanely small number, right? How many protein sequences are there? How many genes? In the gym bank or something. I would go up. In gym bank you have hundreds of billions or a hundred billion sequences or something. No, billions of sequences at least. There are in the ballpark of 130,000 structures just in the protein data bank. Out of those 130,000, there are in the ballpark of 1000 different shapes they all fold into. So that just as the alpha helixes and sheets were very common patterns, suddenly we seem to have similar patterns, not just in the super secondary structure level, but even in the tertiary structure, the fold level. It's not that every alpha helix is related to every other alpha helix, they're just their common shapes. And nature appears to use that on the fold level too. And one way to classify this is that you might have done in the bio-athematics and we will talk a little bit more about that tomorrow, I think families is what we would call things when they're obviously evolutionary related. They're closely related and we can definitely find it. Super families, they're still related, but then you can start having relatively large changes in them and on the fold level, that is just a pattern classification. There is no evolutionary relation, you know. And if you've seen how this, this was almost 10 years ago, well this is 10 years ago now, how this is here. The number of families keeps going up and it has accelerated the last 10 years too. But the growth in the number of folds is actually slowing down. So I'm sorry to say that Cyrus was wrong. This is a bit more than 1500 today, but okay, let's say 2000, he was not off by more than a factor of two. And it's almost 30 years ago, it's pretty impressive. So there is a very small number of folds so that somehow we don't know why, but the entire fold space is quite limited. No matter what, if you create a random protein, it will likely, with a probability bordering uncertainty, it's gonna adopt the fold that we already know. And that also means conversely, if you want to design a new protein, start with a fold that we would like to target, that if I would like to bind a certain factor X, what fold would create a stable binding site for this factor X and let's then design a protein that would adopt this fold and hopefully bind it. The other thing that we could then say that this seems to indicate that folds are super stable, right? If you look at bioinformatics, that how much sequence identity do you need between two chains to say that they're evolutionarily related? Yeah, they say that by the time you have 25%, if I give you, you know a structure and then I'm giving you a second sequence and they have 25% identity. They're gonna have the same shape. Well, in general, that's two. So Lynn Regan showed this again 20 years ago that they could take a protein and it was a pure beta sheets protein, this part at least, well, the one small helix. They changed fewer than 50% of the residues and then they could change the entire fold of the protein. So you are generally correct in general and again, if I see two proteins with 30% identity, I would say it's certain that they have the same fold. Remember this thing about rules in biology and there is always an exception. Just because things have the same sequence, it's not guaranteed that they have the same structure because structure is fragile. And since then, there are quite a few cases where both David Baker and other groups have been able to design completely new folds from scratch that they would like a fold to have a particular pattern that we have not seen in nature before and then can we use computers, calculate and predict interactions and then design an amino acid sequence target that fold. We're getting pretty good at this. That was the end of the supernaturistic structure. So then I would suggest that we continue with the things that had originally planned for today. Membrane proteins. And I so have to confess that I'm seriously biased here because this is the air we breathe in this department. We love membrane proteins. The classical number you say is that in the ballpark to 25 or 30% of all the proteins in your body are membrane proteins. And it's a bit of a sliding scale here because there is something that are clearly membrane proteins if they are embedded in your membrane. Some of these are things that are bound on top of your membranes and everything but they're related to the membranes. It's a tiny fraction of all available structures that are membrane proteins. Why? Why would they be difficult to crystallize? So membrane proteins are, as we talked about, they're gonna need to be hydrophobic. And I'm not sure about you but how many oil crystals have you seen? Oil does not like to form crystals. So that for a long time it was impossible to even imagine crystallizing this. And I would say all the early structures of membrane proteins have led to noble prices. There are a bunch of different techniques but what you typically, one of the most popular techniques today is that you try to attach a large antibody. You create an antibody that will bind to your membrane protein. That antibody does not sit in the membrane. And then you effectively crystallize the antibodies but since they're bound to the membrane protein they will then carry your membrane protein as a cargo. And then you can add a bit of, pretty much, detergent that will help solubilize the membrane protein. So that today we're pretty good at it and we have some outstanding research groups here that are specialists in overexpressing and purifying membrane proteins. The other alternative is to use cryo-electron microscopy because in cryo-electron microscopy we do not need crystals. So we're gonna have to study this at PsyLife lab later in the course and then we will show you all about this. The important thing though is that although it's just 1% of the structure it's something like 50% of drug targets. And at that point it might sound like I'm exaggerating. I'm not, I'm underestimating because while it's 50% of the drug target in terms of revenue, we're talking about something like 70% of the pharmaceutical industry. And this is because membrane proteins they're the doors and windows of your cells. If you wanna get into your cells if you wanna start interacting with your cells a normal protein or drug won't miraculously shoot straight through the cell membrane. If you wanna start influencing the cells start on the surface. And if you wanna influence the cells that's kind of the whole point of the pharmaceutical industry, right? You wanna change things. You wanna change signaling growth. And when we first started this when we saw the first few membrane proteins we were quite happy because they look beautiful and simple. This was one of the first ones, bacteria rhodopsin, they look super simple. Helices straight up and down. So in principle, when you think about this they look much simpler than globular proteins which should make structure prediction much easier. And in particular drug design, deep protein coupled receptors is another very famous class of seven trans membrane helix ones. And together with ion channels this is by far the most common target for pharmaceuticals. We spoke a little bit about the environment and this is the boring environment of membrane proteins. Instead of water, you have lipids. And the lipid consists of some sort of polar head group can even be charged. And then fatty acids. And they can be different fatty acids. And this particular, let's see, I should know what this one is. It's a seed. It's palmio, tul, olio, olio, phosphatidine, choline. Palmio, palmitoyl is one chain. Olio is another chain. It has a double. And then I know that this phosphatidine is a choline group, a bad group. I so don't expect you to know those names. The point is that I can change that chain. I can change the length of the chain. I can change the length of the second chain. I can change the degree to which it's saturated or not. You can change whether you have both positive and negative charge here, which means that you can have positively charged or negatively charged or dipolar head groups. So there's a lot of diversity in your lipids. I might even have a small movie. Yes, if you don't like the diagrams, can I even show it? Yes, I can. The points of lipids are mobile, very mobile. Just like water. Well, waters are mobile too, but the water molecule itself is fairly small and rigid, right? But can you imagine the number of bonds and torsions in the lipid means that the entire lipid molecule can move? If we start to look at the different parts of this molecule, what residues in your amino acids, what amino acids would be interacting with the head group part here? Give me some polar amino acids, then. Give me some polar amino acids. Arginine. Arginine? Lycine. Lycine, you picked the EC1s. Histamine. Aspartic acid, glutamic acid, serine, threonine. Do you see why I kept harassing you in the first or second lecture that you need to know your classification of amino acids? Nobody's talking about exactly where the carbons are. But when you just start, you're gonna be looking at sequins, you're gonna be looking at the structure. You don't have time to dig out the sheets and compare it. Let's see, is this a hydrophobic or hydrophilic amino acid? You need, these things have to, you need to know this so that they shoot out. If you see an R in the middle of a sequence, there is no way that can be in the middle of a membrane. You don't have time to sit down and look it up. Similarly down here, what residues are gonna be happy interacting with this part? Such as phenylalanine, maybe alanine. alanine can kind of go anywhere. Do we have any others? Isolucine. Lucene. And same thing here, right? When you see L or V, right? You need LVI. This is what you should think. If you see RKDE, you need to think that. And then depending on how many, if you have lots of, depending on how many double bonds you have here and everything, it's either gonna be more or less stiff. What determines what type of lipids you have? What mechanism in the cell? Because I said you can have different lipids, right? They can have different lengths, different charges. What determines the lipid composition in the cell? What process? Because you said RKDE, right? So RK is positively charged and DE are negatively charged. But this head group, this particular head group had a dipole, so it's negative there, positive there. But you can have lipids that only have a positive charge here. If you can, that's gonna be pretty important. If you're K and it's a positive charge, it's gonna hate that positive charge. On the other hand, if it's D, it's gonna love the positive charge. So the specific lipid composition is gonna matter in the cell. So there is a special process that determines this. It's a process called lunch, is what you eat. Because the lipids are built from the fatty acids and eat fatty acids you get in the food. So if you eat lots of food with lots of trans fats, you're gonna have lots of fatty acids without double bonds. So you literally are what you eat. And the reason that it's so easy to think that your entire cellular composition, everything is determined by your genome, this is determined by lunch. This is an old corny American movie, but I kind of like it, this is gonna, so we're zooming in on the cell membrane here. The point is that you have, you might have seen some structures of this and including this movie, they're kind of wrong. So if you're, the first thing that you see that the cellular membrane is fairly flexible, here they're drawing things as a completely irregular pattern of lipids and everything. And the point here is that even if we distort the cell membrane, it will self-organize very quickly. This is not based, it kind of is a folding process, but you don't need a cellular environment from this. Just throw a bunch of lipids into water and they will naturally form a membrane. A real membrane, though, doesn't just contain these lipids. I think these small gray molecules here are cholesterol. Cholesterol will make your membrane stiffer. And what you have here is sugars, pretty much. That acts as receptors on top of the cell that we can recognize. So while these pure membranes are important, they're not really that pure. The other sad part is that every single illustration you see if there's a text books is wrong. This is not how cell membranes work. And we know that because today you can actually study this as a computer simulation. So out here you have water, here you have all these lipid head groups and here you have the interior. Do you see how disordered the interior is? There is some sort of average order here that yes, in general, the upper part of the tail here is more vertical, but as you go down here, the tails are pointing in all directions. Could you imagine what if you had a system with lots of double bonds here? Then it would be slightly stiffer, right? What if you had shorter chains? What would that do to the membrane? Could you use that for something? So we don't know, but we think. Because we still know surprisingly little of membranes. So you can have, there are some challenges with proteins in your body. At some point, and I'm gonna talk about that after the break, we create proteins and we insert them in your membrane. But the proteins in your body in each cell, they're not uniformly distributed. So if you look in a nerve cell, there are certain proteins that sit at the beginning of a nerve cell, there are others that sit at the end of a nerve cell. There are some proteins that should just be on the inner part of the, say, the nucleus cell membrane. And there are other parts that should be in the plasma cell membrane. So different proteins should target different locations in the cell. Any particular different parts of the cell membrane. And there appears to be some trends that different membranes have slightly different thicknesses. And when you have different thicknesses in your cell membrane, that tends to favor, if you have one protein that is slightly longer, right? That's gonna favor to be in a thicker cell membrane. While if you have a protein that's slightly thinner, well, slightly shorter in the Z direction here, that's gonna favor a thinner membrane. So it's part of the reason how cells are regulated. The other reason why I haven't seen a lot of membranes is that we can't determine the structure of them. Because just as you haven't seen a crystal of oil, remember on the last slide that random interior, it's oil-like, you can't determine the structure of lipids. You can destruct, actually you can. But then in the extreme case, if you would force a lipid to be crystalline, that would have done at something like 150 Kelvin or something, but that has nothing to do with your cell membranes. Your cell membranes are pretty much a liquid. Formally, it's like what you call a liquid crystalline phase. It's kind of similar to what you have in digital watches. However, that doesn't mean that they're completely random. So we can use in particular neutron scattering to determine the average order. And what you can see is that the water is located very much on the outside. And then you have these choline and phosphate groups. Those would be the polar parts in the head groups. The carbonyls, the carbonyls in particular, it's a CO bond, it's a link between the head group and the chains, really. And then after the carbonyls, we have this large part where you have ethyl groups followed by the methyl groups, which is the end of the chains. So that we know quite well that in the very center of the membrane, there is not just pretty much no water, there is no water period. And we also know that the double bonds are more common in here, but the exact orientation and this is pretty unclear. There's average order, but then you can't say anything about an individual lipid. But if you think about it, you've seen this. Was it in lecture two and three? It's just that we didn't call it the membrane. This is the oil. Remember when we spoke about the hydrophobic effect and everything? We talked about oil in water and an oil phase versus a water phase. The interior of the membrane is oil. And all the things I said there about the solubility of amino acid that it costs 20 kilocalories to insert a charged amino acid in oil. It's gonna cost 20 kilocalories to insert it in the center of a membrane too. And this is the part that's now really complicated because everything that we learned about the amino acid in the first three lectures assumed that the folding and solubility creating a protein had to do with transferring amino acid from oil or vacuum somehow and then into water. And that's not true for membrane proteins. Membrane proteins stay in without the water. So all these things I said about hydrogen bonds, for instance, that hydrogen bonding is really complicated because you can't just assume that hydrogen bonds form. You have to compare it with the reference date when they're making hydrogen bonds with the water instead. Not true for membrane proteins. If you are in the middle of a membrane, you have two choices. Either you make a hydrogen bond to the partner in your protein or you suffer. And suffering is bad. So they're gonna do pretty much anything to create hydrogen bonds. We know quite a bit about membrane proteins because as difficult it has been to determine structures, people have been able to determine more and more structures. I'll spend 10 minutes on this and then I'll give you a break. One of the first proteins that we determined as structure was bacteria or rhodopsin. It's a very special protein that occurs in something called the purple membrane of certain bacteria. I might even, I don't know whether I have a bigger, the reason why it's called the purple membrane is that there's so much protein in this membrane that the entire membrane is purple if you dry it. And it's the protein that's really coloring the membrane. This is the extra structure of it that you can probably almost see it here. You just start to count these helices. There are one, two, three, four, five, six, seven helices per protein and then they're receiving a small beta sheet at almost, well, kind of at the outside of the protein. So here's one protein. Here's a second protein and here's a third protein. All the gray stuff here is lipids and then it keeps repeating this way. So can you make any observations about this? The concentration of protein here, is it higher? It's an insanely high protein concentration. This membrane, it is a membrane, we call it a membrane, but it's a membrane that probably consists of more protein than lipids. So the lipids here are almost like small molecules bound because here's gonna be the next protein. So there are no open stretches of pure lipids like I showed you in that not quite correct movie. There's protein everywhere. So it's really, it's gonna be just as much the protein that gives this membrane the properties as the lipids. You have rhodopsin in another place, although you're not bacteria, but rhodopsin is a fairly large class of molecules and you also have them in your eyes. So rhodopsin, not the bacteria one, but the normal one, rhodopsin, has an ability to turn electrons, sorry, to turn light into electrons. So that among these seven helices, right in the middle of here, there's a so-called retinal group, which is not the side chain, but just like the protoporferin, this is a small non-protein group that is bound to the protein. And this molecule is light sensitive and you would not know that just from seeing it. And in particular, in the middle of the molecule, there's one bond here and this bond can be either cis or trans. And now I should remember this, I think that the trans state has lower energy and the cis state has slightly higher energy, that would at least be my bet. So what happens here is that normally the molecule would be resting and would be a trans, but what then happens when a photon strikes this molecule, the photon deposits its energy into this molecule, the molecule switches over from trans to the cis state, which is higher energy. And then as this relaxes down, then it creates a wave of structural changes here that creates an electron transport and this is gonna be the signal that eventually becomes a nerve signal. This leads to another question, intelligent design. There's a traditional argument in intelligent design. And do you know what, have you heard about intelligent design? So the argument about intelligent design among certain evangelicals, for instance, is that how could nature have created something as complicated as an eye? Because you can't create an eye piece by piece. Rather, you could create an eye piece by piece, but the argument is that why would ever, there is absolutely no driving force in evolution until the entire eye is complete and you can see with it. So that until you see, until the eye actually worked, it would be a waste and then an evolution would rather get rid of it. And the consequence would then be you need some sort of intelligent creator to have done this because the intelligent creator needs an error. I would argue that this is a perfect counter example because we see bacterial adopts in tons of organisms that don't have eyes, such as bacteria. You see the entire structure, right? It's just that in bacteria, you don't have the retinol and you don't have this part of the cornea and everything. And then we just realize, you know what? In this particular structure, it makes a lot of sense. This is a perfect surrounding where we can actually bind this molecule. So the only thing you needed for it to actually get the light sensitivity is to take this existing fold and then bind this specific molecule. Now that, per se, doesn't create an eye. But now you have a molecule that is light sensitive that if you shine photons on it, it can create a signal. And initially, it was probably just a structural change. And then nature keeps refining this over billions, millions of years, at least, of evolution, right? And then eventually, you now have these light sensitive molecules. They would likely be expressed in parts of the cell that are more subject to light. And then eventually, nature would start creating lenses and everything. So then we're gradually just making the process more efficient. But the only part that created the light sensitivity was that we started to bind a molecule in a fold that already existed in a membrane. When we first, when people first determine the structure of this protein, it's not entirely easy to imagine how you get the light sensitivity, right? And I think this is a beautiful example of things we show that when you see the complete structure and when you see me explaining this here, it might make sense, but we don't know that when we start to study things. And unfortunately, I would say this is where we've very much gone wrong. It's not so much gone wrong, but in many cases, science doesn't work the way we teach science. So for a protein like this work, we frequently talk about structure of proteins. But remember what was the central dogma of molecular biology that we talked about in the beginning of the course. Sequence leads to structure leads to function. But how does the structure lead to function? We can't gloss over that this far, right? In some cases, it's easy. If you have myoglobin, yes, it can bind oxygen here. But even myoglobin is not, how does it bind oxygen? So proteins are not bricks. They will look that way in all textbooks. They will even look that way, sadly, if you download them and look at them in the molecular viewer, they will look like bricks. But this is the reason why I've included some movies. Real proteins are surprisingly floppy. They move, just like other small molecules. And just as this molecule can switch between cysts and trans, this entire protein will undergo some motions. The helices will move relative to each other. They're not going to be gigantic motions. And you might, if I were to give you two papers and print the structures right there, you might not be able to see them. But if I show you how it's moving, you would definitely see that there is some shift in the helices. And today, over the last few decades, we've been able to determine a whole range of different bacterial rhodopsin structures. So we know that, and I'm not going to go through specifically what these states are, but do you see that there is an entire cycle so that we have a state here that gradually moves to a second state, to a third state, to a fourth state? And in many of these states, we now have several structures. And you can probably also see that the overall shape is the same. You have seven helices everywhere, but here you see that this helix has moved out and then it's moving back in. So there is, well, I'm not saying they're going to say it's tiny changes, there are some changes. It's moving a bit. So what everything this is going through that it's, the retinol is undergoing a conformational change and then you're relaxing back. So you're kind of charging and decharging the protein just like a battery all the time. How would you imagine a determined structure like this? And how do you do that? So here's that, that would work if you knew where what the states were. But you don't know what the states are. You might not even know how many states are in the protein. And this is the part where I mean, we teach science difficult how we do science. That's, I would say the answer to the question, how do we determine them? I would say it's with great difficulty. We don't, in many cases, we don't know. This is why people spend decades on these targets. So you're quite right that we somehow, we need to capture either intermediate stable states or in some cases even transition states. In the case of bacteria or rhodopsin, it's light sensitive. So what people try to do is that you use special femtoseconds, so time-resolved x-ray crystallography, and then we use a laser, and then we pretty much try to excite the protein, and then a split second, not even a split second, like a femtosecond later, we try to determine the structure of it. So can you somehow determine the structure when we've shown so much light on the protein that it's in the higher energy state? And the first 50 times you try it, it's likely not gonna work. But then eventually you find a small trick that enables you to capture that state, and then you have one more state. And then I guess you're gonna, there's gonna be a second state when it has, this might be after a very high frequency, and then just a short time later, can you find some other states? In this particular case, I think you use, let's see, this is probably more neutron detection or something, because here you have a wave number. So the wave number corresponds to frequency here, right? And at different frequencies, you can start to get information about different distances in your protein. So you constantly have to try to find either indirect methods or a way to capture the protein in some specific state. This is not, in this guy, I'm showing this Rebecca, you're Robson, but this turns out to be the case for many proteins, say epidermal growth factors and everything, things related to tumors. Things will, most processes in the body, whether it's cancer or anything else, will have to be related to proteins undergoing different confirmations. And here's the thing, you're not gonna see that by looking at one of them. How much of this will you understand if you do bioinformatics? Because that's kind of a problem, right? What is the sequence difference between these states? There is no sequence difference. So it's not, I know, of course, bioinformatics is probably the most powerful tool we've introduced the last quarter of the century, but there are limits even to what bioinformatics can do. You can't determine how the protein moves here with bioinformatics. Actually, we might be able to get some hints because we might see some patterns between these interactions or something that makes us realize that they might be able to interact or these two are related or something, but it's hard. So bacteriodopsin in particular, here we've studied it so much so that we know quite well both what the individual states are and how quickly it moves. So this from the resting confirmation into the intermediate confirmation here with the retinal, we're talking about microseconds. It takes a ballpark of 20 microseconds. From that, you can estimate what the free energy barriers are. And then when we're stuck in that high energy state, then we gradually relax down to the late confirmation and that takes a ballpark of milliseconds. So which energy barrier is highest? That one, that one, or that one? Any other suggestions? The last one, right? Because this is the slowest process. Yeah. And this frequently happens. When we excite that, one of the reasons why this goes fast too is that here, this is a higher energy state. So this would never happen spontaneously, but I'm adding energy with the photon here. While this process has to be more spontaneously, that is gradually relaxing. And then we have to pump it through the cycle all the time with photons. Let's see, I think that would be the resting state. Then the intermediate state. Did you see the isomerization in the chain first? We can go through it once more. There is a start, and you're gonna start by seeing, first we're gonna see the isomerization. Boom, there you had the photon strike it there, so we got to change there. Then this creates a motion in the entire helix here, intermediate confirmation, and then the late confirmation, we're gonna relax. And then we relax back there. And then we're ready to strike it with another photon. And this happened a couple of billion times while I was peeking in your eyes. We're not talking about large structural changes here, right? And in particular of these here, you're not gonna target this with the drug, what is the difference in binding between that state and that state? We're talking about fairly small changes, but this is so specific that you're gonna get drugs that just bind to one of these confirmations, but not the other one. It's 1026, it's a great time for a break. And after the break, I'm gonna talk more about pumps and channels, which is even closer to what we're doing, your engines. What does that keep you going? How are you different from a bacteria? You're not quite as beautiful as efficient organisms at bacteria, fortunately, but you have some other abilities. And I'm actually not kidding. You might think that you're in higher organism and we tend to call you higher organisms, but a bacteria is a mean killer life machine. They're amazingly efficient. The point of view is really, a human is the only point of view is really for your DNA to replicate itself. And you're just a necessary step for your DNA to replicate itself. The bacterium accomplishes that in 10 minutes. Doesn't take 30 years. But on the other hand, evolution has created us too. And evolution has created different organisms, have different things that we can accomplish. And humans in particular, eukaryotes, even mammals, vertebrates, we have some, it's a nervous system, which is kind of nice. It's the reason you can take this course, for instance. And the nervous system is very much dependent on membranes. So some of the most important membrane proteins are related to your nervous systems, both driving the nervous system and executing its function. So in all cells in your body, and now we're not necessarily talking about bacteria, but human cells, let's forget about these supports for a second. So in the entire membrane, all live cells, they have an excess of negatively charged ions on the inside at equilibrium. And that means that you can have a negative potential on the inside of the cell in the ballpark of 100 millivolts or so, minus 0.1 volts. There's minus 0.1 volts on the inside compared to the outside. Is that at equilibrium? Well, it's a stationary process, right? Because when you die, this potential of the ions will eventually diffuse out. So this is a process that you have, the body has to spend energy to maintain. You can't have an equilibrium where we have sorted all the negative ions in one part of space. So it's a stationary state or something, but the only way we can uphold this is that we're constantly pumping ions. And the reason why we pump these ions is that if you now have an excess of negatively charged ions and potassium on the inside, if we then have different type of holes here that under some conditions we can open up, say in this case of potassium, K-ion, K-ion channel, if there is now an excess of the potassium ions on the inside, they will spontaneously move out the second we open the hole, right? And if you now have positive ions moving in one direction here, that's gonna change the potential on the inside. Same, we might have another ion channel here, right? So if we open that up, we have an excess of sodium on the other side. If we just open this ion out, well, in general, the sodium is gonna float to the inside. So that if we just have prepared a state, we have different concentrations of various ions on the outside versus inside of the protein, I can very quickly control here by just opening windows or doors very quickly and that will change, equilibrate the concentrations of the ions. This is how your nervous system works. But before we get to those channels, we're gonna need some sort of process that makes sure that we have different concentrations of ions on the inside and the outside. And this is the purple thing we have in the middle here, which is a pump. And this pump has many names. You can call it NKA. So the N is for me is a sodium potassium pump or ATPase. And the reason why it's called an ATPase is that it uses energy from ATP. So the difference here that the green ones here, they're channels. Channels are simple, but you could also argue that they're pretty stupid. You're literally just opening a door. It might be a very selective door that only lets through one type of ions, but it's completely passive. A pump can push things against the concentration gradient. So even though we already have too many potassium ions here on the inside, I can still take potassium ions and move it from the outside where it would like to be and force it to the inside. And for that to happen, we need to add energy. We occasionally also abbreviate this as NKA and everything, and you're gonna see lots of versions of this, until when I was a bit older than you are, when I did my PhD, we got roughly then got the first structures of this pump. This is an amazing structure. Jens Skou in Denmark got an overpriced for this in 1997 when I was helping me through a PhD. I'm not gonna go into detail that there are different types of ATPases. What this does is that it takes sodium ions on the inside and moves them to the outside. At the same time, I'm taking potassium ions from the outside and moving them to the inside. So it's doing this, it's moving ions in different directions at the same time, and it's moving three positively charged ions out and two positively charged ions in. So when I do this, I'm only moving positively charged ions, but since I'm moving more positively charged ions out than in, I'm effectively creating a negative potential on the inside. The way it, and this too, just as bacteria rhodopsin, this undergoes through a whole sequence of events. And what drives all these events is that you take ATP and then you cut off one of those phosphate units and you move it over to ADP. So we're using energy to drive the channel through all sequence series. See if you have, oh yeah, we even have a structure over there. This is starting to be a fairly complicated structure, right? So this part would be the transmembrane part. And then we, we have an alpha and beta domain here. So there's pretty much one, two, three separate big units outside of the membrane. Exactly what these does is complicated. And then it turns out, did I say that there was one state? There is one, two, three, four, five, six, seven, eight states at least. And then there are substates in some of them. So there is a famous group in Ohus led by Paul Nissen. And I think they've determined sort of like 11 or 12 different structures of these. And they keep finding new structures. And this might sound excessive, but the whole point is that there are super tiny details here that exactly how have these domain move between different states. What determines whether you have the ions bound here or there? What are the specific properties of these binding sites? Why on earth would that be important? We spoke about this one of the first lectures in the course. How much ADP did ADP did you use per day in your body? So it's like 60, 70 kilos or so. So it's an insanely intense process. It goes through all the time everywhere in your cells. What do you think would happen if you had a small mutation error here that caused this process to be less efficient? Yes, if you're lucky, I would also. But I mean, you can imagine, if we start influencing this process, it's going to be very severe. And we actually have colleagues at SILIFLAB who are studying this, not with me, molecular simulations. So now, let's see. We have a bunch of different structures here. Yes, and today, we even know lots of the specific binding sites. The other way we can study this, you can study them with microscopy, because these channels are going to be present everywhere in your nerve cells, in particular, close to the areas where we have lots of the channels expressed. We still don't, we know that there are differences where they are expressed. We don't know why. And even quite recently, up at Carolinuska here, there have been cases where we've found mutations in a child. It's actually a very sad story. The child didn't live eventually. But a child that died when it was six to nine months old, and it was completely healthy the first few months. And then when the child was a few months old, you realize that it has some sort of disordered with the nervous system, and everything reacted in strange ways. And then it deteriorated and deteriorated. And even when the child was still alive, they actually had a whole genome sequencing on the baby. And then they found out that there was once more mutant in the sodium potassium pump. And at that point, again, we still can't cure it. But we took one mutant. And the point is that it's not that the child died. If the child had died as a fetus, then it would have been hard or easy, depending on how you see it, right? Because then there would not have been anything you can do. But the point, the child was healthy when it was born. So if you could find a way to rectify these mutations or compensate for it, right? You could cure this type of diseases. And there is an increase in the number of these diseases that are very severe that we actually can find out with whole genome sequencing. And this is one of the reasons why at SILIF Lab, we even have a sub-facility for clinical genomics where you can, if it's urgent enough, you can do a whole genome sequencing within days of a patient to find that error in the mutant and see if we can treat it. So today, there are very few of the diseases we can treat, but give it 10 more years and I think we will be able to. Magnus and the team in our place in several others, they've been able to do lots of simulations with this. And in many cases, we can actually understand what happens with these channels when it goes from one of these states to the next state and how are things being pushed? What is really in this binding site that's caused to this specific ion to bind in one state or the other? When it goes through this entire cycle, why will it bind the potassium in one state? Why will it bind the sodium in another state? When will it release the potassium? When will it release the sodium? So what Magnus is working on right now, for instance, is trying to understand these specific mutations. What happened? Why is it that this specific mutation caused the protein to a malfunction because it didn't kill the protein. If it had killed the protein completely, then the child would not have lived. So what do you think could have happened? It is pure speculation. Yep. Something that what likely happened here is that when you have these eight states, remember the thing that the free energies correspond to differences between states, right? This mutation likely made either take one of the states and made it more favorable or less favorable. If you make something more favorable, you're gonna get stuck there. Or we would have influenced a transition pathway so that it made it more difficult for the protein to move to the next state. Then exactly what happened, that we don't know. And this kind of led to two things. Either that the entire transport of iron was not efficient, it likely wasn't efficient. And it's not so much that you need more ATP, I think, but as a child grows the first few months of your life, the entire nervous system goes under an exceptionally rapid development, right? And if your nervous system doesn't really work when it developed, it's gonna lead to other disorders and everything. But just as these molecules are so fragile that individual mutants here can cause things to malfunction, you could probably also rectify that. Maybe introducing something in the membrane that if a state had become unstable, can we stabilize that state by adding a drug? Or if a state has become too stable, can we destabilize that by adding a drug or something? So we see more and more that they're doing highly targeted proteins where we would like to change something by specifically binding. And there are quite a few binding sites here, right? And you have no idea where we should bind. Right now we don't even know what the mutant is doing. I'll come back to the iron channel shortly. So that leads to another question. If you believe that that is a membrane protein, how do we get that into the membrane? So you take that thing and just push it into the ER? It's not easy, right? Because this part would have to be transmembrane. So is this part gonna be stable in water? Okay, so you can't fold this in water. So that would mean that we need to insert the entire part and then everything would fold in the membrane. This part is not gonna fold in the membrane. This part likes to be in water. It's not entirely easy to understand how membrane proteins form. Where do you even produce these proteins? Well, right now your guess is as good as mine. It's not gonna be in two slides because I actually know the answer. So let's start with the first step. How are the amino acids, the sequence, the chain of amino acids here? Where is that produced in ribosome? All proteins are produced in ribosomes. So somehow we need to get from the ribosome into a state where we have this part in the membrane and some other parts of water. We have learned a lot about this the last two decades and this is part of why I'm deviating a lot from the book. There are gonna be a couple of fundamental models here. One of them I should have mentioned when I showed you this movie. Remember that movie when I saw that the entire membrane was really liquid? So there is a famous model that we use to describe that's called the Singer-Nickelson model. Singer, S-I-N-G-E-R, and Nicholson, N-I-C-H-O-L-S-O-N. You're not gonna remember those names, but the better model is the Fluid-Mosaik model. So it's the Mosaik just in the sense you might have a shape of platter or something. You have protein domains, you have sugars, you have cholesterol and lipids, but it's also fluid. They will move around. It's a two-dimensional liquid, literally. But that does describe how the membrane would behave when it's already folded and when you have the proteins diffusing around in it. Imagine having a liquid spit out on the table where the proteins move around in the membrane. The other part is gonna be how do we insert this? And there was a famous experiment done some 23rd to 30 years ago now by Jean-Luc Popot and Don Engelman. And it's called the two-stage model or the Popot Engelman model. And now we're gonna do this dirt sample and I will deliberately ignore the ribosome. Sorry. The two stages here is that they argue that proteins first insert helix by helix and then they aggregate. So the idea is that we start with the helix and for now we're not gonna worry exactly about how this is inserting. And that helix will go into the membrane or not. If it's a very hydrophobic helix, it should go into the membrane. If it's a hydrophilic helix, it should not go into the membrane. So that the decision whether this helix goes into the membrane or not just depends on the hydrophobicity of helix. And that's an individual helix. And then when we have many such helixes, they will spontaneously find each other in the membrane and aggregate and that's the second step. There's a very simplified model. The question is, is it true or not? Can you come up with any way of testing this? So they did a famous experiment. So if this is correct, forget about the insertion here. This is a complicated part, but if this is true, I should be able to take an existing membrane protein, cut that into pieces and then it should somehow be able to find itself again. So what they did is they took bacterial rhodopsin that has seven transmembrane helices and then they cut it in two parts. So they cut the gene in two parts. So you have one gene that now expresses three helices and another one that expresses four helices. And then these are completely separate genes. And then when they created a cell that does not express the original bacterial rhodopsin, but it only expresses these two half genes, they could show that the protein still has activity. So somehow, even though this is now longer one chain, they're two separate parts, they manage to diffuse around and spontaneously aggregate in the protein and find it. I have predicted that. I so think that Don, they're worth the Nobel Prize for it. Many of the Nobel Prizes is prediction discourse I've come through, but this one still hasn't and it's getting a bit old, but I hope he is worth one and I hope he gets one for it. The other question that has to do with fine, now these two proteins, the previous one was a horrible simplification. They have a real protein that actually has loops here, right? Which way is it gonna insert? Should the N and C terminus of these four be on the outside or the inside of the cell? Is it that way or that way? So it's gonna depend and that's another famous result by my colleague Gunnar in this department actually. So what Gunnar did already as a postdoc, he realized that if you just sit and look at lots of sequence alignments, he noticed that there are charges in memory proteins, but they're always in loops. And if you look at enough of these, funny, the positive charges always seem to occur on the inside of the cell, while there are more negative charges on the outside of the cell, which incidentally also fits very well with the membrane potential because the membrane potential is more negative on the inside of the cell. So this is called the positive inside rule, which is Gunnar's very, I think Gunnar is worth a Nobel Prize too, but he's not gonna get one because this isn't the board of the Nobel Foundation. And I'm somewhat biased here because he's a close colleague. Can you imagine what Gunnar did to prove this? You can do something much simpler than that. You just swap the places of the charges, right? Put the positive ones where the negative ones used to be, put the negative ones where the positive ones used to be. And what do you think happened with the protein? Suddenly it inserts the other way. And if this is a pump that used to pump things out, now it's gonna pump things in instead. It's a super simple experiment that tells you something fundamental about biology of membrane proteins. We already touched a little bit on these, of course, so that what we haven't really described is why do things start to interact? And I think I mentioned the GX3G motives, do we, right? So what Don did is that if you start looking at these helices, and in particular at these crossing angles that we covered yesterday, it turns out that some residues are much more common than others in the crossing angles and in particular glycine. And the reason why glycine is important here is that glycine don't have side chains. So if you're gonna put two things, right, take that, and if you really want these helices to get close, well, removing some of those ridges is likely a good idea. And by removing the ridges here, you don't have the glycine, effectively it doesn't have side chains. And if you now have a glycine here, and then you wait, so four positions later, you have another glycine, so you have glycine, and then any three other residues, and then a glycine, these glycines will effectively be placed right next to each other on the helical surface. And that means that you're now gonna have a patch on this helix that is depressed here, and then a patch on that helix that is at depression there, and they're gonna love to bind right next to each other. When people found this out, and even when I was a post-doc or a young aspiring assistant professor, we were so excited about this. Because finally we've started to find the first pattern that describe how membrane proteins aggregate and how membrane proteins pack. Yes. Sorry, why? So remember, the crossing angles we spoke about, right? What determined how helix, that has to do with these rigid sand depression that had to do with the side chains. So if I would like two helixes to bind to each other, I can't have a large side chain here. If I have a large side chain, it's gonna bump into the other helix. And normally you can pack helices reasonably well, but if I would like two helices to pack even better, let's remove some of the ridges just in this position where they wanna cross over here, right? And the best way to remove something is by not having a side chain there. So having a glycine. But a single glycine is just one piece at a time. It's not really large enough. So if we wanna take this part, I pretty much wanna take this glycine, and then I wanna take the next glycine two so that I have two, the two positions in the ridge that are right next to each other. But since you take circles around the helix, this position, the position that's gonna be right next to this G is the residue for residues away. So if I take two positions in one helix that are right next to each other in the ridge, for residues apart, and I make both of those glycines, then I will effectively have a small depression almost like a valley in that helix. So this will budge in a bit. And then I have a second helix that have the same valley and they're gonna be able to get really nice and close to each other. They will pack really well there. When we first found this out in the entire community, we were so excited. Please remember that all the normal rules don't apply to membrane proteins. Membrane proteins, they don't fold because you're turning the hydrophobic parts away or anything. I'll come back to that in a second. And when we found this out, we thought that we've now started to find the first patterns that predict membrane protein packing. And now we'll be able to predict membrane protein structure. It's even cooler because Donald also realized if you'd add some polar residues here, let's say that you can't take an arginine, but say you put a serine or threonine here, asparagine is probably better. So that's a polar residue. Will the polar residue like to be in a membrane? No. If it was charged, you would never even insert in a membrane, but a polar residue, you can insert it in a membrane. It's not gonna love it, but it can. So now you have this helix with a polar residue and you have that helix with a polar residue. And you might not know your asparagine, but asparagine can form hydrogen bonds. So when you have this helix in isolation and this helix in isolation, they're both miserable because they can't form a hydrogen bond until you bring them together, right? Then they can form a hydrogen bond with each other. It's almost the opposite that normally you would have hydrogen bonds formed with water. And in this case, the hydrogen bonds would form between the membrane protein helix. It's almost the opposite pattern. So when we first found this out, we were really certain that we started decipher things. How many more patterns like this do you think we've discovered since? No, exactly. And I have to, it's probably prescribed nowadays. I've even written research applications where I kind of argued that this is easy. We're gonna predict the packing of helixes. It wasn't. It's still a beautiful result, but as a field, we went wrong. This does not explain how membrane proteins interact. These motifs are still super important. If you find that motif, there will be two helixes packed, but it doesn't describe everything. And I think part of the reason why we went wrong is that we made the mistake of thinking that the things that we see is everything that we exist. So one of the reasons is that a few decades ago, in early 1990s or so, the proteins we knew of in membranes, they were the simple ones, the bacteriodopsin. Simple, normal helixes that span the bilayer, right? And in this case, it's only gonna be a matter of understanding the packing of those helixes and we're done. We will be able to predict membrane protein structure. And then there was the famous exception, Aquaporin, 1997, Peter Eger, got a Nobel Prize for it. Do you see here? Well, here you have one helix, the blue one, and then it turns back and goes out. So it's so-called re-entrant. And then you have a second helix here that comes in. But I think I can kind of hand wave my way through this, that these two, they effectively create one long helix because these residues here are gonna form peptide, not peptide bonds, they're gonna form hydrogen bonds with the end of the red helix. So it is effectively one helix, it's just that it's a bit broken. But the problem is that the more membrane protein structures will determine the words it gets. Glutamate transport, you see, you even have a helix, you have lots of re-entrant regions, helices lying horizontally. The better we've gotten at determining complicated membrane structures, the worst structures we see. And that also means that those simple rules that we thought could explain membrane proteins, they're no longer true. Membrane proteins are very, the packing of membrane proteins probably even more complicated than for globular proteins, which is a challenge because these are the most, by far the most important pharmaceutical targets. So remember that I said, for globular proteins, what do we have, what type of residues do you have on the outside for a globular water soluble protein? Sorry? Yes, polar. Sorry, I was thinking hydrophilic. Polar is fine. And what residues do you have on the inside? Hydrophobic, good. And on a membrane protein on the other hand, what residues do you have on the surface? Hydrophobic. So what residues do you have on the inside of a membrane protein? Shit. This usually works. You're supposed to say polar. That's what we all thought for so long, right? It makes sense. The other problem is wrong. It's wrong, it's completely wrong. If you make the bioinformatics statistics, they are everything on the inside is hydrophobic. It doesn't matter whether they were exposed or to the membrane or bird, it's all hydrophobic. And that means that it's gonna be pretty difficult. You can't predict which ones are exposed and which ones are buried, right? You're not gonna be able to predict that from looking at those curves. They're all exactly as hydrophobic. They're almost always helix. There are few proteins that form this barrel-like porous that you can have an entire beta sheet wrapping around, but for now you can forget about it. So that's almost virtually always helices in the middle of the protein. And the only weak signal we have is that if you're talking sort of sort of entropy or information content, sorry, this should be entropy, not information content. So the residues exposed on the surface of a membrane protein tend to be slightly more varying. While the ones that are buried here in the center, they're more conserved. And then we can see that just by looking for multiple sequence alignment and seeing how much variation there is. And this makes sense, right? Because that things here, if you make a mutation here, it's not gonna influence that much. So somehow it's all gonna be based on packing of hydrophobic residues. And this is pretty much Lena Jones interactions. Super weak, delicate interactions. So predicting this packing is pretty darn close to impossible. There are new methods, and I might have mentioned this in particular Mason correlation mutations and everything that we've finally get to the point where we can predict some interactions in membrane proteins, but anything you try to do for globular proteins is an order of magnitude more difficult to do for membrane proteins. And in particular, bioinformatics, there is one specific reason for that. We simply have less information. We have much fewer membrane protein structures. And because we have fewer membrane protein structures, there is less data to train our methods on. And if we have less data to train them on, we're not gonna do as well when we predict things. So that, having said that, I'm gonna take you through the slightly more modern, because we actually know how this happens for slightly more realistic proteins today. We have the ribosome, and I love this. This will say, this is an old slide by now, but now we know what the ribosome structure is. But there's a point here, you don't need to know it. The ribosome will attach to a large protein that sits in the membrane, which is really a protein channel. It's called a translocon. We have the structure of it there from Vandenberg in 2004. And this translocon, the ribosome will bind to the top of this protein. You see it from the top in the lower hand. And the translocon here pretty much has two gigantic arms. And that means that between the blue and the green helix there, this translocon can actually open up and let helixes out of the membrane. And depending on how duphobic the helixes are, if the helix is very polar, it's just gonna shoot straight through the translocon and go out into the protein, sorry, into the cell here. On the other hand, if it's a very hydrophobic helix, it's gonna get stuck in the translocon for a while. And then it hangs around here and suddenly the translocons opens and then it's gonna move out and become a membrane protein helix. So it's the translocon that it's pretty much selects and lets things out into the membrane. And one of Gunnar's students a few years ago, Tara Hessa, together with Gunnar, did a beautiful study when they tried to determine depending on the sequences you introduce in these proteins, how likely is it for things to go out into the membrane or another beautiful experiment. So what they did is that this is a super complicated process to measure, right? And as always, you need to find a simple model. So the simple model is that they took a small protein called leader peptidase. It doesn't matter what it does, just three helixes. And the first two helixes are highly hydrophobic so we definitely know that this will go into the membrane. And then let's study the third helix there and depending on what residues we put in that helix, we wanna say, will it go into the membrane or not? And an easy way to measure that, you can have glycosylation sites before and after the helix. And depending on whether these end up in the lumen or cytoplasm, you're gonna get the signal from them or not. So if they are in the lumen, you're gonna get the signal and if they're in the cytoplasm, you're not gonna get the signal. And if you just measure how much you see a glycosylation site one and how much you see glycosylation site two, you can say that did this in third to 1% 50% or 99%. And what you get, you get bands like this and then you measure the intensity of the bands. And here is where your statistical mechanics knowledge comes in. If you know what is the frequency on the inside versus the frequency on the outside, you know your two probabilities in the Boltzmann distribution, right? And then you invert the Boltzmann distribution. So that the relative fraction of that, well, it's gonna be E. The delta G has to do with minus RT or KT if you like Boltzmann's constant, the logarithm, it's exactly the inversion of the Boltzmann distribution between the two states. And all you need is, what is the fraction of site one versus the fraction of site two here? If you say that, you can get the apparent difference in free energy between the insertion versus not. It's an amazingly beautiful experiment because just by looking at bands on gels here, you can now get delta Gs in KCals per mole for the insertion. Do you see the point of understanding your statistical mechanics and then creating super simple models? So what they got from this, they then tested this with a bunch of different amino acids in the celuses, adding pretty much one. And then you get this hydrophobicity scale but it's a biological hydrophobicity scale. It's not just the solubility in oil. It's a real cell inserting this with a real machinery in the cell. How much do you gain from inserting, say, isoleucine in a membrane versus how much do you pay to insert lysine or glutamic acid? And then you end up with the cost for all of them. Superficially, the scale looks really beautiful. The hydrophobic ones are really good to insert. The hydrophilic ones are expensive to insert. But you see that there is something with the scale here. We're predicting that it would just cost three kilocalories to insert a charged membrane, sorry, a charged residue in a membrane. And this is quite fun because they've almost led to some fights at some biophilic society meetings. We know that it should cost the ballpark almost 20 kilocalories per mole to insert a charged residue. And that has to do with physics. And here we're saying that it's like three. And Steve White in particular, there are quite a few people that argued that the explanation to this is that we have a translocal, right? Nature is not stupid. Nature doesn't use, just doesn't insert it directly. We are inserting things to the translocal. So that up at A here, we would have the ribosome. And depending on whether things go into water, the point is we don't go directly from A to C. We are going from A through the translocal and then out to the membrane. Can you spot any shortcomings with my argument there? So there is a quote here that you think you haven't said this is an actual Simpson's Quotence. The head order. So what's the problem with this argument? So we talked about state variables before. What is a state or a free energy is a state variable? And what do we mean by that? It means that a state variable can only depend on the state, not how you got to the state. So the free energy of C relative to A can only depend on A and C. It can't depend on B. Because the point is that if we, assuming that it cost us, assuming that it cost me 100 for go from A to C, but from going from A to B to C, it would only cost me 50. So the point is that I, okay, then I would go from A to B to C, that cost me 50, but then I would move to C to A. I just gained 100 back. Good, plus 50 net and I'm back in A. And then I can repeat this and it would be a perpetuum mobile. I would gain 50, I would gain 50 kilocalories per mole every time I've made one loop. It's impossible. It's impossible to violate the laws of thermodynamics. So it can depend on the translocum. The translocum can change the speed with which it happens, but the translocum cannot change how stable a memory protein is. And that, apart from jokes about Simpsons, that brings us to a much, much, much more complicated question. Why are proteins even stable inside membranes? So the concept of saying that it's hydrophobicity, it's well based on this biological hydrophobicity is guess it might not strictly be true. You can even, for some of these, for the arginine or something, you could even argue that, okay, maybe the arginine slides halfway out and that's why it's not so expensive to insert. But it's worse than that. The leucine, we don't gain as much from inserting the leucine, the hydrophobic residues in the membrane as I would expect. And there's no way the leucine helix will slide halfway out because he would prefer to be in the membrane, right? So there's something we don't understand about membranes. Yep, for whatever reason, it appears to be, the differences between, it's almost as if a membrane is not as hydrophobic as oil. Right, the entire scale is compressed compared to what I would guess from inserting things in oil. And based on all this training, I'm now gonna give you a super advanced quiz here. Here's one helix and here is another helix and which one of those is soluble in water and which one goes into the membrane? And just to make your life a little bit easier, we will add some charges there. So which one believes that the left one is a membrane protein helix? Raise your hand. And which one believes that the right one is a membrane protein helix? So the points are both of them are in the membrane. And without this one, you would be dead because that is this part that senses voltage and voltage data in channels. And the reason why we're bringing it up, this was one of the outliers in the site of a business scale where we couldn't understand it. We tried to, Gunnar's team tried to measure this and they definitely do insert. This is, let's see, this helix, there's four of them. And every time the voltage changes over the cell membrane in nerve cells, this helix will move up and down and force this hole to open or not. You can open or close a channel depending on the voltage across it. Which is kind of important because this is how all your nerve signals work. This is how every single heartbeat works and everything. If you want something to move in an electric field, it makes all the sense in the world to put lots of charges in it because the charge in an electric field will experience a force. And I don't know any other way to get a force from an electric field. There is only one small complication. What on earth are they doing in membrane proteins? And I get contradicts everything we know about membrane proteins. I think, yes. And it's even, you don't even need something as large as complicated as this one. This is a human structure, but you can take out just, you see that this is one chain. This is one chain. Here you see the coloring based on chains, right? To make life simpler, if this is the voltage sensing part, can we just try to cut out that part? And we actually can. Four helices. The blue helix there in the rear, you see that all these blue ones are positively charged residues and red one are negatively charged ones. That is a machine. This voltage sensor will itself change states depending on when you change the voltage across the memory. I'm not sure, you've probably all heard of nanotechnology, right? And that's how amazing this miniature machines are. The reason why people call it nanotechnology is that it's smaller than micrometer. So most nanotechnology machines might be 900 nanometers. This is a nanomachine that is five nanometers. It's pretty impressive to have something that can go under through the, and we're talking about four helices. And I'm not, and it pretty much never goes wrong. You can test it, how good your arm is at reacting and it rarely, I rarely move that arm by mistake. The other question, this is an example of a quiz and I got over the break about when can you trust membrane protein structures? So this is an example, not this particular one actually, but the structure you had in the previous slide. Some of the very first structures we got that these voltage sensing proteins had this helix, they couldn't even believe that the helix was transmembrane. So they suggested that the entire pair, the third and the fourth helix here, these two helices, they would actually lie somehow, plane are in the membrane, and then they would move from one side to another. It's a certain noble orient, saying we're not gonna bring up in the video. Now they're in their defense, they were also the first one to realize that this is wrong. But this is such an extreme structure that even people who are some of the most famous in the world couldn't believe that it was the transmembrane helix from the start. And what this led to is a whole bunch of different models, how membrane would up. Do they have a sort of paddle moving up or down, or do you do alternatively open to the outside versus inside, or do you literally have this entire helix slide as a helix up and down? And we kind of know the answers to this right nowadays because we know that, again, you see the spiral shape here, right? That has to do with the 3.6 residues per turn. So what Soma is gonna need to happen as this one is moving up or down, we can't expose all those charged residues to the lipids. Or can we? You can't expose it to the hydrophobic parts of the lipids, right? But this is not the lipid. This is a negatively charged residue. The positively charged residue is gonna be fairly happy to make a partner with the negatively charged residue. That was the one exception I mentioned to you that you would not expect a positively charged residue on the inside of a protein unless there is a negatively charged partner. But as this helix moves up, you're gonna need to twist it and turn it. And then it looks really bad if you're looking at this from the top, that these would be exposed to the membrane here. But do you see how high up they're located? They're not here. And what do we have up here? You have the head groups. So we're not interacting with the hydrophobic parts of the lipid. We're interacting with the hydrophilic parts of the lipids. And nowadays there have been a ton of computer models and simulations on it. And we pretty much, we know both the up and down states of these channels. They're also super cool because we can actually influence these with toxins and you can bias them. And they're related to a whole range of disorders. They long cuties syndrome. Anything that's rid, cardiac arrhythmia or anything. It's single mutations in these channels. The other cool thing is that we can show the entire opening of them. So this is a, I'm gonna, most of those simulations I've showed you before have been run on normal clusters and everything. And it's esoteric physics. I did my PhD on simulations a long time ago. Started studying membranes. You know what the problem with that is? You can simulate maybe a microsecond, which was an eternity. In my PhD thesis I did 100 nanoseconds. On the labs you might be able to do 10 nanoseconds or so, but those are exceptionally short time scales that you're just gonna see proteins vibrating a bit. To really understand biology, you would need to go to hundreds of microseconds or milliseconds. But if we could do that, we could actually see proteins move through these states that we talked about. Imagine those eight states of the, five states of the bacterial rhodopsin or eight states of the sodium potassium pump. For this sampler, imagine if you could see how it opens or closes. That's virtually impossible to determine with X-ray crystallography actually. Why? It seems easy. We have an open state and a closed state. If your group goes after the open state, my group should go after the closed state so we don't compete. So why didn't we just get a bunch of crystallographers try to determine the closed state? We got the open state really quickly because this is voltage dependent. So if you take this protein and remove it from the membrane protein so that there is no longer a voltage across it, it's gonna be open. The only state under which the protein would be closed if there is, you have a membrane, the protein is sitting in the membrane, and there is a potential across the membrane. We can't do that in the structure. So for almost five years, nobody had any structures of it. In a computer simulation, we couldn't, in theory, apply the potential, but we can't simulate long enough. Until almost 10 years ago when David Shaw in the U.S., who has a background, he was actually a professor of computer science at Columbia, but he was interested in building very large parallel computing machines and eventually couldn't get funding for this, and then he went to industry and tried to get funding from industry, but what ended up, instead of him getting money from industry, industry ended up recruiting him. So David left academia and went to Morgan Stanley. And so he's been working in the financial industry for decades, and he's the founding father of modern computational arbitrage. So using computers to do stock trading. And so David, at one point in time, he owned the world's third largest head fund and he has a skyscraper in New York. And then when he turned a bit over 50, he wanted to go back into computing and build fancy machines. But at that point you don't apply to professorships. You start your toy company and hire 50 people. So it's just his hobby. So they have designed special machines that are tailor-made just to do simulations. And one of them is called ANTHOM. And what I'm gonna show you is a simulation of these channels on an ANTHOM machine. And remember that I said the longest simulations we typically would ever make would be one microsecond. And here we go. Do you see how many microseconds we got? And we're still just seeing this protein move around a bit. But you're gonna see these helices move up and then you gradually gonna see the hole here closing. In roughly a quarter of a millisecond or so. Do you see that? It's almost closed now. And this is, you can repeat this a dozen times there are things that's not just fluctuations. Let's look at it from the side instead. And so here you're gonna see the protein undergoing a change of conformations. And then we're doing the slow motion part here. Do you see that the hydrogen bonds there? Sorry, the salt bridges are changing and we just moved the entire helix down. As we're closing the channel. So that this whole principle, the model and everything it actually turns out to work in practice and you can let a computer simulate it. So at this stage, this is interesting because we wanna understand it tells us a whole lot about the fact that proteins are not just static structures. They are machines that move between different states. But I also mentioned that this is very much related to disease, right? So that's, for instance, one common thing that can happen is that what if you take one of these four argenies and mute it away to something that's not charged? It was the charges that created the effect that it would move in the potential, right? So if you don't have any charges, you would be dead. What do you think would happen if you had three charges instead of four? You're, it would be somewhat harder to open or close the channel, right? There would not be enough charges to pull it up. And it turns out to what if you then attach some molecule that somehow would bind on the outside of the protein and try to either pull the helix up or push it up. And we actually have colleagues in Lynch have been working on that and it works. So that rather the traditional way to design drugs to target channels would be to create something that's pretty much a plug. You put the plug in the channel and if you put the plug in the channel, the channel no longer works. You blocked it, a channel blocker. It's really efficient. The only problem with most of these proteins the problem is that they're not open enough. And you're not gonna fix that with the plug. So you somehow you need to force the channel to be more open. So typically we need to pull it up. So what they then do is that we've added in particular small fatty acids and everything that would bind and these fatty acids would somehow bind to the outside of the helix and then help pull it up. Just so slightly. And then you can pretty much restore these deficiencies that you would help the protein be more open. And there's no way you could do this type of drug design unless you understood the different states how the protein is moving between. I will see. That will likely move to some sort of trial soon. I will see how much time I'm gonna, I'm promised to finish at noon sharp today. So I might not have time for the last thing. So the thing I just described, that's it's the full explanation for how nerve impulses are conducted within one nerve cell. You get this chain effect where when you have a difference in potential across or depolarization across a cell that will lead to more channels opening and then you get a chain effect here that pushes the depolarization spike along the nerve cell. And that's actually this traditional peak you would see in an EKG or something. It's literally a difference in potential across the membrane. But at some point you get to the end of one nerve cell and literally you can't conduct, you can't conduct electricity outside one nerve cell. And there's probably like half a dozen Nobel prizes in this picture. But when you get to the end of this nerve cell this change in potential is gonna cause the release of neurotransmitters. So there are small vesicles on the inside of the cell that will bind to the surface of the cell and they will open up. And these small green molecules here that there are a bunch of different ones but they can be glycine, glutamic acid, or frecu-lamino acid, gamma-mini-butyric acid is another common one, acetylcholine is the third. And then they diffuse over a cleft that is ballpark of 0.1 millimeter or so. And then they bind to blue things here on the next cell and the blue things are of course another membrane protein. And these are also ion channels but the ligand-gated ion channels. So instead of being steered by voltage when these green ligands now diffuse over and bind here depending on what ligand you have and what channel they're gonna bind to this channel and then cause these channel to opens up and now we get the new flux of ions to this channel and then we have a new nerve signal in the next nerve cell. And again when the signal goes from my brain down to the finger there are probably two, three junctions on the way. Nerve cells can be very long but this has to happen in absolutely no time. The cool thing with these channels is that they're the Dr. Jekyll and Mr. Hyde of membrane proteins they can bind almost anything. And they can conduct either positive ions or negative ions. It even turns out you can take one of these channels and swap two amino acids and delete the third and you can make an anionic channel be cationic instead. So change the type of ions conducted. And that might sound like a minor detail what type of ion is conducted. But remember those ions control the sign of your membrane potential. So depending on whether you're pushing through negatively charged ions or positively charged ions if I'm moving negatively charged ions to the inside of the cell I'm gonna create more of a potential, right? And if I'm conducting positive ones I'm gonna depolarize and cancel it. Or the depolarization is actually what creates the nerve potentials. I would say that this is the action potential and this is inhibition. So depending on what I'm binding here I can have completely opposite effects. And it turns out there are a bunch of different things here. The other thing that we realize is that these things are quite sensitive to other things such as anesthetics, ethanol but they don't bind up here. They somehow influence the membrane or bind in other parts of the channel. An alcohol molecule or an anesthetic will never open the channel but it kind of acts like an amplifier. If these molecules are present the channels will be more sensitive to these things. And there are some pretty cool things here in particular anesthesia. I think this is so underestimated. We all talk about cancer research and everything and that is super important. But one of the most important advances of modern medicine is anesthetics. It used to be the fact that the good surgeon was a fast surgeon because the patient was lying on the table screaming. But today the reason why you can't do an advanced three hour cancer operations because you can't sedate the patient and you can actually revive the patient afterwards. There is a starting date for this. The Morton Auditorium 1846 at the Massachusetts General Hospital. You can still find this auditorium. If you are in Boston, go there. It's pretty cool to see it. And the first patient survived. This is actually the second patient. I don't think they had an image of the first patient. I think the first one is a tooth extraction and let's see, yes, Edward Abbott. They had some sort of tumor in his neck. Both of them survived. And it turns out that there are a number of different molecules that you can use to sedate patients and they discovered them over 50 years. So almost 100 years ago, Meyer-Overton, independently, if you look at, take different anesthetics and some of them are efficient at very low concentration. So the further down you are here, the better it is at the anesthetic. And on this scale, we have the partition coefficient so that here means that you're very water soluble and here means that you're very oil soluble. So pretty decent line, right? So what would you say about anesthetics based on what we talked about today? Where do anesthetics end up in your cells? Membrane, yes. So for a very long time, we were all convinced that membranes act by going into the, sorry, anesthetics act by going into your membrane and changing the properties of your cellular membrane, making it more fluid or something, hand waving, something happens. And for a long time, we thought this and then we've gradually developed better and better anesthetics. From 1850, the first one was things like nitrous oxide, diethyl ether. And the problem that as we've gone through, we find better and better anesthetics with fewer side effects that we can use in lower doses. But we have started, there are some of these are a bit of acceptance. They're not entirely hydrophobic. Many of them can even be quite polar and having chloride or fluoride ions and bromine there and everything. So there are some exceptions on these scales that we don't quite understand. And sometimes in the around the 1990s or so, we also start to find, there are some mutants we can make in channels, this ligand-gated channels. And if you introduce these mutants, you can create rats that are not sensitive to anesthesia. You can't sedate them anymore. And at that point, we all start to realize there's probably, this is likely protein mediators that there are binding sites for the anesthetics in the proteins. The cool thing is that what we didn't have in the 90s but what we get in the early 2000s is that we've got structures of these proteins, the ligand-gated channels. And I remember because I was supposed to get Stanford when the first structure appeared. Yes, it's not that you need new glasses. This is a crye-EM image from the top. We were so ecstatic when we saw this. Because what you can actually see here, if you believe everything, there are actually five subunits here. And each of the subunits has one, two, three, four helices. Now, you need lots of these images to average them to get something that looks pretty good. And then you can take this into a computer and create a model that looks something like that. I'm not sure about you, but I don't quite see the hydrogens in this image. So be aware of computer models, right? That computer models are good. It's in a remarkably good models, but there were some errors in them. But the point is that we suddenly have some pretty good ideas of what these channels look like. And here you see that you have a transmembrane region that is entirely alpha helical. And it might not be super obvious here. We also have an extra cellular part here that is full of beta sheets. Remember what just talked about that Scorpio toxin that it liked to bind to beta sheets? The channels like that will, sorry, proteins like that will likely bind to the beta sheet regions here. Where we would also normally have the binding sites for the neurotransmitters and everything. And that's gonna influence, it's then gonna influence how you ligand gate the channels open. That's gonna influence the extent to which nerve signals can cross your synapses. And if you start to shut off that signaling, right? The prey is no longer gonna be able to move. It's also how Cobra toxin works. The Cobra toxin blocks the nerve signal by binding to the extracellular domain here. And then you can't create a nerve signal. So that is just that the nerve signal goes to the end of one nerve and then it can't jump to the next nerve. Remarkably efficient. We're doing a lot of research on these channels. And right now there are like five different binding sites for anesthetics, benzodiazepines, cholesterol, they're miracles here. There's so much you can know about this but I'm not gonna spend the next time, five minutes talking about my research. There is another very simple example of disease related membrane proteins. Tyrosine kinases is also a very large class of proteins. And there we know normally these are related to growth in cells. And normally these are quite simple. They're not even particularly fancy membrane proteins. You have some sort of domain outside the protein and then you just have one single transmembrane helix and then you have a domain on the inside. And what happens here is and when we have specific ligands that bind that will cause two of these to dimerize. And when they dimerize the parts on the inside of the cell here bind together and then start to send effectively sending signals here to the cell to divide or grow. And what then would happen is that normally they after a while this ligand would dissociate again and the proteins would move apart and then we're back, we're no longer signaling growth. And then we get a new signal and we bind again. There are a few cases where you can have single site mutations at the top of these helices. And what then happened is that you end up with these two blue helices. They're gonna be so strong so that they will stick together. They won't dissociate. What do you think is gonna happen with the signaling on the inside? The cell goes berserk, right? It keeps dividing and growing. It never stops to send a signal. And it's, I think it's an, I think it's an alanine mutated to a valine but I can check it if you want to. So here the idea we would like, what if you could create some sort of super intercepting peptide, just a helix, a red helix and the red helix should have be very good at binding the incorrect blue helix but not the green helix. Then we could kind of stop and my red helix would not have the partner here so we would not send a signal. Then I could selectively disable just the sick helices but potentially not the healthy cells. The problem is that of course it's not a mix, right? So what would likely happen, I don't think anybody's realized this yet. The problem is that all your helices are likely gonna be the blue form in your genome. So what if we just did this, you would completely shut this off and that's, and then you're gonna die for other reasons. But there are some very detailed things we would like to do. There have been a number of groups that have tried to do this. In particular, Bill DeGrado's lab and Joanna Slusky who used to be a poster in the department here. So what they did is that they took exactly what I talked about, that we know the crossing angles for membrane proteins. Just look in the protein data bank for how pairs of helices cross each other and then try to, if you give me one helix, I will try to design a complementary sequence to bind to your helix. And this actually works. They can design helix in computers that will target other membrane protein helixes. That used to be called champs, computed helical, anti-membrane proteins. And I'm fairly, at least a few years ago, they used to have drugs in clinical development. I think they've sold it out and they have a company working on it now. But there's a lot of research going on here. The only reason why you have not yet seen these sort of things on the market, remember what I said, it takes roughly a decade for things to actually appear. So there's an entire new generation of completely tailor-made drugs that you will see soon. I'm gonna have two more slides because of that. And as I mentioned yesterday, the theoretical advantage here, they can be super potent. Just as complicated as membrane protein structure is, there's not gonna be any side effects. If you tailor make a protein, it's only gonna bind to exactly the thing we want and nothing else. The problem is that you need to inject this. Why? What would happen if you consumed it? That's a pill. Exactly, the stomach and signs would degrade it. And if there's one thing, this is bad because then you need to see the doctor and everything else. You would like to have these pills that rich Western people with lots of money should take three pills a day. They're not gonna go and inject something as a doctor every day. But for real diseases, cancers and everything, I would say this is probably where 90% of the effort in modern research goes. There are a bunch of other, many of these things, we can actually try to come back to do say non-native amino acids, they de-amino acids or things that are not among the 20 essential ones. We try to deliberately make things bio incompatible so that your stomach and signs won't degrade them. Having said that, there are some issues here. A few years ago, almost 10 years ago now, have you ever heard about the Teginero? So this was a antibody, it's a supermap. So the idea is that they wanted to target leukemia. And I'm not gonna go, you probably started part of this in the KI courses. You basically want to activate the CD28 receptors on these cells. And to help activate this, then, is that we would create an antibody that binds specifically these cells and then we would have to add the body, kickstart your immune defense. So that we developed everything here in the mice and everything, we've got beautiful responses in mice and everything, everything is appropriate. And then we calculate, well, you waste slightly more than a mouse, sorry. And then we calculate exactly how much dose you would need for humans. And then you're gonna go through something that's called phase one trials. I will talk more about this in drug design. And what you do in phase one trials, you take healthy test subjects and you don't care about whether it works for now, you just wanna make sure that there's nothing dangerous happening. And that usually means you ask students and pay them $15. And it was a disaster. Yet, 10 test subjects. And when they came in, after 10 minutes, basically their hands and heads swelled to the double size and everything, they were very close to dying. I think several of them had to amputate to many limbs or fingers and everything. I think all of them survived, but it was exceptionally close. What happened? There's a catastrophe. There's a reason why they didn't continue to study. So of course, this company went bankrupt. So what likely happened is that, and this is the danger because we're in the completely different parts of drug design here. So if you look at, you are not quite a mouse. You're very similar to a mouse. And that's why we tend to design things in mouse, but you're not exactly a mouse. So what likely happened is that the entire recognition and everything, let's say that things were only 50% compatible. So when we took this antibody and we designed it for a mouse, well, it was the human versus mouse compatibility is only 50%, which means that we'll likely only get 1% of the activity or something, but we will get a little bit of the activity. And then we calculate the dose we needed to get this activity in a mouse. And let's say that is one milligram. Then we take this into you and we have a 100.0% match. Your immune system is now gonna be a hundred times more activated than it's supposed to be. Like it's gonna try to kill every single cell in your body right away. So that nobody thought about the fact that a mouse is very much like a human, but it's not exactly like a human. There's a joke that people occasionally say in cancer research that if you're a mouse and have cancer, we have very good treatment for you. Because the whole model of 50 years of research is assumed that we can use a mouse as a model of a human. And it doesn't work here because that with the specificity in all these interactions that you don't have in traditional drugs, the problem is that it's 50% versus 100% can make a tremendous difference in the effect. And with that, I'm gonna stop. There is one take home message from this and that is what don't participate in clinical trials just because somebody pays you $25. It's likely on average it will work. The question is, is it worth your life if it doesn't? Study questions for you, not for tomorrow. Tomorrow I'm gonna be at a, we'll meet again on Thursday, right? And I think we went all the way through membrane proteins. Unfortunately, the book is not good at reading here. I've jotted down most of the concepts I brought up today on the web pages. So consult Wikipedia and make sure that you're aware of these concepts I bring up there.