 Yesterday we spoke a little bit about bits and pieces of hard from introducing the course. First things first, can everybody access Canvas? Okay, if there's anybody who can, there's a friend who can, there's a friend who doesn't receive emails or whatever, let me know, because from now on, I'm just gonna trust that anything I post through Canvas will reach all of you, and anything that I, any material I upload to Canvas, you will have access to. Good, sounds like Katie is good at something in the administration at least. Yes, no, yesterday, on Tuesday, I spoke quite a bit about protein structure and function, and I introduced, we start talking a little bit about interactions. I'm gonna spend way more time talking about interactions today. We spoke about water and a little bit about hydrogen bonds that's gonna come back, and then we introduced DNA and RNA. After which we kind of touched upon proteins, there's gonna be much more about proteins today. For a couple of different reasons, biophysics is about lots of life molecules, and DNA and RNA are certainly super important, but in terms of the molecule itself, DNA is a bit boring. DNA only has one structure, and the structure is the structure is the structure. The sequence of DNA is interesting, but not the structure, and there are a bunch of colleagues who would kill me for saying that, but that's fine. We spoke a little bit about ways to determine these structures, and both extra crystallography and cryoEM, and what the limitations of those might have been. One challenge with this format is that I don't have time to go through all these study questions in detail, although it's great fun. Is there any question here that you, rather than me picking a few of them, are there, well, one, two, or three questions here that we should talk about? Anything that wasn't obvious to you? And as I said yesterday, and this might sound like a threat, it's not. It's a carrot. As I mentioned, to pass the exam, we need a roughly 60% of those multiple choice things. They will be taken from these study questions. If you know these study questions, you will pass the course, period. And that's why it's a good point in using this as a reality. If you can answer these 20 questions, there is no point in spending more time on that lecture. One thing I might not have brought up, number 20, what are the limitations and the ways we determine structures compared to the cellular environment? So I touched a little bit upon, how do you grow crystals? We certainly need to destroy the cell, right? We pretty much run things in the equivalent of a food processor, but worse. So you don't have any cellular environment. And once you have a crystal, so first you've gotten most of the things that were important to the cell are no longer there. If you had a membrane protein, you no longer have the real membrane. You just have some sort of detergent-like solution around it. The other challenge is that once you grow crystals, you have tons of your molecules with a little bit of water around it, but your cells are not crystals. So it's definitely not a native environment. The other problem is that of whether it might not be so obvious for x-ray crystallography, but for cryo-electron microscopy, the first word there might give you a clue at what temperature do you determine these things? Liquid nitrogen temperatures are roughly 100 Kelvin. And I'm not sure about you. It is a bit cold outside, but it's not quite 100 Kelvin outside. So does a molecule behave the same way at 100 Kelvin as it does at 300 Kelvin? It's not obvious. So there are a bunch of limitations that we pretty much have to live with because there is no way we can determine the real structure of the cells. You can determine structures at room temperature with NMR or neutron scattering today, but they're not as good as cryo-EM or x-ray crystallography-interest resolution. What I would suggest, you might have had a busy week, in the interest of time, rather than me going through some of these questions that I think that you might not know the answer to, take the weekend, go through both the questions from Tuesday and today, and start posting in this forum. I will make sure that I answer, and it's perfectly fine. Well, I won't just say that, if you want to just ask, what is the answer to question number 11 here? You have to explain what is this you don't understand or you can say what you think the answer is, and then I'll be happy to step in and explain a little bit more in detail. But the first piece of effort you have to do, and then I'll help you. What we are gonna spend some time on today is going back and talking about protein structure. And there is a pretty sad piece of news, but it's fascinating as always, an introduction from early this fall. I'm not sure if you heard, I saw this news on CNN, it wasn't particularly high profile. But it was one of these charity organizations that tried to provide healthcare to sick kids in Africa and everything. And there was this young boy with a large facial tumor, and then they got funding to take him to the US and perform surgery on him. And it's great, he would have a normal happy life. And then the guy dies on the operating table. And the reason why he died has nothing to do with the tumor, but because it realized he had a rare genetic allergy to the anesthetics used. And if you get an anaphylactic shock while sedated and well under tons of drugs on the operating table is pretty much 50-50 that you die. He's not the only example. There was actually a TV host in the Bay Area a few years who went in for elective surgery on facial lift or something. And then he died on the operating table. Because most, these are rare, right? You don't eat anesthetics, so you don't know if you have an allergic reaction to them. The first time they feel like it's if you are on the operating table. The reason why that happens is very much related to the sequence to structure, to function that I brought up. So what happens, the reason why that interested me is that it's very related to our research. There are a lot of people in our team doing research on ligand-gater ion cells. And we'll talk more about this later when I come remembering proteins. But this is a protein that is encoded from DNA that sits in your cell membranes. And tonight, you might, well, I do this purely for scientific reasons, of course, but I might have a glass of wine as a pure scientific experiment. And what happens here is that the alcohol will bind roughly here in these receptors. And that will influence when you're having neurotransmitters between your cells binding here. The alcohol here will cause them to open easier which will increase the flow of current through them. And that will influence how your nervous system works. But the same type of channel, closely related channels are also used for anesthesia. And the anesthetics also bind roughly in this binding site and influence how the channel works. But what then happens either in this channel or other places that you can be very sensitive to this molecule through mutations can be just one single amino acid that is changed. And for whatever reason, this causes the anesthetic to bind in a different way or it influences things in a way that is non-natural and then disaster strikes. And this is ultimately caused by a change in the sequence that causes a change in the structure and it can be a minute change in the structure that causes a change in the function and then you die on the operating table. So part of this that we understand that but the other part is of course that if we understand it better, we might be able to treat it. And second, we might be able to use these things to develop new drugs. There's quite a lot of hard drugs nowadays that are related to specific mutations that we are aware of that influence your electrocardiogram and then we can compensate for that with another drug. Yes? When we talk about sequence, a structure you obviously mean structure of the molecules. Yes. I'm gonna come back and talk a little bit about more. It's not as obvious as one might think. In general, it's the sequence in DNA of basis. The sequence in DNA corresponds not one to one but you can always go from DNA to amino acids through the genetic code. But because of this redundancy in the genetic code, you can't always go back from the amino acids to DNA. But thank God nature doesn't have to do that. We only construct proteins from DNA and advice for us. So DNA or amino acid, take your pick. And that's actually a great carry over to what we're gonna be talking about today. So today I think we're gonna jump into the deep side of the pool and look at amino acids. I'm well aware that you've studied some of these things either if well, if you're a chemist you studied this if you last year or something, you might have studied it in upper secondary school. Part of this is repetition because based on previous exams, I realized that you need this repetition. And part of it is I'm gonna take slightly further. So just to re has amino acids are acids. And they have a common structure where you have an amino group which is an NH3 plus in isolation and then a carboxyl group which is COO minus. Then you always, well, on all the normal alpha amino acids you have a small hydrogen and then some sort of large group that if we wanna talk about amino acids in general, I typically just write R. It's an arbitrary letter. No pun intended. There are 20 different ones. If I don't care with this, I just wanna talk about amino acids in general. It's great to put R there because if I put say CH3, you're gonna assume that why does it mean alanine? R just means that it doesn't matter what it is. You can draw them in a bunch of different ways and in practice what happens here is that there is a bit of oscillation that you could think that you have actually have a full plus one charge there and a full minus one charge there. In that case, you would call it the spitter ion for dual ion in German that you basically have a positive ion on one side and a negative on the other. So the molecule as a whole ends up being neutral. In practice, there's gonna be a bit of oscillation so it can also be neutral with NH2 and COOH. I will come back to that in the next slide with that, will we? In general, this R group is not a hydrogen. There's only one amino acid with this hydrogen glycine. So in general, if this R group is different from the other three, you're gonna have four different things bound to this carbon in the middle. And if you sit down and think about this to play with this and the carbon has a tetrahedral geometry, this is gonna mean that there are two different, if you pick any arbitrary four groups here, there are gonna be two different ways of constructing them and one will be a mirror of the other. You can't take one of, you can't take the molecule on the left and rotate it into the molecule on the right. That's impossible. And that has to do with stereo geometry. If the red and green group, sorry, I didn't create this plot and I know that that's a bad choice if you're colorblind. Let's see, if the green and blue group are identical, then it's fine. Then you can rotate one to the other. So the second two of these groups are identical, you can rotate the molecule to the other. For some small molecules, you can have what you call racemization that one of the stereo isomers, as you call it, would of course, would after a while spontaneously change to the other. That doesn't happen with amino acids. These groups are so large that you would have to wait basically for 10 billion years before one molecule would do it. So if you pick one particular form here is created, in this case, it's what we call an L amino acid and here's a D amino acid, it's just letters. It's gonna stay that way until the return, until the end of time. So then we can start to argue, does this matter? You might have heard about, well isomers or the equivalent in physics would of course be atoms with different numbers of neutrons and everything. From a physical point of view, you could argue it does matter, right? Because you can take one molecule and rotate it into the other. These will also have some very peculiar physical things. The way any molecule that is chiral in the sense that it can't rotate, has a tendency to rotate circular polarized light. So these two molecules will in general, we don't know how they, but they will have, if you put this in a particular spectrometer that checks whether it's rotating light to the left or the right polarized light, they will have different properties. So there are some spectroscopic properties, physics, that will be different from these molecules. If this has been upper secondary school and we would have had chemistry, and for a second, let's forget about life science here. For a small, stupid chemical molecule, the chemical properties are the same. The boiling point is the same, the density is the same, the energy is the same, the confirmations are the same, but the caveats that they are mirrors of each other. So every single chemical property would be the same. But that's not quite true in life sciences, because the problem is that proteins interact with what? Other proteins, right? And if one L amino acid starts to interact with another amino acid, then this gets complicated, because if an entire protein is constructed of amino acids, then it's not obvious, then they could become mirrors of each other. And in general, you can't take, proteins are not symmetric. In general, you can't just take a protein and make a mirror of it, image of it. For a very long time, this was hardly known. And there was a scandal, Talidomid, that you might have heard of, but the Swedish brand name for this was Nerusetin. Do you remember what it was where we heard of it? So this, okay, your yarn. So even in my generation, this was common to see, well, middle-aged, elderly people without arms or legs, or at least with tiny arms or legs. And this was because in the 1960s, there was a new drag on the market that was approved fairly rapidly and it was a great drag to treat pain in particular. Pain and no, not yet, not yet, not pain. And one of the most common cases where you're nauseous is when you're pregnant. So you start to administer this to pregnant women and no problems whatsoever until you realize that it causes severe birth defects. And that way you get these tiny arms and legs. And the scary thing is that only one of the race mirrors did this, and I forget whether it was the left or right hand and one, it doesn't matter. But the, and this is the complicated part with biology. In the biological sense, there will be cases where the handedness of the molecule not only matters, but it's completely critical. So this drug has, for a long time, this was obviously prohibited from the market, but it reappeared on the market some five, 10 years ago, in particular in South America. And the reason for this, it has some, well, I wouldn't, it has some sort of efficiency in treating HIV infections or at least pregnant. But this is of course scary, right? If you're now manufacturing these drugs in cheap plants and you're not really certain about which race mirror you're getting, and then you're administering this broadly and then somebody starts taking it to nausea that people might not know if they're pregnant. So there's been a lot of worry that this could cause a new wave of birth defects, although I haven't seen it. In humans, we are L-amino acids. We don't really know why, that's just the case. And D-amino acids would not be compatible. And the reason, you could argue this might have a natural selection or something that caused one of these pieces to survive, but because we are created of L-amino acids, the next generation is also gonna be created with L-amino acids because a protein made of D-amino acids would not be compatible with the L-amino acids. They would not bind. So some, well, if you think in terms of physics, this would be a spontaneous symmetry break. For whatever reason, we ended up on that side of the pond. You can create D-amino acids in the lab, but they don't exist in nature. But this, of course, and this is one of the reasons why I tell it to me only one of the race mirrors had effect, right? Because this race mirror interacted with the L-amino acids, but not the, and the other one would have interacted with the D-amino acids. There are tons of you're gonna, sorry, I'm gonna harass you with these amino acids until you're sick and tired of them. There are lots of different ways of representing them. You're gonna come back and just take what they do. You can display them in different ways. One challenge compared to simple molecules, it's fun because my daughter actually had a test, not on amino acids, but on writing chemical formulae the other day. And for amino acids, it's not really enough to just draw the chemical formulae. We frequently need to draw them in 3D to understand what they're doing and what the properties are. There are a bunch of different ways. You can draw the bombs. You can somehow try to draw some space-filling diagram. You can try to draw the surfaces or maybe even the electrostatic potential. In this case, I think the coloring is just based on the atoms, but typically if you have an oxygen that tends to be negative and the nitrogen tends to be positive blue. And there are 20 normal ones that you're gonna see over and over again. We can classify these in a bunch of different ways. There is a one unique way to classify them, but we can just think of amino acids as positively charged ones that would be arginine, histidine, and lysine. I don't expect you to know these things by heart, but you probably will by the end of the course. There are negatively charged ones. So these are effectively ions in the side chain. There are amino acids that are polar, so they're not charged, but they're definitely not hydrophobic. And there's serine, threonine, aspergine, glutamine. And then there are lots of amino acids with hydrophobic side chains. So they're basically oil-like in the side chain. And here you can probably guess that these side chains in particular are not gonna be very soluble in water. And then there are some special cases that we will come back to. So we could, for instance, group these since we say, are the side chains small or large? Or are the side chains slightly polar? That it has some sort of charges? Among the polar ones, you have a few ones that are positively charged and negatively charged. There are so-called aromatic ones that have benzene rings in them. There are aliphatic ones that are just linear carbons. The point again, you see that it's, these are not disjoint groups, right? Pick any type of classification you want. But you need to be able to think, I might ask you to classify, explain a couple of different ways of classifying amino acids, so be creative. The reason why this is important, we're gonna see that when we create protein structures. So we very rarely care about specific amino acids, but it's gonna turn out the class of amino acid will matter. It will matter that we frequently have charged or polar amino acids on the outside of protein structures. We very frequently have the non-polar, the hydrophobic ones on the inside of proteins, but exactly which hydrophobic one it is, I don't care about. We spoke yesterday about polymerization, that what happens when you have two of these amino acids, although individually they have this amine and carboxyl group, but what happens in practice in water is that they tend to polymerize, not just polymerize, but you get an entire chain where they form this peptide bonds. The peptide bonds are, I'm not sure you remember that from upper secondary school, but it's a fairly stiff bond. So this is really an entire group that ends up being, you can't really rotate freely around that bond. So it's always trans because the H, the hydrogen there tends to be positively charged, the oxygen negatively charged, and then you, the reason why it's playing there is complicated, but it has to do with electron resonance in the entire bond. But the point is, you know, for the very strong bonds that are planar and that you can't rotate around. And then you keep forming more such bonds and then you get an entire sequence on the different side chains. Oh yes, sorry, I even had a slide on the bonds. So this is a different way of drawing it. The simple way would be to say that you have a double bond, but we don't really have that many electrons. We typically draw this as a single bond, but be aware that it's frequently stiff. It can be either cis or trans, if you remember that from a secondary school. If you don't know, with 99% probability, guess trans, they're virtually always trans. There is only one special case, probably an occasion to have them in cis, but if we don't know anything, it's just trans. And before you know it, you two are gonna start drawing amino acids. It's not as difficult to draw amino acid as you think. So there is one pattern you need to remember. N, C alpha, C. So that you have a nitrogen and then a carbon and then a carbon and then you have a nitrogen and a carbon and a carbon and then we continue like this. That's a bit irritating. Carbon, carbon, which carbon is which? Well, there's one carbon that has the side chain bound that is this central atom and we, that is this one and we like to call that alpha. And that's the reason why it's called alpha. It's when we enumerate carbons, it's the first to needs molecule. Once we have that, I know that the other carbon has to be the one that has the oxygen. And after that, I know that, well, the nitrogen needs to have the hydrogen going in that direction. And that's really all I remember. I always start draw this backbone, NCC, NCC, NCC. And then the exclusion method will get the rest for you. Yes. Well, we're not quite there yet, right? But we typically draw them for less right, so NCC. The challenge here though is that even though we might typically not rotate around that bond, if we can't rotate around any bonds, this would just be long linear molecules that would be fairly boring. So there will be some sort of degrees of freedom here that we need to think a little bit about. And let's see, this might be easier to see here. Yes, so here's a long chain, full disclosure. They're never gonna look like this in solution. So if we for a second believe that we're not, if you believe me that we're not gonna rotate around that peptide bond, there is, in addition to the peptide bond, if we only look at the backbone, why would we only look at the backbone? The central long chain. If we wanna understand how these molecules might move and different conformations to put them in. So I spoke a little bit about that on Tuesday. Remember, key living in biology need to learn to approximate and approximate violently. Always go for first order approximation, right? Before you know it, we're gonna be looking at proteins with 10,000 atoms in them. Don't even think of creating models for everything, right? You need something that's so simple that we can understand it just by looking at it. Let's pick an example. Let's pick that side chain. If we rotate around that bond, will it change the confirmation of the molecule? Well, technically it will. It's gonna change the position of that hydrogen, that hydrogen, and that hydrogen. Do we feel, do we believe that's gonna be important? Likely not. So let's start rotating around. If we rotate that bond in the... Well, if you rotate that bond, that entire part of the molecule is gonna start flipping around, right? So that bond is probably pretty important. And we just said that that bond, we've said that we can't rotate. And that's the same there. So that we pretty much only have two bonds. The bond before the C-alpha and the bond after the C-alpha. And then people, well, if we have two bonds, we're gonna need to call them something. So the names that have historically, actually I should know this. It might, there might not even be a reason. But for historical reasons, we call the first one, the one before each C-alpha, we call for phi. And the one after each C-alpha, we call psi. And you can, given three atoms here, you can define a plane. So that we essentially have a plane here. And then a plane after the C-alpha. So if we rotate that bond, it's the phi atom. So it's really the bond going from carbon, carbon, nitrogen, C-alpha, C. That's gonna change that angle. And then the next is then gonna be nitrogen, C-alpha, C, and nitrogen in the next residue. Don't worry, we're gonna come back to this. So no matter how complicated this chain is, the other nice thing by saying that it's before the C-alpha and after the C-alpha, the C-alpha belongs to a particular residue, a particular amino acid. So that means that no matter how complicated this molecule, the first approximation is, well, there are two angles, or torsions, I have to call them, per residue that we need to care about. So if you have 10 residues in theory, that means that there are roughly 20 angles that we need to care about. And that doesn't sound like a whole lot, right? It's far more than you think. So in principle, if we start changing this, we should be able to understand how a small molecule like this works. While I'm at it, since I introduced phi and psi, there is another angle too that you will hear about. And that's called chi. And those are the angles when we rotate around the side chains here. So the first one would be chi one. If there are many heavy atoms here, there could be a chi two and a chi three. And then there's one that I will likely never, ever ask you about. But again, for the completion, the peptide bond you occasionally call omega. The reason why I'm not gonna ask you about it is pretty boring to talk about the angles for an angle that doesn't really change. And with that, you know, pretty much, well, again, to first approximation, everything you're gonna need about protein flexibility. That was easy. So now we can skip the remaining 11 lectures of the course and give you some free time and you know everything you know about biophysics. The problem is that the devil is in the detail. So first, we haven't talked about how likely it is for these molecules to make transitions between states. But the problem is long before that, we're gonna have a problem. Let's just, what I love with physics is that any physicist's worth his soul should be able to estimate anything in the world by an order of magnitude. So let's try to do that here. If we have a small molecule here, how many states can it adapt roughly? So if we take each of these phi and psi torsions and we sample it, well, if you just make a difference with one degree, that's not really a different state. So to first approximation, let's say that, okay, if it's changed by 10 degrees, then you might have a different state. It's completely arbitrary. You can pick 30 if you want to. So that means that for each such torsion, if it can sample an entire term, that would be 36 states, right? And then there are two such torsions per amino acids. So that means that each amino acid or residue, I will use those two words and to change it be. Residue just means a beat in a long chain, right? But in our case, the residues are always amino acids. So each amino acid could adopt roughly 36 square states. Some of them will be impossible because things are gonna collide. But again, don't worry about the details. And then if we have 100 such amino acids, you have 36 squared raised to the power of 100. So that's roughly 36 to the power of 200 or 10 to the power of 308. I would challenge you to bring up your calculator and try to calculate what that number is. And you probably don't have them in front of you or write a small program in a computer and try to calculate what that number is. So the problem is that single precision math will have an overflow at 10 to the power of 30, roughly. This will create an overflow even in double precision math so that even computers can't even calculate a number as a start. So this is way, this is hundreds, several orders of magnitude more than the number of atoms you have in the universe. And we just talked about 100 residues here. There are proteins that are significantly longer than 100 residues. So even though we've made horrible approximations, we've assumed that there are only two degrees freedom per residue. And they, again, anything within 10 degrees is gonna be the same. There is an insanely large conformational space here, which is, of course, partly responsible for the diversity you have for proteins in your body. And only one of them is gonna be the structure that we call native, but it's the biologically important structure, the one that actually works that we need to find. And that leads to a bit of problem because how on earth are proteins gonna adopt the structure in your cells automatically within a split second? There are some cases where the molecules can have cis-transisomerization so that proline in particular could have the oxygen and the hydrogen on the same side. I'll skip this for now, there's not an important slide. The reason why proline can have that, sorry, the reason why normal amino acids would hate that is that for a normal amino acid, that would mean that these two rings, the side chain on the first amino acid and the side chain on the second amino acid, they would collide. This doesn't seem horribly bad, but remember that is not one atom, that is an entire large side chain and that is an entire large side chain. So that's gonna be very bad in general. Proline on the other hand that has this small ring, here you actually get a collision between the ring and the side chain on the previous residue if they are in trans. So proline sometimes likes to go over and be in cis. And this is something we hate too because sometimes here literally means roughly 50-50. So we can't assume that proline is always cis. So with proline all the bets are off. We're gonna need a computer or something to predict what that is. And that starts to be pretty complicated, right? How on earth do we keep track of things? And 36 squared is quite a lot to understand just the single amino acids, 1,000. It turns out that things are much easier in practice. So the good news is that thanks to, well, thanks to the people in the 1960s who started determining X-ray crystals of the structures, we know a ton of structures of proteins. When I took a course roughly like this in 1993, I think it was, first year for Cheyenne, I learned, he had us strike out the sentence in the lecture compendium he had written and said, because that said that there are roughly what is 100 protein structures known and that's almost 350 now. He said with a very proud voice, that's an important change. Do you have any idea how many protein structures we know now? Roughly 100,000, and that's in your lifetime. And this is a revolution for drug design, molecular biology and everything. We're starting to understand not just the alphabet, but the structure of life. All modern drug design depends on this. And when we know the structure, we can actually, we can start calculating what are the torsions? So what parts of physics do it? And again, you're a physicist, right? So if you have two variables, phi and psi, that's something we can plot in a two dimensional space. For each amino acid, let's put phi on the x-axis and psi on the y-axis and put a small dot for each amino acid. And then it's gonna turn out that each black, this is just one protein, I think, or two. So that there are a bunch of amino, this is phi and this is psi. And exactly how you define the angles, I'm not gonna go into details there now. There are different ways of doing it. So apparently it's very common to have stuff here. It's somewhat common to have stuff here. There's a few ones accessed over here. You never see anything in the white region here. Never ever. So I'm not sure about you, but a thousand seems like a bit of excessive. I would not say that there are a thousand different states here. We could even be radical and say that let's group this into one state and that is one state and that is one state. Maybe that too. So maybe I have four states and not a thousand. That's a pretty nice simplification before 11 a.m. Any idea why this never happens? The white stuff? So if you look at the part on the left here. Would you in general imagine that you could rotate these two planes absolutely any way you would like? In general they're gonna collide and bump into each other, right? And there are gonna be some confirmations here where things are virtually guaranteed to bump into each other. For instance, those two atoms. And when they're virtually guaranteed to bump into each other no matter what side chain you have then we're in the white region. It doesn't really matter what protein we have as for what amino acid we have. They will never wanna be there. There are some minor differences between amino acids here. For general amino acids they roughly obey this pattern you saw. And again, I'm gonna tell you what these regions are in a second. There are two special amino acids. First we have glycine. A glycine is this amino acid has two hydrogens. Glycine is very flexible because it doesn't really have that side chain. And a large reason for this is a big white space is that side chains will bump into things. So if you don't have a side chain there are obviously gonna be fewer things you bump into. So glycine has a number of areas here where it actually can be. So glycine turns out that it's gonna be very common when you need to do a tight turn or something. If you need lots of flexibility in your chain it's great to have glycine there. The other amino acid is proline. Proline hardly even wants to play the game. So proline might wanna be there and there but in principle proline hates to have any neighbors. And that also means proline is frequently can destroy the structure. This was known indirectly. We didn't know the structure of, people had a guess about the structure of amino acid but you also use spectroscopy and realize there appears to be two patterns in the spectrum we see. And this is, again, the problem here, when we teach science, we teach retroactive, right? So we already know what the answer is but if you only have two structures could you imagine what you would call them? If you don't know anything about them or two patterns. We would call them alpha and beta, right? It's the two first letter of the Greek alphabet and that's pretty much how it started. There might be two patterns here that we see in the spectrum. They must correspond to something. We don't know what it is. And of course in the power of hindsight we know that there are assembled proteins they're diverse but they're not that diverse. So that the ions actually it's easier to, these things you attend to have these spiral patterns that you might have seen before. They're going to correspond to this alpha pattern. We call them alpha helices and in this other part you see on the left here do you see that you have these straight planar things here? It might not be an ideal way of showing it but we call those sheets and beta sheets in particular. And once we've assembled the structure here, well, in this case we might have roughly 200 of 100, 200 amino acids. Here you have in the ballpark of 1500 amino acids. This actually causes five different chains stuck together. So you get a lot of diversity in the high level structure but somehow even here there appears to be common building blocks. We don't need to understand every amino acid here. You just want to see the patterns, right? And this is going to come back that at some point now we're going to stop looking at atoms. We start looking at amino acids. We're going to try to go away from amino acids and can we start looking at these patterns? And once we have these pattern building blocks we will try to iteratively go up to larger and larger and larger building blocks. And these building blocks are very much related to this diversity I've spoke about. I'll come back to the specific building blocks that we'll probably have to be after the break. But there's one very obvious question here. How do we get the big structures that I've shown in the last slide from the amino acid sequence? What determines that? Is it an easy problem in the first place? Sadly, I think this is even frequently presented as an easy problem. It's not an easy problem at all, right? You have something very complicated, required for life in the cell. In general, in biophysics, most things required for life we need energy for it. We need something complicated machinery to construct it. You could certainly imagine that we needed a very complicated machinery to construct our proteins and that we'd have some sort of other wizard components in the cell that created the proteins. But that's actually not the case. So Christian Anvinsen had a very famous experiment in the 1950s, I think it was, that he could show that you could take a small protein, luciferase, it was a protein that is fluorescent. And then you could take this protein and destroy it. I think he used a strong acid, but it doesn't matter. So you could destroy the protein so that it unfolded and was no longer native. But he did this in a test tube. And then they used chemicals to remove the acid again. And then you start to see the light again. So now you can show that even outside of a cell, as long as you have the sequence of amino acids, they will somehow spontaneously into convert back to the protein just in an accurate solution. And it's one of those things that when I typically give this course, every student say that, yeah, but that's obvious. Trust me, it wasn't obvious in the 1950s and that's why I got a Nobel Prize for it. So you are physicists most of you and that speaking in a physical way, what Christian Anvinsen explicitly said that the native structure of a protein corresponds to the global minimum in free energy for the molecule. It's the lowest possible free energy the molecule can adopt. And as physicists that should make you happy, right? But now we have another problem. How many conformations were there of a simple molecule? Possibly more than the number of atoms in the universe. And how on earth can the molecule find which one is the lowest in a millisecond? There is no way to test all of them and yet it does. So there was another researcher, Cyrus Leventhaler, who formulated this into very famous paradox. And it's important to get to this slide. Cyrus did not think that this was wrong because of course we know that proteins do fold in your body. It's also, Christian Anvinsen's result is also obvious. But the paradox in the physical sense is an apparent inconsistency that we need to try to resolve. That if you make this radically simpler, forget about those 36 states. Let's just say that you have alpha and beta. That's two states per amino acid, right? And we definitely saw that in the experiment. They can adopt both states. For 100 amino acids, that would be two to the power of 100 different conformations. That's a fairly large number. If you work with computers, two to the power of 64 is large enough to handle any integer we ever need to work with, right? So two to the power of 100 is also an insanely large number. There's no way we can test that even with those two states. We're gonna come back to that later on in the course and try to resolve that. But this is very... What I love here is this interface between chemistry and physics. Both of these statements are very much physics. We know that all this chemical diversity and everything has to be explained by physics. Otherwise, this experiment wouldn't have worked in the test tube. And as I said, that there are proteins that range from very small, cytochrome C to 100 residues, up to some, a titanium protein. As you're gonna see what titanium is, it exists in your muscles. 30,000 amino acids per protein. So forget about two to the power of 100. Try to calculate two to the power of 30,000. So these are some pretty darn complicated molecules. Oh yes, I have a case study on there. Just to show you an example. So what happens in your muscle fibers, you have apologies to all my MD friends that... So you have the muscles and then you have a muscle bundle and then we get down to muscle fiber. Sorry for skipping this a bit because I'm not interested in the muscles. I'm interested in the molecules. Far deep in the muscle fibers, you find this what you call the myofibril. This is still a biological constant. You can see a myofibril in the microscope. And if you dig further down into the myofibril, you start to see what you call a sarcomere. And now you're starting to get things that are almost on the molecular scale here. And inside the sarcomere, you have some strange, very elongated molecules. And these are these 30,000 residue molecules I spoke about. So what happens inside the sarcomere, you have the molecules that they're bound to what you call actin filaments, which are also part of the muscles. And then depending on the environment today, these molecules, I think some domains of these molecules are flexible, almost like springs. So that they have a structure here with tons of beta sheets. But you can also take, if I pull on the left and the right side here, I can pull the molecule apart. It works. And then, but I have to apply some force. And when I release the force, the molecule will just like a rubber band go back to its previous state. And depending on how much energy, well, I need to use to expand this, I'm gonna need to add energy. But now I have here, depending on how I add energy, I can get a molecule to move. The cool thing, this happens every time you move a muscle. You have millions of these molecules moving, contracting and expanding. And this works all the time. So you have a nerve signal going from my brain down to a particular muscle that's telling the muscle to contract. And then you have a gazillion of them immediately going through these features. But the reason why this works, this is a purely physical process. We even have colleagues have been able to calculate this with computer simulations and show what the forces required are and how strong the counter force will be and everything. So that even something as biological as the muscles in your chest and everything, we can explain that on the physical level. And here too, there are plenty of examples where you had mutations in these proteins. And if you have bad mutations here, it can correspond to that you have very weak muscles. Here too, there are plenty of generic diseases. We're not gonna, this is not a course about muscles. I think I'm gonna skip Titan for you sake because 30,000 is a bit too large for our test. There are three different classes, but I'm gonna use one more minute. No, two slides, one minute. And then I'm gonna let you have a break. There are two large classes of proteins that are gonna spend a little bit of time next week talking about. And if you're interested, I put all the slides throughout next week on Canvas already. There are plenty of fibrous proteins in your body. There are somewhat important, but fairly boring. Skin, nails, hair, all these large building blocks, even part of the stuff that goes into your bone and everything is actually protein. And it's of course important, but there are large and repetitive structures. And for that reason there, I don't find it particularly interesting, but we need to go through them. The common workhorse proteins in your bodies are the water soluble ones. We're typically called englobular, and that just has to do with the shape. They're roughly spherical. They are cool because they can exist in a ton of different folds, and they're very, they have, well, that also gives them very diverse functions. That's usually what we're gonna look at when we wanna understand how proteins fold and what is that causes a protein to fold in the first place. And then we have my particular love story here at the end that I have to confess that I'm horribly biased and the books don't talk that much about this for two reasons. Partly because it's complicated, but also because it's new, membrane proteins. So these are proteins that sit in the lipid bilayers in your cells, roughly 25 to 30% of all the proteins do that. That makes them exceptionally difficult to study experimentally because they basically sit in oil. And I'm not sure about you, but I haven't seen a whole lot of oil crystals. So things that need to be solidized in oil have historically been virtually impossible to determine the structure of. But because they sit in the membrane, these are the doors and windows of your cells. Anything that needs to act on a cell. The cells that are responsible for conducting the nerve signals have to be a membrane protein. It's the one that received the nerve signal in membrane proteins. Anything that received signals, if one cell is telling another cell, hey, divide has to be a membrane protein. Pumping energy in and out of your cells has to be membrane proteins. So this is gonna be very directly related to biological function and that's why we find it fascinating. It turns out that in contrast to most other molecules, proteins can, they're both hard and soft. Most chemical molecules, if you look at benzene or something, that would be hard. If you pick a small domain of a protein, it too can be fairly hard. You can turn it into a crystal. But larger multi-domain proteins, I showed you titan, right? That they can be floppy, flexible and kind of soft. And in contrast to benzene or the small molecules, you might be used for this means that the key is a molecule that can have many conformations. And it's not just that it can have many conformations, but during the course of action of its normal function, it will change the conformation. An ion channel will open and then it will close again. It's all based on physics. We can understand it, although it's not easy. There is no magic going on here. And what this also, there's also said when proteins fold, they tend to be all or nothing. Either we have it folded or we have it unfolded. They don't tend to gradually loosen up a bit. Why we don't know at this stage, but that's very much going to be with this course about understanding these complex molecules. It's two minutes past the hour. I'm not going to steal more of your break here. Let's meet here at 60 or 70 minutes past the hour. So you compensate for that after the break. And then I'm going to continue, get back to talk about protein structure. So I spoke a little bit about protein structure and I deliberately introduced this some sort of alpha and beta structure that we have no idea what it is. And if we go back to this sequence thing again that you asked a very good question about this morning, the main real sequence we have is a sequence that's coming from DNA. But on the other hand, if we now look at proteins, we could of course start to look at DNA and say AGCT. But before we're going to run out of space because these sequences will be so long. We don't want to deal with DNA. And we know that all these different DNA bases, they will code for one of 20 amino acids. And given a triplet of DNA, we know what amino acid is. So for us, when we think about proteins, there is no, let's forget about DNA. So in terms of proteins, we start at protein structure. And the first thing we start in the sequence is we call the primary sequence or the primary structure. Nobody would call that primary structure, but it is of course some sort of one-dimensional structure. But that is the sequence I normally mean when I talk about sequence, the sequence of amino acid in the chain. Then this will form some sort of next level structure. We don't know where that is yet, but I will show you the next few slides. But if we don't know after primary comes secondary, right? After the secondary structure, these small common building blocks, they will be grouped into common patterns. And again, you don't know what it is yet, but some sort of chain, they will fold up in some particular shape. Let's call that level three structure, tertiary structure. And at some point, my ion channels that I showed you, where I have multiple such chains, you might even have a quaternary structure. So again, think of this in levels, because when I'm here and I'm looking at 1,500 residues, each with 15 atoms, trust me, there is no way you can keep 20,000 atoms in your head and try to understand what they do. So when it comes to these secondary structures, I will spill the beans a little bit, that these two areas in the Ramashandran diagram that I argued that appear to be common, they correspond to two very common building blocks. And one of them is when the amino acids are curled up in a spiral, almost like DNA, but this is not a double helix, just a single helix. And the other one up there is when they are just straight. So when they are just straight, they are actually positioned pretty much in the pattern I showed here. This curled up one is slightly different. So why would it like to curl up? Well, the curled up one is the helix of what we call alpha helix. And one of the reasons we're curling up, and this is so not obvious, if you have a small, I can try to bring a molecular building to old kid next week so you can play with it if you want. It turns out you can form hydrogen bonds from one turn here to the next turn. So an amino acid four steps away can form a hydrogen bond with us. So that is each amino acid here will participate in one hydrogen bond going down and one hydrogen bond going up. So that means that you form lots of nice beneficial interactions and you think of this as lots of glue or something. So you're gonna get a nice, very firm structure here. It's virtually perfectly packed. There is no extra space and there are no groups colliding and all the side chains will point out away from the helix so there's plenty of space for those two. You could actually almost imagine that if you know what an amino acid looked like and you sat down with a building block and think really hard about it. There are a few other helices. The only reason being that you might hear of them if you start working with protein structures. The default one is called alpha helix. You can think of taking the spirals there and just twisting it harder. And if you twist it hard enough, at some point the hydrogen bonds will jump. So that's, I'll let that out. So here you see the hydrogen bonds in purple, right? If you twist this hard enough the hydrogen jumps might jump one step. So instead of being a hydrogen bond from rescue one to five, so N to N plus four. If I twist it really hard it's gonna be to N to N plus three. This creates a very tight wound helix. It says three, 10 helix. That three just has to do with the number that you're making a hydrogen bond to rescue four, three residues away. And you could also do the opposite. You could take this spiral stairs and try to unwind it. The reason why these are not common is that this one is gonna be too tight so the atoms will collide a bit and this one is, well, it's too open. So it's almost gonna be vacuum here in the middle and nature a porous vacuum. So in practice, every time you would like to arrange residues like that they're always gonna end up in this pattern. So hydrogen bond for rescue one to five from two to six, from three to seven, N to N plus four. And if this is such a common building block that occurs in lots of proteins, right? If we just take proteins and melt the overall structure a bit this is why this was present in the CD spectrum. If you just look at how different proteins turn light this is gonna be one of the very common patterns we see, alpha helix. This beta sheet that we had there, that looks even, that looks almost boring and it is boring. If you think about it, which one of this do you think is more stable? I already hinted that the alpha helix should be quite stable, right? This one, is there any reason at all why it would be stable? So why does it occur in the first place? Because it obviously does occur. We could see that from the Ramesh Shandran diagram. Actually, I wouldn't say repelling. You can think of this in terms of physics or math. There are necessary versus sufficient conditions. So necessary conditions is that things can't collide and it fulfills that. As you saw here, right, that the side chairs are in different sides and things don't collide and that's important because if we don't fulfill that we're not gonna have anything at all. What is not obvious is yeah, but what would be good having it in that structure? So the reason why it's good if you take, if you just take one of these structures it does really help, but what if you have two of these or three of these? So many of them right next to each other, right? Then they can start to make hydrogen bonds to each other. And you saw my talks about this as a single strand and when you put many of them right next to each other we talk of it as a beta sheet. So that this is actually not stable at all. So it's intentional that I'm only showing you one. This would not be stable. But if you have more than one, they start to be stable. So this is a bit of a freak of nature. I mean, it's not really sure why, well, we have these because amino acids are apparently stable in them. And what's true for both of them is that local structure is highly ordered both for helices and sheets. There are some turns to, we will talk a little bit about that next week, but because these are small stable building blocks that can be adopted by almost any amino acid you will see them all over the place. No matter what protein you see they will either have lots of helices or lots of sheets or both. And now we can take a step up this ladder, right? So rather than worry about atoms we started to think in terms of amino acids. But rather than worrying about amino acids maybe we should start to think about helices or sheets. We don't really have to worry about whether that amino acid is a leucine or isoleucine. It doesn't matter. And that's why secondary structure is important. It allows us to understand things on a conceptual level. And as I said that for beta sheet what happens in practice there are two ways to arrange them. You need to put things in parallel and these arrows, well the order of the arrows is just read the order for residues, one, two, three, four, five, et cetera. And they can be either anti-parallel or they can be parallel. And among these it's far more common to have them anti-parallel because if you have them anti-parallel you can go up and then you just make a small turn and then you go down and then you make a small turn and then you go up. Here you have to go up and then you have to go out of the board here and then have some other structure and then you can come back and go up again and then we have to go maybe inside of the screen and then connect here and then up again. But both of them occur. We will come back and talk a little bit about that next week too. So but for now just realize that both alpha helices and beta sheets are good because there are no things colliding and B there are reasonably stable because they can hold lots of hydrogen bonds in these structures. And I haven't really told you yet why hydrogen bonds are so beneficial. The cool thing with this structure is that, oh I'm gonna pass around that paper. These were discovered by Linus Bohling. Same guy that had that horrible DNA structure yesterday. Pass around it. This is a review paper by Dave Eisenberg. We wrote 2003. So there was a series of eight papers and PNAS in 1951 where they went through all these structures. What is impressive with this? It's insanely impressive if you're a physicist. Look at the year. You might not remember that. When was the first protein structures determined? With X-ray crystallography. Yeah, so the point is that, well that was in particular the DNA like with that late 1950s, right? That the point, they predicted these structures based on physics and theory. They predicted that there must be a structure that looks roughly like an alpha helix. And they predict, and they even said that there could be two, I think this is two seven and then this is either three 10 or three 16. This might be the alpha helix actually. But the point is that all these different helixes and everything, they predicted based on theory and interactions and physics. And then a few years later, it was all confirmed experimentally that this is indeed the way the structures look. And it's insanely cool that they were perfectly right. And that kind of forgives everything about this horrible DNA structure, right? You see the power of physics. They dare to have a model, dare to make predictions. And in Linus' defense that when Watson and Crick presented their DNA structure, he was the first one to say that their structure is obviously right, ours is obviously wrong. There is absolutely nothing, there is no problem whatsoever in being wrong, but you need to be willing to admit that, oh, my model is not as good as this other model. Let's give up on my model. But this worth, those eight papers are a bit lengthy to read and everything. If nothing else is fun to look at the pictures on these papers. And nowadays we know that virtually all the structures in the proteins belong to these alpha or beta classes. So if we go back to this Ramachandran diagram, what we know is that we have this region here that are alpha helixes and we have this region up here that are beta sheet and pretty much everything else we can forget about. So this alpha helix is right handed and that's just the way the spiral is going and that has nothing to do with left versus right handed amino acids, it's just the orientation of the helix. But you are physicists, right? Are the laws of nature symmetric or not? Why is there no left handed alpha helix? Why are all helixes right handed? Why are all alpha helixes right handed? So the electromagnetic forces, that's the law of physics. Are the laws of physics symmetric or not? So this is kind of violating the laws of physics, right? So why are alpha helixes always right handed? I spoke about that before the break. Kind of the first thing we brought up. The amino acids, the amino acids are not symmetric. The amino acids are chiral, right? The amino acid have a handedness. So let's try to shake hands. Are you right handed? Yes. Okay, let's shake hands. We're not compatible, right? That is compatible. So the problem here is that you're not compatible. You can't form, because all the amino acids happen to be, and this is sad part, they happen to be, we call them left while the helix is right. Sorry, because when we define the amino acids, we didn't know that the helix would be right handed. So when all the amino acids are left version, they can only build a right handed type of helix. In a theoretical parallel universe where all you had was a D amino acid, the alpha helix would be left handed. And then we just have a mirror image of the remersion and diagram. So now we started to see that this thing that looked like a small chemical detail or artifact, that leads to a symmetry break all over the place. There is a handedness built into the biology, and that's where your left hand is not the same as your right hand. You can't take your right hand and rotate it to be the same as your left hand, it's impossible. So that's gonna determine how molecules interact and what molecules can interact. The helixes are also pretty cool in the way that, well, if you're a physicist at least, those peptide bonds, the oxygen has a fairly large negative charge and the hydrogen a fairly large positive charge. So it's gonna be a fairly large dipole over each of the peptide bonds. And if you look at this, it's not obvious, but if you look at this in the helix, it turns out that all these dipoles will line up in a helix. So this will actually turn the entire helix into a dipole as if you had a large negative charge here and a large positive charge here. So the helix itself will be like a dipole. That is not just a fun artifact, we will come back to that. No, actually, it's the next slide. That might appear like a fun, small, minor physical detail again. The very first membrane-prote, membrane-iron channel crystallized by Rod McKenna, KCSA, it's a, the K stands for a, it's a potassium channel. Let's forget what the CSA is right now. Potassium channels are super important in your cells too. The sodium-potassium balance and everything and driving your cells. So there are plenty of cases where we want channels that should be able to pass a potassium. The only problem, we absolutely do not want these channels to pass sodium, because again, we need to have different concentrations of sodium and potassium. There is one small problem here though. So the channels are simple in a way. Channels don't use energy. Channels are just holes basically that they need to up and spontaneously let something through. The only problem is that, and this might be up to a second or two chemistry that I don't expect you to remember, potassium is larger than sodium. So you now need to create a hole that always let the large thing through, but not the small thing. I'm not sure about you, but that's a complicated hole. So the way nature does this is that potassium is a small ion. The radius is small. And all ions have some water around them. And when the ion is small, it's gonna bind the water fairly hard. While if the ion is larger, it's not gonna bind the water so hard. So sorry, sodium is smaller. So sodium is gonna bind its water harder. So in this channel, you have four helices here and they're all pointing into a center here really. So when the ion gets in here too, all these dipoles, they will coordinate the ion so that the ion will, the positive ion here will feel the negative ends of these dipoles. And then it will happily let go of its hydration water. And then you just have the ion. And then the small ion will happily slide through and go in there. Sodium, on the other hand, will hold onto its hydration water because he's not happy enough with the stabilization. And while potassium is a larger ion, potassium without the water is smaller than sodium with the water. And now nature has created a hole where the large thing will go through, but not the small thing. So there are tons of things like this. But again, the reason why this works goes down to the physical properties of the constituent secondary structures which is due to their amino acids, which is due to their atoms. And there is no fancy thing with energy involved or anything here. We can explain this entirely from physics, movies. This is a fun old movie. I got this from Mike Levitt a few years ago. 1906, you have no idea it was complicated to make these movies because they need to film with a eight millimeter camera and then they need to sink the camera with every image they were showing on the monitor. There was no things like just exporting it an MPEG at the time. This was so cool because that when they got the first X-ray crystallography structures, again, in those days, you just had numbers on papers, right? The mere fact that you could visualize molecules and rotate them and see what they looked like in three dimensions, this was insanely cool. And I think there is another one for, let's see if we see that movie. License time, it's another protein, roughly the same thing. In the interest of time, I will skip them. These are available on Canvas too. So that virtually all the names you see here, they've been part of one or more Nobel prizes since. So this was in Cambridge in the 1960s. When people started to look back, we can take the coordinates of these structures that you determined by X-ray crystallography and first put them into a computer. That means that you can visualize them but it also means that you can actually analyze them and start to understand, can you try to optimize the position of the atoms? Can you try to minimize what the structures look like? So the whole concept of starting to model proteins, large type of, with computers was entirely new at the time. This is the computer that these movies were created on. And I'm not sure, it's a very special computer. It's actually a Mac, multi-axis computer, not the type of Mac you're used to. So that's the terminal. Note that there is no screen on the terminal. So that's the only screen. And you couldn't really display, you could only display graphics on the screen. The whole concept of having a terminal that could display text was utterly unknown. The text was displayed by having a printer write out the statements. And you would program this with punch cards. And you can have something kind of like a mouse. I'm not sure, at least you could try to rotate it. And the reason we're trying to show you that, the guy who was behind all of this was Cyrus Leventhal. And you might have seen that before the break that you had this official picture of which I bet was Cambridge University when he has a tie. Scientists don't look like way. So this is what Cyrus Leventhal really looked like. So Cyrus was one of the first computer geeks in this business. And I managed to find, this was hard to find. I had to dig down a library. A few years after this, Cyrus wrote a Scientific American paper on this, which is really fun. Scientific American in those days, they used to have actually really good scientific articles written by scientists, but in a semi-popular fashion. But not the way that say illustrated science would do it today, but expected people to be educated and they won't understand the details. So this is entirely extracurricular activity, but it's, in biological sense, this was an eternity ago, like 40, 50 years. But in terms of physics is, we're getting to fairly modern science. So what I haven't touched upon this as far is that the reason why this works has to do with interactions. We will happily ignore the electrodes, but you probably should be aware that all the interactions I'm going to speak about are ultimately due to the electrons. But we would run out of time if I had to. Actually, I have some constant here, like orbitals and everything. If you want to read up on this, go right ahead, but the point, this is all the chemistry course. We want to understand physics and proteins. So read up on this if you want to understand the background, but I'm going to need to start to study what are things interactions in protein, why are they interacting, and what will it lead to? So we certainly have some sort of fluctuations in charges and proteins, that are probably the most basic interactions. And any atom you have, even a xenon or something, if you have the electron cloud around this atom, if this just vibrates, at some point in time, the electrons will be a little bit to the left here, and that will mean that I have a small dipole on this atom. That dipole will perturb the electrons in this atom to do the same thing. So you just can have a bit of noise when the electrons are fluctuating around the atoms, you create tiny dipoles in one atom that interact or causes a tiny dipole in another atom. You sometimes call those dipole-dipole interactions. You don't have to understand the base there, but this is what you call van der Waals interactions, or Lenard-Jones interactions, that are like that all atoms interact, and in particular at large distances, all atoms will attract each other. This is the reason why even noble gases will eventually condense the form of solid phase, even if there are no charges whatsoever involved. But from our point of view, we're just gonna be happy and study these that there are some interactions that things can't overlap, they can't collide, if nothing else, the Pauli exclusion principle, and at very large distances, everything will attract each other. So that's one of the things we need to be aware of. This particular force would be proportionally roughly to one over the radius to the power of six. So the obvious question is, do you need quantum mechanics? Well, of course you need quantum mechanics. We know that quantum mechanics is true, and you can't describe the world properly without quantum mechanics. On the other hand, the problem with quantum mechanics is that if we were to do that properly, we can only, when I was in your age, people were so proud because they'd been able to calculate the electron density of benzene with quantum chemistry, six carbons. And six carbons does a form of protein. There's not even a side chain in a protein. So you would, even if you could handle 100 atoms, there is no way we could treat proteins. That's a horrible excuse, because saying that you're not gonna do the right thing just because you can't do it, then maybe we shouldn't do this type of science at all. The other argument though is that to be able to treat those things here, even the quantum mechanics is not exact. So first, to be able to do that, to do it accurately, you should use time-dependent relativistic quantum mechanics. And then we're probably down to one electron, maybe. Well, at least if the hydrogen atom. Because the other thing that you typically assume that what you call the Born-Oppenheimer principle, you assume that the atomic nuclei don't move. So yes, if you have a hydrogen where the nucleus does not move, we can solve the time-dependent relativistic Schroederer equation for that one electron. But if I just talked about proteins having different conformations, assuming that atoms can't move is kind of a deal-breaker for us, right? So the point is not that quantum mechanics is right. Quantum mechanics makes other approximations. One of those approximations with quantum mechanics, you ignore some things that are completely irrelevant, such as water. The proteins, cells, you can't ignore. You can't say it's zero Kelvin without water. We just throw out everything that has with biology to do. And then we're arguing that this is technically more accurate because it's quantum. And this is a bit of a problem here because you're so much, this is a battle with fire. I'm not saying that this is good. But on the other hand, if you're going out and playing football, I can still play football anymore, but if you're playing football, how frequently do you solve the wave equation for the football? It's completely pointless because we know technically it's a correct description of it, but it's a massive overkill description that we just cause this pain and problem. And it's the same thing here, I would argue that the way proteins move, we don't really need quantum chemistry. In 99 case out of 100. While quantum chemistry itself comes with other approximations. So virtually everything we do is instead going to be based on statistical mechanics because statistical mechanics deals much more with this complicated way of many states. The fact that it's not just one benzene molecule, but you have a protein that can again adopt a billion different states. Which one is more likely? All of them are relevant, but we need to talk about probabilities. And that is something quantum mechanics does not deal well with. You could in principle do some sort of extrapolation that starts from quantum chemistry and then extrapolate. But if you extrapolate by 15 orders of magnitude, that's gonna be like me trying to point out, if I'm gonna point to Paris, I'm not sure I mean if you've lived in Paris, but the Eiffel Tower and the Arc de Triomphe, they're not exactly close to each other. They're in different parts of Paris. So try to point out the Eiffel Tower and make sure you don't point at the Arc de Triomphe. It's a bit difficult for us to talk about, right? And that's definitely not 15 orders of magnitude. So it's difficult to extrapolate. So these three gentlemen that got the Nobel Prize in Chemistry a few years ago, they come up with a really beautiful idea that you can kind of cheat. Instead of trying to extrapolate from quantum mechanics, we can look at experiments. So adopt simple parameters to make sure that our simple models reproduce experiments. So instead of trying to determine parameters for say water from quantum chemistry, we can determine parameters from water such that we reproduce the diffusion, the density, the heat of vaporization, maybe the dielectric properties of water. And that's a much easier problem. You can, of course, argue that it's horrible, that it's not proper quantum mechanics, and that's certainly true. But if our goal is to model water, it will actually be a better model. And the reason why this got the Nobel Prize was not just because of that idea, but this is really what has made it possible to simulate proteins moving in computers. Burke is gonna tell you more about that later in the course. We're gonna be able to predict binding, why we can do drug design and everything. This is everything that we actually do calculations on real proteins, tend to have this type of interactions. And those are the simplified interactions that I'm gonna spend the last 15 minutes going through here. By far, the simplest thing we can imagine has to do with bonds and bond stretching, right? You have some bonds there, and the bond here between those two heavy atoms can move. And we're gonna need some sort of way to describe that. And because you're physicists, if you put the spring there, you all know this, but you probably haven't been thinking of it for this type of molecule. Depending on what the length of that, of the position between those two atoms is, there's gonna be some sort of energy here, right? And the energy can be high or low, and hopefully you're all aware that low energy is good in physics, why high energy is bad. So if the energy is too high, there's gonna be a point when the bonds can't stretch further and it collapse. And for this particular thing, you know this really well too, that this is a quantum oscillator. So that it gets kind of complicated and have different ground states and everything. But we just said that let's not do quantum. So we can approximate this with some horrible thing, like the green function, just an harmonic. Why do you approximate things with harmonics? Physicists love to approximate things with harmonics. We love it so much that we have fooled an entire generation to thinking that describes the energy of a spring, it doesn't. Yes, not just that. So what would you use if you didn't pick an harmonic? But it's much easier than that. If we are looking at here, forget about the value because this is an arbitrary scale. So I can say that the value is zero or 47 there. If I know that there is a minimum where I'm happy because if I am here, I can't get lower, right? So what is the first derivative at that minimum? It's not the trick question. Yes, it's zero, good. So we have a function that should not have a first derivative there, right? And then if I want to study what happens around this value, well, there is something that it goes up. Should have a second derivative, right? What do you know about that second derivative? And in this particular case, you have a minimum so it should be positive, right? What more can you say about the value of that positive second derivative? Nothing really. Good, so we're happy. Let's not make things more complicated. You have an harmonic. So the point is not that we don't have anything more, right? But if we have something, remember what I said, stick with the first and easiest possible approximation. The problem we know that this approximation is horrible because if you take two atoms and pull them apart, what's gonna happen is not that the world will blow up, but eventually you're gonna break the bond. So in real chemistry, you would have something like the blue one that the atoms can't overlap. And at some very large distance here, you should be able to have a dissociation energy when you break the bond. But that almost never happens in proteins, so let's ignore that. Horrible approximation. In principle, you should even calculate the other problem is that it turns out that this harmonic approximation is not really ideal either. You might wanna do this quantum-wise and calculate the different levels, but in an entire protein, whether the bond length between that carbon and the hydrogen, there is 1.1 angstrom or 1.12 angstrom, we probably don't care that much. So we have something that can describe that vibration and then we're happy. There's gonna be some sort of course cause there. The simple spring. And in contrast to quantum mechanics, this is something that we can calculate in like five, 10 cycles on a computer, which is important. You can do the same thing with an angle. So then you have three atoms and given, well, you can define what that is and you can define what that is. You have two vectors. There will always be an angle between those two vectors. Can you come up with any good functional form we could use to describe that angle? Yeah, so our harmonic, right? Yeah. There are some other ways of theory you could describe it, but the harmonic works pretty darn well. It's not quite as rigid as the bond, but it's good enough. And the point is good enough is critical here. There are some slightly fancier ways you can describe it, but the point is this is not where our challenges are gonna be. Remember the Ramachandran diagram. It's not the angles or bond lengths that is our main cause of attention. But the main star of the show really is this part. If you have four atoms, there's gonna be a way where the central bond can rotate. And in contrast, the bond length, that's just gonna vibrate a little bit. The angle is also gonna just vibrate a little bit, but these atoms, they can actually move and they can move substantially. So you can have the molecule take significantly different, so these are gonna be important. We can't ignore the other ones because if nothing else, we need to have them to make sure that bonds don't explode or overlap, but they're not really gonna change a whole lot, but this will change. And the other point here, we can't really use an harmonic either because they need to be periodic, right? If this bond rotates an entire turn, I can't have that explode and say that that can't happen and you're going to infinity. So here you're gonna need some sort of trigonometric function or something to describe that as periodic. These are the degrees of freedom that give rise to the degrees of freedom in the Ramachandran diagram. You can explain it with two planes. Three atoms, those three atoms define one plane, those three atoms define one plane, and the angle between two planes. In mathematics, you would call the angle between two planes a dihedral angle. We use dihedral and torsion interchangeably. Torsion would be the rotation around the bond, but I will mix, I mix them up all the time. I don't mean anything special with torsion or dihedral. So if you have four atoms, ijkl, the plane ijk to the plane jkl. And here too, there will be some sort of energy that is a function of what this bond is. And the interesting thing is that bonds would, if you try to rip a bond apart or crash them, it would instantly go to very high values. And very high values might be interested in designing nuclear bonds. Very high values are not particularly interesting in biology because the value is too high, it just means that it will never happen. So once we know that atoms don't collide, at least not at room temperature. The interesting thing with these torsion potentials is that things actually do happen. If you see this small molecule here rotating, there are gonna be some, you can probably, and that is probably not really good when those two large, heavy atoms are overlapping, right? But all of these confirmations are possible. Some of them will likely be better and some of them will be worse. And how good or bad they are, we can describe with the potential where you have an angle here on the x-axis and the energy on the y-axis. And the energy for a small, I think this is, oh yes, sorry, it's all of these is butane. For butane, the energies you're talking about here is a couple of kilojoules, a couple of kilo-cals. That's another thing, I know that we are in Europe. We all should use DSI units. For historical reasons, people frequently use K-cals in these areas. So we're gonna use them interchangeably. Thank God you're physicists, you can convert them. So it turns out that the best confirmation is when the two CH3 groups are in different, opposing each other. The worst one is when the two CH3 groups are overlapping. But all of these are possible and the molecule can rotate into any of these confirmations. So this is the first example of the protein folding or all these billions of confirmations, right? This is a small molecule. You could say there's just really one, two, maybe three, four confirmations if you wanna calculate the ones at the peak. But all of them can exist. This will be the most likely one. This will be the least likely one. And how common that one is, is a question. If you increase this barrier, it's gonna be harder to change and then this process will actually be slower. If you drop this barrier, it's gonna be easier for them to interchange. And the cool thing about all the kinetics matter. In physics, we only care about things that equilibrium, but at equilibrium, you're all dead. In chemistry, it matters whether things can happen in one second or a hundred years. Because if it takes a hundred years, you will be dead before it has happened. You can plot, you can use this phi and psi torsions and plot this as an energy as a function of two torsions. The Ramachandran diagrams would usually just say, is it allowed or not? Black and white, right? In practice, we should likely say, is things, oh, sorry, my bad. No, there. Is things blue, meaning good or really good or red, meaning really bad? And then there might be multiple different states the small molecule can be. This is the simplest type of amino acid you can imagine. It's just an alanine residue where you have one phi and one psi torsion that can change. And there's starting to be a pretty complicated landscape of different energies here already, right? You can imagine how complicated this is gonna be with a hundred residues. But the point here, now we're not talking chemistry. This is physics. It's a value, it's a function of energy that is a function of two variables, phi and psi. You could, in theory, what is it, 15 atoms or so? 15 atoms, each atom has an x, y, and z coordinates. So in principle, this is a function of 45 variables. And I'm not sure about you, but I find it really difficult to visualize 46-dimensional space. So for me, the concept of introducing residues and simpler degrees of feeding means we can think of this at least three-dimensionally. But this is a simplification. Of course, that bond length matters, but not as much as the torsions. There are other things that are important. These van der Waals effects we talked about are certainly important. Electricity, things can't overlap, and at some point, all atoms will attract each other. That is a fairly weak effect, and the most important is what I said at the top, electrostatics. All atoms, or all molecules, even uncharged one tends to have partial charges. So in Benz scene, you have slightly negative charges on the carbon, slightly positive ones on the hydrogen. These charges have been determined with quantum chemistry. That is what quantum chemistry is good for. In waters, you have roughly minus 0.8 on the oxygen, and roughly plus 0.4 in the hydrogen. They're very charged. So that even though an entire water molecule is not charged, it can be very strong electrostatic interactions between two different water molecules, roughly 100 times stronger than van der Waals interactions. This is also something that, I know you're a physicist and you don't want to learn things by heart, but there are some things that you have to know by heart and chemistry, and the orders of magnitude of interactions that if you have two charges separated by roughly one angstrom, you're talking about a few hundred kilo calories per mole. You don't know yet whether this is a lower high energy, but next week I'm going to convince you that it's a high energy. It's a very high energy. Bond rotation, a few kilo calories per mole. So if there was a fact that you had to rotate the bond to get two charges to be closer to each other, if there are different charges, the molecule would instantly do that. But if there are different charges, sorry, the same sign of the charge so they would repel each other, we would much rather rotate the bond away. I already spoke briefly about those van der Waals interactions. I'm not going to go into too much detail about it because again, I prefer to focus on the physics. In principle, at long distance, the attraction, that's described exactly by this one over R to the power of six term. And you can show that with dipole-dipole interactions, induced-dipole-induced-dipole. And you're also physicists. Do you know that when two electrodes get too close, if you know your quantum chemistry, the powerly exclusion principle is going to be an exponential. So it becomes exponentially more expensive to get two atoms to overlap. You're going to get to insanely high energies. If you were planning to take the new Clary Explosive Device course, that would be very relevant, but this is not a nuclear device. And we're also not doing fusions. We're not simulating anything at a million Kelvin, which means I could not care less. Actually, I do care because we both work a lot with computers. So calculating one over R to the power of six, this is, well, first I take R and then I multiply it with R, then I have R cubed. And then I take R cubed, multiply it with R cubed, multiply it with R cubed. So it's three multiplications and then I have R six. Each multiplication takes one cycle on a modern computer, three cycles, quick. The exponential function, that's roughly 150 cycles. But I'm not designing nuclear bombs. I want something much simpler. I just wanted something to make sure that atoms can't overlap. So if I have one over R six and I multiply that number with itself, I get one over R to the power of 12. That's something that goes up quickly, good enough. And that takes one cycle on a computer. And I know that if you're a physicist, you should cry. But the point is this, I worked great for 40 years in these fields. So what people do in practice in computers, you just model this repulsion as one over R to the power of 12. It's gonna be a much, in general, it's a much weaker interaction that these torsions and it's not so important. And at that point, the person in the room who thinks that quantum chemistry is important should rightfully step up and say, that's an absolutely horrible approximation and they would be correct. But the only, and of course, it would be much more exact to use that exponential. But the problem is that's not enough to do quantum chemistry properly. It's not enough to consider pairwise interactions. You also have interactions between three atoms and four and five atoms at the same time, all the way up to all N atoms interacting at the same time. And we can't do that with a computer. So what we in practice do is that we use this as a very simple approximation form. And then we have these two parameters. And we simply fit these parameters to make sure that we reproduce experimental properties well. And that works, surprisingly, that works better than using exact quantum chemistry derived parameters, but only considering pairwise interactions. And I think that's frequently, I'm gonna stop here because there were the last two, three slides I would have to save until morning. That's frequently indicative of the difference between chemistry and physicists. I am a physicist, I am allowed to say this, that physicists are far smarter than the average chemist when it comes to mathematics. But be aware of that, because being smart is not necessarily the same thing as being wise. And what I am frequently exceptionally impressed with that chemists, they're nowhere near as smart as chemistry, but they're good at considering what's important. And they can always get the computer to do that advanced math for them. And it's important that if you sit down and assume that quantum chemistry must be better because it must, did you even consider the alternative? Maybe there is a smarter way to work around that. And I think part of the goal in this course is to get you to be both smart and wise. So that physics is good and that math will enable you to do really fancy things, but don't assume that just math and brute force is always the answer. This is one example where thinking about it was actually a better question than brute forcing it. Given that it's a weekend, I will spend one slide here. Just explaining why this is gonna be important. Hydrogen bonds are caused by this electrostatics and yet it isn't. Exactly what the complication with hydrogen bond is, I'm gonna defer until next week. But hydrogen bonds, as I already mentioned, they're super important in the alpha helices. They're gonna be critical in the beta sheets too. They're gonna explain a ton of the things about the protein folding. So just as the torsions are the most important degrees of freedom in proteins, I would arguably say that the most important energies we're gonna speak about is the hydrogen bond. And it's roughly the same order of magnitude of energy as the torsions, three, four k-cals. But I will save the last few slides until next week. If you wanna go read ahead, be my guest. There are study questions for this week too, that you should look at. And then I'll continue with the last few slides here. But next week I'm gonna be in all physics mode. Then we're gonna have equations both on Monday and Tuesday at least. Have a nice weekend.