 All right, lecture two, good morning. As I told those, well, I think most of you were here yesterday, but we're going to do this in a slightly different setup. The first part here is not going to be a lecture, but we're going to be talking about the stuff that I brought up yesterday. And then the second two hours, roughly, I'm going to spend talking more about electrostatics and interactions and all these molecules. And then we're going to head into the Boltzmann distribution, which is going to be a recurring theme of this course. But we're, I don't expect, you're not physicists. So we're going to approach the Boltzmann distribution more for what you can do with it. But you are going to do the math. Yesterday, we spoke about, let's see if this works. Ta-da, it does. We spoke about the background, why are we doing biophysics in the first place? We spoke a lot about the structure determination and how we can get access to these structures. And I also repeatedly brought up this concept that the reason we're trying to organize is to understand. In a way, a protein is very simple. A protein is just a set of coordinates if you took about it. But understanding 10,000 atoms and three coordinates is difficult for you. So the reason why we're doing this is to organize and make sense of it. We did a bit of repetition on amino acids and peptide bonds. Understanding this is important, not because I'm going to ask you about the detailed properties of individual amino acids. But as I mentioned, as we're moving up this ladder and looking into more advanced stuff, it's important to have this gut feeling that, oh, a proline has these properties. A proline can't rotate around that bond. And if you constantly have to look that up in a book, you're not going to have time to focus on the actual problem. So you need a gut feeling for all amino acids and the degrees of freedom. We spoke a bit about the phi and psi torsions that's going to come back today. A very large part of our world is actually going to be phi and psi torsions, because those are, by far, the most important degrees of freedoms in proteins. We are, I mentioned that we have the siting characteristics of the amino acids that determine the properties of proteins. That's also something we're going to come back to now again. And then we spoke a little bit about Amphinsen and Leventhal and secondary structure elements. But I'm going to suggest that we move on to these questions that I gave you already yesterday. There might be some extra here that I didn't add on the slides yesterday. And this varies a bit from year to year how we can handle this. We can try. And anybody basically, well, take a question and answer it. You so don't have to stick to these questions. You can answer me, others, that too. If you are a bit tired in the morning and shy and everything, I'm just going to start from one end of the classroom, but it's going to be random, and then go through and ask each and every one of you to answer one question. Make sure that you answer one questions. Try to pick the question that's most difficult for you, not the one that's easy, because the one that's easy you already know. You don't need to me to confirm that you already know it. So should we get started? Does anybody want to answer anything? OK? Let's do that. How many charged amino acids do you know? Which ones? So this is actually a bit of a trick question. There are not just five. It depends on the pH. Well, no, tryptophan is typically not charged. But what happens, the point of all charged amino acid is that they are titratable. So you have either an NH3 group or something with an extra hydrogen that depending on the pH, it might lose this hydrogen. And commercially, if you have an hydroxyl group, if depending on the pH, it might pick up an extra hydrogen. So when we say that they're charged, we typically mention around pH 7. And the five ones you mentioned, there are quite a few. Actually, I would say four of them are definitely charged at pH 7. And those are aspartic-anglotamic acid, lysine and arginine, two negative and two positive ones. Histidine is a royal pain. Why is histidine a royal pain? So there are two parts that are royal pains with histidine, actually. So first, it has two sides where you can have an hydrogen. So you can have an hydrogen either on the epsilon or a delta carbon. So you can jump around a bit. The other pain that you measure, we're going to come back to this. Do you remember the measure you used to decide whether a compound is protonated or not as a function of pH? It's called a pKa value in chemistry. So the pKa value is pretty much the pH where something is 50% protonated. And the problem with histidine is that its pKa value is roughly 7.4. It's really close to 7. It's so close to 7 that it's going to depend on your salt concentration. It's going to depend on your surrounding. So the problem with histidine is pretty much that all bets are off. Strictly at pH 7, it would be neutral. But it's so close to 7 that you can't assume that it's neutral. And it's not that one site will always be protonated first and then the second one. That's going to depend on your surroundings. So that actually, normally we might not say that histidine is charged, but I think it's a very good idea to assume that it might be charged. Because again, all bets are off. You have to assume that it could be a charge than histidine. But even other amino acids that are typically not charged, if you push the pH far enough, you can change things. And an amino acid in isolation, it's with rhionic. So you have a positive site in the amino group and a negative site in the carboxyl group. Shift the pH far enough, and one of them is going to become neutral while the other one remains charged. That's not going to happen in proteins, though. So it's quite correct. 4 or 5, depending on how you count. And then always there are exceptions to the rules. So which ones are the acidic amino acids of the charge? Because remember, in isolation, all amino acids have a carboxyl group at the C-terminus. And what is, let's see. Yes, you want to take that one, too, in that case? Same for the basic amino acids? Well, I would say arginine and lysine. The histidine is, technically, it's a basic site. But when we talk about basic amino acids, it's usually just arginine and lysine. I think I have the next slide about that. Actually, we can show it right away. It has to do with this one. There is not just one unique classification of amino acids. It's up to you to decide what's important. Is it the size, or is it the charge, or is it whether it's allophatic or aromatic, negative or positive? So it depends. So it's not a unique definition in the sense of each amino acid belong to one class and one class only. So in the case whether you care really of whether it's allophatic or aromatic, then all of them would be grouped in one site. Whether it's hydrophobic or hydrophilic, you might split them. All right. Next person, one amino acid is not chiral. Which one? Glycine, why not? Yes, and specifically, rather than hydrogen, the properties, so what is the chirality property? What is required for a molecule to be chiral? Yes, so you have some sort of what you call a chiral center, which is usually an atom. And around the center, there has to be four different surroundings. If there are only three different surroundings or two of them are identical, you can rotate them to superimpose. But if there are four different ones, they're going to be mirror images. Which is related to the next question. What would happen if you had some D isomer amino acids in a protein? Exactly. Well, so the first thing that technically you could form peptide bonds. Most peptide bonds in our bodies though are formed by enzymes and those enzymes would likely not recognize the D amino acids. But in theory that the chemistry around the peptide bond is the same, right? So they could in theory link, but at the higher level, they're really going to be incompatible with the other amino acids. So a whole lot of the secondary structure we saw as you saw yesterday, there is a handedness to things like helices. And if you then have one amino acids or more that are mirror images, they're not really going to be compatible with the other ones. But on the low chemistry level, it's the same. So what are the levels of structure organization in proteins? And why exactly those? And I'm going to be nasty again. But why exactly those? Well, as so many other things are right, this is very logical in hindsight. But it's important that this is a completely arbitrary definition. You could of course decide to, I have no idea, let's say that the hydrophobic versus the hydrophilic, the inside versus the outside parts of a protein. The reason why we use this classification is that it turns out to be a classification that works well. It's reasonably universal to proteins. Most proteins, and that's we're going to see more and more, protein structure tends to use this building blocks. For whatever reason, nature reuses secondary structure elements. And therefore it makes sense for us to classify things by secondary structure elements rather than trying to look at all atoms because there's too much information if you look at everything. But it's an entirely arbitrary, well, you could argue the primary sequence of a protein is somehow information provided by nature. But all the other things, they're man-made classifications. They're also classifications, but they are arbitrary in a way. So describe the relation then between sequence structure and function. And the next, so that's quite fair. I would even say this is something you should know by heart. What you said is quite correct, but the formulation you need to know is sequence leads to structure leads to function. And that has to be like a running water if I wake you up at three AM in the morning. So this is called something. The central dogma of molecular biology was discovered, well, this was a concept that emerged at the very start of molecular biology in the 50s and 60s. So does it only go that way? Well, formally, yes. But there is, of course, this part back here, right? Which is what? Evolution and natural selection. So that note. The function does not directly alter the sequence, but if your particular sequence leads to bad function, you're not gonna survive. And conversely, if for whatever reason you happen to have a sequence mutation that makes you better, you're gonna have more offspring, which is good in nature, at least. And that's related. Does function induce structure? Yes? So binding in general, there are lots of things that bind. Every time you have a nerve impulse, there are ions binding, sorry, there are neurotransmitters binding to the receptor that causes the ion channel to open. An enzyme binds things and causes some things to happen. So when you say that the structure leads to function, the structure then is the specific binding site that makes it possible for one neurotransmitter, but not another neurotransmitter to bind. Or in an enzyme, it makes sure that two specific amino acids can bind, but not something else. So you get the selectivity here that leads to a specific function. So related to that, if then you might wanna ask a question is, does function induce structure? Say, exactly, that's it. Normally not. Then normally it's quite true that it goes in this direction, but you could imagine that a particular function might cause a protein to change shape or something. And we're gonna see that later on. When you actually bind, the protein virtually always changes shape when you bind. And this can be a tiny change in shape. It can be so little that we hardly don't care about it. In some other cases, even the hemoglobin in your blood, the protein actually goes under, it goes fairly significant structural changes as it's binding and as it's doing its function. And then does that mean that the structure leads to the structure or the function leads to the structure? Well, the point, this is still true, but the border here can be a bit blurred. There are some cases where the specific function, when the hemoglobin binds oxygen, it will start to change its structure just a little bit. And the really cool thing is that that's actually gonna alter its function too, but we'll come back to that much later in the course. So generally true, but the other thing I mentioned yesterday, there is not a single ruling biology without an exception. So how are amino acids linked into a protein chain? Sorry? Peptide bonds. And what's the property of the peptide bond? It's planar and it's based on electron, you basically have electron resonance so that you get a group that you, normally you can turn around the actual peptide bonds. It creates a peptide, it's a planar bond, almost exactly planar, actually it's almost exactly 180 degrees, that is either trans or cis. So what is it normally, cis or trans? Trans. And what is the main exception? And what is it for proline? So that's where I got you. Same thing as with the histidine, all better off. And I, with proline, we frequently, and I bet that I might even have said that on previous recordings, that's where proline is more common to be cis. And that's probably strictly correct, but more common might mean that it's 60, 40. And the problem, by the time you have 20 prolines in a protein, right? If you start assuming that all of them are cis, we're gonna be wrong in 40% of the cases. Sadly, if you go through and look at structures in the PDB, there are even several errors there. Things that have been assigned to cis, whether they really should be trans-prolines. One thing I didn't put up, so that is, of course, the property of the bond and the type of the bond, but how is the peptide bond formed? Or broken for that matter? Yes, but in your bodies. So where do we get the amino acids from? Well, no, not so much fruit. Ha ha. The food. Food, sorry. Food, yes. Sorry, my bad, sorry, the fruit. Food, yes. In particular, protein, meat, in particular, maybe beans or so. And what do you do with it? You get the amino acids, because the problem then, you're gonna have titin in particular, right? You're gonna have the specific proteins that are in the muscle if you eat the meat. So what do you do with it? I'm in acids. Exactly. Proteins and then they build them up. Yeah. So the first part is at least correct. You have enzymes that break things apart to create the building blocks and amino acids. But how are amino acids stitched back into proteins? Well, I guess technically you could say that it's an enzyme, sorry? Enzyme. Well, Enzyme. Exactly, the ribosome. So this is what I wanted to teach of it. Just because everything else is suddenly doesn't mean that everything is enzyme. So it's a ribosome, right? So the ribosome, and this way you can also actually even think of the ribosome as a gigantic enzyme. So the ribosome takes your transfer RNA and where each transfer RNA comes with three bases and a small amino acid attached. And when we put these right next to each other, the entire surrounding in the ribosome creates, of course, enzymatic binding sites that helps us to then stitch these two amino acids together in a peptide mold. It's an amazingly large machinery. But the point, you need proteins everywhere. In principle, it doesn't happen spontaneously. So how do you determine protein structure? And there are actually more methods than I brought up yesterday. Yep. Go right ahead. That's a very good question. What do you think? Yeah, but take a guess. So what determines the structure of a protein and where does the structure of a protein appear? So the point is that what does the ribosome create? So the ribosome only creates a chain in principle. It only creates the primary structure. So how do you go? And that's, of course, the primary structure is the same whether it's a, actually, no, sorry, with a proline particular, that will likely, if it's proline, it's likely gonna be bound together, either in cis or trans. But in general, the proline cis trans isomerization, it can happen later too. It's gonna be very rare. It's not gonna happen when the protein is formed. So in general, cis trans isomerization is gonna happen in the nascent chain when the exit tunnel of the ribosome. And the specific case of proteins, it might actually depend on the, there are no different RNAs, but that might actually depend on the specific geometry of the earlier part of the chain. I'm gonna look that up. I actually don't know that part. In some cases, you can definitely isomerize outside of the ribosome. I'm not sure whether there is a bias already when you create the chain. A really good question. I'll at least try to look it up. I'm not even sure whether anybody knows. Could be a good thesis, Todd. So how can you determine protein structure? So that's one good method. It was the very first, one of the first ones people started with. And the idea of X-ray crystallography, of course, if you pack things, well, people did this for, even I have done this one for a long time for assault, sodium chloride. So anything that's crystallized in principle, you can determine a structure over the X-ray crystallography. It's certainly not the only method. Can you, we brought up cryoen yesterday too. So I'm not gonna ask you about cryoen, but can you think of something completely different? NMR is a possibility. So the NMR has to do with interactions and how atoms are related to each other in particular in solution. And NMR is actually a very bad method that this is gonna so stand, so straight as I record, I need to put this straight for the record. When NMR first appeared, it was believed to be an awesome method and a particularly solid state physics. And you could really determine the properties of materials. The problem that happened then is that you realized NMR builds on the spins, the nuclear spin of atoms. The problem is that this spin depends on your neighboring atoms. And initially when NMR said, this was a problem with NMR, right? Because it's gonna screw up the spectrum that it depends on the surrounding. And in biology, this is the amazing part in NMR. It depends on the surrounding. So it's not just a spin in a nitrogen or a carbon, but based on how these spins are behaving, you can draw conclusions about their surroundings, in particular how we're packing side chains and everything. So part of the reason why NMR has never been that popular in say physics, it's a part of solid state physics, that's the reason why it works in biology. And what you effectively get from NMR is a huge amount of restraints distances. Are there other ways you can get information about proteins? They're building a gigantic facility in the south of Sweden. Do you know what it's called? Which is not the max lab I showed you yesterday. The max lab is a synchrotron for X-ray crystallography. But there is a second facility very close to it. ESS, have you heard of it? It's the European Spalation Source. So it's a neutron source. And neutrons are different because if you compare electrons and neutrons, what's the main difference? Charged. So when you shoot electrons at the sample, it's primarily gonna influence with the electrons in the samples, you get an electron density. When you shoot neutrons at a sample, it's primarily gonna interact with the nuclei, the neutrons and protons. And that means that you see different things with neutrons than you see with electrons. It's hard to get high flux with neutrons and everything. But neutrons provide some really cool things and in theory you don't need crystals and you can even do that room temperature, which you can't even do with cryoEM. So nobody, I think there are a handful of structures that people are just improved for principles that you can determine structures with neutrons. Who knows, in 10 years this might be a really important structure because you can find out things that you can't find with any other methods. Can you come up with any other way to determine protein structure? No. I guess technically it's a good question. Technically it probably could, but it doesn't have any advantage over neutrons because again, neutrons are far worse than electrons, right? And anything you could do with a proton, you could do with an electron too much easier, simpler, and at higher resolution. So I think neutrons would combine the worst parts of electrons and neutrons. But in theory you probably could, but you would need the LHC to do it. So no, so probably, yeah, you can get rough shapes from it, but the best super resolution microscopy methods are something in the ballpark of 20 nanometers. So we can occasionally find, we can find out roughs if you thought that crye was blobology, this is far worse. You can find out whether two proteins are interacting, but you can't find any structure inside proteins. So some of them just took some large courses before this one. So historically, had I or Arnor somebody said this 20 years ago, people would do this stuff to say, yeah, they're reticent, they can think they can determine structure of proteins with computers. This is changing, this is changing very rapidly. I would still say as an experimentalist, if I'm running a drug discovery company and really, at some point, you're gonna want to go into lab. When you're doing this at high throughput, many of the computational models are now starting to be so good that it's not worth determining structures. That you can say, and that's certainly not always the case, but if you have a protein, and for instance, if there is a close relative in the database, it might be so good that we will just trust it. We will even try to do drug design on it. And again, 15 years ago, nobody, people like me and Arnor would have said so, but no real experimentalist would have trusted that theoretical model. But that's changing. The other thing you should be aware of in science is the Wayne Gretzky quote, that I don't skate to where the puck is, I skate to where it's gonna be. Five, 10 years from now on, I bet there's gonna be an even stronger trend. The 10 years from now on, the computational methods you and your peers are gonna develop are likely gonna be so good that in the vast majority of cases, you don't need it. And that's important, because how long does it take you to predict the structure? And how long, how much does it cost? You're talking about days, minutes, a dollar. Determining a single x-ray structure could take four years. So it all about, and the x-ray structure will likely be better. The question is it gonna be so much better that it's worth a million dollars in four years. You will be faster to market if you do the use the computers. But the point is, there are lots of methods here, and you're gonna see even simpler methods. This is only limited by your imagination. And until recently, everybody thought that cryeum sucked, which is so long and not the case. So if we then start to look into proteins and actually not just structure, but motion, what are the most important degrees of freedom in a protein? Yes, or in particular, the backbone torsions, right? So why are the backbone torsions more important than other torsions? We're gonna go through and name other torsions even today, I think. But there is a reason why Phi and Psi was on this must know plot, while some other things were not. So I think the key word is local versus global, right? If you start changing the backbone, you will have very large changes in the structure at a later place in the backbone. If you're changing a Psi chain or hydrogen, it's a very local change. So how do you define these angles? And what are this? We already talked about cis and trans. So we said that it's a rotation around the bond. And now do you define a rotation around the bond? And that's the one we wanna measure. Right, so that any three points in space will define a plane unless they happen to be entirely linear, but in general, you will not have a linear. So that and that atom will define a plane, that and that atom will define a plane. So it's gonna be the angle between these two planes. And then as I mentioned yesterday, unfortunately this means that unless the planes are at exactly 90 degrees, there's gonna be a smaller angle between them and the larger angle between them. And that leads to two slightly different definitions of these torsions. One of them is predominantly used in biochemistry and the other one is predominantly used in polymer physics. And they differ by 180 degrees. And cis and trans, the reason, I would actually, one of the really important things for using cis and trans, can you imagine why based on what I just said? And this is also the reason why I really, I do know the definition, but I'm not gonna bother you with it. And I have to think about the definition where there's biochemistry and polymer convention every time. So if I say that a torsion is 180 degrees, what is the first thing you're gonna need to ask me? What convention are you using? And then you need to remember to get this right. Because again, it says this 180 degrees or is it zero degrees? So you have a choice. You're more than welcome to use that. And then you have to keep track of the conventions all the time. And you have to think about this every single time. Or you can do like me and call them cis and trans and then it doesn't matter. So that this is always trans no matter what convention you use. And this would always be cis no matter what convention you use. Which is kind of nice. So that's why we used, that's why we have the terms. So why are cis conformation virtually never observed in peptide bonds? You already said that they are observed for polines, sorry, yes for polines. So what's gonna happen is that when you have normal amino acids in cis groups, you're gonna have the oxygen in particular, the carboxyl oxygen on one amino acid clashed into the side chain of the next one. So for glycine it might be possible poliness is a bit different, but in general you will clash into things because in general you have a side chain there. So that's why normally it will never work. We're gonna see that more as I will draw your attention to that when we have some plots about it. And number 14 brings us to two famous scientists. What did they say? Yep, sure, but again, no, but you're not backtracking. This is, I have to confess something. I know the answer to most of these questions. So this is not for me, it's for your sake we're doing this. So sorry, say that again, if you have two. So first thing, the important thing, this will never happen in normal protein. We don't have the amino acids normally. If you have two D amine, so the point is, if you just look at this definition, what here depends on an amino acid? Where is the R group in this plot? No, I don't have it. So this could equally well be an aliphatic, well whatever, say butane or something. There's nothing here that depends on an amino acid. And that's of course why all this originated in polymer chemistry long before we started applying to proteins. So the concepts of cis and trans have nothing to do with proteins. It's general in chemistry. And then of course it turns out that it works very well to uses for chemistry too. So even if you have two D amino acids together, when we say, there is actually one definition here that's a bit unclear. When you say cis and trans, let me draw the amino acids. If I see again, that's right. So we have nitrogen, C alpha, and then a carbon down there. And then another nitrogen, hydrogen, C alpha, and let's draw hydrogen there. And I'll just do these glycines just to make my life simpler. In this particular case, it's easy to talk about trans, because you don't have anything else bound here. If you had something like a butane though, you have lots of carbons and every single carbon has four other atoms bound to it. So when we talk about cis and trans, we focus on the heavy atoms. In this particular case, there is only one set of atoms, so it's obvious that it's trans. But in general, if you have a chain, you say that something is trans if you only draw the carbons. So each carbon here would still have two hydrogens on it. So technically, there are always four atoms. So if you go from that carbon to that carbon to that carbon to that hydrogen, technically, that bond would be trans, but you would never say trans, sorry, you would never say cis there. So you look, you focus on the heavy atoms, the main part of the chain. For proteins, it's easier because it's only oxygen, carbon, nitrogen, hydrogen, or carbon, carbon, nitrogen, carbon, no matter which ones you pick, it will always be trans. So when it comes, so I'll even make your life even easier there. The backbone torsions, we can define them even for glycine. And glycine is not chiral. So that this would only start to be a complication once you all set in the side chains. And the side chains don't influence the backbone torsions. So the backbone torsions will not depend on it. So what did Amphinsen and Leventhal, Christiane and Cyrus, what did they say? Right. And I think that that's kind of the biological or chemistry way of thinking about it. One of the things that was so profound with both of them is that they connected this to physics. So there was one part, same thing, it might sound like nitpicking, but when it comes to, let's start with Christiane Amphinsen. There was one particular thing he said about the drug here. You're quite right that it will re-naturate spontaneously at these small proteins, but why does it re-naturate spontaneously? And in particular, what does that mean about the native state? Not the lowest energy, the lowest free energy. And we're going to come back. And the reason why those things are different, that the fact that the protein spontaneously re-naturates, that's the observation. And I bet that other people have done that observation before Christiane. But the point is the conclusion is what it leads to. And this must mean that there is no, it's the first, it's not the ribosome that folds the protein to a specific structure. So that the structure has to be determined by the laws of physics, and in particular the prediction that this should be the lowest free energy of the molecule. And then you're quite right that Levinthal said that, but if you're going to do that in principle, you need to try every single state, and there is no way that could happen in the age of the universe, or at least not in our lifetimes. Why is water such a special molecule in biophysics in particular? You could argue physics and life in general. I will skip through this a little bit yesterday, but we're going to talk a lot about water in the course of it. So it was something I said about partial charges in the water. So water is a very, very, very polar molecule. And that's going to turn out that it helps with protein folding and everything. Water also has something else. What does water form? A lot of hydrogen bonds, and it forms an insane number of hydrogen bonds. And that is responsible for something that I so didn't bring up yesterday. Water has a very high heat capacity. It stores a lot of energy in water when you heat it, and it requires a lot of energy to heat water. There's almost no other molecule like it. And there are tons of other properties, and we're going to spend a bit of today talking about that. And then you read the Watson and Crick paper. Some of you, at least, might have read it yesterday. You could argue that there were two really key findings of the Watson and Crick paper. There was one really smart idea. Yes? That makes sense with the previous experiment that had showed that the ratio of the G to C. Exactly. So that's a, yeah. Remember that I showed this other seeming crazy model byline is polling, right? Whether you have a spiral ladder and the bases in the middle and the bases pointing out. The key observation that they just, can you imagine how simple this data is? Just look at the amount of A, G, C, and T. There are four parts of this molecule, and you just start counting them. It's not at all obvious why you would count them, but it's data. Look at the data and think about the data. And the one thing that's actually funny, it's they always occurs in pairwise in the same concentration. And if you see this for lots of different samples, that must mean something. That must be important. And if you have this three bases, sorry, three back, well, three single strands with bases pointing out, there is no reason whatsoever why things need to occur in pairs. So the first conclusion was that, ah, the fact that they occur in pairs, it makes it really obvious that they should pair up. And the other, what is it? This led to another, well, another observation or two other observations. First relate to hydrogen bonds. Exactly. So that the reason for these pairwise preference is that the way they hide a bond. In one case, you can form two hydrogen bonds, and in another case, you can form three hydrogen bonds. And that means that there's going to be a preference among the pairs for how they will pair up. And then there is this absolutely insanely beautiful formulation towards the end of the paper, that if you haven't read it, then note that you should. And which is that observation? What do they say? Today in any modern paper, this would have been 16 papers, 16 pages in the supporting information and follow. And I think it's such a beautiful British optimizer that you thought that we hadn't thought about it. We hadn't thought about it. And that is the point in time where we basically specify that, A, you confirm that DNA carries the genetic material. And B, you also describe how the genetic material is replicated. It's a two-page paper. But the point is that the reason why they could say that is that they based it. It was not just a crazy model. They based it on data. And based on this data, they were able also to build models that were compatible with the x-ray data. All these tajedals, I don't think I brought them up yesterday, so I will do that today. Can you describe energies that are typical for a few interactions in proteins? That's also what I touched upon it yesterday, but in the interest of time, we might do that more today. Can somebody describe one energy in a protein? How strong some sort of interaction is? Which would be how strong? Well, hold that thought. We'll come back to that later. Why should you know some of these things by heart? It is the same thing as I mentioned about amino acid. In theory, you can look anything up. But what you're frequently going to find out in science, you have a ton of data work. And if you have to start going and looking up things every time you do it, you're going to be so busy looking things up that you don't have time to focus on the data. And what you will also realize just as Watson and Crick write that you will see data flying by, and then you need to have the frame of reference that is significant. And if you have to look up things every time, you're not going to find out. You're not going to spot that as something. It must mean something, because then you just say that whether it's 14 or 14 million, it's just a number to you. And that's why I'm going to force you to learn some of these numbers. I don't care whether you're wrong by 50%, but you need to be roughly in the right order of magnitude to have a gut feeling about all the energies. But I'll talk about that today. So how much ATP does your body use in a day? Did you look that up? Is that a small or large number? How much is it? It's roughly your body weight. So you use, and of course, you don't consume these molecules. ATP is converted to ATP, and then you recharge it back to ATP. But the turnover rate, you turn over roughly 70 kilos of ATP per day in your body. And that's what you use most of the energy for when you eat. That's why you need 2,000 kilocalories per day. We're going to talk about that in neuroscience. And what's that post-translational modification? Not really this course, but since we brought it up in the slides. Yep? And they undergo translation. But some parts they start being modified already. So they can be like removing parts of the C, or any terminal post-translations, new post-translations. Yeah. So in principle, as always, the details can be super complicated, right? But if you think about this concept, the conceptual model is that we move from DNA to RNA. And then from the RNA, we create the polypeptide chain. So the conceptual part is that these are things that come in after that step, that we modify the polypeptide chain. And that is actually one of the reasons why, in some cases, the whole idea that proteins can always fall spontaneously in there. It's not strictly true. It comes like there are always exceptions. But we might talk a little bit about that. But this starts to go more into biology. So that brings us to what we're going to talk a bit more about today. So personally, I quite like this. I much prefer talking to you rather than talking about my slides. I like the two-way communication. If you don't like that, let me know, and we will change it. And if you would like to do even more of that, while you prefer to watch all the videos at home, go right ahead and do so. And then I can stop lecturing, and we can spend two hours or three hours each day talking. So when we talked about amino acids, I already showed you this plot that there are a huge amount of ways that you can divide this. Again, this is not a plot you need to know by heart. But you need to have a gut feeling about roughly how proteins are divided. So for instance, what is the difference between a phenyl alanine and a tyrosine? Oh, sorry. Yes, sorry. The slide notes for today. My bad. My bad. I so do have it. Ha, ha, ha. My bad. Distribute those all for today. What is the difference between a tryptophan and a tyrosine? So tryptophan is an aromatic ring. It's only an aromatic ring. So it's going to be hydrophobic. And in a tyrosine, it's the same one, but it's a 10-OH group, which is going to give it different properties. Oh, sorry, not tryptophan. What is phenyl alanine? Tryptophan, on the other hand, what is special about tryptophan? So I'm so not going to ask you to draw a tryptophan, but you need to know roughly what a tryptophan looks like. They are aromatic, but more than that, they are gigantic. So typically, the way you would use this as a chemistry structure of biologists, that if there are binding sites, you typically do scanning. And scanning is just a fancy word that you take one amino acid at the time in your protein and you try to replace it with this amino acid and you see if something happens. And then a tryptophan, for instance, if you're having a small interface between two helices, if you were to put a tryptophan there, that would completely disrupt that interface, right? While if you put a tryptophan on the surface of something, it's likely not going to have that much of an effect. So if you want to start, we'll see what happens if I put something large and bulky here. Tryptophan is your amino acid of choice. Glycine is the other extreme, right? Glycine is something that doesn't have any cytine at all. So what you want to see what happens if I remove the cytine, you go to glycine. And same thing that the difference between arginine and lysine, they're not really that significant. They're both large and they're both positively charged. So what I might ask you about is basically pick two, three ways of classifying amino acids and give an example of this. So that it's complete. Actually, to tell the truth, if I had to draw this plot, it would likely not look exactly the same like this because these are not necessarily small and large. Might not be the most important classifier to me. I might not say that cysteine is tiny. The point is not exactly where things are, but the point is to have a gut feeling about most of them because you will use them even if you think that you're only going to go into biotech and do high throughput sequencing or something, you need to understand the amino acids. There was another one of the papers I sent out yesterday, sorry, passed around yesterday, about this sequence of insulin. So in those days, what did they do? How did they describe amino acids? In three letter codes. Because it was these, from one point of view, you could argue it was much easier to students. So what amino acid is this? It's kind of easy, right? And what amino acid is this? Yeah. And lysine on the other hand. It's somewhat hard to mistake. So who was the idiot that thought that I wanted the idea to say that? That's R, that's K, and that's L. So how long was that insulin sequence? It was tiny. It was so tiny that they even write the entire sequence as flowing text in the manuscript. And it didn't matter. It was obvious it's much more fitting to have a rich descriptive way so that you don't make mistakes. I remember even talking to my father's generation, and when I started studying these courses, two or four years ago, they primarily used three letter codes. And my father still prefers three letter code because he understands what it means. But if you're a generation in particular, you've already started to do this as a computer at a very large scale. You need to be able to do things in alignments and everything. And the point is having one column through amino acid means that you can settle if it's 80 positions per page instead of 10 per row. And that's why we have so much amino acid nowadays that it would be complete insanity to waste three positions per amino acids. So there are a couple of tricks to this that you can learn. The R here is arginine. Veilin is EC, threonin is EC, tyrosine is Y-rosine. Tryptophan, I'm not sure. Maybe that's it goes up. Phelananalin is F instead of pH, right? Histidine is very easy. Glutamine is Q-tamine and serine is serine. So that the aspargene, not entirely. It's a bit of a stretch. But the point is there will never be a point. I'm not joking here. This is how I remember them, arginine. Actually, arginine I work so much with that to remember. I think Q-tamine. Learn these tricks because that means that there will be, otherwise you're going to need to learn 20 of them by heart. And monomonics always beat learning things by heart because you don't forget them in the monics. And as I said, we also spent a lot of time, well, sometime at least, talking about the different torsions here. The phy and the psy, at least. You need to know where the phy. I need to be able to draw something like this, remove all the descriptors here, and you need to be able to place the phy and psy torsions. You also need to be able to draw what I did. Draw two amino acids, the backbones. You can just say R for the side groups. You need to be able to draw amino acids in place where all these bonds are for your sake because you're going to be handicapped in science otherwise. So there is one more bond here that I didn't draw you. That we call omega, lowercase omega. You can use uppercase omega too. People kind of mix up lower and uppercase for these. So what bond is that? What torsion is it? Sorry? It's the peptide bond. So the peptide bond is the omega bond. And it's not at all as we don't talk about it as much, typically because it's typically either a cis or trans. It usually doesn't change that much. In general, you also have side chains. And if you have a large side chain, it might start to matter how you rotate around these bonds. So for these bonds, we typically just pick a Greek letter. And in this case, we pick chi. But then we might have more than one here, depending on how large your side chain is. So the first one would be chi 1. The second one would be chi 2. And the third one, well, I would don't have a third one there. But you can probably guess what the third one is called, chi 3. I don't think we're even going to talk about those in the course because it's not going to talk about if you're looking at a specific structure and sitting in building it, it might be important what the value of this chi bond is. But they're local. And when it comes to conceptually understanding proteins, it's not really that important. But if somebody asks you about a chi bond or an omega bond, you should know that it's related to either the side chains or the peptide bond. And then we also drew this Rameshandan diagrams in place of that there are lots of alpha helices. There are lots of beta sheets. And in a few cases, you might have the bit of left-handed helix. If I were you, I would forget about the left-handed helix. It's not important. And it will probably confuse you more. But the left, what is the relation between the left-handed helix and the amino acids? Yes, but they have nothing to do with each other. So that a D amino acid, that would literally be a mirror image of the entire Rameshandan diagram. So both the phi and psi as you would mirror it along the diagonal. So that would be in the theoretical case where we had D amino acids, which is so it's not important. It's a small niche in life science. Then you would have an alpha helix that would be really a left-handed alpha helix. But this is not really an alpha helix. So these helices occur normally for L amino acids. They're very rare. But it's not really, it's not literally a mirror image of the alpha helix. So it's not that particular point that has nothing to do with D amino acids. And I think that sums up what we did yesterday. What I'm going to continue talking about today is some historic background here. And I'll show you why in a second. We're going to talk a little about interactions and empiric modeling and start introducing computers. At some point, what you would imagine in particular today, one should probably ask whether we should need, this means we should go all the way to quantum mechanics. We're going to talk more about these partial charges that I mentioned for water. And then I'm going to repeat it and start digging a little bit deeper into these fundamental properties of amino acid and molecules. And then we're going to start going into physics, mainly because we're going to turn out that things get complicated and we need a more formal language to describe them. So if we start by showing this movie, it's a very old movie that I got from a colleague, a close friend, actually. So this is something you can't find on the internet. So you can probably see there was a heave group in there, right? Let's see, I think there might be a second movie here too. I think, yes, Lysoside. So I'm not going to take you in the interest of time, I'm not going to, but so these are very, very, very early movies of protein structure. And you might have recognized those names. So Kendra was one of the person who got the Nobel Prize for the structure of myoglobin. And Philips was to derive the first structures of Lysoside and both of them in the laboratory of molecular biology. So these are actually movies that I got from Michael Levitt, the long-term colleague of mine, who in turn got them from Cyrus. So these were actually some of the very first, actually, I think these were the very first ever visualizations of proteins. And they're completely unpublished. And when I say Cyrus Levittal, you're probably going to think of that as a senior professor in a, whatever, tweed suit or something. The point is, Cyrus was a computer geek far before you were. So he even had a Mac. This is actually true, it's called multi-axis computer. And I think that is somehow, I think that's some sort of mouse. Let's see. And this is the screen. Again, the whole point is you had a cathode ray tube. These were in the late 50s, right? That things were a bit more primitive. And you don't have any text terminals or anything. So how did you program them? Punch cards. So you have punch cards you set up all night and then you get time at two a.m. in the morning, you would have 30 minutes of access to computer and then you would run there with all your punch cards. And then you would put in your punch cards in the loader and the computer would spit it out 30 seconds later and say error. And then you would need to go home. So if you complain when you sit down and work with Python, you have no idea how spoiled you are. But this enabled them to actually start studying these products. What you get from the X-ray crystallography is really just a set of coordinates. Structure factors even. So these were the first and when they realized you can start to use computers to understand structure and proteins, actually draw out these helices, rotate them to show them in three dimensions and everything. I think, yes, then we have a picture of Cyrus. So what they did in one way, I liked, there was many things that I liked better in science in those days. Today science is very competitive. Most of the things we do in science tends to be very focused on somehow getting impact. But they published all of this in Scientific American, not in a scientific journal. So this is an original extract of Cyrus's paper in Scientific American when they describe that we can build models of molecules. And compared to all these nature in science papers that can be pretty impenetrable, I think it's a beautiful paper that explained the emerging role of computers in structure biology. And at the time what also this eventually led to is that it also led to a number of people starting to use computers, not just to draw proteins, but calculate on proteins, calculate on the sequence. Can you take this X-ray structure and I'll try to minimize it? Can we make sure that atoms do not overlap here? It was very esoteric at the time and I'll come back to that in a few slides. And on one level, this means that it's easy. Because we said that what Anfilsen said is that all the interactions in proteins are governed by physics and we know physics. We know physics exceptionally well. It's all based on quantum chemistry. So now we're gonna spend the next 12 lectures talking about quantum chemistry. No. This is so beyond the scope of the course, but when it comes to quantum chemistry, most of the things have to do with the electrons. And depending on the shape of the electrons around the molecule or something, you're gonna create either constructive or deconstructive interference between them. And that in some cases, these two atoms will then form a bond. And in other cases, they're gonna move away from each other so that they don't bind. And that is the story behind every single covalent bond. All the electrostatic interactions you're gonna see is based on a surplus or a deficit of the electrons on some atoms that will make them effectively positively or negatively charged. And again, the formally correct way of doing all of these things is quantum chemistry. Actually, it is not necessarily complicated, but it's a lot of bookkeeping because you have a lot of electrons and it's really complicated equations to solve. And at very short distance, what would eventually happen is that if you had two electrons in the same orbital here that had the same spin, they would repel each other. You can only have one spin up and one spin down. And that is the reason that if you push two atoms close enough together, at some point they're gonna start to repel each other. And I'm not gonna go through that evidence because it would take an hour, but this is why the universal. At very long distances, all atoms will interact and at very short distances, all of them will repel and it's all due to the electrons. Oh, sorry, I kind of spilled the beans here. This is basically just confirming what I just said that depending on whether these spins are, I can either get them from a covalent bond or they will prefer to stay apart. But the other part that's not quite as obvious, I think it makes sense that things will repel each other if you push them close together, but why will things attract at very large distance? Well, if you think xenon, xenon is a boring molecule that doesn't do anything, right? It's a noble gas. And then we think about water. And water has these large dipoles with more plus on the hydrogens and more minus on the oxygen. So if we now take this water here, that's gonna be a dipole in the electrostatic interaction, what's the charge of a xenon atom? Null, yes. But how does the xenon get that null charge? Well, you have lots of charged protons here on the inside, right? And then we have lots of negatively charged electrons on the outside. So what happens now? Normally we have the plus charge here and the minus charge here. So what happens if these start to move? Because the electrons can move relative to the nucleus. So what can happen here in the method that this dipole, if it's now minus here, that might cause the electrons to move slightly to that direction. Because again, then we will create the same type of dipole there. So a dipole or a charge can cause the electrons in another atom to start moving a bit. So what will happen even, if imagine now if you have multiple xenon atoms, if we have three xenon atoms and normally each of them is neutral, but for each of these three atoms, let's take all the electrons and move them slightly to the left. And that means that we will have a slight negative charge here. So a slight positive charge here, that's gonna like it because there's a slight negative charge here, slight positive there, slight negative there. So suddenly all of these xenon atoms will start to interact just a little bit. The weak dipoles, they're super weak. You actually don't need quantum chemistry to derive this. You can just talk about small fluctuations in charge and then you can determine what the shape of that's gonna be. And it's turned out that this shape is gonna be one over R to the power of six. And it's called London dispersion forces. And you don't need quantum chemistry to derive it. And that's gonna be the case for every single atom. They have it. Yes. So the second, so that's one. In this case, you had one dipole here inducing a dipole in the xenon, right? But if you only have xenons, there is no dipole to start with. But if the second you have a finite temperature, things will start to move just a little bit. So at some point in time, let's say that these electrons are slightly to the left because they moved a bit. Forget about those two xenons for now. But if these two electrons have moved slightly to the left, that suddenly means that we now have a very, very small dipole in this xenon, right? There's a fluctuating dipole. It will fluctuate back in a second, but for now we have a small dipole there. That dipole is now gonna play the same role as the water does here and induce a dipole in the next xenon. So what you call this is an induced dipole to induce the dipole interaction. So that just the fact that there are small fluctuations will mean that you will have a bit of noise essentially where these atoms starts to attract each other very weakly. Very, very weakly, but they will attract. And this is the reason why even noble gases such as xenon or helium, they will eventually condense, very close to zero Kelvin. But if there were no interactions between them, they would never like to form a condensed phase such as liquid helium. You so do not need to understand the quantum chemistry here, but the take home message here is that just as all atoms will repel each other if you push them close together, at long distance, all atoms will attract each other. Very, very weakly. And weakly, you might think that weakly means that it's not important, but it's gonna turn out that it is. So that literally means that this, the further you can program it, the closer they are together, right? The stronger this effect is gonna be. While if they're very far away from each other, it's gonna be weak here. You can actually derive this, but it takes a bit of time, so we're not gonna do it. But if you derive that, you can show that the potential of that interaction is proportional to one over the distance raised to the power of six. And then when we say R to the minus six, that just means that that's the shape of the force. They will decay right away. So it's weak, but not insignificant. So friend of order would then say that, okay, but that needs us, we need quantum chemistry everywhere. We're gonna need to do this properly. There is, we can't make any shortcuts. So because I just spent these three slides, quantum chemistry is the solution of today. The only problem is that there are gonna be some compromises here. The good thing with quantum chemistry, again, you have no idea how spoiled you are. When I was your age, people were so excited that we had been able to calculate the first wave functions for benzene. Six carbons and six hydrogens. It's an amazingly large molecule. Today you can probably do this a bit larger. So how many proteins can you handle up to 100 atoms? Not amino acids, mind you, but 100 atoms. We're gonna be somewhat limited in the biophysics space here, right? They can even do this completely accurately because there are some things that we are ignoring here. You actually need to use the time dependent relativistic Schrodinger equation too, because again, we said we are gonna do things correctly. And the good thing is that we can do that correctly for the hydrogen atom. The bad thing is that there are likely generation of physicists that has already studied the hydrogen atom this way. And the hydrogen atom is not gonna tell us a whole lot about proteins. So yes, friend of water could say that this is important, but friend of water is not gonna get anywhere. There are some other minor details, approximation we would make there. So we would have, there's a fundamental approximation you're making quantum chemistry. It's something called the Born-Oppenheimer principle. And that means that only the electrons move. So the heavy atoms are not gonna move, which means that we just say you're gonna model things that's zero Kelvin in life science. So the point is not that everything, you can easily think that we're not gonna use quantum mechanics. But it's very easy to think that means that we're gonna do horrible approximation with quantum mechanics. My point here is that quantum mechanics is a pretty horrible approximation itself. There is another thing you don't do in quantum mechanics. You don't have water. Seriously? Zero Kelvin without water, and you're pretending to do biology. Proteins will not even fold. So you can forget about the fact that you're theoretically right in your time-dependent realistic assuring your equation. If you get the protein to be stable, you made a mistake because it shouldn't be stable. So the point here is quantum chemistry is not gonna work. Quantum chemistry has made a ton of other approximations while we need to focus more on the biological part. That must mean, that does mean that we will throw out some things. But just the way that, well my kids can play soccer. Technically the formally correct way to describe a football is of course quantum chemistry. And you should describe it with a wave equation. Now due to the wave particle duality and everything, I think the wave properties of a football are usually not that pronounced. It's a fairly decent approximation to represent that there's a particle. And it's gonna turn out for everything we do with proteins. It's gonna work great to have atoms as particles unless you're forming or breaking bonds, which we normally don't do. And this might seem so obvious now. This was not obvious in the 1960s. So a lot of the work that happened in the 1960s was really that people started to bridge quantum mechanics to classical mechanics. They introduced these very high level descriptions that forget about quantum mechanics. Let's describe the atoms in a protein as a particle, an oxygen particle. But then we're gonna need to say that this particle has a charge of roughly minus 0.6. So when you first look at this, it's a truckload full of horrible approximations. But there is one key thing that we can cheat because we can parameterize this based on experiments, which you can't do. So I can start tuning the charges in my water molecules until I reproduce the properties of liquid water. So I don't need to derive them from first principle with quantum chemistry. Again, my goal is to reproduce experiments and predict experiments, right? And if I can do that, it's a good model. The only thing that defines the quality of a model is that if it's useful. A model that is theoretically good, but practically useless, I would say is a useless model. And there were a number of people around the group in Israel in particular starting doing it. I would say that I am biased here because Mike is my close colleague and post-egmentor. Ari Warshall in particular was the guy who came up with this, which was called semi-empirical ways of treating it, which at the time seemed almost like blasphemy against quantum chemistry. But this is why you're in bioinformatics and everything. We just use this force fields and everything that we calculate in proteins. They were awarded the Nobel Prize a few years ago, 2030. And I think it was for the studies, let's see. Yes. And that's why I think after this slide, we're gonna forget about quantum mechanics. I already introduced partial charges to you. The reason why we have this 0.82 and 0.41 is, again, we know that the molecule as overall has to be neutral. So I really only have one parameter here to change. If I increase that charge, I will have to decrease the hydrogens by the same amount. And the hydrogens are identical, so they need to have the same charge. I can, of course, also decide how large the water radius is here, right? So that how much will this bump into other things or attract other things. But we also have a huge amount of experimental data for water. So it will actually work quite fine to parameterize this to reproduce experimental water. If you're looking inside a benzene molecule, you can use quantum chemistry to calculate this. And today it works quite well to use quantum chemistry to calculate these charges. And it turns out that on the carbons here, you can have roughly minus 0.06, and on the hydrogens then plus 0.06 in the benzene molecule, which has to do with the distributions of the electrons here and the clouds. And you can do this for every single amino acid too. There are people that spent their careers deriving parameters and force fields. Are you gonna calculate things for proteins? That's gonna be one important part. I'm gonna jump deliberately, go through a couple of them here. We also have bonds. I talked a little bit about bonds. Yes, in principle it's super complicated to describe bonds. Superficially it's easy. It's like it's a bond between those two atoms. So how long should it be? And then when you start bringing quantum chemistry and everything, well technically if you pull a bond far enough apart, you can break it. So that's something we need to allow and describe. And the other problem is that while a physicist might say, you know what, let's reproduce, or a bifurc, like me might say that we're just gonna reproduce a bond with a small spring and harmonic spring. A properly doing this will of course require quantum chemistry and describing so what is the energy levels in this spring? And a bond can't, if you handle this with quantum mechanics, the bond can't take on any length. It should be discretized in particular levels in the quantum chemistry spectrum and everything. You can add as much complications here as you want. Until you look a bit more in this and realize, that's funny. So at room temperature, all these normal bonds, they're gonna be at the ground state. So it doesn't really matter. Yes, if you wanna simulate, calculate proteins at 3000 Kelvin, it's gonna be very important. But at room temperature, these bonds, we know that from extra crystallography too, they don't change. So just, it's gonna work beautifully to have a spring constant there, or even in some cases just force the bond to be a fixed length. Again, horrible blasphemy to a quantum chemist, but we're not quantum chemists. We wanna understand proteins. So this is gonna be trivial. So we just describe that based on how long the bond is. Sorry, what I didn't say, you can even get the force constants here. And the way you get the force constants here is from spectroscopy. So we decide how much energy, the force constant there is gonna describe the frequency with that particular bond oscillates. So if you use the spectroscopic method, you can find out at what frequency that's like say the carbon to hydrogen bonds absorb. And if you know that frequency, we know the force constant. So we can get this from experiments. Same things for angles. It's exactly the same. And the angles is pretty much three atoms. It's gonna move just a little bit, but it's not really important. If it's 120 degrees, it might vary between 115 and 125. It's gonna work beautifully to your seven harmonic there. And then we have the torsions that we already spoke about, the ones between four that you define from atoms I, J, K and J, K, L. And the torsions are super important, but they are here, the quantum mechanics is even less important. You're not gonna break or form any bonds. They're low energies and they're gonna move and be very smooth. I will show you the torsions here and then I think it's time for a break. So if we look at a few torsion potentials here, let's see, can I start to move there? Yes, I can. So this butane. So as the butane is moving through different torsions, I'm rotating around this bond. You can imagine there's gonna be different energy levels here, depending on the orientation of the last atom here. So for a very simple molecule, if we didn't have the heavy carbons here, if it was just CH3, CH3, then it would be go between low energies when things are not overlapping and then higher energies when the hydrogens would be overlapping. And in this particular case that the peak here is roughly three kilocalories per mole. There's a fairly typical torsion potential, it's low. We will be able to get over that with thermal energy and I will come back to why. With butane, it's more complicated because with butane, this is by far the worst case, right? When you have the entire CH3 groups overlapping and then it's gonna be a high energy here, roughly six kilocalories per mole. The best energy is when it's full trans, so the two CH3 groups are trans. And then there are these intermediate potentials. Let me go back and show you that again. So here, let's see, there somewhere that that hydrogen is overlapping with that one. That's really good. And here we are somewhat bad again, somewhat good, and then we go to really bad. So for a more complicated potentials, you're gonna go between really good to really bad and then these intermediates. And they even have different names that are not that important. And if we go to a real protein, then things get even more complicated. So we pick alanine. This is a super, this is the simplest molecule imagine, the dialanine, it's basically, it's an alanine and then we have added enough amino acid molecules here just so you have a phi and a psi torsion. And if you then plot out this energy two-dimensionally, so just as we had a Ramesh-Handen diagram only showed you what was allowed and what was not allowed. But here we can put this as an energy level in kilocalories per mole. So black would be zero, that is the best place to be. Then the red or yellow here, this would be 25 kilocalories per mole. So it turns out that is a good place to be, that is a good place to be. And if you have to move from that place to that place, you would have to go roughly here. So you can calculate some counter-chemistry with every single amino acid. Don't worry, I'm not gonna do that for you. But in principle, we can understand the role of the torsions by understanding roughly the relative energies of what is allowed. This is a good place to take a break, but I'm just gonna remember that I said it's important to have a gut feeling for things. When I told you about electrostatics, the barrier here might be, say, five kilocalories per mole or so, maybe 10. No, it's probably closer to five. When I said that the electrostatic interaction between two charges at one angstrom or so, roughly how large was that? I mentioned that yesterday. It's a good frame of reference to compare with. So electrostatic interactions can easily be in the hundreds of kilocalories per mole, was 300 if they are at one angstrom distance. And here we're talking about barriers that are five kilocalories per mole. So that this is a barrier you can overcome. Overcoming that electrostatic energy if it's 300 kilocalories per mole is gonna be more difficult. So the relative strength of these interactions is gonna start to come into play here too. But we'll talk about that after the break. It's 10.20 now. What about if we give you 30 minutes and we reconvene here exactly at 10 minutes to 11? And then we'll continue. All right, so before we break, we spoke about torsions. There's one thing I'm gonna go back because I got one question about it here. Bonds. When I showed this figure, there were two curves here and I didn't, in the interest of time for you to have the break, I didn't go through it. This super simple physical model that I say there's just a spring between two balls. That is probably the simplest model you can have for a bond. That does have one shortcoming. That would be the green one here. At very short distance, we're gonna pull them apart. And at the distance increases, we're always gonna try to pull them together. But you all know that's not what happens with atoms. If you take two atoms and start pulling them apart and apart and apart, what happens at some point? At some point, the bond breaks, right? It's gonna cost you a bit of energy, but once you've broken the bond, it's broken and it doesn't matter anymore. And you can describe that with a slightly more fancy form. The form there, the shape is not the critical thing, but then you would get the blue curve that is bad at short distances. It's bad at long distances, but at some point you get to a finite value here. And the finite value, this is the energy it took to break the bond. And once you've broken the bond, we are happy. So let's see, we ended with torsions and torsions, real torsions in the landscape here. So these energy plots correspond closely to the Ramesh-Handrand diagrams that we spoke about in particular yesterday. The simplest Ramesh-Handrand diagram you can imagine is that if we fake one completely, and I've deliberately picked a couple of illustrations on the book here. So if we just care about the backbone and pretend that all the other atoms did not exist, you can't do that. But we're gonna come back to this, and this is part of this traditional called Gedankian experiment that Albert Einstein, in particular, coined. It is very useful to think of an experiment that you can do entirely in your brain. You can't really do the experiment, the point is not doing it, but by thinking about it, it can teach us something. So in this case, it's gonna be, we can't, if we took all the five torsions and put them at zero, we will always get clashes between the heavy atoms in the chain, so we can't put all the five torsions at zero. So that's gonna be a completely forbidden region. And then the psi torsions, if they are zero, well, it would be a bit bad. But overall, lots of white areas where most things would be allowed. What then happened is that as we're adding heavy atoms here, and particularly the oxygen, but this is still stuff on the backbone, not the amino acid side chains. So the oxygen there and the hydrogen there, there are gonna be some larger regions here that are now forbidden where things would bump into each other. And this is important because these are the things that are true for all amino acids, right? All amino acids have these, well, proline and gly, proline is a bit of an exception, but these areas will always be forbidden no matter what amino acids you have. And then as we keep adding things on the side, in particular real side chains, and again, this is, when it comes to side chain, alanine is the C-set it gets. And then there are all these gray areas here that are fairly bad. And now there's a small region here where we're allowed to be in, and the small region here you're allowed to be in. And you can already start to recognize the Ramesh Shandan diagrams that I showed you yesterday, right? For most other amino acids, it's pretty much only one small place where they are really happy and then you're gonna be here. So this is already starting to help us a little bit to solve this Levin-Thos paradox. You don't need to test every single combination of every single amino acid, right? Even for most amino acids are likely gonna wanna be up here and maybe down here. There's much less freedom of amino acid than you might think when you just see this as a chain. And that brings us to the last interaction. We've actually already covered Van der Waals interactions. It's just that I didn't call them Van der Waals interactions. So when I showed you these dipoles and the quantum chemistry, I said two things. At some distance, if you start to push Adam close enough, they're all gonna repel each other. And at very long distance, every single Adam is gonna attract each other, although very weakly. And you can describe this, you can actually, this is fun because this is actually an exact equation. And it's completely useless, I'll tell you why in a second. At very short distance, the repulsion comes with something called the Pauli exclusion principles and you can actually show that that's an exponential. It will go up very quick as the radius goes to zero. And these are gonna be some sort of constants. And at very long distances, I already told you that the dispersion interactions or the induced dipole to induced dipole, they all goes as one over R to the power of six. The problem is that to handle this accurately, we should do it with quantum chemistry. So it's gonna be between all pairs of atoms, all triplets of atoms, all quartets of atoms, all pent-upplets of atoms. It's impossible, we can't derive this. It's also gonna turn out that it's a bit expensive to calculate in particular an exponential. So calculating a multiplication or an addition on a modern computer takes one cycle. This probably takes 100 cycles, it's a bit bad. But we can cheat and we're gonna cheat in two horrible ways that turns out that it's fairly simple. The first thing we can do is that, if we get about the equations for a second, that if there is some sort of shape to this curve, we can fit that to experiments. So assuming that I only have one atom to make things really simple, let's say whatever, argon, because it's a noble gas, you don't need to turn to molecules. If there is one part that attracts and one part that repels, that means that there are pretty much just two parameters we're gonna need to find out. And the two parameters for most gases is that one of them is gonna be the density. So how dense is this gas? Because that's gonna determine a bit on how much they attract each other. And the other one is that it's gonna be at some point when you move from liquid to gas phase or solid to liquid, what is the heat of aberration? How much energy do you need to add to get it to change phase? Two experimental numbers and there are two parameters we wanna fit. It's gonna work. So that creates some sort of effective interaction that makes sure we reproduce experiments and that we don't care about the quantum chemistry. The problem is that if you do that in practice, we're still gonna be very sad because it takes too long to calculate this in a computer or at least it took too long to calculate this in a computer in the late 1960s. So people come up with a horrible approximation. If you've already calculated one over R to the power of six, instead of doing an exponential, well, we just want something that is repelling, that is a function that goes up very quickly when air moves to zero. So take one over R to the power of six and multiply it with itself. That's gonna give you something that says one over R to the power of 12. And that takes one clock exactly from the hundred. And then you're gonna get this other form which is a strong repulsion, that blue line. And when you add them together, you're gonna get almost exactly the same curve. And somewhere here, and all these things, you can parameterize from different small gases. So we can calculate what these two parameters are. And again, friend of order would be upset and say that, but you can take an exponential and approximate that with one over R to the power of 12. They are completely different functions. And you're quite correct. They are completely different functions. That's gonna be super important if you're up here. So if you're designing nuclear weapons, it's a very bad, because again, if you're gonna have a density that's a million kilos per cube centimeter, that repulsion matters. I haven't checked lately, but there aren't a whole lot of applications of proteins in designing thermonuclear weapons. So the point is at room temperature, you're not gonna be here. At room temperature, we're gonna be down here. And we might move around a little bit here, but you're never gonna be, two atoms in a protein will never be at a distance of 0.1 angstrom. It will not happen. And because it will not happen, we're not gonna worry about it. In other parts of physics, it's super important, but for life sciences, this works great. So a repulsion term that's one over R to the power of 12 and a dispersion that is in a weakly attractive term that's one over R to the power of six. I have kind of, I haven't mentioned it explicitly, but one thing that you see in all these figures is that negative energies are good. The lower you are in energy, the better it is. For now, you can take that as an axiom. I'm gonna show you why in a couple of slides. So that that means that this attractive term, which is good is negative and the repulsive term, which is bad is positive. And based on this, we can start to look at some of the things we learned. For instance, the hydrogen bonds in proteins. So I already showed you a small slide of this yesterday. All the images you've seen of proteins are, sorry, water are wrong, because you usually see something like this, right? The way it really looks is rather something like this. You have an oxygen that has a gigantic Lener-Jones radius, and then you have two hydrogens. But that is not very fun to draw and it would look a bit strange in simulations. We tend to draw it this way. But these hydrogens are so weak that we frequently even ignore the Lener-Jones on them because you can imagine, this is not exactly a horrible approximation, right? It still works roughly the same way. We're gonna need the charge on the hydrogens, of course, but we know the repulsion of the hydrogens. There are so few electrons out here that all the electrons are pretty much gonna be wasted on the water. What's gonna happen is that when we actually do this properly in quantum chemistry, is that the way things are oriented, this group, because it shows up on the recording when I use this one. You're gonna have two directions here where you have the hydrogens, and then you're gonna have things formed pretty much at the tetrahedron. So there are gonna be two other directions that are corner of the tetrahedron where you have unpaired electrons. And that has to do with the way the orbitals work around the water. And when these waters are now floating around and interacting with each other, these hydrogens are gonna love to interact with the unpaired electrons here. So you're gonna have lots of tetrahedron structure in the water. And these are gonna be electrostatic hydrogen bond interactions. These are covalent interactions, and these are just electrostatic interactions. And we're gonna come back to the entropy part here shortly, but this is a very strong interaction. It's gonna be so strong that the waters will not like to break those. And for that, for now, that is just something that you need to trust me saying, but that's an obvious experiment you would like to do. How many, in ice, you have hydrogen bonds everywhere, but if you look at liquid water, how many hydrogen bonds are there? And you can determine that with spectroscopy to check what frequencies do you absorb. So in ice, you would have absorption of one wavelength. So that would mean that you have lots of hydrogen bonds formed. And the extreme opposite would be, well, if you wanted to study water without hydrogen bonds, we would somehow need to have one water molecule but make sure it doesn't have neighbors. And the way you could do that is to study water, say, in tetra-material chloride, so CCL4. So it's a metal, at a methane, you replace all the four metal groups with chlorides. That would lead to a completely different spectrum. So what happened when you boil water, no, so not boil, when you melt ice, if all the hydrogens broke, you would see the number one curve move all the way down to the number two curve here. But that is not what happens. What you happen in liquid water is you end up with something like the red curve here. So I have a much more spread out things, which means that there will be lots of hydrogen bonds formed, but some waters will have more hydrogen bonds and other waters will have fewer hydrogen bonds. But mostly, hydrogen bonds are still present even in liquid water, so we don't break them. So there must be something else that happens. And this is intimately related to something called the hydrophobic effect that you've probably all, well, you've definitely seen this. You've all tried to put a droplet of oil in water or something and realized that it doesn't solve it. The hydrophobic effect is super complicated. And Chandler, who was one of the founders of this year, I think he spent an entire career of 40 years and there are still tons of open questions. We don't understand water. It's a much more complicated molecule than you might think. So part of the reason why oil forms droplets in water is that if you're now putting something that's an oil drop here, that's gonna be bad because the oil cannot form hydrogen bonds with your water. And by far the most obvious thing is, so the reason why oil is bad is that we're gonna break all these hydrogen bonds and because we're breaking hydrogen bonds, you don't solve things in water. It's a completely reasonable. It's a completely obvious explanation that it's also completely wrong. So if you measure this, the number of hydrogen bonds, it turns out that the number of hydrogen bonds is virtually exactly the same. And that is based the fact that it's so expensive to break hydrogen bonds that we will not break it. The water is gonna do almost anything it can to maintain those hydrogen bonds. So what's gonna happen instead is that all the waters around this droplet of oil or argon, these waters are gonna reorient themselves to maintain the hydrogen bonds. They're gonna form hydrogen bonds with other waters instead. So you're really forming a shell-like structure around and this is actually based on a simulation. So you're gonna form a shell-like structure around your hydrophobic solvent. And the net effect of this is that the ninth number of hydrogen bonds is pretty much exactly the same. So we don't change the number of hydrogen bonds when we solvate something hydrophobic. Is that good or bad? Well, for now we don't really know, but I'm gonna spill the beans here and tell it that it's not entirely bad. That's gonna cost you something. Water doesn't like to do this. So if you now have two drops like this, if the drops are isolated, we're gonna have one volume around the first drop and then volume, sorry, one area around the first drop and another area around the second drop. But if you instead take those two drops and bring them together, the total area here, the total area around the drops in the second case here is gonna be smaller than the sum of the areas for those two drops. So it's gonna be an advantage to move from that case to that case. And that means that we have lots of small dispersed droplets of oil and water. They're gradually gonna form larger drops of oil. So you separate the oil from the water. And that goes into most things that we consider insoluble or hydrophobic. The reason why they are insoluble is that you can certainly stir things up and try to force them to be independent of water, but they will very quickly try to move together so instead of having lots of these small shell structures, having one larger shell structure around both of them. These shell structures, they even have a name in chemistry. They're called clathrates. You don't need to know that. But for now, we still don't really know why that's bad, but it's gonna have to do with the degrees of freedom of the water. And I'm gonna argue for now that this is very important for proteins. So one important thing, if you have this long protein chain that is solvated in water and lots of hydrophilic groups here, each of those hydrophilic groups will likely be interacting with waters. They might form hydrogen bonds to your water molecules. And then somehow when we fold proteins, whether this, sorry, in this case, it's a hydrophobic group. So in this case, it might be bad. They have to interact with waters. And somehow what's gonna happen when we end up folding proteins is that these groups who are yellow hydrophobic, they're gonna move to be placed on the inside and not really form hydrogen bonds anymore. And then the waters here will be free to form hydrogen bonds with each other. And this is why water is so important. The whether proteins fold and whether they form the structure is just as much related to the water as the protein itself. Which is also gonna be a difference between membrane proteins and water soluble proteins. And that is why this approximation in quantum chemistry of not having any solvent around the protein is pretty bad because we lost one of the most important components. Yep? I was saying that in crystal structure. There is water. There is always crystal water. There's not as much water as you would have around a protein in a solution, but there is definitely crystal water in proteins. And you can even see this in the protein data bank. In most PDB structure of protein, there are a few waters present and those are the crystal waters. So it's hard to say. Because a protein in a solution, right? If at some point you read bulk water and the fraction of protein concentration in your cells would be sub-micromolars or synonomolars even. So inside a cell, most things are not protein. It's you have billions of times more water. Now, I think the relevant question is rather how much water do you need to have around the protein for the protein to behave as if it was in a cell? And that is probably a couple of layers. And whether that is five or 10, we can argue about, but you don't need 100 layers of water around it. And in the crystal structure, you might have one layer of water at least. It could influence it a bit. So this hydrophobic effect that we're gonna see, this is not unique to amino acids actually, but it's because any small hydrocarbon compound that you don't want to solve in water, the cost of solvating it is roughly proportional to the surface area of the hydrophobic part, of the carbon part. And I'm gonna argue that's roughly the case for amino acids too. So if you have a large amino acid, alanine is a small example, but say leucine or something. If you have a large sites in here that's hydrophobic, the cost of somehow introducing this in water is gonna be roughly proportional to the area around the hydrophobic part. And then we have all these other things you would need to care about, how atoms are interacting both with charges, with Lenard-Jones interactions, the bond length and torsions and everything. And based on this, on how you place the different atoms, what they're interacting with, whether they are in a protein or interacting with a solvent or in a membrane, depending on how you change all these things, it's gonna be better or worse. So for now, we just consider this the degrees of freedom. If you change the bonds or torsions, you can end up with different things. And then we need to decide, is this good or bad? And this is just an amino acid. If you were to do this for a protein, we would need to basically, we would like to use all this to calculate an energy. If I could calculate the energy, I would know whether it was good or bad. The only problem is that this is now good. If you have 10,000 atoms, each atom has an XYZ coordinate. So this is now a function of 30,000 variables. And then we need to add the water around it. So we probably have a function of a million variables or something. And I'm not sure about you, but that's a bit more complicated than I like to do before lunch at least. So we're gonna need to do something simpler to try to understand this. And then we're gonna talk something about energy landscapes. So the energy landscape is reducing this to the most complicated part that we can understand easily. And that would be something of two variables. Just like the Ramesh Handland diagram, right? You have a phi and a psi bond. And then you can say whether things are good or bad because that we can be easily visualized. And same thing here. What, whether these degrees of freedom are Ramesh Handland torsions or the distances between the two atoms on the, it doesn't really matter. In practice, it's gonna be much more complicated than that. But if you want something, if you want something concrete to think about, think about the Ramesh Handland diagram. But for now, this is just depending on what coordinates you're picking for atoms or bonds or something, we're gonna end up with things that are either good or bad. And we can somehow describe this in an energy landscape. And we're also gonna use a convention here that when the energy is high, then we're in the red part that's bad. And when it's low, we're in the blue part that's good. So this is something that's good. And we expect the protein to like to be here. And this is something that's bad that atoms are bumping into each other or something. And the reason why we use this convention for energy is that it matches the derivatives and it matches everything we do in daily life. If you take a kilo and carry it up the flight of stairs, you've added energy to it. But if you take that kilo and allow it to drop, it would like to drop. It would like, everything would like to go to lower energy. And that's why it makes sense to say that low energy is good, high energy is bad. So we somehow assume that bad is red and high energy while low energy is blue and good. And here we are kind of stuck because now we don't really get further because we lack a language to, here I just had to say that high is bad and low is good. That's hand waving. We need some sort of methods, concepts, whatever to start to describe this. Does good mean that if you have 10 proteins, will all 10 of them be here or will be 50-50? How many proteins are gonna be here? How many will be here? That is obviously the best one. The question is that one good enough? Could you have some protein there or there? Will a protein, at some point if the protein moves, it might have to go from there to there. When does that happen? So we need a language to describe these things and that's why we need to go back to physics. And this part of physics is called Statistical Thermodynamics or Statistical Mechanics. There is a classical book in this field which is written by David Goodstein, States of Matter. And if you would like to spend six months studying only statistical mechanics, it's a great book. But there's a great fun preamble in this book. It says Ludwig Boltzmann who spent much of his life studying statistical mechanics, died in 1906 by his own hand. Paul Erefest, his student carrying on the work, died similarly in 1933. Now it is our turn to study statistical mechanics. Perhaps it would be wise to approach the subject cautiously. So this certainly has a reputation to be exceptionally difficult. It's one of the most difficult parts of modern physics. And again, I'm not gonna pretend that it's easy. But anything in the world that's worth knowing is hard. And this is something that is hard. But if you learn to grasp that it's gonna help you throughout your career, it's not just in this course. It will help you to understand throughout chemistry in particular by what happens and why does things happen and when do reactions happen. That does not mean that you need to drill down into every single equation, but understanding the concepts here is important. However, because you're also not physics students, I'm not gonna start by throwing all the equations that you hear right away. There will be some equations thrown around here, but I would suggest that we're gonna start, I'm actually gonna follow the book here deliberately because if you don't understand this, you can actually read the book tonight. And what the book does is that we start by introducing this for one special case. So see that we got a decent feeling for it. And then next week, I think it is, we're gonna do the real general case and prove that these things I'm gonna show you are universally true. Why is statistical mechanics difficult? Well, the reason why it's difficult is that we don't assume anything. If you study atomic physics, all the things you're studying are limited atoms. If you're studying material sciences, limited to materials, the strange thing here is that the equations we're gonna show you here, they're universal to physics, chemistry, life, everything. And they don't even depend on a specific formulae. I bet statistical mechanics is gonna be more long-lived than quantum mechanics. Because you're talking about general mathematical concepts that must be true. They're not based on an axiom that something else in nature must be true. But this is also, this generality is occasionally what makes it hard. And that's why I say let's first approach it without the generality and then we can look at the generality slightly later when you understand the concepts. Have you seen this equation? The Boltzmann distribution. Also something that you should know when I wake you up in the middle of the night. The Boltzmann distribution is seemingly super simple and the devil is in the detail. So rho is usually just something we use for probability or density. And you can even think of it more generally. The likelihood of something happening or the likelihood of something occurring. Say the likelihood that a protein is in the red position. And this symbol just means that it's proportional to. The exponential function raised to minus the energy divided by something. And what this means that if the energy here is very low, negative even, then this is gonna be a large positive number and then that means that if the, we really like to be at very low energies. And similarly, if the energy is very high and positive, this is gonna be something very low. So this conceptually describes what I told you in the previous plot. We're gonna be very likely to find things at low energy and it's very unlikely to find things at high energy. But we want something slightly better than hand waving. And you can see that in physics and gases and everything. At, if you look for is the speed of individual molecules in a gas, say yes, nitrogen. At low temperature 100 Kelvin, there will be a distribution of speeds so that there will be some sort of average speed here and there will only be a few atoms having very high speed. And as we are increasing the temperature here, we're adding more and more energy and then we will, the average speed of atoms will increase. And that's one complication. So apparently it doesn't just depend on the energy here. It's also gonna depend on the temperature. So at low temperatures, most things are gonna be at very low energy and as we're increasing the temperature, it's becoming increasingly likely to also see things at high energy. And that is another fundamental difference between physics, the way we study it and quantum chemistry, right? That it matters that proteins are 300 Kelvin, not zero Kelvin. The distributions you would see at zero Kelvin would be very easy because there would not be any speed. The problem here is that this is mostly, when I was an undergraduate, this is frequently brought up as it was some sort of axiom. And the problem with axioms is that axiom somehow depends on that they can be proven wrong or it's an observation. This is not an observation. You can prove this. And it's not super difficult to prove. It is actually difficult to prove in the general case. Actually now I say, in one way it's actually easier to prove in the general case, but the problem is to prove this in the general case, we would need to spend the next hour just focusing on pen and paper and mathematics. We're gonna talk about systems and energy and you lose touch with reality very quickly, and particularly if you haven't worked with equations before. So I'm not gonna throw that on you right now. I'm gonna throw that to you in a week or two. So what the book does and that I like, we're gonna do this for one very specific case that is likely easier for you to grasp. In this case, if you imagine that you have gas in a very, very thin cylinder, imagine a capillary even, and whether this is one meter or a thousand meters high doesn't really matter. If you're out flying at very high altitude, air is thin, right? And as you go down, air is thicker. Why is that? Well, yeah, so gravity pulls it down, but if you were to pull all the molecules down, it would be too crowded down here, right? And then we would have vacuum up here. So there are the point, it's kind of a balance here. On the one hand, we have this gas and a height. Let's see, there we have it. Somewhere here that at low altitude, there's very high density and we have low potential energy. So low potential energy is good, but high density is bad, right? It comes back to the general gas low. You can't put all the atoms in the same place. And up here is gonna be better in terms of the gas low because the density is lower, but the potential energy is high, which is bad. So there's gonna be some sort of balancing force here. And as you said, gravity is definitely pulling things down, but the pressure or the gas law is kind of opposing that. You can't have everything down here. The gas law is gonna mean that individually, it's better for the molecules to move up. And at some point of equilibrium, if you leave this thin capillary to itself, things will relax and then they're not gonna move anymore. So what we're gonna look at is we're gonna take a thin slice here in the middle and start to try to understand what happens exactly here. So how many molecules do we have in this slice? And we already know, based on, I said, we're gonna have fewer molecules per slice here and we're gonna have more molecules per slice up here. And in a way, now this is a bit of hand waving, not that much hand waving. The number of molecules here, that is really proportional to the probability of each molecule being there, right? So this is really the row, the probability of the Boltzmann distribution. And that is a function of eight, which is the height, which corresponded to the energy. So if I can derive this, if I can say what is the probability of having molecules here, or what is the density of molecules here as a function of the height, we are really deriving what is the probability as the function of the energy, for one special case. So the way we're gonna do this is not quite as... We don't need very advanced mathematics here. You've all done this. The gas law. You probably, depending on your background, I think you're more familiar with that version, right? Pressure multiplied by volume equals nRT, where R is the gas constant, n is the number of moles. That works great. But we are in physics here, and it turns out in particularly the Boltzmann distribution, chemists love to calculate the number of moles of molecule and then use the gas content, but a mole is a fairly arbitrary definition. And we don't want that six times 10 to the power of 23 to enter everywhere. So when we are physics, it makes much more, if you're gonna find an absolute magnitude, it's much easier, let's just calculate the molecules, not six to the power times 10 to the power of 23 of them. And this n just then corresponds to the number of molecules or the number of atoms. And that just means that you get a slightly different constant there. And it's called, it's Boltzmann's constant. But that's just, it varies by six times 10 to the power of 23 compared to R. The other thing is that it's a royal painter. If we constantly have to calculate what is the volume of this tube? Well, I don't want to start having units of length and everything measure. So let's make this a bit simpler. Let's take the volume and move it down there. So instead of having the number of molecules, let's count the number of molecules per volume. And then we introduce a lowercase n for that. This is something that's gonna come back and that's it's important to learn. When you're sitting and working with an equation, don't be afraid of assuming things. Most derivations in classes like this, they're horrible in a way because of course, I know how to derive this already, you don't. When I'm sitting and deriving things in my research, do you think I know how to derive things? I'm usually doing it for the first time, right? So when you're working with equations, don't be afraid of playing with equations. If you're introducing something you're assuming the worst thing that can happen is that it goes wrong. Trust me, you're gonna find out six hours later when you're tearing your hair and things don't work. But I'm not sure of you. I have a flight to catch at 2.30, so if we're gonna do it that way, we're gonna have a bit of a problem. So I will have to, you will have to trust me on some of these things. But some of the things here are not entirely obvious. But doing something like this, I like it because suddenly there's one less letter here I have to care about and I don't have to worry about the volume. And that's usually something that's gonna work really well. And that means that Kliperra's law simplifies that the pressure equals the number of molecules per volume multiplied KT that are just constants for now. And if we now wanna study things as a function of the height in this tube, it's interesting to see, well the obvious thing is how do things change with height? We have a name for that, it's the derivative. So how the pressure changes with the height. Well, then we would need to use the product law for derivatives here, but it's very easy because the temperature doesn't depend on the height and the Boltzmann constant doesn't depend on the height. So this is just means the derivative of the number of molecules per volume with respect to height and then the two constants. So here we're still talking about upper secondary school math, very easy. But all the thing that I've done here is really that I simplified the equations a bit and then I came up with a way to describe how does the pressure change with the height as a function of how the number of molecules per volume change with the height. So that was the part that's pushing us up the gas law. And the assumption here overall that this would have to be equal to the part pulling us down that you noted. This is incidentally a very common way of deriving things in the equation. You just need to find two things, put them equal and then work on the left-hand side and work on the right-hand side until you get somewhere. So first we worked on the left-hand side, let's try to work on the right-hand side. This is even easier math I hope because it's lower school. The energy as something instead of the mass times the gravity constant times the height, right? It doesn't get any easier than that. And if you now move the, if you now move up by a very small amount, dH here, the weight of the gas pressing down changes by the mass, multiplied by the gravity constant, multiplied by the number of molecules we have per volume multiplied by the height there. It's not as hard as you think. If you're not used to Ironman, you are under some conditions, we are actually allowed to separate this, the denominator and the denominator in these derivatives. So we now know that compared to the previous slide, we first calculate how much are we pushing up? And same thing there, if you move from the lower volume here to the higher volume here, what is the net defect of the gravity pushing down here? And those two things have to be equal. What would happen if they were not equal? Things would start moving. And then we would just wait until it stopped moving. And then you're at equilibrium. So by definition, at equilibrium things don't move anymore. So that they will have to be equal, otherwise there would be an acceleration. So on the last slide we said that the amount of pressure changes as a function of height was equal to the weight of the number of particles changes as a function of height multiplied by those two constants. And then we had this number, right? How much we're pushing down with gravity here? By definition, they need to be equal. So we can set an equal sign between that, but now we have to be careful because here we're pushing down. So that has to be a negative sign. So I have to add a negative sign to that Mgm. And that has to be equal to the way d and dh changes when we go up. So now we know that the derivative of the number of particles with respect to height is multiplied by some constants equals some other constants multiplied by the number of particles. So here we have an equation that has the derivative of N on the left side and N on the right side. So it's gonna be a differential equation. And you can simplify that a bit by moving over the kT to the right-hand side. So that's the derivative of N equals some constants multiplied by N itself. So how do you solve this? So this is the same reason why you need to know the amino acids by a trigger. This is not a course on math. I would not expect you to necessarily expect to be able to solve that differential equation. But just as you need to know enough about amino acids or properties to have a gut feeling for it in mathematics, if you've worked enough with mathematics, you're gonna recognize this. And the way you recognize this is that there is this classical law. If you take the derivative of the logarithm, that equals one divided by N multiplied by the derivative itself. You can prove that, but again, you're not gonna prove it on the left side. You have to remember that, ah, this looks like a derivative of a logarithm. So the logarithm here is gonna be, you can write the equation on last slide, this one, and say that the derivative of the logarithm of N, and then that one over N, the N we had there disappears into that logarithm, equals a constant. And then we're done. It might not look exactly where we're done, but so if you then try to integrate the left-hand side, then it's gonna say that the logarithm of N equals minus Mg divided by kT multiplied by N. And if we take an exponential then of both sides, the exponential of the logarithm gets us N. And then it says, then it's an exponential on the right-hand side too, minus Mg divided by kT, and then the H, sorry, I said N, it should be H. When we integrate it there, we got an H there, right? And that's the H that shows up there. And here we can now recognize what we had on the previous slide. The mass multiplied by the gravity constant, multiplied by the height, that is the energy, relative to some sort of position. And for now we're not really gonna care what that position is, we can call it delta E. So that the number of particles you have at a particular height is proportional to or equal for now. The exponential of minus the energy you had as a function of that height divided by kT, where k was Boltzmann's constant and T was temperature. And again, this is a completely arbitrary case and it's worthless in the sense that we can't use this for proteins just because I showed it for air in a gas cylinder. But to prove this completely universally, slightly more difficult. But the point here is that with a bit of upper secondary school methods only you could derive the Boltzmann distribution. And that is also why this will never be proven wrong. No matter if we, we might certainly realize that quantum mechanics is wrong and we would have some relativistic quantum mechanics, merge quantum mechanics with relativity or whatever. This will never be proven wrong because this does not build on quantum mechanics. This builds on mathematics, observations and probabilities. And we only assume mathematics and statistics here. In the particular case of the gas, we assume some things about the gas cylinder, but we're gonna show this later on without assuming anything about the system. So this is universally true for any system that can exist at different energies. We can calculate how likely it is to have the system based on what the energy is. And there's always gonna be an exponential involved. This might sound completely corny too. Do you have any idea how fast the exponential function grows? Trust me, you have no idea. Even I think I had a slide about this might know it's probably gonna be the next lecture. By the time that the energy starts going up here, you're very quickly gonna be in the, I will eat my left shoe if it happens territory. So the, and the exponential function goes up so rapidly that you will never see things of very high energy. On the other hand, you will see things at very low energy, even if it's slightly more inconvenient. And this is gonna, the scale you have for that is gonna be the temperature and the Boltzmann's constants. So at very, the higher your temperature is, the more likely it is to see things at higher energy. Because if the temperature is high, then that's gonna compensate for the energy being high. Let's see what this means is that this is, this is actually easier than it might think from the equations. Now I just take the Boltzmann distribution and answer the probability of something. Let's say you can say that think of this as a protein, but this is universally true for a system. The probability of being in a state A where we have the energy EA is a constant multiplied by the exponential of minus that energy divided by KT. That's exactly what we had on the last slide, right? And let's pick the same system because if it's the same system, the probability constant is gonna be the same. If you are now in state B, the probability for that, well, it's exactly the same thing, but it's gonna be energy B instead. That's what the Boltzmann distribution says. What is C? We have no idea and that's the problem. You can try to normalize this, calculate that the total probability should be one, but then we need to enumerate all the states. So we will cheat this, that. And we will cheat by saying that, you know what? What I care most about is how likely is it to be in A relative to B? Is it more likely to be in A or more likely to be in B? And then we just take the quotient of those two and then the C's disappear. Also a neat trick. If there is something you don't know what it is, try to get rid of it. So instead of spending the next six hours worrying about how to normalize that with the C, I just spent two seconds and I got rid of it. And if you know your exponential loss, an exponential raised to something divided by an exponential raised to something else, that's actually, we're gonna turn, take the exponential arguments there and take the difference between them. So that corresponds to the exponential minus EA minus EB. So that's gonna be the difference between A to B. So now you say, instead of having an absolute energies, we now have a delta energy, which is a difference. That is also neat because what I didn't tell, an energy is always measured relative to something, right? The energy of lifting something is relative to the ground or relative to the desk, but is the desk on the first or the second floor or is it the ground here or the ground by the subway? The second you want an absolute reference point, things become difficult because you need to define what that point is. And then another cool thing here by looking at relative things, I got rid of the absolute part. The only thing that matters for how likely state A is compared to state B is the energy difference between them and then an exponential. And what this says is that low energy states will be much more populated. This is amazingly cool, much, much cooler than you think. If we now start to draw a protein or water, anything you can imagine, because this is where we can start drawing conclusions about life, at least if we have a good pen. No conclusions about life today. Ah, there. Let's say we have a state A there and a state B there. Which one is most likely? A or B, okay? But that is not the same thing as saying that things will never move over here. Because again, we also have these thermal fluctuations, right? There is a bit of finite temperature. But just as the gas in that cylinder, it's not that the gas, the air bubbles stop moving. It's just that on average, the density is the same, but the individual molecules will move. So here too, there will be some molecules moving from B to A and there will be some molecules moving from A to B. And then there are two things that can happen. Either there are more molecules moving from A to B and then we are not yet in equilibrium. Then we are net moving this direction or there are more moving from B to A and then we're net moving in that direction. So the Boltzmann distribution leads to something pretty cool called the detailed balance. That the number of molecules moving from A to B is very simple. Then that is the number of molecules we have in A multiplied with the probability of an individual molecule to move from A to B. So the number multiplied by the probability. And if we are at equilibrium, that must be exactly the same thing as the flow in this direction, right? Which is the number of molecules in B multiplied the probability of moving from B to A. So with that one equation we had on the last page, we can now say that the number of molecules in A related to the number of molecules in B is also not just proportional, but identical to the number of molecules moving from B to A divided by the number of molecules moving from A to B per time. So even though we just, this superficially Boltzmann just told us something about the likelihood of being in different states, but now we can also start to draw a discussion about how frequently, how likely is it for a molecule to move to one place to another? What are the relations between these two energies? How many molecules will move here versus how many molecules are gonna move in that direction? And we're gonna keep coming back and seeing, and this is pretty cool because the derivation I showed you assumed these things about the gases but within a week you're gonna prove that this is universal. There's nothing here that assumes a protein. It's gonna be equally true for stellar systems as the proteins as for electrons. And this is why I had to find so amazing with statistical mechanics. Who is a bit confused? Okay, so there are four of you who are honest. You're gonna spend the entire life tomorrow playing around with this. And you're actually gonna do a simulation here, but it's a beyond and already, I think they've designed a beautiful lab on this. The point is you should not assume things. So they're gonna give you, they're pretty much just gonna give you Boltzmann distributions and detailed balance as some simple laws. And then you're gonna end up modifying some very simple Python codes. And you're gonna sit and play around with this and get a feeling, what does this mean and how populated will they be and everything? And initially this might, this is deliberately a Gedanken experiment because you can do this for a protein too. You can do it with water, but there's nothing here that assumes protein or water. And by adding those details, you're just crowding the situation and you're occupying your mind with things that are not important. And that's why they're doing this for a completely abstract, very simple system first. Because if you showed and understood it there, it's gonna be much easier for you to move to Proteus later. Unfortunately, this is not entirely true. If you had to pick one here, which one would be best for your gas of these shapes? And why? Why? So I will deliver, this is not obvious. Take one minute and try to talk to the person right next to you and convince them why you're right. 30 more seconds. Okay, well, let's break there in the interest of time. I should, I'll reserve more time for this discussion next time. Do we have any suggestions here? Yes. No, all of them are not best. Sorry? Why do you like the round one? Yeah. So you could argue that, you could certainly argue the round one because the overall volume is good and you have volume here in the middle that's reasonably good and you also have some reasonably good volume. Which one do you think is the worst? Do you have any other takers for the worst one? I would hate this one. And the reason for this one, this has very little volume in the place where it's really good to be, right? And you have lots of volume in the place where we don't want to be. In general, it's good to be anyway. I'm not proving this, just hand waving, right? But let's see here. If I try to sort you in a way, so the corridor, and if everybody gets to the end of the corridor, you will get a thousand dollars. It would be pretty crowded by the end of the corridor, right? But everybody could not fit there. So ideally, you would all like a corridor where there is more space where you want to be. While a corridor where you would all want to be, but there is unfortunately only room for one student there, well, that student is going to be happy and all of the other ones will be miserable. On the other hand, as you mentioned, I should probably modify the slide a bit. It's also good to have an overall large volume, so this is students I must be. I think that both that one and that one should be reasonably good. The volume here is slightly larger overall, and this one is good because we have lots of volume and space for lots of atoms where most of the atoms want to be. But the problem here is that suddenly you realize that that simple derivation I gave you is a horrible lie. How does the Boltzmann distribution, in fact, account for this? It doesn't, right? The Boltzmann distribution only said that it was a function of the energy, and the energy here is low, the energy there is high, the height. So somehow the amount of space, and now I'm doing this for the gas, but the point is in general, the amount of duplicity or the amount of atoms that can be at this state also matters a bit. It might actually be easier to do this with the equations. So let's, we'll still use the stuff that I had in the detailed balance. So we'll have one state A and one state B, and for now we don't care what they are. In the Boltzmann distribution, you kind of had just spoke about one state with a capital S, and you could never imagine, say, a state's having two positions in it, so each atom could be in one state. But let's now say that state A has some sort of volume A, and I'm not gonna say how I measure volume, but it's some sort, that somehow describes how large state A is, and then you have another volume B that describes how large state B is. And then I can also say that the number of states is obviously proportional to volume, right, because every single particle that can be in state A, well, if I make the volume twice as large there, I have room for twice as many particles. So it's reasonable to say that this is proportional to the volume. So the probability, again, if I look at this relative probability, the probability of finding something in A here versus B, it's still gonna be the exponentials I had in the Boltzmann distributions, A and B. I need to remember to do this, we see it in the video. The right hand side here is that we still have the Boltzmann distribution parts, but now we also need the pre-factors. And the denominator is gonna be proportional to volume A and the denominator is gonna be proportional to volume B. You're following? And here we should get sad because suddenly the stupid teacher reintroduced that volume that we got rid of four slides ago. And that's bad, but live with it for now. I hate that when I introduce bad things. So let's see if we can get rid of it. And this is a completely arbitrary trick that would likely take me in half an hour to discover if I did it the first time. The volume, I can choose to write the volume as the exponential of the logarithm of the volume. That's just the exponential and the inverse of the exponential, right? And then I said the VA, the VB here, the volumes that previously occurred outside of the exponent, I can now, if that is an exponential, well, on the previous slide, it would now say the exponential of the logarithm of the volumes multiplied by another exponential. And if I multiply two exponentials, that corresponds to adding the arguments. So now the logarithm now ends up as an addition in the argument in the denominator and the denominator inside the exponentials, right? And just to make things easier, let's multiply and divide the logarithm by KT because then I can move it into the same area. So let's say the energy minus temperature and K multiplied the logarithm and then I divide everything by KT. But the funny thing, that looks exactly like the Boltzmann distribution. So it's an exponential raised to minus something divided by KT. It works like an alternate distribution, it quacks like it's alternate distribution. It's just, it's not exactly a Boltzmann distribution. The way we were used to them. So instead of E, it says E minus temperature multiplied by Boltzmann's constants multiplied by the logarithm of this strange volume. But instead of, let's see, we're gonna need some other letter here. So we call this E. What is the next letter in the alphabet? Let's call this F. That's another, if you can't get further in physics, we define something. So I'm gonna call this F because we, the E itself, E is not a fundamental constant. E was the sum of all the, E itself was super complicated. E was all the sum of the electrostatics and the bonds and the torsions. The E was a super complicated constant per se, but by introducing a letter for it, calling it the potential, it made it easier. And now it turns out, oh shit, there are some other terms we're gonna need to account for. Let's just introduce another letter for it so we don't have to worry so much about it. So we'll call that F. And that is this concept of free energy. So it's not quite energy, but because we now account for these volumes, it's gonna turn out that this very much describes how much energy is available to before work. And available energy, or you could even or even available energy would have been a better name. But then if we call it available energy, we're gonna spend the rest of our careers explaining to colleagues what we mean. So let's call it free energy because that's what everybody else does. But you still have this horrible part with the volume. And when we can't get further, physicists define things. So the Boltzmann, the temperature is problematic, the temperature can't change. So we don't want to introduce, sorry, the temperature there, it's nice to keep that separate. The Boltzmann's constant is a fundamental constant in nature, so that we'll keep. And then this somehow describes the number of different microstates or count. At the very extreme, you can count, spin up is one state for an electron, spin up is another state. So at some level we can calculate volumes. So we, instead of saying k multiplied by the logarithm of volume, let's call that S. And that's this horrible concept entropy that you've probably bumped into and you find entropy difficult to understand. And if we leave out the temperature there, this letter F, that is just equal to energy minus the temperature times the entropy. And the energy, it's actually bad to call this energy, this really should be called the enthalpy if we're strictly correct. That first part describes the direct, the interactions in the system, what energy you are. The second part here in particular S describes the number of states you have in the system. So how much space is there available? And the point is it's energy, entropy really is that simple. The problem I think if all of, how many of you have worked with entropy before? Yeah, and how many found it easy? The problem is that you try to somehow approach entropy with order or something, you try to understand it. This is not something you can understand, you define it. It turned out to be convenient to define it to get rid of that horrible volume and everything. It's a definition. If you wonder what entropy is, it's very clear. Entropy is K multiplied by the logarithm of the number of states or the micro volume or something. The number of ways you can organize the system. It's as simple as that. So let's now do the things that you probably, the cool thing now, now you can say that the probability differences between any two states, A versus B. That is now exponential minus, they're raised to the minus delta F divided by KT. And this, and now we can also count for systems where you might have two states with energy four and six states with energy nine and 39 states with energy 14. That will show up in the entropy part because the energy would be the same for all those states. And this means that it's a much more realistic way of treating nature. And that's why free energy really describes whether the reactions will happen while the energy would just describe, the energy can't really take these things into account whether there are multiple states. And that's why anytime you use Boltzmann distribution you're likely gonna use a free energy. And that equation my friends is the other one, third, fourth one that you should know in the middle of the night. We're gonna spend like a third of this course talking about that equation. Because you probably thought that the Boltzmann distribution looked complicated. The Boltzmann distribution is fairly simple. Understanding this equation is hard. It's much, much harder to understand this equation because what it means is more difficult. The Boltzmann distribution is fairly simple. Low energy is good, high energy is bad. I'm gonna do a small example. No, this is fake. This is not my desktop. How many states that does that correspond to? If you think of the order, the ways we can, the ways you can place the icons here or something. What is the volume of that state? If, imagine if this was a protein organized in a particular way. That is a very low number of states. Exactly, this pretty much corresponds to one state. And you might, sorry, now I spilled the beans here, but let's compare it to this state. That probably sadly is a more representative view of my desktop. So how many states does that correspond to? So you might think that this is one state and that's many. And the point is that that's wrong. Both of them correspond to one state. And this is where the F equals E minus T starts to be complicated. The trick here is that how many states are there like that one? There are very few such states, right? Almost in the extreme case, there might only be one state that is very well-ordered. I do this experiment in my kid's room every week too, trust me. It's very unlikely to end up in a state like that spontaneously. But there are lots of states that are like that one in the sense that we find them unordered. So if you just start with a room randomly and stir up the energy, where will you on average end up? Here. But the likelihood that you're gonna be in exactly that state with that icon placed exactly there and that icon placed exactly there is of course not more likely than I end up in exactly that state. And this is gonna turn out that it's the same for proteins, that there are lots of states that are like the unfolded ones, but each specific position of every single atom is of course just one microstate. But there are lots of states like the unfolded ones and there are very few ordered or folded states of proteins. So the entropy is gonna be crucial here. Yes. So that's a really good question. We might not even have time to go into today. If you go into my kid's room and say that it's ordered, there might even be only one state where everything is ordered, right? But if you go into the kid's room and say that it's a mess, and you might think that it's always a mess in here, but no, technically the toys have probably never, ever all been in exactly this position before. It is just for me on some sort of higher, on the microscopic level, I should classify all the toys between the X, Y and Z coordinate of every toy and the rotation, right? I don't do that because it's pointless. So I rather classify, I look into the room and I classify it whether it's ordered or not ordered. And there are very few states that are ordered, but there are lots of states, random positions of X, Y, Z coordinates where I would find the room to be ordered. So that's my higher level description. On the microscopic description, it's just positions of atoms. We're gonna come back to this lots of times with proteins. So what the laws of thermodynamics, sorry, I can say this is of course what the entropy describes, right? Entropy describes the entropy here is very low. It's only one state. And the entropy here is very high, lots of states. And that's why we have this definition of entropy as order. If there are lots of potential different states, we call that disordered and then the entropy is high. And with only one state, entropy is high low, then it's ordered. So that actually, ah, sorry, my bad. That leads us to the law of thermodynamics and particularly the second law of thermodynamics. You can read this if you want to later, the reason why the entropy is important is that in the normal laws is that if two systems are in thermal equilibrium with each other, they're in thermal, with a third system, they're also in equilibrium with a third one. And then the second one is pretty much that the energy is constant. The second law of thermodynamics is probably the most difficult one to understand. And that really says that in a natural process, the sum of the entropies of the interacting thermodynamic systems increases and approaches a maximum at equilibrium. And that has to do with these things is that entropy always increases. So the entropy will decide whether things happen or not. If you think about it, there is nothing here that says these things are fundamental properties of this system, but they don't really describe whether things happen or not. So entropy and in particular, free energy is so important because it's actually gonna tell us what things happen versus what things do not happen. And there's also postulate here that at zero Kelvin, the entropy is minimum. And it turns out that we can use this to understand some things that were not so easy to understand before. If you look at any compound such as water that undergoes a phase transition, when it's ice, you have a very well-ordered system and all the hydrogen bonds here are formed. And at high temperature, we started to break some of those hydrogen bonds. But if we did not have entropy, that would be bad, right? Because the energy would be worse. And we don't want bad energy. We want low energy. So if we did not have entropy, you would say that the best date for any water molecule to be in should always be the state. Doesn't matter with the temperature. There's nothing here that's, the temperature would somehow change the scale. But if we did not account for entropy, that would always be the best case to be in. And obviously that's not true. We know that water will at some point in time become liquid and at some point it will become a gas. And the reason for that has to do with this balance between entropy and entropy. Yes, I have two minutes, so I can take it through this. So at low temperature, here the energy is very low, which is great. All other things equal, low energy is awesome, but all other things are not equal. You also have a very, very low entropy because each water here, they can't move. There's pretty much only one position, one state the water can be in. Now exactly what units we use for this volume is not really relevant. You can even start to count it. And I don't think you will ever start to say, I don't think we will ever use any units of this volume. So think that that water, you can't move that to another state because then you would break the ice. At least a zero Kelvin. Now liquid water on the other hand, you've just started screwing up the hydrogen bonds, so it's gonna be unhappy. The energy is much higher. However, the entropy is starting to go up a bit because there is now some freedom for waters to move between different states. And if you start to look at this equation, which again is the most important equation in this entire course, what is the part that determines the balance between E and S here? T. So what happens if, let's assume that we are at zero Kelvin. What determines F? That it's only the energy because the entropy is irrelevant. And at zero Kelvin you are where? There, right? Then it's only the energy that's important. And the limit, if we forget about gas for a second, as the energy increases more and more and more, which one of these terms is gonna be more important? Suddenly the entropy becomes more and more and more important. And at some point it's gonna be more important for the system to have high entropy because then that entire term will be low than to have low energy. And what happens at the exact cutoff when these terms become more important? Can you get the temperature that happens at? Zero degrees centigrade because then you get a phase transition. Suddenly it's more important to be there. And then the ice will melt and you will be in that phase system. And now I'm just hand waving but what you might start to guess is that this is also gonna be the case for proteins. This is very much related to protein. It's not strictly phase transitions but there are similar processes. And you can even use this to explain the hydrophobic effect which is slightly more complicated than understanding the melting of water. So I would suggest that this is your homework task for Friday morning. Let's talk about it a little bit then. Try to use this equation to explain the hydrophobic effects. And this is what I mentioned, right? That superficially this looks like a super easy equation. It's one of the most powerful equations we have in chemistry. I would even say the most powerful because it helps you understand things that you don't understand otherwise. So with that equation you can explain exactly how the hydrophobic effect happens, you can explain exactly why it's this relation between hydrogen bonds and where the waters wanna move and everything. It's pretty cool. And if you give up, you can look at the end of chapter four in the Finkelstein book. So I'm not, what I spend a lot of time talking about is gradually moving over to physics. We will come back to the amino acids and everything but the reason why we're now gonna come back to amino acids in a lecture two is that all the things that I hand waved about yesterday I can now start to treat them properly with free energy. And what is the hydrophobic effect? Why is a particularly amino acid good or bad in place? And we can start using the free energy concept to determine that. So it's much more rigor. And before that you're gonna spend the entire afternoon tomorrow in the computer lab playing around and getting a good gut feeling for this which is more important than you think. You need to have a good gut feeling for the Boltzmann distribution. And then we have a bunch of study questions that I will go through on Friday morning.