 the recording then. So welcome everyone, also if you're watching this on Moodle. Today is proteins. So we're just doing all the biomolecular levels in order one by one. So we started off with looking at DNA, then we had RNA, and of course the next level is going to be proteins. So there will be a little bit of overlap. I will be talking again about the ribosome and because it's important for protein synthesis as well. And we will add some more information about things like wobble bases. And we will start beginning to introduce some of the techniques like mass spectrometry and NMR to how to measure these 3D and 4D structures. But we'll get to that. So did I make an overview slide? Yeah, so the overview slide will start of course with history. So because proteins have a history as well, slightly different history than RNA, slightly different than DNA. We will talk about structure because when we're talking about proteins, then structure is everything. Like in DNA sequence is everything. In RNA, it's kind of a mix of sequence and structure. But in protein structure is the only thing that counts. We will talk a little bit about how to identify proteins, like how to do purification. And then we will do some function prediction. So head looking at protein domains. And then there's going to be a very complex and confusing part in which I try to explain to you guys what is an orthologue, a paralogue and a xenologue. And the difference between an in paralogue and out paralogue and all of these things. And then we will also have a little bit of phylogenetic trees. There will be assignments. Like last week, I was planning to make assignments, but since I was doing a big like paper or not paper, well, I submitted two papers in the last week. But besides the paper, I was doing a job application. So that took up all of my time because that had to be finished before yesterday. And we spent a lot of effort on that. But today, protein, so let's just start. I think actually, if we're quick, we can do it in like two hours. But of course, if I'm rushing through it, and you have like, you want to ask questions, then feel free, of course. All right. So the history of protein starts in around 1800, slightly before, so like 1795. And at that point, like people were doing a lot of chemistry. So it's the age of enlightenment, right? And people start like the discovery of electricity and like optics and these kinds of things that made it possible for new discoveries to be made. And in the 1800, they figured out that proteins were a very distinct class of biological molecules. And only 38 years later, we have the first real description of what is a protein and that a protein is more or less consisting of amino acids. So that it's just a chain build up. So in 1985, we have the discovery of x-rays by Curie, I think. Yeah, I think she was the inventor originally. But then it takes like another, like 20 years, 40 years for the first x-ray diffraction experiments. And only then do people start learning on the 3D structure of proteins. Because when you do an x-ray diffraction, you're able to figure out where the side chains of the amino acids are. And you can see that there are different side chains and you can start getting an idea of how a protein is shaped. So in 1926, we have the proteins are enzymes. So people figured out that proteins are actually enzymes. So that means that they are involved in chemical processes, right? So they can speed up or slow down the transformation of chemicals to another chemical. But they are not used in that process. So they do take part in the chemical reaction, but they don't get used during the chemical reaction. And that is of course very important. Because if you would use up proteins in chemical reactions, then in the end, you would have to produce a lot of proteins. So in 1933, they figured out the theory of the secondary structure of proteins. Before, people had no real idea on how proteins would look like. They did do x-ray diffraction experiments. But the theory of the secondary structure of proteins is when people started realizing that hydrogen bonds and sulfur bonds are actually the main ways that a protein chain is getting more or less attached to itself. And that gives it a certain secondary structure. In 1946, there's the development of nuclear magnetic resonance, also called MRI, when you're in a hospital. So when you're outside of a hospital, an MRI machine is called an NMR machine. And this is a way to study not only the structure of proteins, but also the dynamics of proteins. So you can see how they move, right? So you can see them in action. When you do protein x-ray diffraction experiments, the big issue there is that you have to make a crystal first. And of course, within this crystal, the protein is unable to move. But with NMR, you can actually observe proteins doing their kind of chemical enzymatic reaction. So you can see them move. Well, not like you would, but you can get data on what is going on in a more dynamic way. All right, 1949 was a very important year, because this was the year where people first succeeded to make synthetic insulin. So insulin is the molecule that you are lacking when you are diabetic. So type 1 diabetics are born without the ability to make insulin, or they have like massive death of the beta cells in the pancreas, and that doesn't allow them to make insulin. So the synthesis of insulin is like a major leap forward, because it made it just much, much cheaper for these people to kind of stay alive. Before that, people used insulin from cows. So every time that you needed insulin, you would go to a pharmacist, and the pharmacist would buy his insulin at the slaughterhouse. So he would go to the slaughterhouse. He would buy a whole bunch of pancreas from cattle, and then the insulin would be extracted from the cattle pancreas. Of course, there are some issues there, because cattle insulin is slightly different than human insulin, which has to do with the sugar groups on the protein itself. But synthetic insulin was made for the first time in 1949, and that is kind of a major, major invention. So in 1958, we have the first protein structures being published, and not far behind. We have the first real electron microscopy crystallographic experiment. So this is then using electron microscopy. So it's using a microscope, which is transferring electrons. So you cover a crystal, or you cover a protein with gold molecules or gold atoms, and then you have a very fine needle, which allows you to more or less scan. It's not really a microscope, because you're not looking through an ocular, but it is in a way a microscope in the fact that it's using the same thing. So in 1967, the first protein structure by X-ray crystallography was published, and this is the FIST. There's a typo in this slide. I will fix it before I put it on moodle. And then three years later in 1970, the bioinformaticians became really involved, because this is when the first protein database was established. So the PDB is one of the oldest databases in the world. The PDB database is well funded in the 1970s, and it's still operated today. So it's one of these databases, which is more or less older than the internet, because even before we had the real internet, as we know it today with HTTP, the protein database was already there. So the protein database holds protein sequences and protein structures, and nowadays it does a lot more. All right, 1975 was a big advancement in protein, a study of proteins and protein mixtures. In 1975, it was the invention of 2D gel electrophoresis. So before that, proteins would be purified to centrifugation and other techniques. But using 2D gel electrophoresis, where you have one axis pulling the protein using like a, so one of the axis does it based on the, what's it called, the isoelectric point. So the isoelectric point of a protein is the pH at which a protein is neutrally charged, because every protein has a little bit of charge. So it's either a little bit acidic or a little bit basic. So you can use that when you make a gradient to kind of separate them in their isoelectric point. So based on their pH more or less. And in the other axis, you can separate them based on the size. So, and you can then do protein identification, because if you know the isoelectric point and you know the size of a protein, you know more or less which protein it has to be. Then in 1976 is a major advantage in computer graphics. And this is the first visualization of a protein structure. So more or less very similar to the thing that I showed you in the RNA lecture, where we looked at the ribosome. So this is the first time that anyone used a computer to visualize a 3D protein structure. So that's a big advantage. In 1981 we have the advantage or the invention of the ribbon diagram. So we will talk a lot about ribbon diagrams. And that is just an easy way to kind of write down the 3D structure or the tertiary structure of a protein. And in 1999 the ribosome structure was more or less solved. So using x-ray diffraction experiments combined with computer modeling, they were able to make a ribosomal model which is more or less still valid today. And every time that we talk about like a protein, you have to remember that you can observe a protein through a microscope unless you're using electron microscopy. So when we are talking about structure, we're talking about a model, so kind of a mathematical model of how the protein chain is kind of folding or is folded in 3D space. And then when we talk about a structure, then you're talking about how accurate you are compared to the experiment. So you can have like a structure which is solved to a resolution of two angstroms or solved to a resolution of 1.4 angstroms. So the smaller the resolution, the more the model is looking like the real protein in crystal form. So that's just a little bit of a weird thing or it's just something that you have to keep in your mind. And when we talk about the structure of a protein, we're always talking about kind of a model and models are always false or well, they're not always false, but models are never the reality. The reality is more complex, but when we talk about proteins, then it's the distance of the model towards its protein structure and that is measured in angstroms, which is like a measurement for atomic distance. All right, so first off, some nomenclature about protein. So an amino acid is a building block, so it is a very small chemical molecule. And usually we talk about amino acids and then we talk about like the 26 or the 21 essential amino acids. But an amino acid is a building block, so it's a fundamental unit and you can link these together because every amino acid has an n-terminus and it has a c-terminus and the n-terminus of one amino acid can couple to the c-terminus of the other amino acid. And then you can make a chain, just like DNA is a chain of base pairs, a protein is a chain of several amino acids. However, nomenclature-wise, we have to be very careful because when we talk about one chain of several amino acids that are coupled together, we are actually talking about a polypeptide. And if you have one or more polypeptides which are together, then we're talking about an apoprotein when we do not include the co-factors. So co-factors could be things like an iron molecule or a zinc molecule or a copper molecule. So these things are not part of the amino acid chain, they are not part of an apoprotein, of an apoprotein. But when we have one or more of these polypeptide chains together, then we're talking about an apoprotein. And when we're talking about a protein, we mean the finished product. So it's an apoprotein, but then with the co-factors included, like the iron or the zinc or the other molecules, which are not part of any of these chains but are required for the functioning of the protein. All right, so I hope that's clear. So amino acids are cold like that because they are amino carboxylic acids. That means that they have an amino group, so they have an N group at the end, so an NH2 group, and they have a carboxylic acid and a carboxylic acid group in chemistry is a C double bound O and a COH group. So of course when you put this in water, then this part will kind of lose the H, so it will create, if you dissolve a single amino acid, it will create a H3O plus molecule, which is an acidic molecule, and then the charge right that's left because then you have a negative charge on this part that will be kind of distributed between the two O's. So there will be a small negative charge. So the H can drop off when you put it in water and there's a small negative charge, and this is counterbalanced by this part because this part is a C bound to an N with an H2, so if you would dissolve that in water, then it has the ability to actually become NH3, more or less, it won't bind it exactly, but it's kind of a balance. So sometimes you will have two hydrogens coupled to the N, and sometimes there will be three, so there's a little bit of an additional residual positive charge. So here if you look at a single amino acid, then dissolving it in water will give a slightly negative charge on this end, it will give a slightly positive charge on that end, and that is why two of them can actually be connected together. So this is called the primary carbon atom, and this is the secondary carbon atom, so the side chain, so the R here is the thing that determines which amino acid we are looking at. So the most elementary amino acid, of course, is when the side chain here is just a single H, or actually two Hs, but there's always one H right at the side, but if there's like two Hs, then this is the most simple amino acid that you can imagine, and that's called glycine, but the general structure of any amino acid is just this, and then here we have the side chain, and the side chain is kind of can be anything, and I mean literally anything. So there's like 200, 300 plus amino acids known to men, and they all have like different things on this side chain, but for example if the side chain is a CH3, so what's that called again? Well you guys in chat might probably know CH3, side chains, that's, my chemistry is failing me here, but it doesn't matter, but the CH3, then it's called an alanine, but you can also have like a benzyl group here, which is like a circle made out of carbon atoms. Methyl, that's it, it's a methyl group, but hey, it can be as complex as you want, so one of these things that causes or that is the result of this is that an amino acid is actually chiral, so yeah, atelbe, thank you, methyl group, yeah. So hey, if you have a glycine and you add a methyl group to the alpha carbon atom, then it will transform into an alanine, so, but have all amino acids that exist are chiral, and that is because, well, this C atom has three different groups to it, more or less, of course if there's a CO group then it's not, but the idea is that because you have two ages here, glycine is of course not a chiral molecule, so a chiral molecule, I have a different slide about chirality to explain that, but just remember that amino acids are having a right-hand form and a left-hand form, and all naturally occurring amino acids are almost always in the L form, so at least in most humans or like eukaryotes, there are some bacteria which produce deformed amino acids, but they are very uncommon, and like I told you there's like a lot of them, so because this side group can be anything, there are around 500 different amino acids that are found in nature and that are known, and of course in theory you could nowadays synthesize any amino acid that you want and put anything on this side group. All right, so chirality means that because this C atom has the ability to make four bindings, right, so here you have the C atom in the middle, here you have the carboxyl group, and here you have the amino group, so hey this is, if it's just taking the one slide and then turning it on the side, and then here you see the H, and here you see the R group, and of course if you would take this right, then because there are three little legs here and one little leg here, and these have like different angles to each other, but there's no way that you can actually take this thing in 3D and then turn it in such a way that it will become this one, so this one is the left-hand side, and this one is the right-hand side, so in nature only these occur, and if you chemically synthesize amino acids, then you have to be very careful because normally you would produce both of these variants, right, if you would produce it using a chemical reaction where you start with glycine, right, for just 2H, and then you will start adding a side group, and because of doing it chemically, these both versions will be produced, so this is very important in medicine because there's a lot of cases where synthesis goes wrong and they want to synthesize L amino acids, but they synthesize a mixture with L and D amino acids, and usually having the same amino acid in the D form is toxic to humans while the L form has beneficial properties, but if you think about chirality, it just means that every this molecule, if the R group is not an H, then of course there's no way to turn this one towards this one, so you can visualize it in your head, and then you would have like a little tripod with a stem up, and then you can turn it around the stem and you can flip it around, but of course if there's, if you have COH N2H or R, then that's of course different from having it in the other direction, so that's chirality. I hope that's understandable, like I always found chirality a really hard concept, and in chemistry it is a pretty hard concept also to kind of predict which one of the, or which two, because stereochemistry, because normally you have standard chemistry, but this depends on stereochemistry, so autobases contragan is an example for side effects of chiral molecules. All right, yeah I know that the way that they used to kind of measure if something was L or if something was D was to put it in a little cuvette, so a little thing which is like a holder which is square, so normally if you're working in a lab all of the things are round, but a cuvette is a little square thing where you can put a liquid in, and then inside there is a little slit where you can shine a light through, and then you can turn, you can turn the Polaroid filter to the left and to the right, so you shine light through and then you have a Polaroid like a sunglasses, so and then when you turn the Polaroid then at a certain point at a certain angle the light will extinguish, and if you have a mixture which consists of L, then you can turn it to the left and then it will extinguish the light, and if you have a mixture which contains D amino acids or stuff which is in the D form then you have to turn it to the right side, so the Polaroid filter gets turned to the right, so that's where this stereochemistry comes from, but I don't want to dwell too much on it, but just remember that if there's a question on the exam and the exam question is most amino acids are in the mhm form then you have to write L form, right? It's that simple, so all right, so let's quickly go through the 21 amino acids that are there and show the different side groups, so of course all of these will have a COH group, right, and they will have an NH2 group, so this is the C terminus, this is the N terminus of the single amino acid, and then here we see the side group, so this is a very long side group, so it has one, two, three C atoms and then you have this part which is composed of a nitrogen and then other nitrogens bound to it, and all of the amino acids in the A group here, these are amino acids which have electrostatically charged side chains, so why are we doing only 21, why are we not going through all 500, well first off time is limited and I don't want to do all 500 of them, but these are the ones which are the essential amino acids, so if you're missing any one of these then you actually die, so you can't live without having these 21, because if you don't have all 21 of them you can't build for example a ribosome, and if you can't build a ribosome you can't make new proteins, so these are the ones which you absolutely have to have before you can start doing anything. So arginine, histidine and lysine are the three amino acids which are positively charged, and most of this positive charge is very, is located or is made by an NH3 group, so here NH3 the H can fall off and then it, then it will, or NH3, so this N actually has like four bindings while a nitrogen normally can only have three bindings, so arginine, histidine, lysine those are the positive ones, and then there are two negative amino acids or two negative essential amino acids which are aspartic acid and glutamic acid, and these have a carboxyl group here, so of course this part here is also having a charge, this part here is having a charge, but these charges are normally, because you have many of them behind each other, it doesn't, there's only a little charge at the end and a little bit of charge at the other end, so a little bit of a negative charge at the C-terminus, a little bit of a positive charge at the at the N-terminus, but that kind of evens out while these charges are real charges, so these, if you would dissolve aspartic acid in water, then this will start binding, or then this will, this will cause the water to become acidic, and this will be, this will cause the water to become basic, so the other side of acidic. All right, so those are the first two, so then we have also amino acids with uncharged polar side chains, so these don't have any charge, but these are the ones which like being in near fat, right, because they have like chains of C atoms, not too long of a change, but these ones they love dissolving in water because they are polar, so they have like an OH group, so they feel very good when they are in the water, so if you're thinking about a protein, then usually the outside of the protein, so the part of the protein which is in contact with water, will be made out of serine, trionine, aspargine, or glutamine, so these are not the ones that like cell membranes, these are the ones which like water, so they like being in water. There are some special cases, so the special cases are actually for structural things, the cysteine and the selanocysteine will come back because they have this sulfur molecule, so they have a sulfur atom in there as their side chain, and these are able to form bridges, so you can have a real atomic bond between a cysteine and another cysteine, so that means that physically they are coupled together, just like the NH2 group is coupled to the next carboxylic group, so these cause physical attachments between molecules like real, not like metal ionic bindings, but they cause a real atomic bound between two parts of the same chain. Then you have glycine, glycine is of course weird because it has no side chain, it's the basis on which everything is formed, and then you have proline, and proline is a little bit weird because it has a ring structure, but the ring structure is actually coupled to the NH2 group, so normally you would have an NH2 group here, but proline doesn't have, it doesn't follow the standard structure, here there should be the side chain, but the side chain is actually folding back and touching the amino group, so the secondary carboxylic atom is connected to the N atom twice, so that's why it's a little bit strange, and this makes it planar, so it is a flat amino acid, so if you would look at it under a microscope, and then all of them will have things sticking out or sticking to one side, but these ones will, the proline one is actually flat, so it's a kind of real 2D molecule. All right, and then we have some other amino acids, and these are the amino acids with hydrophobic side chains, so the hydrophobic side chains are the ones which like being in the cell wall, because they have these carbon chains, or a lot of carbon, or like different like little carbon grappling hooks, the methionine is a little bit interesting, because it also has a sulfur group, but all of these, they are very into being into an environment with other carbon atoms, so they don't like to be in water, they like to kind of stick together, and kind of clump up, and that is because they are, if you remember from chemistry, you have like things which are hydrophilic, and things which are hydrophobic, so these are hydrophobic, so they don't like water, and the hydrophilic ones, the ones that were here before, these are the ones which like water, so these are on the outside of the protein, and these generally are found on the inside of the protein, or they are found on the outside of the protein, but then the protein is embedded in the cell wall, because the cell wall of course is made out of kind of fat molecules. All right, so those are all the amino acids, so 21, you don't have to learn them all, but you have to kind of remember which groups there are, and remember that glycine is the only one which is not, which is not a chiral amino acid. All right, so then we said that the next step, so we go from amino acids to polypeptides, right, and polypeptides are amino acid change together using peptide bonds, right, so they are, the peptide bonds are when you take an amino acid, and then this amino acid is bound to another amino acid, so here we see the n-terminus of the first one, so this is, which one is this people, with the methyl group, methyl group, so it's glycine with the methyl group, then we have quick question for you guys, just throw it in chat, no answer, no guesses, you guys knew it was a methyl group, but now you already forgot the name, all right, so the name of course is alanine, very good, alanine. All right, so it's, no glycine, glycine is the one which has no side chain, so glycine is the one without any of the side chain, so here, the third one here, this is glycine because it has this, it has this H group, right, so that means that there are two Hs, so it's glycine, and if you have a CH3 group, so if you have a methyl group to it, it's the alanine, right, I always get them mixed up anyway, don't remember the names, just it's methyl glycine, oh that's a nice one, yeah, you could always just say something like that, so call it methyl glycine, no we're not going to call it methyl glycine, but it's not that important how they are called, right, it's also, I'm not going to have you learn all of the short names for them, because every amino acid has a full name, then you have a three letter code, and then you have a single letter code, but that's not what you, what you need to learn, and what you need to learn is that they come in like four to five groups, so they are electrically charged, like positive and negative electric charge, they are polar, right, meaning or hydrophilic, meaning that they like water, there are some special cases, like the ones which have the sulfur group, and then you have one big group, and those are the ones which are hydrophobic, so those are the ones that do not like water, that's, if you know it in that detail, then that's more than enough for me, so but if we talk then about polypeptides, right, so make a chain out of different amino acids, then we arrange them always from the N terminus to the C terminus, it's like we, in DNA we write everything from five prime to three prime to not confuse ourselves, in protein work we always go from N to C, because those things are not in that order in the alphabet, it's, I don't know, like five prime to three prime, because five is bigger than three, that might make sense as a rule of thumb, but going from N to C, it doesn't really make sense, but just remember you write them from the N terminus towards the C terminus, writing them the other way around is just wrong, so all right, and these things are called peptide bonds, so when you have this carboxylic group from the one bound to the amida group from the other one, head then you have a C, and then head the O is being used up, so there's no COH anymore, but the C couples to the N, and then head this then is a peptide bond, and these are atomic bonds, so they are very strong, you can easily break those, so that's why an amino acid or a protein generally has a very tough structure, all right, so the primary structure of them is very comparable to DNA, like I told you, we write them down from the N terminus to the C terminus, however, if you have cysteines in there, it is not a normal linear structure, right, if you write down a structure on DNA, you say CT, AG, AT, CC, CG, AT, blah, but in amino acids it's not that simple, so in polypeptides it's not that simple, so if you write down the primary structure of a polypeptide, then the problem is that you have sometimes the cysteines, right, which have this sulfur group, so the sulfur group couples to another sulfur group, so here this is the primary structure of oxytocin, the happy hormone, so the thing that makes you feel happy, and this hormone has two cysteines in there and these couple together, so the proper way to write this down is to start with the cysteine, because that's at the N terminus, then you have tyrosine, isoleucine, glycine, aspergine, then you have another cysteine, and this cysteine is physically attached to this other cysteine, because these two sulfur molecules, they form an atomic bond, right, so there's kind of this molecule, the cysteine here is coupled to the cysteine, but it's also coupled to the cysteine via this, so you already see that here some structure starts getting into the sequence, because the sequence is not just a linear sequence when you talk about protein, it is more or less already a kind of flat surface, and then the cysteine goes to proline, noisine, and glycine, and then we are at the C terminus, so this is a very small protein or a very small polypeptide, but the structure is already a little bit more complex, right, and it's not, if you would write this down just cysteine, tyrosine, isoleucine, hey, you would not deal with this SS, so this sulfur-sulfur bond, then it would be wrong, so if you would just write them in a sequence like you would do with DNA, then that would not be the proper primary structure, so the oxytocin, the happy hormone thing, this is a very basic example. If we look at insulin, for example, the molecule which is very important for when you have diabetes, then writing down the primary structure is already difficult, much, much more difficult than with oxytocin, and that is because these cysteines that we have here, which have these SS bindings, so this sulfur-sulfur bindings, they can actually bind two peptides or two polypeptides together, right, so here we're talking about a protein or an apoprotein, although I don't know exactly if insulin has a cofactor, but if it would have a cofactor, then it would be an apoprotein, because we're not writing down the cofactor, but actually insulin consists of two polypeptide chains, so you have the alpha chain and you have the beta chain, and here, because you write it always from N-terminus to C-terminus, you have to do a little bit of trickery, but these two chains are connected to each other, physically, so they are physically attached to each other based on the 1, 2, 3, 4, 5, 6, 7, so the 7-cysteine is connecting the alpha chain to the beta chain, and this is the 1, 2, 3, 4, 5, 6, 7, so the 7-cysteine of the beta chain is connected to the 7-cysteine of the alpha chain, the 6-cysteine of the alpha chain is actually coupled to the 11-cysteine of the alpha chain, and then it continues, and then the cysteine, one before the end of the first alpha chain, is coupled again to the beta chain, so you can see that this already becomes a lot more complex, right? But since the primary structure is based on atomic bonds, you do have to take care of these atomic bonds, so that's the thing, how we write them down, so I think this is enough about primary structure, but primary structures in proteins are already complex and they are difficult to write down, because you have to know which cysteine binds, which cysteine, because of these, and then you still have the Solano cysteine, which can also do that, but remember that primary structure of proteins or polypeptides is based on atomic bindings, so not on hydrogen bindings, not on van der Waals powers or other things, no, they are based on atomic bindings. All right, so then we have the next level of structure, so primary structure is only the first level of structure, and it's relatively easy for a computer to figure out, right, since we're doing bioinformatics, I want to say that if you give a computer a primary structure, for example the two polypeptides of insulin, then the computer can figure out for you what the primary structure is, if you just give it the like non-folded structure, so you would just give it the amino acids in order, you would give it the amino acids of this thing in order, then it can figure out that these two are going to bind together and these two are going to bind together as well, based on the distances, so this is, for a computer, this is relatively easy to figure out. When we talk about secondary structure, then we're talking about kind of the 3D form of local segments of the biopolymer, so a protein or a polypeptide can also be called biopolymers, and they have like a secondary structure, and the secondary structure in proteins are two, so there's two different secondary structures, one of them is the alpha helix, so that's when it rolls up into a helix, and then you have the beta sheet, and that is when two of these chains are more or less parallel to each other, so a secondary structure does not describe the specific atomic position in 3D space, it just tells you how the chain is folding up onto itself, and also this is still relatively well predicted using a computer, so computers are pretty good in predicting secondary structures, and primary structures, so for example the alpha helix, here you see an example of this alpha helix, so here we see the n-terminus, and here we see the c-terminus, and here we see a piece of DNA, so here we see a leucine zipper, and this leucine zipper is made out of two polypeptide chains, so this is one protein, right, the leucine zipper is a protein, the DNA in this case is its co-factor, so we're really looking at a protein, and these leucine zippers they have two polypeptide chains, and every chain consists of two of these alpha helices, so an alpha helix is formed when the NH group of an amino acid forms a hydrogen bond with the CO group of the amino acids for residues earlier, so and then when you look at the turn of this helix there are around 3.6 residues in each of these helical turn, or exactly 3.6 residues, we will zoom into it a little bit, so the amino acid side chains are on the outside of the helix, and point towards the n-terminus, so how does this look if we look here, and then we zoom a little bit into the into the alpha helix, so at the secondary structure when we are talking about alpha helices and beta sheet is not, well it is dependent on the atomic bonds, but on top of that you are now incorporating hydrogen bonds between the different C and n-termini, right, so not about, it's not about side chains, this is about the chain, so the chain of amino acids, and then here we're talking about the H, so the H on the n-terminus of the 1, coupling to the C, so to the carboxyl group of the same residue, or of the, of the, something of the same chain, 4 residues away, so you can see here this is 1 residue, 2 residues, 3 residues, and then 4 residues, so where are here the side chains, this picture doesn't really show the side chains, but the side chains of course are on this C, right, so every time on this C there's the side chain, and the side chain here is always pointing towards the n-terminus, so it's pointing kind of backwards, right, so here the side group would go this way, so if this would be one of these long side chains with a lot of C atoms, then it would always point this way, right, and this one here would also point this way, and this one here would also point that way, and that is because of the 3D structure, so it points the side chains towards the more or less the start of the helix, all right, so that's the alpha helices, they are actually shown as little barrels, so this is now shown as a helix, but when you draw them on paper, it's a barrel, so it's a cylinder on a secondary structure picture, so beta sheets come in two form, so it's a flat structure, so if you're thinking about biology or proteins being like construction vehicles, like cranes and elevators and these kinds of things, then the beta sheet is more or less a flat structure, so it's really flat, like it's a single molecule layer in which like the two chains are more or less at the same, and they can become, they can be very big, so they can more or less make like a whole square construction site, right, to do things on, so like an alpha helix is something which is like strong and a little bit flexible, a beta sheet is rigid and you can put things on there, so that's how these things function, so it's again a flat structure, it's based on hydrogen bonding, and the strand falls back onto itself, so in case here we see like one strand, but it's the same strand, so we have like, so if the n-terminus would be here, then this is the n-terminus, and then we go, then it loops around, and then the c-terminus would be here, and the same thing holds here, if this would be the n-terminus, then it would loop around, and then this would be the c-terminus, so there are two important forms of beta sheets, one of them which is the parallel form, and one of them which is the anti-parallel form, so in the parallel form the side chains are all pointing to the same direction, right, so they are pointing, both of them here are pointing left, and both of these are pointing right, for you guys it might be the other way around, but hey you can see that here both of the side chains they point towards the same direction, and here they point towards the other direction, in an anti-parallel sheet it's different because these two side chains are pointing towards each other, and then on the next amino acid they are pointing towards, they're pointing away from each other, right, so this is a very big difference if you if you talk about how these things structure, and if you if you're talking about a beta sheet, then it is always shown in higher level structures as an arrow, and this is where the ribbon diagram comes from, right, in the 1980s some some guy had the idea like, oh, but if we draw a protein structure, and then we should draw a beta sheet as being an arrow, and we should draw an alpha helix as being a little cylinder, and that was that was enough for a nature publication back then, so they're just saying, oh alpha helix, no that's a barrel, beta sheet, no that's an arrow, so that's that's one major invention, he didn't get a Nobel Prize for it, but almost, so all right, so here we see then the tertiary structure, so the tertiary structure of a protein is his three-dimensional structure as defined by atomic coordinates, right, so the forces which determine the tertiary structure are not just atomic bonds and hydrogen bonding, no, now we also take into account the ionic bondings, if from any ions, like plus-minus bondings, right, we take again, we take the hydrophobic interactions into account and all the disulfide bonds, but the disulfide bonds are part of the atomic bonds, right, so it's the it's the same, so this together with the amino sequence makes the primary structure, then the secondary structure is including the hydrogen bonds and then the tertiary structure is then including also the ionics bond and the hydrophobic interactions, right, so here we see a protein, we see here that this protein consists of one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, give or take beta sheets and there is like a very small alpha helix here and then there is a very big alpha helix here or this could actually be three different alpha helices, but here we see what a ribbon diagram can do for you, because it can show you kind of the structure of the thing in three dimensions, but the tertiary structure of a protein is actually based on the three-dimensional coordinates, so this is the thing which you measure when you do x-ray diffraction experiments, then you're determining the tertiary structure of a protein and this is almost impossible to predict for a computer, I say almost impossible because we're getting better and better at it and we will get back to that because there have been some major advancement in the last year, so I made a couple of, well I made one new slide for you guys, it's the major advancement, I thought it only warranted one slide, but it is a major advancement in protein and bioinformatics work based on predicting these structures from the original sequence of the amino acids. All right, so then we talk about the quaternary structure because there's always an additional level in proteins and the quaternary structure is the arrangement of multiple folded proteins or coiled proteins in a multi-subunit complex with cofactors, so here we see the structure of hemoglobin, right, so hemoglobin is the thing that transports oxygen in your blood cells, so hemoglobin is made up of four polypeptide chains, so it's two times an alpha chain and two times a beta chain, so it's like the red ones are alpha, the blue ones are beta, or the other way around, and then you have four iron atoms, four iron ions, so here we include the cofactors, so the cofactors are included in the quaternary structure, they are not in the tertiary structure, so the tertiary structure is more or less the apple protein and then this is the final protein, is the final level, is the assembly of multiple apoproteins together with their cofactors in there, so I think that the iron molecules they are here, these are in the middle here, they have the heme group, so the heme group is a little, is an iron molecule surrounded by a little net and oxygen comes in and is bound here, so every hemoglobin molecule can carry four oxygen molecules with it to supply oxygen from the lungs to the muscles, for example. All right, let me look at the time, I have been recording for 50 minutes, so I think we should do a short break so that I can stop the recording and show you guys some commercials. Again, I'm not in it for making money, I just find it funny that you already paid a cup of coffee for me, so I will be back in like 10 minutes, 15 minutes, I have to go to the toilet and drink something because my throat hurts a little bit, and then I will be back, and then we will continue talking about proteins and how to detect proteins and how bioinformatics is involved in proteomics. All right, so...