 Hello everyone. So welcome to today's lecture. Now we are almost approaching towards end of this course. So what I thought in last five lectures, I will be giving you some glimpse of NMR spectroscopy in structural biology, how to determine the peptide structure by NMR. Then I would like to see how we can understand the protein-protein interaction by NMR spectroscopy. And finally I will give some glimpse how to get the shape and size of this, like size of the protein molecule or peptide molecule by use of NMR spectroscopy. So these will be topics that I will be covering in next five lectures. Now let us start for today's lecture. So what we want to do today is if you have a primary sequence which is given in terms of say amino acid, can we read this amino acid sequence here, use this information now of amino acid sequence, we record various sorts of NMR spectrum and can we translate those information into three dimensional structure. So that is this total goal. As you know we have a three techniques that are primarily used in a structural biology for getting the atomic resolution structure of any biomolecules. One is crystallography, where you need to crystallize the protein. So like what you do, concentrate protein then put it in the crystallization condition. If you get a crystal, this crystal must diffract and after diffraction you can determine the structure of the protein. Then one is NMR spectroscopy when you need to solubilize the protein and you need to record various sorts of NMR spectrum where you do the like do whatever today we are going to discuss and then at the end using all those NMR parameter we want to determine the three dimensional structure. Third one which is emerging is called cryo electron microscopy where you do not need to crystallize but still you just spotter that protein sample on plate or sheet and then you like put the electron beam and that created various diffraction pattern and that is used for a structure determination. So NMR is a technique which gives structure and dynamics as you have studied the relaxation, relaxation essentially gives the dynamics. In a separate course we will be taking later maybe sometime later how to use NMR extensively for getting the structure and dynamics of a protein. But today I will just give you glimpse what are the steps involved in NMR spectroscopy going all the way from the primary sequence of a protein to the three dimensional structure. So steps in the NMR structure determination is first one sample preparation. How to prepare a clean sample which is amenable for NMR whether it is peptide or it is protein. So for today's lecture let us just concentrate to peptide. Now you have to synthesize the peptide either by bacterial expression or by chemical synthesis then you need to purify this peptide so that we have a maximum purity of this peptide more than 98% or 99% purity needs to be to have that purity of that peptide. Next step will be because we are doing solution state NMR so next step will be solubilizing that peptide in appropriate buffer. Many peptides are not soluble in water so you have to choose an appropriate buffer to get the structure. So mostly it should be like if you want to get a solution structure it has to be soluble and therefore we need to choose an appropriate buffer for sample preparation. One sample is prepared the next step will be data collection. So we need to collect the various sets of NMR data and we had discussed in pre-past. We have to we need a data collection for sequential resonance assignment like we need to do record a data for like COSY or TOXY or double filter quantum COSY DQF COSY. Then we need to record the X nuclei correlation in spectral like 13 CHSQC or N15 HSQC that I will be going to discuss later. So that will be data collection. Now once your data is collected next step will be to assign those data. So like here if you look at in the background I have given a spectrum. So you need to identify each of these peak coming from the 2D data set what these peaks means and today I am going to give you some example. So that will be I say assignment of the peak or analysis of the data. After you assign these peaks so you have studied earlier we have discussed that on the NoG data set you can generate the distance means looking at the NoG peak you know the distance between 2 protons and that is called restraints. So by measuring the intensity of those peak you one can generate the restraints and now once restraints are generated all these can be used in structure calculation. So we need all those assignments and all those distance restraints and then one can use those restraints in structure calculation. Now once your structure is calculated that you get a reasonably good structure you need to validate its structure that means it is not violating any phi and psi torsion angle in the Ramachandran plot so that is called a structure validation. So once you did it and after that all the like after this process your structure is ready it is a probably it is a good quality structure after validation then we can deposit in something called protein data bank PDB. So that is a single depository for the all structure determined by various scientist and you can look at the coordinates of each of these structure. So let us go all of these one by one so if you ask me what is the timeline for that so sample preparation depending upon if you are getting sample from your friend or you are synthesizing yourself so if peptide synthesis can take a few days and purity few like one day then you can dissolve it and collect a data for two three days assignment might take like if you are doing manually it might take two weeks to one month or even longer depending upon how export you are. Now once assignment is done like major problem is solved so now distance is sent takes few hours and then a structure calculation it is a iterative process you do step by step so that that may take one day or so and then validation in one to hour. So like total if you are doing Dino and if you are a student who is starting it may take two, three months, four months for a structure determination but if you are export you can do very fast and with now automated software it can be really done fast. So let us go one by one so sample preparation as I said sample preparation sample has to be soluble first it has to be pure and then it has to be soluble. So purity you can determine by various methods like you can run a mass spectrum to see that you are getting a single pure peak. So mass spectrum what I means you can do by Maldi or even you can do by just running a like a SDS page gel where you should see that you are getting a single band single binding SDS page gel or you get a single peak corresponding to this peptide in Maldi spectrum so m by z has to be single peak. So if you get that then you know your peptide is soluble. So next is solubilizing so you have to choose an appropriate buffer and the concentration has to be decent enough. Now as we have now known that generally NMR is one of the insensitive technique therefore whereas in other techniques you require just micro molar like 10 micro molar NMR you need few 100 micro molar or maybe even millimolar concentration. So typically for small peptide I would say 1 to 3 millimolar of peptide should be dissolved in 0.6 ml of suitable solvent mostly in water and you need 10% of D2O deuterium oxide just for locking the magnet so that magnet does not drift during experiment and you get very clean spectrum. So that means if your sample is clear then it has to be really like transparent so there should not be in turbidity and sample should be clean and clear. Now if you have done this then you are ready for doing the experiment. So next step will be just going ahead and now recording the data. Now what kind of data for a peptide you have to start with something called like as you have discussed this total correlation spectroscopy. Now total correlation spectroscopy as we know it is like you correlate from say H alpha to H beta to H gamma to H delta. So that gives proton-proton correlation through bond. So depending upon if these protons are correlated you get a spectrum in the Toxy spectrum. So that gives through bond correlation and that transfer of magnetization if you recall it correctly it happens through a scalar coupling. So in Toxy you can go few bond order. Then the other one is NOG, nuclear overhousers spectroscopy this gave the correlation through a space and that is mediated by the dipolar coupling. Now these are two proton-proton detected experiment. So then now you can do at natural abundance something like a HSQC spectrum. So proton-nitrogen HSQC spectrum so this gives you proton-nitrogen correlation. So if you remember little bit of biology in each of the amino acid there is one NH correlation from the backbone and there are few in side chain so essentially your HSQC will give the number of peaks equal to number of amino acids in your peptide sequence. Then after that one has to record the 13 CHSQC for like a side chain correlation like H alpha C alpha H beta C beta H gamma C gamma and all those. So this will give you resonance assignment. So now Toxy helps in identifying the unique protons in the peptide and NOG helps in determining the distance between different protons. So essentially all those will be done on a decent magnet like a 500 or above for a peptide. Let us see. So I want to determine the structure of a small peptide like suppose here I am taking typically around say 20 amino acid peptide and this is insulin so where you have a slightly longer like this is 50 amino acid and suppose this is around 20 amino acid peptide. So the first thing that I said we need to know the primary sequence so 1D structure or whatever primary sequence. We need to know amino acid sequence for each of these peptide. So if we know that now that is the first prerequest NMR does not give you the primary sequence. You need to know from other techniques what is the primary sequence of a peptide. Now once we have the primary sequence then we want to get the structure. So simply if you record by solubilizing this peptide into water so if say record one millimolar you get a proton spectrum something like this. So from now we know that these say these are NH protons here you can have a aromatic proton so these are HN proton this is H aromatic proton, H aromatic proton, aromatic proton. These are our H alpha proton here are H beta, H gamma proton and these probably are H methyl proton. So we know like these are the range of the peptides. So now if you look at the peptide or protein spectrum like a small size protein of 50 amino acid, 60 amino acid or a peptide more than 20 amino acid you see how many lines we have. Now these lines cannot be resolved by 1D. So 1D ends in chemistry now you require also 2D in chemistry but for biology minimum requirement is your 2D spectrum. So 2D that is why I said we start with a toxious experiment. So that is the complexity of peptide so we go from 1D to 2D. Now this is say we are we have recorded a spectrum of 37 amino acid peptide at double the still water at pH 3.7. Now you ask me why pH 3.7 as I said this peptide has to be soluble. So we found that this peptide was soluble at lower pH therefore we had to reduce the pH to get a decent spectrum. So one can see here we are getting lots of correlation and now you can like here if you look at here all these are correlation coming from Hn to H alpha to H beta to H gamma and that is the toxy spectrum representative toxy spectrum. And then you record a noxious spectrum you get many more peaks. So now each of these dot is a correlation. So next job after solubilizing the peptide after recording the noxious and toxy spectrum to analyze this spectrum assign what these peaks in noxious spectrum means. Before that we analyze any of these peaks we have to analyze what these peaks means so from which spin system like what is the amino acid where the peak are coming what is the number of that amino acid which is 10 lysine lysine number 10 or lysine number 15 or a lanyine number 12 or something like that so that is called resonance assignment of the peak. So we need to do that if you take a bigger protein you get even more crowded spectrum here just look at the noxious spectrum how much wealth of information you have. So this is on a like a 50 amino acid so if you record toxy on that 50 amino acid you get lot more peak here and also a noxious spectrum and this can be done typically in 13 hour of experimental time. Next job is to assign these peaks assign each of these amino acids so that in a moment I am going to tell you how to assign it so let us take an example of assignment. But before assignment as I said we will do the say HSQC so here I am showing you a natural abundance HSQC of a peptide which was like 18 amino acid. So here if you like if you record long enough you get each of these dots in HSQC spectrum is giving one peak so that this means glycine number 18 this means lysine number 15 lysine number 16 I do not know like a priori when I record this spectrum what this is. So after the resonance assignment I was able to pinpoint this but you get as a number of amino acids as you have in your peptide or protein sequence. Now HSQC gives you fingerprint so if you record this spectrum you know that each of these peaks is coming from one amino acid so now it is now you know number of amino acids and also you know that my peptide is good enough. Now so only one amino acid that will not give peak in the N15 HSQC which is proline. Why because proline is an amino acid not an amino acid proline does not have NH therefore this is NH correlation so here is N15 and here is proton in NH correlation proline will not have any peak. So if my sequence here if you look at GGLRSL all the way from G to here are giving all peaks since I do not have any proline so essentially all 18 amino acids are probably giving me peak. You might miss one or two peak that is because of dynamics or some modification. So that is a N15 HSQC similar I can do the data collection for 13 CHSQC so if I do for the same peptide that 18 amino acid peptide that I showed you this is the reason from the aromatic. So few of the amino acids has a aromatic ring like if you see tryptophan has a aromatic ring and then here your tyrosine has a aromatic ring right so these two will give peaks in the aromatic amino acid and that is what you get from the aromatic peptides. Then from the aliphatic region all this CH alpha CH beta CH gamma and all those will come. So here is your CH alpha like C alpha H alpha correlation then here you have a C beta H beta correlation and here are your methyl C methyl and H methyl correlation. This I know because we know the chemical shift but I do not know here which is from alanine number say 5 or something like that sequence is specific I do not know. So for that I do need to do resonance assignment. So now we know sample preparation it has to be pure it has to be soluble and 10 percent of D2O needs to be added to the sample for locking purpose. Now once we have a sample which is of good concentration and high purity we recorded a series of experiments. So minimum experiment that we counted is TOXI for spin system assignment, NOZI for like a distance restraint so the sequential and long range correlation in space. Then we recorded N15HSQC that is a fingerprint for peptide or protein. Then we recorded C13CHSQC from the aromatic region and the aliphatic region. Now next job and the most important job is to assign these peaks. So I will give you some example how to start assignment. Now assignment should start with a TOXI spectrum. So TOXI I have here one experiment but maybe I will give you some more for amino acids. So let us start with a simple amino acid which is alanine. So if suppose your alanine is there in the peptide chain let us write it down alanine here and NH this is our alanine right. So if alanine is there. Now what all protons we have alanine? First is say NH proton, second is C alpha H proton and then C beta or C methyl H proton. So in TOXI essentially we should see three peaks one for your NH proton which will be suppose around Hn so which will be suppose around 8 ppm. Then we have one proton CH alpha proton so H alpha proton which is around say 4.5 ppm and then you have around like 1.8 or 2 ppm for methyl proton C sorry H say methyl proton or H beta proton. Now if I see this kind of a spin system in my TOXI spectrum I know that this is coming from the alanine. So I identify alanine but still I do not know which alanine is similar if I take an example for say glycine. So glycine what we have here NH, CH, H and CO. Now if you take glycine so now glycine what glycine should have one NH and two for say H alpha 1 and H alpha 2 right. So this is around say Hn which will be around again 8. something 4 ppm and these two are say around 4 ppm and 3.9 ppm something like this. So if you see this pattern you know a priory that this is for glycine and this is for alanine. So this is the way we assign the TOXI spectrum. Let us now move to little complex system say we have taken here the longest one say arginine. What we have in arginine here let us start. So I will take this simplicity NH so arginine because we are talking in a peptide sequence. So arginine has a NH like amide proton then we have CH alpha, C beta, C gamma, C delta then these are the side chain NH and these are all three side chain NH. So let us see what we are getting. So here is coming our backbone NH this one 8.27. Then let us go little back here then we are getting CH say alpha here CH alpha here CH beta here sorry CH beta at 1.89 and here and then gamma are here and then delta is here and these are coming from the side chain side chain NH is around 7.7 this is H alpha. So let us repeat it again. So at the diagonal here what I am getting here is from this one 8.27. Then let us move at H alpha 4.38 then H beta are here 1.89 and 1.79 then gamma are 1.7 and then delta at 3.32 and then the side chain NH which is this one is coming at 7.7. So if you guys see these kind of pattern in my toxin spectrum I know that this is an alanine spin system. Now let us take an example for tryptophan this is again a bigger amino acid. So tryptophan if we look at here again for simplicity we will take this as a Hn because this is in peptide bond. So we have a Hn from the backbone then we have CH alpha and CH beta and these are all from the side chain. So this is from aromatic ring and these are from this 5 member ring and this is the side chain NH. So let us see now we have 2 NH one from the backbone one from the side chain. Now backbone NH generally comes around 8. So this is backbone NH and the side chain NH comes around like 10 or higher than that. So here that we have. So if you look at the toxic correlation spectrum. So from NH you can get this CH alpha that is this one around 4.66 and then you have beta. So 2 beta we have 1 comes at 3.22 and 1 comes at 2.99 that is 3 ppm. So that is what toxic we are getting. Now from this NH we can get the side chains so like here so that is here. Now from these aromatics you can again get it from all the aromatic correlation. So here you have these aromatic correlation and that you can assign it from these guys. Then again here you have for 5 member ring. So here like this was and that you can assign it all these. So from H alpha you get again correlation to H2 H beta these are this and then from H beta you get it from the other H beta. So this pattern if you look at now by doing this pattern we know that this is tryptophan. So that assignment you are doing and then we are getting the rest now amino acid type assignment for each of these. Just to make it more comfortable let us take another amino acid say serine. So what serine is I will just write serine in a peptide bound C H, C O and say OH. So now we have a 1 here, 1 here and 1 here. So let us see how it comes omega 2, 1 H and omega 1, 1 H. So here we are getting Hnp for your backbone this one and then we are getting so here so say H alpha and H beta. Now H alpha and H beta in this case will be quite close, why? Because here if you remember O is electronegative group and that will actually create de-shielding of beta proton. So this is H alpha and this is H beta so that comes quite close. But in case of methyl use so that already if this was C H 2 in many case you see C H 2 here is around 3 ppm. In the previous case we looked at here the H beta R 1.79. In case of serine again we looked at that it comes it will come around 3.6 ppm and this will be or 3.8 ppm and this will be 4.2 ppm so that will come quite close. Now if you look at we are getting unique pattern from different amino acids. So like alanine as I said 8.34 ppm, 4.2 ppm and 1 around 2 ppm so if these three peaks are there you know I alanine serine again you are getting H alpha and H beta quite close glycine you are getting 2 H alpha around 4 ppm. So looking at this pattern you will know that what kind of amino acid is what type of amino acid is tryptophan you see here are the pattern so H alpha H beta 1 H beta 2 by looking you know that this is tryptophan similarly like we had looked at the arginine long toxic correlation. So by this way we identify different amino acids ok. Now individual spin systems are identified here. Now once the spin systems are identified now we have to find the correlation with a next amino acid because here we I found this is tryptophan this is arginine this is alanine this is serine this is glycine or what so far. But which alanine which glycine what is the number of that alanine, alanine number 5 or alanine number 12, glycine number 7 or lysine number 11 for that I need to know what is the near that and then we can solve this puzzle. So what I will do I will start from this point how to use now Noji spectrum for knowing the nearest neighbor and that will help us identify the number of the amino acids and we will continue from there and then we will see how to do complete resonance assignment identification of the system. So I will start again with this point. Thank you very much and looking forward to see you in the next class.