 Hello, welcome to today's lecture. So we were discussing about structure determination of peptide by NMR spectroscopy. So now, so essentially what we want just to repeat whatever we did in last class, we have a primary sequence, amino acid sequence and then we want to translate that amino acid sequence how this polypeptide will fold into three dimensional structure and this for shorter peptide like 20 to 50 amino acid how we can get the three dimensional structure using NMR spectroscopy as a tool. So that is what we were discussing. So we discussed there are some primary steps in NMR structure determination. First is sample preparation. So for peptide we need to synthesize the peptide or bacterially express this peptide and purified. The pure peptide has to be dissolved in a solvent which we want to solve the structure and then 10 percent of D2O is or deuterated solvent is added to that for locking the spectrometer. So that is our sample preparation. It has to be transparent and there should not be any like aggregated particles. So that we have to make sure that sample is absolutely pure and sample looks transparent that says that it is pure. For purity we can use orthogonal techniques like mass spectroscopy Maldi or even SDS-Pasiglian electrophoresis to see if the band is single and in Maldi you are getting a single band and that tells about the purity of the sample. The next step that we have to record a series of NMR data that we discussed last class say like a Toxy or Total Correlation Spectroscopy and Nozzy Nuclear Overhouser Effect Spectroscopy. Then couple of like N15HSQC, Carbon 13HSQC or Double Quantum Filter CoG. So all this dataset we need to record it and that takes a day or maybe 2-3 days and after that we are discussing actually the assignments of peak, how we are going to interpret this data. So we ended in the previous class at the sign analysis of the data. So now we will discuss that and today we are going to discuss more about like how you use this data to generate the restraints and those will be used for calculating the three dimensional structure and once the structure is calculated then we need to validate that structure, does it satisfy the dimensional plot statistic and all those and then it has to be minimum energy structure. Once it is validated and energy is all the angular constraints and the distance constraints are satisfied then this structure is good for submitting into protein data banks. So that is what we will be discussing today. So we were discussing about let us say simple thing that I was showing you assignment of a few amino acids like we just showed it for alanine. So let us see this is our alanine in a polypeptide chain. So this is alanine and let us write it down for glycine that is what we had written here, glycine okay. So now we are looking how toxic pattern will come. So here is our say spectrum omega 1, omega 2. So say around 8.2 ppm our NH peak will come. So here is my NH peak then we have a peak coming from H alpha. So we will get a peak from H alpha and then we get a peak from H methyl. So if this pattern and this will be say around 1.9 this will be around 4.4 or something like that this will be 8.2. So if we get this kind of pattern in a toxic spectrum we now we understand that this is alanine. Similarly for say glycine we can have around 8.4 or so and then we have 2 closer peak around 3.9 and 4. So this is pattern for glycine. So these amino acids have a typical pattern in the toxic spectrum and that we can use to identify in the toxic spectrum and that will be our start point or something. Then we need to combine this with nogie spectrum to walk along the backbone. So here I am giving for analysis of data. So now we take a valine. So here for valine what we have NH. NH will be correlated with say H alpha. H alpha will be correlated with say H beta. H beta will be correlated with methyl. So if we in a spectrum in a toxic spectrum we get pattern for this suppose I write it down. So suppose here is our NH and we get a peak for H alpha which is 4.5 then H beta 1.8 and then H gamma 1 and H gamma 2 if very close. So if you get peak something like this, this we know that this is coming from valine. Similarly we can get a pattern say for any of the amino acid like proline here. So proline we do not have NH right. So here that is missing. So we have here 2 peaks and can be here and all those. So here also if we get a pattern like this then we know that this is coming from proline. Similarly we can take it from some of these here. So NH that will be correlated with H alpha, H beta and so on and so forth. So these will be coming for a simple spin system. Each of these spin system will give a typical pattern that is in toxic. Now we want to see which is near what. So then we need a like distance based spectroscopy and that is no G. So if we do no G we know that this CH2 can show a correlation peak with this NH. Now if this correlation in the no G spectrum from this NH to CH2 we will see then we know that this valine is near to this residue. Similarly if we see some H alpha showing correlation peak with the proline this protons then we know that this is coming these two are near in a space and that is how they have a correlation peak in the no G spectrum. So that is how we try to identify from the toxic and no G how which residue are like how they are connected. So here just for schematic we have shown two residue I and I plus 1. So this one residue is I up to here and this is up I plus 1. So in toxic you will get all the blue correlation which is like intra residue NH to CH alpha and then to methyl's right. And in no G you can get same like in the same amino acid as well as the neighboring amino acid. So intra residue as well as intra residue correlation we can get it. So here intra residue means this CH alpha will be correlated with NH of next residue. This NH can also be correlated with the NH of next amino acid and that helps us in identification of the proximity residue. So in toxic we know that this is alanine, this is serine, this is say arginine or so but if we combine no G with this we can know. So this alanine, serine and arginine are near in space and that is how we can identify. So I will work you through some of those assignment protocol. So let us see I am taking a sequence of 18 amino acid which is GGLRSLGRKILRAWKKYG. So this is single letter code of different amino acids and this is say one peptide which is 18 more. Now we have a beautiful toxic spectrum. So at the moment let us for simplicity I just concentrate on this region which is NH, like H-aliphatic. So these are all H alpha, H beta and here are H-methyls. So say let me write it down H alpha, here will be H beta and H gamma and delta right gamma delta whatever and here are H-methyls. So only I am focusing on this region. So let us take here are my point pattern. So for arginine here is H alpha H beta beta to gamma. Now for lysine we have one here and then series of here and then you see intense peak coming from glycine. Now here glycine both the HCH H this H2 glycines are showing peak at the same position because their chemical shift is not too different and that is how we have only one peak for that. Similarly we can find again for arginine here is for isolation H alpha H beta H gamma 1 gamma 2 and similar all the way of methyl. Now for say tryptophan here tryptophan we have H alpha H beta lysine you have all those peak H alpha H beta H gamma H gamma delta. So and again all those for this. So different amino acid type now looking at this toxy pattern zoom tap toxy pattern we can find it out which residue it is but like this 7 4 9 does not only come from toxy. For that I need to do something called NOGY spectrum we combine it and in next slide I am going to show. But other than that if you zoom this region this region is correlation of NH and NH like in each amino acid at least one NH is there and this population here shows that NH and NH are connected. This gives very important structural information. Now as we have studied NOGY gives you distance dependent correlation. So that means this NH-NH correlation if it is shown in NOGY spectrum this is toxy spectrum but I will come to NOGY spectrum it is shown in NOGY spectrum then that gives information about the distance. Here in the toxy spectrum if this we have a NH-NH correlation that clearly says that the amino acid which has NH as a backbone as well as side chain can also give you here. But this is also like if you look at 7.2 to 6.7 these are from aromatic amino acid. Now aromatic amino acids are very important. So if you look at here YW these are aromatic amino acid and then they can give correlation of the aromatic side chain correlation and that is shown here. So NH-NH correlation will be shown here in the NOGY spectrum that I will show you. This is aromatic protons correlation that comes around this and this is coming from tryptophan and tyrosine. So let us combine this toxy information with a NOGY spectrum. So here what I show for the same peptide 18 amino acid peptide toxy is shown in black and NOGY is shown in green. So if you see NOGY as we discussed it will give you intracellular correlation as well as intracellular correlation. Now intracellular will be same like toxy but intracellular will be different. So if you look at here many of the green peaks are also similar to the black peak. So this is coming from intracellular. However these all peaks are coming from intracellular correlation not intracellular. The green black overlap is intracellular. Green alone is intracellular correlation. So now this green if you identify this green because this is distance dependent or a dipolar coupling based correlation. So we know which residue is near to what and that is what helps us in identify. And again this is NHH alpha section. So if you zoom this section same thing we have here. So let us start from anywhere. Now say here let us start from glycine 18. So as we say glycine H alpha will be very intense and this coming at around 3.7 or 8 ppm. So we know that this is glycine. Now glycine H alpha is showing a correlation peak with another amino acid here. Now so you see this is H alpha. So H alpha H alpha correlation from glycine 18 to something which is nearby. Now what could be nearby? It can be either 17 or 19. Now if you take this peak and go in this line now here there is a peak and this peak has another correlation and that we have identified. So this is Y. So this has a correlation which can be traced back to the Y kind of residue that is tyrosine residue. So now we know that Y and G are near in space. So now we will go back in this sequence and then try to look at where are our Y and G. So here if you here Y and G are this. So now by using this we can identify this is 18 and Y is 17. Similarly let us go to some other here. Let us see here what we have is 15 K. At the moment I do not know whether this is 15 or not but I know that this is coming from a lysine residue. So lysine also shows a correlation peak at this position. Both of these are overlap. Now lysine will show a correlation. Now we go back here and we find that there is a residue which can be identified as a tryptophan. So now tryptophan and lysine seems to be together and then we can look at here. So tryptophan and lysine are together that means this is 15 lysine this is 14 tryptophan. Similarly we can do exercise for few of more amino acid like here if you look at glycine. So here we had a glycine, glycine H alpha showing the correlation peak of H alpha and 8 H showing the correlation peak 8 R is showing correlation peak which is this and this shows correlation peak with this glycine. So now we know that arginine is showing correlation peak with a glycine type and arginine and glycine are nearby here. So that is 8 and 7. So similarly we need to do exercise for days and we can walk across the backbone to identify each of these peaks using Toxy and Noji spectrum. This takes time as you see there are several overlaps there will be several ambiguity. So by doing like we can then make use of this region which is NHH delta region and by using whole spectrum we can identify the peak coming from the Toxy and Noji how they are correlated by bond and by space respectively and that is how we can identify the amino acid number like 7 G 8 R 13 A or so and so forth. So this is also like you can find it out and that helps us in building a model. So now I identify that if you look at the NHH alpha correlation is happening in backward right backward means 8 is showing correlation with 7 13 is showing correlation with 12 and 18 is showing correlation with 17 and that is how we can look at like these kind of correlation we are seeing here. So this helps us in building a model how the Noji peaks will be assigned. Similarly if we see a long range NOE constraints like for an example from here to here then we know that these two amino acids are closer in space and therefore by finding several kind of these correlation one can build a model for the polypeptide chain that my polypeptide chain will be folded in such fashions and that actually essentially can be identified using Toxy and Noji and this was very established technique like 30 years ago almost 30 years ago Professor Anil Kumar is a pioneer who has developed this Noji techniques and now is a core of the structural biology. So he showed actually for the first time how you can use Noji for doing the structure calculation for polypeptide chain. So now this building model helps us in identifying and fixing up the assignments great. So that is what we discussed the Toxy Total Correlation Spectroscopy which correlates which correlates by scalar coupling can be used for intracellular correlation and Noji nuclear overhauser effect spectroscopy can be used for like intracellular correlation. And then if we combine this something with double quantum filter Koji this gives you this three bond correlation J coupling J3 and if you remember we have like we told that this J3 is correlated with the phi torsion angle using carpolous equation. So just in a moment I will just give you some hints for that. So DQF Koji you can essentially measure coupling constant right by looking at the splitting pattern you can measure something called Hn, H alpha coupling constant. So if you remember where are the Hnh alpha in polypeptide chain this is essentially Hnh alpha and this gives you torsion angle this. So if you remember your Ramachandran plot phi and psi so this gives you phi value. So J3 is carpolous equation J3 Hnh alpha A cos square phi plus B cos phi plus C. So if you know J3 which comes from Hnh alpha experimentally determined Hnh alpha you can calculate the phi torsion angle. So here is the phi torsion angle this is psi. So you know this is beta strain region and this is alpha helix region and this is L type of three prominent region. So if so just by knowing this J value one can know where my polypeptide what kind of conformation my polypeptide chain has and say for a random coil say typically it is a 6 hertz or so. So 6 hertz is for random coil chemical shift which does not have any persistent structure. If this J3 moves towards 11, 10 to 11 hertz then we are in beta strain region and if it is goes like 3 to 4 hertz then we are in alpha helical region. So just by looking at the J calculated for this double quantum filter cosy we know where my polypeptide chain is. So suppose we are getting this J3 around say 9 to 12 hertz then my polypeptide chain is in this region. That means if you remember the value here is 0 this is 180 minus 180 plus 180. So one can calculate what is the phi and psi just by looking at what is essentially phi just at looking at the value J. So experimentally determined from DQF posing double quantum filter cosy you can experimentally determine and you can guess the conformation of polypeptide chain already. Now so now we have got the assignment of peaks we have got through a space correlation we have got some idea about di head draw the strains coming from the DQF cosy. So that all will be used in generating the these strains. So these strains now can be further used for calculating the structure. So two things are important one is torsion angle how are polypeptide chain what kind of conformation I have and then another is distance resistance how these protons in space are correlated if you go back here the correlation that we have shown how these will be correlated what will be the distance from here to here here to here here to here here to here. So if you go to get this distance and we get the torsion angle is phi and psi phi and psi then one can find it out the conformation of a polypeptide chain. So these two things we need. Now over the years people have developed algorithm which is called Telos torsion angle likelihood obtained from the shift chemical shift and that that can be used for calculating the phi side dihedral angle. So this was developed by a professor from like his group in NIH at BEX there his group developed this algorithm called Telos or Telos plus which uses chemical shift that you have assigned from Toxy and Nozy to predict the phi and psi torsion angle for each of the amino acid. So what it does basically Telos actually Telos takes the chemical shift for 3 amino acid in a polypeptide chain. Suppose we have a ARS W Y G L something like that. So suppose I want to calculate the phi side torsion angle of this alanine then Telos need 15 chemical shift as an input phi from here, phi from here, phi from here. What all chemical shift it needs? So it needs H alpha, C alpha, C beta, C O and H N. So these are five chemical shift it requires and then what it does it matches with a database like already so many protein structure has been determined. So now so it matches this chemical shift in the database base and also this stripe peptide sequence in the database and then from the search it gives you information of these phi psi that were there for all the structure determined. And then it predicts approximately with a range say 120 plus minus 10 this should be the say for psi this is 120 and for phi say minus 120 plus minus 10 some something like this it predicts and gives you value. So that you can use for a calculation of the torsion angle. Now as we have seen that NOE intensity from each of the peak that has appeared depends upon the distance R. So if we determine the intensity of these peaks that we had earlier here like now we know that these peaks come from the correlation of which two protons from two different amino acids we can find it out the distance of those amino acids. So intensity of NOE between i and jth peak determine depends upon 1 by R to the power 6 and that gives you actually distance correlation. So you can now use this dihedral constraints obtained from either DQF cozy or obtained from Telos. Then you can use this distance restraints coming from the NOG peak intensity and then one can generate a template. Template is essentially a random coil chemical shift. So residue name the atom number then list of bound length and bound angle and then one can generate a random coil polypeptide change this does not require anything and then one can fold through molecular dynamics and then what we are doing. So we generated a polypeptide sequence like this. So right using the chemical shift and all those now we have a angular constant. What should be angle here all these angles and we have a distance from here to here, here to here, here to here all these distances we have. So that is what you need. You need a template, you need an angle that will be required for folding and you need a distance. If you use this experimental constraints coming from the torsion angle distance coming from chemical shift data obtained by Telos or DQF cozy and distance restraints coming from NOG data then we are all set for structure calculation. So what we do something called simulated annealing. So we start structure and then this structure is heated to high temperature in a simulation and then atom of a starting structure can get a high thermal mobility. So everything at high thermal mobility is quite flexible and then you do you cool it in a step wise and then these constraints will be restraints will be used given by experimental constraints. So first you heat it then in an unbiased way slowly you cool it. Now structure calculation algorithm will use these restraints and then they will minimize the energy and then energetically favorable final structure will appear and that used by this whatever force field we are using. So energy minimized structure will come by this simulated annealing process. So here one can see. So you do restricted molecular dynamic simulation you get a template and then as you start so like distance geometry start from a extended polypeptide chain heat it at top and then slowly let it cool down. So structure will be revolving in a step here you can see at the end finally we obtained a folded structure. So similar thing is shown here starts from many conformation at high temperature slowly by using these restraints obtained from experiments and then making correct kind of context here the tertiary and secondary constraints finally a structure evolves and gives us a folded structure. So that is how you determine the structure of a polypeptide chain. So essentially it solves some of those equations. So it minimizes the distances and it minimizes the energy. So by using this ENOE energy finally by minimizing the energy it does the it gets the minimum energy structure. So one thing to notice here NMR never gives you unique structure it gives you ensemble structure those which has minimum energy. So the if you look any of the protocol that is used either Sienna or Explorer in NIH that gives you either 10 ensemble structure or 20 ensemble structure these are minimum energy structure obtained by this distance geometry simulated annealing protocol. So nutshell what we have done now starting from the polypeptide chain using we did MD simulation to calculate it use the restraints for force it to correct like context then we get a 20 manner and minimum energy structure and from there you can take one of the representative structure that will be a structure of your peptide. So if you look at here starting of the same 18 amino acid that we have started there are some amino acids which shows a helical and some appears as a tail. So that is what the energy structure determination protocol based on the minimum energy is used for getting these kind of a structure. Now the next step is to validate this structure. So for assessment of the quality of a structure one has to use this protein structure validation suite. So essentially what you need to do so provide the standard constraints analysis and a statistic of PDV validation protocol and then these structures will be tested against already solved structure using these protocols and then it will quality score will be given for your structure and that will be suitable integrated with the database. So what we are doing just for our obtained structure we are checking against the already solved structure from the PDV that does it fall. And finally you need to check the stereochemical quality of your protein structure that means you have to do the acid test for your solved structure by putting it in the rama chandran plot map whether all amino acids are falling in the rama chandran plot map or not. So if you do that so you can find it out like your this some of these protocols will find it out how many residue or what is the reason in the favorable reason of the rama chandran plot what is in additionally allowed reason. So if you look at the previously determined structure most of them are either in allowed reason or additionally allowed reason none of them are essentially in disallowed reason that means this structure is very good so quality of this is very good. And rmsd is root mean square deviation so deviation of C alpha atoms of those 20 amino acids is very minimal that means structure that has been solved is very very coherent and you can see it from here all these structures seems to be overlapping on top of each other these are 10 minimum energy structure ok. So if you do that we have final structure determined. So to sum up what we did here we started with a sample dissolve that in a proper liquid at 10 percent of dto then we take it to the high field magnet or whatever magnet that you have a core data for days then we sit together and do the data analysis for weeks and then we calculate the structure and that will be final representative structure of a peptide and that is what and then we need to validate it do the pro check analysis to see the structure is of high quality if we obtain this we are done we can submit in PDB and that will be our final structure determined from NMR spectroscopy using solution state protocol ok. So I hope I give you glimpse of how you determine the peptide structure using NMR tools. I hope this will be useful for you and if you have any doubt please do write to us ask us we will try to resolve thank you very much.