 Our next job is to assign the individual spin systems to specific residues along the polypeptide chain. So, now this is done from the nosy spectrum. How? Let us look at this here. Here we have the polypeptide chain running from the N terminal to the C terminal here. So, this is residue I NHC alpha CO, NHC alpha CO, NHC alpha CO, NHC alpha CO and that. We have written here about 4 residues, 4th residue up to the NH1 is written and the same thing continues elsewhere also. Now, what does the nosy we make use of the nosy spectrum here to identify the correlations between the individual residues and this the nosy spectrum as I showed you the cross peaks will depend upon what is the distance between the 2 protons. So, we will have to see what are the short distances between the protons along the polypeptide chain between the sequential residues or between short range residues. So, that is typically classically indicated in this slide there. So, this is again taken from the book of Vitric that is indicated here NMR of Proteins and Nucleic Acids. You can see here now there are some lines which are indicated by thick arrows and there are some which are indicated by dotted arrows. Now, what are these thick arrows indicating? The thick arrows indicating the near neighbor interactions here. So, let us say for example from the NH of the residue I plus 1, I see to the NH of the residue I and that is indicated as alpha n. From the NH I see also to the sorry NH to the alpha n and the NH to the NH of the previous residue this is indicated as dNn then you have NH to the beta protons this is indicated as d beta n. So, d beta n, alpha n, Nn these are the near neighbor interaction the sequential immediately after the next residue. Notice you do not see to the right side you only see to the previous residue. So, from the I plus 1 to I only you see therefore this provides your directionality whereas of course the Nn, Nn this connectivity this can go either way. So, this can go from here to here or it can go from here to here as well whereas the alpha n and the beta n these ones are only on to the residue which is previous to the amide proton which you are looking at. So, you will see I plus 1 to I only for the alpha n and to the beta n now and there are also other distances which are indicated by the this dotted lines. Let us look at those ones there now from this alpha proton and this is the really long chain one here alpha beta I plus 3 a residue which is so far away in the residue. You might see this one under certain circumstances I will show you where these ones will be seen. You can see from the alpha proton here to the NH of the residue which is I plus 2 you can see that one also. Then from this alpha proton there are three other ones which are going one goes to the alpha n I plus 3 NH of I plus 3 to shows to the alpha of I and similarly alpha n I plus 4 the NH of the fourth residue shows to the alpha of the I residue. So, this is there that is indicated by this symbols there alpha n I I plus 4 means N of I plus 4 to the alpha of I alpha and I I plus 3 mean N of I plus 3 or the amide proton amide proton of I plus 3 to the alpha proton of I and likewise then you have NN I I plus 2. NN I plus 2 means the two amide protons I and I plus 2 then you have alpha N I plus 2 there is N of I plus 2 to the alpha of I so these are the short distances which are indicated these ones are less than 5 angstroms and the same thing is listed here. So, at this depends now on the secondary structure in the polypeptide chain different segments of the protein chain will have different kinds of secondary structures. We discussed about the various structures earlier so what sort of a distances are present in these individual structures suppose I take the alpha helix then the alpha N distance is 3.5 alpha N I I plus 2 is 4.4 alpha N I 2 I plus 3 is 3.4 to I plus 4 is 4.2 and N is 2.8 this is very interesting see this is a very short distance and 2.8 is the short distance alpha N is 3.4 is the short distance for the other two are little longer. Now NN I I plus 2 is 4.2 and beta N is 2.5 to 4.1 this is a longer range here alpha beta I 2 I plus 3 is 2.4 to 4.4 so therefore depending upon what is the secondary structure if you have an alpha helix this is the kind of a distance is which are present. Now if you take the 3.10 helix this is very similar to this except that alpha N I I plus 2 is somewhat shorter which is 4.4 here and is 3.8 here and alpha N I 2 I plus 3 is very similar and you do not have this alpha N I 2 I plus 4 because it is a 3.10 helix and this distance is also similar NN is also similar and this is also similar this is also similar 2.5 and this one is also similar and 2.5 3.1 you may see this sort of a peak as well. So that is how you get these peaks in the helices now compared to that what are the kind of a distances you have in the beta structures if you have the anti-parallel beta strand you see alpha N distance is very short d alpha this is very only 2.2 strength therefore you will see this as a very strong peak and the NN distance is far so 4.3 and you will see the beta N distance also is a wide range 3.2 to 4.5 and what does this range come from this comes from the torsion angles along the side chain. So there is a variation in this distance because of the torsion angles in the side chain depending upon how the side chain torsion angles vary you will have a certain range of distances there but this is a very interesting distance if the beta is characterized very well by this alpha N distance and the beta parallel also has this very short distance and this one is slightly in this case this is better than this one there but other ones are also similar. So from this you cannot distinguish whether it is the parallel beta strand or the anti-parallel beta strand from this basis of this NOE is often this is difficult there. Now if you have a turn that is the turns also we come across and there you will have this distance is 3.4, 3.2, 3.6, 3.1 all of these are observable and this is a very interesting distance you have this 2.6 or 2.4 that is the NN distance. So the NN distance is present in this turn see this one is very similar to the helical regions there but this is not present in the case of the beta sheet the helical region you will find this and you will also see this in the turn there. And now beta type turn 1 we indicated that different types of turns and you will see that turn 1 has this characteristic features but the turn 2 has different features. This particular distance here is very short in turn 2 and this is more like the beta sheet beta strand anti-parallel beta sheet or parallel beta sheet this distance is very short in therefore that will discriminate between the 2 types of the turns in this structure. So and then you will also have this distance alpha n i i plus 2 this these are seen in the 3.6, 3.3 and you will not find this in the beta sheet structure. You may find sometimes in the 3.10 helix and you may not find it also in the alpha helix as well because the 4.4 is little longer compared to the this distance is there. But turn 1 and turn 2 you will find this distance very short and alpha n i 2 i plus 3 this is also observable 3.1 to 4.2, 3.8 to 4.7. So you will see this distinction is not very easy here from this but this is the very characteristic discriminating distance between the 2 structures. And with regard to this one here the n n distance, n n distance is much closer but that is not so in the case of the turn 2, turn 1 it is very similar very observable similar to the alpha helix but this one is not observable. So therefore you will have this sort of a distance is there. So 3.6 to 4.0 this is the beta n distance, beta n distance also will be very observable. This is a certain range there so you may see you may not see depending upon the side chain torsion angles wherever the side chain torsion angle is such that the distance becomes more than 4.5 angstroms you may not see it. So therefore these are not definitive that you will see that but where there is a clear short distance indicated those ones you will always see. So this is the kind of a distance pattern what we have and that is extremely useful in determining the structures. Now here is a typical example as to what sort of patterns you will get. If you have this cosy spectrum here you have this nosy spectrum will show you the sequential correlations there. These are the sequential ones immediately after and this range is for the amide protons to the alpha protons and then to the beta protons here until here this is the alpha and this is the beta and this is the one particular region which is taken to indicate what sort of peaks you will get. So here you will have you have the two peaks there one is to its own alpha this will be present in the cosy and here the sequential peak which is in the nosy spectrum. Again you have two peaks there alpha to its own alpha and then to the sequential same here same here and then of course here you see to the beta proton here as well. So and then here you will see to the beta proton here and this is of course a single peak here present and this is something different it is not coming from this NH proton but coming from somewhere else. And then you see this here or the alpha proton itself is shifted up field so much here and it can happen on the certain situations you will see the alpha proton itself is going so high and here it is alpha to the beta and then you have also to the two gammas here typically you will see such things in the long side chain ones which are there. Now using this one can do a sequential walk along the polypeptide chain. This is chemically indicated here suppose you have a helix we said the NN distance is very small we have the N this is the diagonal here the diagonal these are the amide protons the four amide protons there and you see the connections you have a connection here and you have two connections there and two one connection here. So therefore you can see you can walk along the helix in this manner the helix is indicated there. So you can walk from one residue to another residue let us say you start from here you go from this to this that is here and then you go from here to here so this is connected to this okay then you go from here to here this one then you go from here to here then you go from here to here. So you see this is the connection that is happening over the entire term the NH protons are here this is the NH-NH connectivities as I said the NH-NH connectivities do not have a directionality so they can go from one residue to the next residue in a particular direction any direction it can go so we cannot say here we are going from I to I minus 1 to I minus 2 and I minus 3 we may going I to I plus 1 I plus 2 I plus 3 as well so it can happen that way. So a particular place you will see two cross peaks there what is for example if I take here why are we seeing two cross peaks there we are doing the two cross peaks because it is showing to both the directions if I see here there are two cross peaks one here one there this is in one direction this is in the other direction okay so that is how you are establishing these connections walking along the polypropylene the lines which are drawn are showing a walk in one particular direction and this additional peak which is present here this actually is going in the other direction so that is how you get this in the helix you will get this sort of connections in the polypropylene chain. Now on the other hand if you look at the NH to the alpha proton area this is alpha proton area okay NH protons here and this is the alpha proton here okay now let us look here so this is let us say we start from here you see the suppose this is the self peak of the particular residue here and from this alpha I to the I minus 1 I minus 1 you go to the I minus 1 NH then you have the sequential to I minus 2 then you go to the self of I minus 2 then you go to the sequential to the I minus 3 then you go here I minus 3 then to the self this is the self then you go to the sequential of i minus 4, so you go like that. Now you can either level it as i i minus 1 or i minus 2 or i plus 1 i and things like that. So that is the sequential connections which are shown in this cases here. So this is i plus 1 to i then to its own then to this say this one and you can go on from this to this and then you can go on. That is what is shown as the sequential connectivity in the alpha NH, alpha region there. Now this actually shows the kinds of short distances you will get in the experimental spectrum considering a certain stretch of amino acid residues. Let us say you have this stretch of amino acid residues 1 to 7. There are 7 residues which are let us say you consider this here and you see what sort of peaks you will get. If it is a alpha helix you will see from here to here there are three continuous peaks which are present there from here to here, here to here and here to here. These continuously three peaks alpha n i to i plus 4 how many will be there i to i plus 4. So that is from this to you are from here to here the next one there and the next one there. So this is the four residues which are present there. So therefore in the similarly alpha beat i to i plus 3 how many you will see i to i plus 3 how many you will get in the 7 residue stretch you will get 4 of them. So you will get 4 peaks there and alpha n i to i plus 3. So you will get again 4 peaks just as you have here 4 peaks you will also have 4 peaks there for the alpha helix and if you take nn i to i plus 2 then you will have 5 peaks there. That is indicated here you have 5 peaks there. The range I mean the what is shown here is from what residue to what residue for example this one is 5 to 7 then it is 4 to 6 then it is 3 to 5 then you have 2 to 4 then you have 1 to 3. So these are the i to i plus 2. Similarly you can draw these lines for these individual residues. So this is i to i plus 3 means 7 to 4 6 to 6 to 2 i plus 3 5 6 7. So i to i plus 3 7 6 5 4 so you go from this. So 7 to 4 6 to 3 5 to 2 and 4 to 1. So that is how you get this 4 here and even when you have i to i plus 4 you will have 7 to 3 7 to 3 6 to 2 5 to 1. So that is how you will have 3 peaks there. So in the same manner you can also draw these ones all these peaks you must have in order to be able to conclude what sort of a helix you had. You must have this whole stretch of connectivities in your nosy spectrum. You must after you establish the individual spin systems you must be able to establish these connections from the nitrogen the amide protons to the various side chain protons or the backbone protons. So for the 3 tan helix you will have this sort of a pattern. For the turn you will have this sort of a pattern. See in the turn 1 and turn 2 these are quite distinct. So these are extremely useful. These patterns are extremely useful to be able to identify what sort of a turn you have and these are two additional terms which are slight variations of these ones there and you have a half term. So these are slight variations of these terms but nonetheless you have this distinctive patterns here for the secondary structures and whether the beta ones you will have you will have only this NN connectivity there and of course you will have the sequential alpha n that is not that is not you will have the beta you will have the alpha n sequential connectivity is there. This is the whole range of sequential connectivity will be present in the case of beta anti-parallel as well as parallel beta sheets you will have these ones there and of course you will also have these ones in these areas because you will see this sequential connectivity is sequential connectivity these are the immediate neighbors. These are for the immediate neighbors and you will have these ones there and this is how you obtain the sequence specific assignment first. From the immediate neighbors you obtain the sequence specific assignments and there will be peaks left in your spectrum. On the basis of that you try and identify these sort of connections to figure out what sort of a secondary structure the particular stretch of amino acid residues belongs to. So you first establish these ones there all the sequential connectivities and after that you look for additional peaks which are present in the nosy spectrum they will establish these very characteristic peaks for the secondary structures for the alpha helix the beta 3 tan helix the turn 1 and so on. Now this is a typical nosy spectrum of a protein this is a spectrum which is relatively old because this is not a very good spectrum but nonetheless this is one of the very early ones and therefore it is important to show this but it shows the connections between the different residues. It brings out the point as to what sort of a peaks we will see we will see the long side chain ones or the long range connectivities that is the point which you are trying to make here. So we already talked about the short range ones and the near neighbour ones now we talk about the long range ones. Suppose you have a polypeptide chain which is going like this and you have A, B, C, D protons here and the A proton chemical shift is here the B proton chemical shift is there C proton chemical shift is here and D proton chemical shift is here. Notice we have already identified the chemical shifts of the individual protons through the sequential connection procedure. So now you show which ones are where any other peak which is present we must be able to establish a correlation between the two protons which are connected by that peak because you already have all the assignments not only the backbone but also the side chains. All the side chain spin systems and the assignments are made therefore we will be able to identify those peaks which are present. Now what sort of a peaks we will get for the long range interactions. Long range suppose the polypeptide chain folds in this manner. The say in elongated manner it is written that way extended manner. Now suppose the polypeptide chain folds in this manner then what happens is these two protons come close by in space. When that is the case then you will see a cross peak here between those two protons. If the polypeptide chain folds like this then these protons come close by in space therefore you will see a cross peak between these two. On the other hand if it folds like this you will see the short distance between these two protons and you may see a cross peak between these two. So these are three different types of the structures leveled as i, j, k and depending upon what is the nature of the fold in the polypeptide chain you will see peaks between different protons along the polypeptide chain. Therefore we call this as a fingerprint of the structure of the molecule. The nosy spectrum is called as the fingerprint of the structure of the molecule. Now another point you notice here is that the different peaks are different intensities. Why is it so? Because the distances are different. Although the protein is folding and coming to a certain distance there but the distance themselves may be varying. It may be 2.5 angstroms or 3.5 angstroms or 4.5 angstroms. So depending upon the way the protein is folding you may have different distances and you will see therefore different intensities for the different cross peaks but that is a structural information. We can extract this structural information and use this to calculate the structure of the molecule and this is what is the step which is indicated in this slide. Now you have once you have the particular intensity what we do is we convert that distance convert that intensity into a distance and we will say this distance must lie between these two limits. This is the lower limit and this is the upper limit. We do not say that the particular distance has to be exactly 2.5 and 2.6 angstroms like that. We will say the distance Rij this is between the two protons i and j should be between this lower limit and this lower limit. It may be between 2.5 to 3 angstroms or 3 to 3.5 angstroms. You can classify this distance ranges depending upon the confidence we have with respect to the intensity measurements. There can be errors in the intensity measurements and that will determine what sort of a range you want to specify. If it is a weak peak then you will generally want to give a longer range like 3 to 4 angstroms or 3.5 to 4.5 angstroms. That is if the peak is weak but if the peak is stronger then you will narrow down the distance range. It must be between the range of 2.5 angstroms 2.5 to 3 or 3 to 3.5 and so forth. So you collect a larger number of such distances. So in a polypeptide chain if it has 100 amino acid residues you will have thousands of peaks. Of course several of them will be short range several of them will be sequential and there will be many other ranges but however we include here all the distances not only the sequential the short range ones and the long range ones all are included. Now what is the next step? The next step is we want to calculate a structure which is consistent with all these interpreted on distances. These are called as a distance restraints and your structure must satisfy these distance restraints. An initial model may not satisfy all of these. You will have to optimize your structure so that the distance constraints are satisfied. So therefore what is done is you define an energy function here and which contains 2 parts. One is this EF part this is the standard one which contains all the short range distance the stereo I mean the steric interances and things like that. There should be no steric contacts there. So therefore you take care of all of those bond angles, bond distances and things like that. All of those included in this particular function here and these are the basic energy determining term and then the NOE distance which you calculate this is included as a separate potential function here E energy E NOE and that is defined in this manner. These are defined with these harmonic potentials here and you have a particular force constant here. You define it as a spring. You have a violation of upper lower bound is given by this Rijl minus Rij to the square it is the force constant here and this is the violation of the distance for the upper limit. So Rmn to the Rmn is the upper limit variation here. So you run this through the entire set of distance constraints you have got. If all the distances are satisfied this NOE will come to 0 and this is what you want to optimize. You want to optimize your model such that this E NOE comes down to 0. Often you may not achieve that complete 0 but it will converge to a particular small value and that is when you can actually say okay why this is acceptable range. That also determines to what accuracy your structure is determined to what confidence level you have the structure and how much are the statistics of the distance is violated in your structure and how much is the range of the violation. And this is your document in a particular kind of a table. You say okay NMR restraints are in the structure calculation. You have the intra residue distance is 419, sequential distance is 475, medium range i minus j less than 5 angstrom these are 302, long range i minus greater than 5 this is 407. You have hydrogen bonds also indicated as a distance and therefore you have a total distance range of 1655. You also have dihedral angle restraints this actually comes from your coupling constant measurements you have these ones there and these are the other ones which are with regard to the geometry of the ones you have the bonds and the angles and other improper distances these ones are also indicated. Now at the end of the day after you have done these calculations you have to see how much is the variation violation are they all satisfied or all they are within the certain range RMST from experimental restraints you see the violations are very very small and if this is the kind of range what were then you say okay it is acceptable. So you consider the RMST for all what happens is typically you may not find one structure you may find a set of structures typically you may find about 5 to 10 structures which satisfy these and there is a certain why there is a range of these violations there. So therefore you put this plus minus and you have all the backbone atoms if you consider what is the RMST among these different structures you have selected your 10 structures and for the residue is 2 to 95 how much is the variation among the backbone atom positions and that you say you calculate the RMSTs all heavy atoms is this much okay and then of course you have to verify this your phi psi torsion angles you measure from this structure in terms of the Ramachandran plot. So you have how much is the variation the most favored regions is 74.5 additionally allowed regions 2 23 generously allowed regions 2.0 this allowed regions is 0.5 therefore this structure is an acceptable structure because it satisfies the Ramachandran plot okay this is typically the way you define it okay. Now in practice how we actually do the so called distance geometry algorithm which we discussed earlier also in some way that you start from a polypeptide chain structure which is similar typically like this okay you start from many different initial structures because you do not want your calculation to be biased by the choice of your initial structure okay. Then you have this 6 different initial structure for a particular protein and you go through the various steps of calculations intermediate steps here and you see how the protein is folding and then you see in the end all these structures are coming out to be similar okay there are many structures which are overlaid here and they are all coming very similar and therefore the RMSTs of these are very very small and therefore you see your distance constraint is quite good and you are able to obtain a unique structure from these constraint set okay. So I think now we can stop here we go into the next class with regard to the more complicated structure determinations more complex proteins stop here.