 So, we have been discussing about the protein folding phenomena to just to have a quick recap. We said protein folding is described by this sort of a diagram, this is called as a folding funnel. So, we can look at the folding transition, this is the native state here and then the top you have the unfolded state which consists of trillions of millions of million different conformations and each one of the molecule in your ensemble may have a different conformation and it will eventually start folding down the funnel causing through various energy barriers and eventually lead to a single conformation or maybe a related small number of conformations here. This is called as the native state ensemble here. So, it may be few conformations which are very similar to each other by and large. So, to monitor these folding transitions, equilibrium folding transitions, they will also belong to the entire set of conformations that are possible for any protein. We can start with the denatured state here, we created denatured state by using urea or gonadine hydrochloride or whatever some of the denaturing agents or DMSO and then you slowly start changing the conditions to induce the protein to fold that means induce the protein to go down the funnel here and then you will start seeing changes in the HSQC spectra. These are proton N15 HSQC spectra, amide proton N15 HSQC spectra and you will see that there will be small shifts in the peaks and the peaks will start becoming narrower and things like that. See this spectral dispersion is will increase as you go down and this is at zero molot. This is the folded state. In the native state the spectral dispersion is very good, it goes all the way from 7 ppm to 10 ppm and you see here in the unfolded state is only 0.3 ppm dispersion there and therefore the peaks are looking broader in this situation. And you start going down, there will be changes in the chemical shifts, there will also be changes in other parameters like the relaxation properties which we will use because as the protein starts to fold some new contacts will be made when the contacts are made between different residues, they will put restrictions on the free moments of the individual residues. Therefore the dynamics in the protein gets reduced. So this will reflect in the relaxation times, the relaxation rates, particularly the R2s the transverse relaxation rates they will get reduced. Therefore changes in the relaxation rates will be good monitors for seeing the folding of the particular protein. The motions will start slowing down and that will be reflected in the individual residue wise changes. So that is how we can identify which residues are folding and what is the pathway, how the protein is trying to fold. So this is what we showed here in this particular slide. You start from an unfolded state which is completely random and very high frequency motions are present and as you start going down the motions will change. So this is indicated by the change in the color here and the motions will start slowing down that will reflect in the change in the R2 values, R2 values will start going up. And then as the structures are getting formed you will start seeing secondary structural preferences. In these areas there are no structural preferences but here we will get some structural preferences. They are not permanent structures still, they are structural preferences. So you get helix made here and the motions will further increase I mean get restricted and as you go further down 5 molar you will see more helices are formed transiently. So they make and break, make and break that sort of a situation happens. Go to 4 molar some of these, one of these helix actually disappears and these two helices are there. When you go to 3 molar so all the helices have disappeared and then you will see the structures has no structural propensities. But you see here the R2 values have increased at various places and this is the final structure of the protein and what is colored in red here these are the places where the R2 values have gone up. That means these residues are coming closer and going away, coming in contact and going away things like forming and breaking, coming close and going away and things like that. So there is exchange here. This is the kind of a folding unfolding transitions happening here. These residues are coming close and then going away. So therefore these residues are experiencing an exchange process. The exchange process results in increase in the R2 values. And therefore you see that is an interesting thing that at this point the protein is getting ready to form the native kind of a structures. These structures were non-native therefore those structures were removed and then once it is formed here then this kind of a structural transitions are happening inside the ensemble and then slowly further reduce the urea concentration you start seeing the beta sheet formed and then the two helices are the native helices and then it forms a very stable structure at zero molar urea concentration. That means in the absence of the urea. So this is how you can monitor the residue wise changes that are happening in the particular in any given protein. Now here I show you an example of the folding done in the similar manner it was in gonadine hydrochloride but this is with HIV protease. HIV protease it has intrinsic preferences here. Initial preferences are these are called the fold-ons. So these are the areas which the folding can get initiated. This is a dimer here. So this is a dimer this is called the flap here. So individual two monomores are held together kind of a beta sheet at the terminal here and then the seat terminals come close. And so therefore this is the initial folding preferences in the protein. So and then you start folding initiate the folding here then you start getting more folded preferences folding preferences that is again indicated by the color code. So you get the blue ones and then the green ones and then the red ones you start getting these. On what basis? This is on the basis of the slowing down of the relaxation rates the R2 values. Now you see here the ones which all these are adjacent to each other or the changes that are happening they are adjacent to each other which means when a particular portion of the protein folds it induces the neighboring fellows to fold and this is called as the cooperativity. So there is a cooperativity in the folding this is clearly demonstrated in this example of HIV protease and this is the final structure of the protein. So it is a beta sheet these are all beta sheets here what is shown here is of course I have shown the same structure not that these ones are in the initial phases these structures are present. But these are the preferences which are present and where the protein gets kind of propensity is to fold in that manner. So those that is why these ones are indicated on the same structure here. So it shows that there is a cooperativity of the structure formation. The initially it will be preferences and then eventually they will form the structures. And this is the flap area here this is the so called active site of the protein at the active site of the protein is very at the interface of the two monomers and this flap is also very dynamic and this dynamicity of the flap is responsible for the protein function. The substrate will have to come in here and then go out the flap opens for the slippage to come in the flap closes and then the cleavage happens this is the protease right. So it is supposed to cleave the proteins. So the substrate comes here and once the reaction is over the flaps open it goes away. So therefore dynamics of the protein also plays a very crucial role in the biological function. I will also illustrate this to you later with another example and this is how the protein folding happens. So you have initial preferences and then you provide the folding conditions there will be cooperativity in the folding transitions and they will eventually find way to the final folded structure. And the final folded structure here is it has one helix and it is a whole series of beta sheets. The beta sheets they are two beta sheets here over left of the two monomers and it forms a flap. So now so far we are looking at small proteins. Now we see how NMR can be used to study large protein assemblies when I say large not some 100 amino acid residues or 150 amino acid things like that but these are of several megadaltons, hundreds of kilodaltons of molecular size. How do we study such large protein assemblies? So what is the problem in NMR to study such large protein assemblies? So here is the slide which shows the line widths in the protein signals according to the molecular size. What is plotted here is line width on this axis, this is for the amide protons and this is for the N15 and on this axis there is spectrometer frequency in megahertz. And this various numbers indicated on the curves these actually indicate the correlation times. The correlation time reflects on the size of the molecule. 20 nanosecond correlation times it is a relatively small molecule, 60 nanosecond is larger and then this is 320 nanosecond this is an extremely large molecule. They can be assemblies these are multiple copies of the same protein can assemble together in an ordered form to form large assemblies. In such a situation we can see how the line widths change in both the proton axis as well as the N15 axis. The line widths are so large you see the 50 hertz and it is 30 hertz and things like that. So these are very large line widths. So when that is the case it is very difficult to see the signals in the HSQC spectrum. And of course this is also field dependence it is also indicated here. So if you see the line widths are smaller at this particular frequency this is almost 1 gigahertz actually this makes the case for going to larger and larger spectrometer frequencies especially if you are going to larger molecules. Of course when you are dealing with small molecules it does not matter. So small molecules are 20 nanosecond you can anyway from here to here if you go it does not matter the correlation time is 29 seconds and the line widths have not changed much. So from here to here it is about the same line width so therefore it does not matter. So it goes to larger molecules it becomes very crucial to go to higher magnetic fields where you have a smaller line width. So you see it is 10 hertz here at 1 gigahertz and it is 50 hertz here at if you take at 500 megahertz it is about 30 hertz and you go further down of course it goes to 50 and so on and so forth. So therefore as you go higher in molecular size you also need to go to higher in spectrometer frequency to get sharper lines. It will have important implications as we will see. This is true for both proton line widths as well as N15 line widths. What is the consequence of this we will see here. So therefore to take make use of this make use of this different line widths depending upon the spectrometer frequency to generate a better spectrum with a better line widths as a technique was developed which is called as the TROSI. This is transverse relaxation optimized spectroscopy. This was developed by Mr. Bhushan in the laboratory of Kutvitrik and this actually became extremely useful. This is called as attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anesthetropy indicates an avenue to NMR structures of very large biological macromolecules in solution. We will not go into the theoretical details of this. Basically it is make use of a kind of a interference between chemical shift anesthetropy mechanism of relaxation and the dipole-dipole coupling mechanism of relaxation. So the two things go in opposite directions. At some stage they will sort of cancel to produce a very small line width and that is indicated here. So suppose you are taking in an HSQC spectrum. In the HSQC spectrum if it will have on the proton axis you will have proton amide proton is a doublet with proton N15 coupling and N15 is also doublet with the same coupling provided we did not decouple this amide proton nitrogen coupling. If we did not decouple it will have four components here the particular correlation peak proton N15 correlation peak will have four peaks. But all these four peaks are different characteristics with regard to the relaxation and that is indicated here. So if you see here so typically in the normal HSQC spectrum this is what we will see because we decouple this we remove this coupling we remove this coupling therefore it will appear a peak at the center here this is one peak is what we will get in the normal HSQC spectrum which is what I showed you earlier. Now if we did not decouple then the couplings are present in both cases there are four components here. Now all these four components of the peaks have very different line weights and therefore you can see their intensities are also very different. So among these this is the sharpest this is the sharpest and these ones are relatively less sharp and then of course is the broadest. So therefore different ones are different components and this is the point one makes use of in the Trozy to say let me pick up this only forget about these ones I will do a technique whereby I can pick up only this component of the cross peak of the correlation peak. Then you see that is what we will get in this spectrum here this is the so called Trozy spectrum and you will have only one peak here this is much sharper in intensity and much larger we will see that example immediately in the next slide see here. So this is a normal HSQC spectrum it is also called as COSY because it is a N15 proton correlation spectrum is also it is called COSY but you see this is the normal HSQC spectrum which is looking like this. Now in the Trozy the same spectrum looks like this because you have picked up the sharpest component of the multiplied notice therefore that you cannot afford to decouple the proton nitrogen 15 either in the F1 or omega 1 dimension or in the omega 2 dimension and then you devise the pulse sequence in such a way that you pick up only the one component of that one 4 and that will be the sharpest signal and that is what is represented here it will reflect the denominator should do anyway. So therefore you see how sharp these ones are now you take this cross section there are 3 cross sections are taken this peak this peak and this peak these 3 cross sections are shown here this is in the COSY so you take a cross section like this horizontal cross section in all these positions so it shows the line width along the F2 dimension. So here it is case of the proton so COSY all these 3 look at the Trozy peak intensity this is so much more larger the noise level is the same this is so sharp similarly in the N15 axis also so this is the COSY and this is the Trozy such a sharp line here for each one of these and therefore naturally so if you do this your resolution in the spectrum is going to be higher and you will be able to see larger number of peaks in the spectrum and this is the very very exquisite illustration of this gain in the resolution this is a large protein of 130 kilo Dalton or something like that so this is the COSY spectrum or the HSQC spectrum and you see here is the Trozy and the peaks are so well resolved and sharp you can count these peaks measure their intensities and all kinds of things you can do therefore you can actually work with large proteins in this situation it may be a single protein or it may be an assembly of smaller molecules but eventually lead into a larger molecular size so then either case one can use this notice however that if you use very large spectrometer frequency for small molecules you actually stand to lose why you stand to lose in this case because I will show you that here so you saw here that we are picking out one component of the four in the normal case we are getting contributions from all the peaks here if you are working at a small molecule then all these components do not have very different intensities therefore when you decouple this will be very sharp peak with high intensity and at 1 Gigahertz or something but if you did a Trozy experiment on a small molecule of 10 20 kilo Dalton then you are throwing away the intensities of all of these and picking out only one then of course you actually tend to lose the intensity so therefore we should not do that if you are working on you should use Trozy only when you are working with a larger systems and then go to higher spectrometer frequencies okay so this is what and then of course 1 21 further to develop two more pulse sequences this is done by the same group once more this is called the crypt and the crinept okay this depends upon polarization transferred by cross correlated relaxation in solution NMR with very large molecules okay so we will not go into the theoretical details of this but I will simply show you the result that was obtained in this okay now this was an application to a 900 kilo Dalton Groel-Groes complex and all of you know that this is actually a chaperon this is the Groel-Groes complex is a chaperon and this is the structure of this chaperon this is determined by x-ray crystallography this structure is known okay so it is not that you are determining the structure what is the intention here is only to demonstrate the application of the technique to observe signals even in such large complexes okay now you see here this is the structure and the same thing is played out in this particular form these are three different ones here you have a SR1 and then you have this domain here and then the domain here so all of this sit on top of each other there are multiple copies of a particular small protein here in this case there are seven copies of a small protein which is 10 kilo Dalton in size so therefore there are seven copies of this one is trying to look at the signals from this particular protein okay what you do is we can actually reconstitute this protein with particular some portions labeled and some portions unlabeled and then you can see what you are going to get so in the free this is the spectra in the free this one assembly without the without these ones there and then of course you can see reasonably good signal here now the moment this one is bound to SR1 that is one this all these are presented together then you will see the signals are completely disappeared vanished in the Trozy in the Trozy these ones are vanishing so this is what is trying to demonstrate here that where the Trozy also becomes insufficient then you can use the crypt Trozy and you see this is this spectrum which is recovered when it is bound to SR1 and this is free when it is free of course you are seeing this in the crypt Trozy this actually has much larger number of peaks compared to this and this already has shown a certain enhancement but if even when it is bound to SR1 it is bound here then you will still have a large section of the peaks which are present here so this is how this technique can be used and this was published as you can see in nature in 2002 and this was an important development so the point to remember here is that you must use this techniques only when you need to work with very large very large molecules large protein assemblies it is possible to study the association process okay and so that is the therefore the you can stretch your application to very very large systems from small molecules to large systems now going further from this we would like to see how one can see self-association in the previous case what we saw was that the associated state is is particularly known the structure is known how the association is happening is already known in these cases okay so this is we know that structure therefore we know what the pattern of association of the individual molecule the structure is not changing from individual molecules to in the associated state because the spectral peaks have the same pattern there and therefore that is this is the one application this is the demonstrate that you can actually see the signals even in large ensembles okay now when you are working with a small molecule which actually associates this is very common phenomena in protein NMR that or whenever you express proteins many proteins tend to self-associate now how do we understand the self-association process does it lead to a association which is constructive or destructive is it functionally relevant or not relevant so therefore self-association is an important phenomena to be studied by as important phenomenon in biology therefore we try to see how one can study this I will illustrate this to you with particular example here now this is a molecule this is a process called as endocytosis this is the process called endocytosis what does that mean so if a small molecule has to be internalized the small molecule comes here and then this will be engulfed by a membrane and then it actually forms a kind of a encapsulated system and then there is a protein which actually binds at the neck of this formation here and this protein's name is dynamic okay and this is the budding clathrin coated vesicle the dynamic actually forms a big rope kind of a thing and then when you have the GDP hydrolysis this actually gets separated out and then you have this molecule internalized so this is the process known as endocytosis so the interesting point here is this dynamic molecule the dynamic molecule how does it work what is dynamic what is its role okay so this is therefore a functionally relevant process okay now what is dynamic let us look at what is he present here the structure of dynamic so this so the dynamic is a protein of 864 amino acid residues it has these many domains the gt paste domain the middle domain the ph domain gd domain and the prd domain these are five domains which are there this is 864 amino acid residues and this protein actually self associates self associates to form huge assemblies several megadalton size assemblies that is how it is able to form a big rope when it forms a big rope then it can actually bind at the neck wrap around the neck of the budding vesicle and then of course it squeezes with GDP hydrolysis it forces puts pressure and then the whole thing gets separated out so this among these domains it is the gd domain which is responsible for the self association of the of the of the dynamic so therefore can we study this this domain so okay so let us see this you can I individually express these proteins produce these proteins in solution and this is what you get so you take a gd when you take the gd molecule express it and you see it has a nice structure this is the circular dichroism spectrum and you see that it has a very beautiful structure this is the helical signature this protein has a nice helical signature okay of course this will not tell you that what is the size of this molecule whatever is the size so this is expressed protein has this sort of a thing now and this is the electron microscopy image the transmission electron microscopy you can see that this is forming particles of this size so what that means this means these are extremely large extremely large assemblies associated state so typically so 20 to 30 nanometers of size you will get here so which means the molecular weight if you translate that into it will be several megadaltons 10 10 15 megadaltons in size with such a large molecular weight it will be impossible to get any signals now we look at the HHSQC spectrum of that one here so you see how many peaks you see only about 25 peaks here roughly okay typically the GED protein itself has about 618 to 752 okay so this is about 134 residues so 134 residues discount some prulines and things like that so you should expect about 125 to 130 peaks so if you see look at that in contrast to that you are seeing only 25 peaks so what is happening of course why where are these peaks coming from obviously these peaks are coming from a certain section of the protein which is flexible and therefore it has a different correlation time the flexible domain has a certain correlation time and you will see that okay so I will be able to see the signals for these ones and where are these if you look at these numbers these ones one can analyze so these are coming from a particular domain of this particular protein and these are in the n-terminal so you look at these residues number here 3 14 21 11 25 15 so these are mostly in the n-terminal of the protein so therefore n-terminal is free whereas the rest of the protein gets in a kind of an associated state to form a large assembly so when it forms a large assembly it will you will get electron microscopy pictures like this and all those portions which are in the interior of this assembly you will not see the peaks from there because that forms a huge molecular mass and that will be tumbling very sluggishly and you will have very large line weight and you will not be able to see those signals here okay in these ones you can see particular types of residues here these are mostly you can see few of the glycine's you can see the serine's here and the threonine's serine threonine's and and then you are largely charged residues okay either hydroxyl groups here or the charged residues glutamates okay aspartates arginine's okay and tyrosine here and one or two here valine and things like that fine but by and large these are more least charged residues and it is reasonable to expect these charged residues are on the surface of the assembly and the all the from the different components because their flexibility they do not show as different peaks suppose there are some 10,000 molecules in the assembly all these 10,000 molecules n-terminal segments are looking similar they are producing one peak only they are not producing different different peaks they are all producing one peak so therefore this ensemble average is what you are getting and these are the residues which are on the surface these are charged residues therefore they are expected to be on the surface and the ensemble average of all of those ones gives you one peak each for each of these residues and you are able to see those peaks and you can analyze this you can assign this individual residues by using the standard procedures which you have discussed earlier for small peptides or small proteins okay now how do we understand this process our objective is to understand this process here okay this assembly process I think okay all right so we have to use a use a particular strategy to identify this okay so I think I will take up in the this in the next class we will stop here