 So, for so long we talked about NMR methods, various kinds of techniques which are developed over the years and now we are ready to go into the applications of these methods into the area of structural biology which is the main theme of this course, NMR in structural biology. So, we have prepared the ground with methods. So, now we are going to see how they can be applied in structural biology. So, therefore, we will have to prepare a little bit of ground for the biology itself. What do we mean by structural biology? What sort of molecules we are going to look at? So, that will be the theme now onwards. So, we are going to spend a little time on introducing the different concepts and different biological molecules. And everybody knows that especially the biologists, they will know that the central dogma in molecular biology is described in this particular manner. Every cell has a DNA which is there inside the nucleus and this is the hereditary material. So, different individuals have different DNAs. So, therefore, the hereditary comes because of this molecule. So, the offspring will be similar to the parents and this is because of the similarity of the DNAs and therefore, it is in fact used to identify people on the basis of their DNAs. And then DNA is then transcribed into another molecule, the DNA stands for deoxyribonucleic acid. So, nucleic acids are the kind of polymers and then these ones are transcribed into another kind of a nucleic acid which is called as RNA. This is the ribonucleic acid and this actually is the process called transcription and different segments of the DNA are transcribed into a certain molecules RNAs we will describe that little bit later. And those segments which are transcribed are called as the genes in the DNA and then they form RNA and this comes as what is called as the messenger RNA. This process is known as transcription. So, this process is known as transcription and from the messenger RNA you have what are called the proteins. So, there is a kind of a code in this which is called as the genetic code and this genetic code describes what kind of a protein has to be produced and you know the proteins are made up of amino acids there are 20 different types of amino acids and a set of 3 units of RNA codes for one particular amino acid and they have the 20 different amino acids are coded by different codes these are called as the genetic codes and then this process of expressing the information in the RNA into the proteins this is a process which is known as translation. And then proteins are actually the biological machines actually for a long it was thought that proteins are the biological machine but today we also know that sometimes RNAs can also be function carry out function biological functions but the general belief has been for me which is true that proteins are the real biological machines and they carry out the biological functions inside the cell. So, now to explain this little bit more so they have the DNA which is a long polymer which is a long polymer it has more than a billion molecular which is huge molecule DNA this is the polymer and it is very very very very long. So, it has approximately 10 to the power 9 units and these are called as nucleotides. So, on this DNA there are certain segments small small segments and these are called as the genes there is a lot of other things also inside the DNA and these are the genes and from the genes you get what is called as the RNA. And this is called as the messenger RNA there are three different type RNAs they are messenger RNA then the tRNA and then you have the ribosomal RNA. So, if I want to draw the cell here you have the nucleus and this contains the DNA this contains the DNA and the mRNA and these things that actually happened inside this this is other thing which is called as the cytoplasm here. So, the genes are expressed in the form of mRNA this that is a particular mRNA there are other kinds of RNAs which are called tRNA and ribosomal RNA and tRNA is involved in the translation process from here. Here this is then it goes to the proteins here this is the process of translation and from here to here this is the gene this is the transcription and the translation generates the proteins. So, now ultimately our objective will be to understand the structures and how this process is happened of course this is very complex there will be many, many different kinds of proteins involved in every step and it is an extremely complex process. Clearly the and there is a kind of a synergy in all these actions of these various proteins and the nucleic acids and one needs to understand this synergy in order to understand that we have to understand the structures structures and the dynamics of these molecules. So, this actually constitutes the main theme of structural biology and protein there are also other molecules inside your cell and these are small molecules which are the source of energy or metabolites. So, the food you eat is actually degraded into various kinds of small molecules and these are the marker we have the ions and things like that. So, but biologically we will be concerned with the these macromolecules which are the DNA segments then you have the RNA segments then you have the proteins the ribosomal RNA is present inside the what are called as the ribosomes and these are the machines where the actually the proteins are synthesized. So, these are the areas these are the kind of a powerhouse where the proteins are synthesized mRNA comes inside the ribosomes and then there is a tRNA which also plays a role inside there and the ribosome also contains many proteins therefore the whole thing is a big complex is a big assembly and the structure of this ribosome is also an important aspect to study. So, individually one looks at the molecules the nucleic acid segments mRNA segments the tRNA segments and the proteins. Now, the this idea of this process of understanding the structures was of course an important topic and several people had worked on it and there are various kinds of empirical rules which had evolved and on the basis of what is called as a fiber diffraction fiber diffraction data these three people these three people brought out a model of the DNA the structure of the structure they proposed a model of the DNA structure and they got the Nobel Prize for that in physiology or medicine 1962. See for the discovery of the molecular structure of DNA which helps all one of the most important of all biological riddles that is the replication process why is it a hereditary molecule how is it that the offsprings are similar features as the as the parents. So, why is the DNA so important what is the structure how is the information transferred from one from the parent to the offsprings and that is retained in the same manner. So, that was an important riddle in biology and these people gave an answer to that in the form of the structure of the DNA molecule. So, what did they propose they showed that the DNA is a on the basis of the fiber diffraction data. So, they showed that the DNA is actually a double helix. So, is wound like this then you have the another one which is also going parallel I cannot draw this properly here, but this is a double helical structure which they proposed and that actually became an important contribution for understanding double helix. There are certain rules as to how the double helix is found we are going to look at all of those but they based their arguments on the basis of the fiber diffraction data. Fiber diffraction does not give you such very high resolution at the atomic resolution it does not give that information. It only gives the kind of a symmetry elements what are the kind of symmetry elements in the molecular structure. So, either in any proper symmetrical spacings of certain items certain spacing of certain groups what kind of a structural elements may be present inside the big polymer. Because you remember this is a very big polymer and in these we have these two people Francis Crick and Morris Wilkins these were the crystallographers the diffraction the ones we interpreted the fiber diffraction data and Watson was a chemist who actually built the model of this using ball and stick models. So, the and then actually turned out to be such an important contribution that they got the Nobel Prize in 1962. So, now we have to go into understanding the structures of these molecules in greater detail. So, before we actually go to that let us look at these various biological function in a little bit more detail and that is indicated here. So, here now here also it makes an introduction to NMR because there is an article written here in analytical chemistry for metabolomics and metabolic profiling and it also has describes what are the various elements in structural biology. So, you have the genome to metabolome. So, this is the genome the DNA that is the DNA which is also called as the genome here and there are typically about 30,000 genes inside the DNA in a particularly typically in a living cell cell there are about 30,000 genes. Now, these 30,000 genes and they are characteristic of the individuals that define what is called as the phenotype and then you have of course phenotype all the things also contribute to the phenotype. Now, this genome is transcribed into what is called as a transcriptome. So, you have 30,000 genes which are expressed and produced 30,000 mRNAs. So, these are 30,000 this is called as a transcriptome. So, this is very characteristic of every individual every cell has his own transcriptome therefore this becomes an important thing to study what is the process here from here to here and then from these 30,000 transcriptomes you produce nearly 100,000 proteins. So, which means that one mRNA is producing more than one protein in other words there is what is called as a mRNA splicing. So, therefore, different combinations of the mRNAs are made here and not only they produce individual proteins but some kind of a alterations in the mRNAs and joining cutting them joining them things like that and that results in producing nearly 100,000 proteins in a particular cell and this is called as the proteome and there are also certain changes happened after the translation some modifications happen and those are called as the post translational modifications. These are important events that happen in biology and they are required for particular processes in the biological function and then from here you have this metabolome the proteome deals with the metabolome that is the various kinds of molecules small molecules which are interacting with this. So, the metabolites are those which are the food you eat which is degraded into small molecules there are ions and there are nearly 40,000 small molecules which are actually present in a cell and these are upregulated these expression levels are upregulated downregulated and things like that depending upon the disease conditions depending upon the nature of the individuals. It depends upon the environment, diet, age, lifestyle, drug, disease and therefore one has to target various molecules in this category. So, you can sometimes you may have to target here, sometimes here, sometimes here, sometimes here and therefore all this contributes to the research in drug development, disease control and these are dependent on the environment, the lifestyle, age, etc. So, therefore it is a huge topic to study all of these ones. There are many challenges and of course now the NMR has come of age to provide solutions to many of these situations. So, in order to see how one can apply NMR we will have to go into the details of the individual molecular structures. So, let us begin with the RNA structure or the DNA structure as I said the DNA is a polymer it is made up of nearly 1 billion units and what are these units these are called as the nucleotides. So, the nucleotide is described in this manner it is a phosphodiester linkage here, this has the sugar ring here, this is the sugar ring, this is a 5 membered sugar ring as you can see this is and then you have a certain kind of a base here there are 4 different types of bases which we will describe soon sometimes 5 and if in the DNA has a particular kind of a base called the thymine whereas the corresponding one in RNA is called ursl it is and then you have the phosphate group here and this phosphate group this continuous this is the monomer unit of the DNA and this repeats itself this repeats itself through the polymer what changes in this the only thing that changes is this base from one unit to another unit there is a change in the base which are of the base and that is called as the DNA sequence as you go from one unit to another unit DNA base changes and that is the one which is the information carrier the sequence of the bases that are present in the DNA that is the information carrier and this has a certain chain direction here. So, we represent this as the chain direction going from n minus 1 and n and n plus 1 it goes in this direction. So, therefore, we write it as 3 prime so this is a 3 prime and this is the 5 prime and here see these are atoms are labeled in this manner the sugar ring sugar ring is labeled in this manner this atom is called the C1 prime the C2 prime C3 prime C4 prime and C5 prime and then the C5 prime of this unit is attached to the phosphate group here this phosphate group and this phosphate group is attached to the C3 prime end here at this point. Now, you see there is an oxygen here all of them have an oxygen and this C3 prime attaches with through the phosphate group to the 5 prime end of the next nucleotide and therefore, this is the base sequence goes on. So, this is the 5 prime one end is the 5 prime end other end is the 3 prime end the chain goes in this direction 5 prime end to 3 prime end. So, the nucleotide one particular nucleotide unit this is n this is in this is from one 5 prime end to the other 3 prime end. So, include the phosphate group of this into this unit and the next one of this goes to the next unit. So, therefore, this is the 5 prime to 3 prime sequence typically when you write the DNA sequence you write in this manner. So, in going from 5 prime to 3 prime what is the DNA sequence you write the various bases which are present in this nucleotide unit and that is called as the DNA sequence. So, this is the monomer of the of the entire polymer there are about 1 billion of this kind of things. Now, you can imagine how much how long this would be. So, if you consider the bonds here number of bonds here. So, from here to here how many bonds are there. So, this is 1, 2, 3, 4, 5, 6, 7. So, these many bonds are there. So, typically length if you consider the length of each bond is approximately let us say 1.5 angstrom then the total will be approximately 10 angstroms from here to here it will be really 10 angstroms. Now, the 10 angstroms into 1 billion so, this is 10 to the power 10 angstroms then you can imagine how long this will be if you take when 10 to the power 10 angstroms and how much is 1 angstrom 1 angstrom is 10 to the power minus 8 centimeters. So, this will be 10 to the power 2 centimeters that is about 1 meter. So, the DNA length DNA length will be approximately 10 to the power 2 centimeters if it is completely linear it goes like that it is complete if it is be after 1 meter. However, you see the size of your cell is just about a micron 10 to the minus 6 centimeters. So, then and then the nucleus is even smaller than that now how is this entire DNA packed inside that small space inside the nucleus. So, which obviously means that the molecule adopts a very folded structure all kinds of folding things happen there and it is completely wrapped around in various ways and so that it is packed inside the nucleus and this has to be unpacked when it has to be expressed when the protein has to be and the mRNA has to be produced it has to be expressed. So, it has to be unpacked and then the proteins have to be produced therefore, you can imagine how complex is this whole process is and a lot of this the a lot of research therefore, goes on in trying to understand all of these and how can NMR contribute in this is what we are going to see. Now, this is a little bit of a explanation is somewhat more detailed at atomic level. So, you can see here the atoms are listed here where 1 prime 2 these are the carbon set this is the ribose ring this is the ribose ring here see the 1 prime 2 prime 3 prime 4 prime C 4 prime and then the O 4 prime this is an oxygen here and then at this point you have the 5 prime carbon carbon is attached to a phosphate group here. So, we have a 5 prime OH group and that corresponds to the phosphate and the deoxyribose this is a deoxyribose the deoxyribose means that at the C 2 prime position here there are two hydrogens these two are hydrogens here therefore, this is the CH 2 group the CH 2 group is there is a hydrogen and the C 1 prime this is actually connected to this oxygen and this C 4 is also connected to this oxygen and C 1 prime here is connected to the base from here the base goes and then the C 3 prime is connected to an oxygen and in the case of the RNA there will be small difference here in the case of RNA you will see that the 2 prime position here also has an oxygen. So, there is one hydrogen and one oxygen in the case of RNA that is a ribose nucleic acid. So, therefore here in the C 2 prime in the case of DNA there are two hydrogens these are labeled as H 2 prime H 2 double prime and in the case of RNA you only have H 2 prime you do not have H 2 double prime and all of these also have a proton on this there is also a CH 3 on this CH 3 there is a proton on the C 3 there is a proton which is called as H 3 prime and on the C 1 prime also there is a hydrogen which is called H 1 prime and likewise on the C 4 prime there is a hydrogen which is called H 4 prime and on the C 5 prime there are two hydrogens which are indicated here and these are the two H 5 prime and H 5 double prime that is better seen in this picture here. So, H 5 prime and H 5 double prime. So, therefore there are a number of protons. So, therefore one can study using proton NMR what is the kind of a structure of this individual nucleotides are. Now, what are the bases as I said the one prime is connected to the base. Now, there are four different kinds of bases or the rather five here see this is adenine, gonine, cytosine and thymine these are the four which are present in DNA. In RNA thymine is not present it is replaced by a uracil. Now, you see these are all these two are called purine rings. So, this is a purine structure here you have a five membered ring and a six membered ring joined here this is the purine structure and these are only the six membered rings these are called as pyrimidines. So, the basic entity in organic chemistry this is the pyrimidine here and these are two purines and the uracil is also a pyrimidine and it has two oxygens here and same is true with the thymine the difference between thymine and uracil is that you have a methyl group in the case of a thymine whereas, in uracil there is a hydrogen here at this point there is a hydrogen and all of these there are hydrogens here in this point there is a hydrogen here there is a hydrogen here there is a hydrogen and so on so forth. So, these are the structures of the basis and those are indicated in a color coded man here. So, all the nitrogens are blue here and the oxygens are in red and the hydrogens are in white and the carbons are in green. Notice here there are NH2 groups here the cytosine has NH2 guanine has NH2 and adenine has NH2 all of these are hydrogens NH2 groups and they also have amino protons the guanine has an amino proton that is the NH group here and the thymine also has a NH uracil also has a NH. Therefore, so far as the nitrogen is concerned there are amino groups and the amino groups and there are also some nitrogens which do not have a proton attached. So, the cytosine has it is at this point that actually these ones are attached to the carbon these NH is actually will get this hydrogen will disappear and it will be attached to the sugar ring similarly and this will be attached to the sugar ring base will be attached here and the sugar will be attached here and like that here. So, this is how the structure of the DNA and the RNAs are built. So, this is the same molecular structure of the bases given in a more explicit manner. So, therefore, now this is a kind of a summary of what I explained for so long and that is the primary structure of the DNA or the RNA. So, this sequence goes like this. So, as I indicated earlier so, you have a base here and this is the ribose ring or the deoxyribose ring whatever in the case of only the carbon skeletons are shown the protons only are shown for the CH2 groups here but there are protons here, protons here and protons there those ones are not shown. So, you have the phosphate group here. So, therefore, this is called the phosphodiester linkage we have a phosphodiester linkage in this this is called the phosphodiester linkage because there are two ester groups there phosphodiesters. So, every dinucleotide has a particular phosphodiester linkage here. So, you have the CH2 this is the 5 prime end and this is the 1 prime here 1 prime attached to you see as I told you earlier at the at this amino at this nitrogen these bases are attached. Similarly, the pyrimidine at this nitrogen the base the sugar rings are attached and likewise. So, this is at the that this nitrogen the sugar is attached and so on so forth. So, in the purines the sugar is attached to the 5 membered ring and in the pyrimidines it is attached to the 6 membered ring at this position and to repeat you have this 1 prime group here this is the 2 prime in the deoxyribose you have 2 hydrogen here and there is 1 hydrogen at the 3 prime and the 4 prime there is 1 hydrogen here and this repeats itself in the all along the sequence. So, in a linear sequence this is how the structure is made. Now this is shown in a color coded manner here. So, where the how the DNA sequence is found you have the 5 prime phosphate group here and then there is the base attached to the sugar ring. So, the phosphodiester this is called as the backbone okay. So, the phosphodiester linkage this is called as the backbone of the DNA. So, the deoxyribose ring here ribose ring here. So, all the ribose ring in the phosphodiester linkages they constitute what is called as the DNA backbone and the bases are indicated here C, G, A, A, T etc. etc. So, this find the sequence of the DNA and this is called as the DNA backbone okay. Now what this people said Watson Crick and Franklin and all of these they proposed this model that the DNA is not a single strand but it is a double strand and that is how the hereditary information comes hereditary property comes because of this that is that if you have a DNA and these this is a molecule exists as the double helix what happens in the double helix there is one strand runs like this from 5 prime to 3 prime end and the other strand runs in the opposite direction 5 prime to 3 prime end and these two nucleotide units the two strands are held together by so called hydrogen bonding between the bases and these are called as the base pairs. So, you have the CG pair here the TA pair here GC pair here CG pair here they also observed that C can pair with G and T with A and no else okay. So, therefore there this is the model which they put up and of course this was based on previous information inside available to them that okay the number of C is equal to the number of G is the number of T is equal to the number of A is in the DNA. So, on that basis actually they had put up this model then you have the C and the G are hydrogen or paired together through hydrogen bonds and there are three hydrogen bonds between C and G and there are two hydrogen bonds between T and A and so on. So, this is how the model goes that the only certain bases will form pairs okay. So, now you see if this DNA breaks apart suppose you the two strands separate out and in the daughter of springs if the new DNA is synthesized on the basis of one DNA strand naturally the other strand which will come will have the complementary sequence. So, if the sequence goes C T G C A the complementary strand will have to have this sequence here that T G C A G it has to have that therefore the information is transferred from one molecule to the other molecule. If this is the parent molecule during the process of replication the DNA strands separate out and each one of them actually become the starting point for the new DNA molecules to be synthesized when the new molecules are synthesized they maintain this complementarity of the base sequences and that is how the offspring get the same kind of DNA information and that is called as the heredity. Okay, now this is explicitly shown in a little bit more clear manner the two doubles the two strands are running in this manner you see you have the 5 prime n of one chain which is going like this one DNA chain is running like this and the second DNA chain runs like this they are intertwined they are wound together in a kind of an intertwined fashion and then you have these base pairs that are happening between the two strands. You have the phosphate backbone and you have the base pairs which are there between two strands which are running in opposite directions. So, this is how the hereditary information comes in the DNA molecule and to show explicitly what sort of hydrogen bonds are there and this slide shows you the particular way the hydrogen bonds are formed A and T there are two hydrogen bonds formed here this is the thymine immunoprodons and goes here and the oxygen here goes to the NH2 proton of this of the purine here the purine has an NH2 here these are the amino protons and this is the ketogrop here this is the oxygen of the thymine which pairs with this and here is the amino group of the amino group of thymine pairs with the nitrogen of A and therefore this is how the hydrogen bonds are formed in this similarly in this case you have the amino of cytosine going to the oxygen of the guanine and the oxygen of the cytosine goes to the amino of the grossine and then there is an NHN hydrogen bond G amino where is to the nitrogen of the because you see the color code here. So, what is the color code the blue is indicating the nitrogen and the red is indicating the oxygen therefore these actually clearly tell you what sort of hydrogen bonds are formed between the A T and the G C pairs. So, this has been the basic discovery which actually led to the Nobel Prize to Watson Crick and Wilson. So, we will stop here.