 Dear students, now we are going to start with our new chapter of homology modeling. In homology modeling, we are going to study how the protein structures can be predicted by looking at the primary sequence of proteins. As you know already that proteins are three dimensional objects that are functioning within the cellular systems. The structure of the proteins gives them the function. If a protein is unfolded and is in its non-native form, then it cannot undertake that function which it was supposed to while it was folded. Moreover, because most of the protein structures are very difficult to measure experimentally, there is this field of homology modeling or protein structure prediction which employs computational strategies to predict the structure of proteins by simply looking at the sequence of these proteins. As you already know that the protein function is dependent on the protein structure. Moreover, the proteins have one prime, two prime, three prime and four prime structures that is primary, secondary, tertiary and quaternary structures. The primary structures of these proteins is essentially its amino acid sequence. The amino acid sequence in proteins is measured typically by admin degradation or by mass spectrometry. So once you have an unknown protein with you, you can find its amino acid sequence using one of these two strategies. Next you have the two prime structures that is the secondary structures of proteins. The secondary structures of proteins include helices, beta sheets, loops and coils. These structures are formed when the primary sequence of the proteins come together and fold to take these three dimensional artifacts or shapes. Next is the tertiary structure of the proteins or the three prime structure. So these secondary structure elements that is the helices, beta sheets, loops, coils, they come together in various conformations. So once the secondary structures they combine to take an overall form, this is the tertiary structure or the protein structure and then the four prime structure which is the quaternary structure of proteins which is formed when two complete proteins they come together and make a complex towards protein-protein interaction. So these four types of structures are there for different proteins. As I just mentioned that the experimental determination of these structures is very difficult and expensive. However, two strategies exist that is the X-ray crystallography and the NMR spectroscopy in which you can use the proteins and find out their three dimensional structure. However, since they are difficult and very expensive, bioinformatics comes to the rescue. Bioinformatics tools and algorithms can help predict the three dimensional structure of proteins by simply looking at their primary sequence. So the protein primary sequence gives rise to the structure that much is known. So if another protein which has a similar sequence, right? So if you have two proteins, one is unknown and the other one is known. So if you take the sequence of these two proteins and compare them, then if this comparison is good, you can talk about the structure of this unknown protein by looking at the structure of the known protein. So this is the idea here. So if another protein which has a similar sequence and its structure is known, so this is very important. So for one protein, the sequence is known and the structure is known. For the other protein, only the sequence is known. So by comparing the sequence, you can also predict the structural formation or conformation of the other protein as well. So this is essentially homology modeling or a basic definition of it. So using such homology modeling approaches, it then becomes possible to predict the structure of proteins whose structure is unknown, but their sequence is known. So in conclusion, I would like to leave you with two important concepts here. So the first one is the sequence identity and the second one is alignment length. So sequence identity essentially means that if you have two protein sequences, how many amino acids are matching between the two? So let's say five out of ten amino acids are exactly the same between two sequences, so the identity will be nearly 50%. Secondly the alignment length, which means that how many regions within the two sequences are matching with each other exactly, which includes gaps and mismatches as well. And lastly, which combination of identity and alignment gives you the best homology? So this is a question which we will deal with later.