 So, now we talk about the main subject which we are seeking in this course that is the genome informatics, but first we know that genome sequencing is there. So, that provides us the sequences of the whole genomes of the organism. So, then the major application in bioinformatics is to find analyze those full genomes and to find those genes which are predicted to have some particular functions. Let us see what genomics is. So, NHGRI defines genomics as the study of a person's, all of a person's genes, the genome including interactions of these genes with one another and with the person's environment. So, it is the study of genes and their interaction with the environment. So, genome informatics then can be defined as the field in which computational and statistical techniques are applied to derive biological information from those genome sequences. LoC4 defines genome informatics as it includes methods to analyze DNA sequence information and to predict the protein sequences and structures from it. So, when we do genome sequence analysis, after we have those genome sequences, what we are going to do with that is we can do the discovery and utilization of the sequence polymorphisms. So, different sequences, they vary from one another. So, we can identify those polymorphisms and we can help identify different traits of those organisms by using those polymorphisms. It gives us an opportunity to explore genetic variability between and within the organisms. In genome analysis, we perform these tasks mainly, we do sequencing at first, then we do assembly. Since the sequencing is done in a way in which the whole genome is broken down into the short fragments and once those fragments are sequenced, we need to put them together. So, that step is called as assembly. Once we assemble that genome, we try to find out the regions in which we have large number of repeats because assemblies, they mess up where we have those repeats. So, we need to find those regions. So, it is an important task to go and look into those assemblies while keeping in mind those regions in which we have those repeats. So, we will go over repeats in a section down below. After we have assembled a finished genome, now we can go for prediction of the genes. So, we can find the genes by using different patterns or features of those genes. We can also look into the ESTs and CDNAs, the parts from where the genes are expressed and then they are then transcribed into messenger RNA. So, those regions we can find them. So, once we if we want to find them within the DNA, obviously, they are then reverse transcribed into CDNA and then we can look into where those CDNAs are present in the genes. So, that will give us the idea of the gene expression or the region from which the genes are making up those messenger RNAs. We can do genome annotations similar way in which we can find out different functions performed by different genes. We can do expression analysis. So, once we have the idea about the regions of the genes in which we can have the gene expression, then we can explore the quantification or we can explore that how much those genes are expressed. So, that thing is under expression analysis. Once those genes they are expressed, then their products they interact with each other and then they perform different metabolic roles in the shape of different metabolic pathways and networks. So, we also do that in the genome analysis. Functional genomics is where we are actually looking into the different functions performed by different regions of the genome actually under the control of different genes and what is the effect of changes in those genes specifically if we want to study about the genes related to diseases. We can also find about the gene location and map the location of those genes on the chromosomes. So, that is called as gene mapping and we can also do comparative genomics in which we can take one genome and we can compare with some other genome and we can find the comparative features what is present in first, what is not present in second and what we have intersections blah blah blah or we can identify the clusters of functionally related genes. Those genes which are maybe they might be having similar structures, they might be having similar sequences and they are performing similar functions. So, what those genes are so that can give us the idea about the evolution. So, obviously, we can do evolutionary modeling. Sometimes we are interested in finding the genes which are kind of duplicated within the same organism. So, in order to do that this self comparison of the proteome. So, proteome is actually the collection of the proteins which are derived from those genomes. So, the whole collection of one organism's proteins can be called as proteome. So, we can compare it with itself here and then we can find about those sequences which are which are duplicated in it. Most of the time while we are doing those genome sequencing projects our objective is obviously the health of us. Obviously, we are looking for some correction of some diseases, we are improving some crop varieties for having more food, we are maybe looking into different drugs against different organisms. So, it is good idea to have those model organisms that can help us study these processes in labs. So, there is a range of model organisms that have been selected so far. E. coli is bacteria, then we have yeast, C. elegans is another example we have a fly here, in this set we have a zebrafish, mouse, homo sapiens of course you and me and then from the plants Arabidopsis is considered as a good model. So, here in this diagram we see universal tree of life that has been made with the help of the structures of small ribosomal RNA unit. It divides whole living organisms into three groups, we have bacteria at the top, we have archaea, these are special organisms which live under harsh conditions and then we have eukarya which is obviously the biggest among all. So, we pick those model organisms from important branches of this tree of life. So, for example, E. coli is right here, we pick yeast as an example of fungi, from animals we have worms, flies, fish and mice. Obviously, Arabidopsis, rice and soybeans are the examples from plants. So, we try to get these organisms best representatives from different classes, from important branches on this tree of life. So, in the end we conclude that sequencing of genome paths the way for future discoveries and model organisms are the best source with which we can study or these genomes and then we can interpolate those results towards the higher organisms.