 Genes are expressed into messenger RNA and whenever we talk about gene expression generally we mean the messenger RNA sequences. So normally we can get those messenger RNAs through different techniques like microarray or nowadays another famous technique is now is being established is called RNA-seq. So mainly the microarray and the RNA-seq data we can classify it as an gene expression data. So the gene expression data basis they store that gene expression data. So basically these are the public repositories for archiving and storing the gene expression data. These data are Miami-complicents. Miami is basically it's not Miami city in USA it's an abbreviation. So basically you can look into over the internet I have placed some links to it. It's a minimum information about a microarray experiment. So whenever you need to submit a microarray experiment over there you need to give us for example the descriptions that how the samples they were prepared then normalization methods which you have used and then the other counts and normalized files also in it. So while keeping in view these rules these data sets have different records in it. Basically it is convenient for deposition of gene expression data and this GEO which is required by different funding agencies and we can also it's a curated resource for gene expression data and where we can browse and query the and analyze this kind of data. So we have a page for GEO which is gene expression omnibus under NCBI. You can go into NCBI and then you can look into GEO databases which are having different GEO data sets. We can have expression profiles where we can see the change in expression of genes across different treatments and then we can also analyze this expression data. There is a tool called as GEO2R. We can use Blast in it we'll talk about that later. GEO has four kinds of records four kinds of data files keeping in view the Miami rules where GSM these files they store the sample information like how the samples are prepared how the treatments are given how the experimental design was established. The idea about the platforms they are stored in GPL files so here we can see whether it's microarray data if it is RNA-seq data or what kind of microarray data is it. There are different protocols coming from different agencies so we can have that information. Different sometimes different treatments they are recorded as separate files so GSE are the file where we can have the similar treatment files they are put together in a shape of a series. These are set of samples which are somehow related whereas the actual data is stored as GDS files which are the actual data sets. So here is again we are back to this page of GEO and if we look into different types of data it have we can have series if you look towards the top right side we can have different series we can have the records for different platforms and we can have the records for different samples. If you look into the types of the series here different types are there for example there is expression profiling by array and in our course we will be getting some RNA-seq data so that is under this expression profiling by high throughput sequencing. Same way there are different other techniques used for getting the expression which are also listed down below and number of data sets available to them are also present in this column called as count. If you want to look into some data set you can simply type say for example we want to know about colon cancer RNA-seq data we just type it in the search box and then it leads us to a set of records which it gets then we click on to one of them and we are here in this file. So here that is the information about that particular data set the top one each data set is submitted as a series and there is a unique number. Say for example here we see GSE57043 so that is actually the ID number for this data set. So if you look into this page we can have the idea about the experiment the organism from which it's coming and then the type of experiment a little bit summary of the experiment and then we can also see the contributors names and their publications and addresses. This is a big page so I chopped off into two sections. So here we have the bottom part of the same page since all GEO summations they need to be associated with the platform so platform information is here we get those sequences which are sequenced by machine alumina high-seq 2000 you will know about that later in some lectures down below. There are six samples total in this data set so individual samples are put together labeled as DSM. We can also download these sequences I mean those counts of expression counts or values in different formats. Here are the normalized counts in this box. So normally they are compressed files and the big ones the RAID data they are present as in this format which we call it as SRA or Sequencing Read Archive. So that stores the RAID data. These publications you need to mention about since funding and the publication agencies they require your data to be submitted and shared with the community so here is an example where we can see there is a publication and then they put this GSE into their publication so the other scientists they can get access to this data using that ID number and if you are submitting your paper you need to provide this information to the publication agencies. So we conclude that gene expression omnibus GEO is a public repository for archiving and retrieving gene expression data and it is the best resource where we can get the micro array or RNAseq data.