 Welcome to MOOC course on Introduction to Proteogenomics. In the previous lectures, you have learnt about how data is generated from various ohmic technologies. The amount of data is very huge and it is very challenging to make meaningful insights from the big data sets. To understand the mechanisms at multiple levels, data visualization tools make the job easier. For example, a tool can help a researcher find the correlation between a gene with its mRNA or protein or even microRNA. In today's lecture, we will take a look at linked ohmics which is an online tool that helps in visualization and correlation of multi-ohmics data set. So, let us welcome Dr. Bing Zhang for his today's lecture. So, first I will just give you a brief introduction to the motivation and the basic functions in linked ohmics what we can do there. So, this is a tool basically try to bring the data and the tool together like Web Gestalt and it is just a tool right, you have to provide your own data or results in order to do analysis. But here I think the motivation for this project is that in recent studies like the TCGA and CPTAC has produced a huge amount of data and for us especially and for some of you who are work in the cancer area and this provide a very important resource for us to explore. But for ordinary biologists who do not know how to program it is not easy to get access to the data and also do the analysis. What we want to do here is to try to develop tools centered around this data resource and then allow everyone to access and use the data. So, then the questions and of course, with so the huge amount of data you can do a lot of sense right, but we are asking what are the most typical questions like biologists will want to ask about this data set. Of course, one question a lot of people are interested in is about survival let us say if you have survival data and then you of course, want to ask I mean maybe which MRA are associated with survival and also maybe you also want to do the analysis as a copy number level to see any copy number chains associated with survival or you can also do this as a protein level to say which protein chains associated with survival. So, you can do this analysis separately, but sometimes you also want to compare the results maybe you want to prioritize some biomarkers and then you want to see do I see any genes that are commonly associated with survival at different omics level meaning copy number chains only and the protein all associated with survival right. Of course, you might be able to identify some unique sense like gene whose protein is associated with survival, but MRA the protein are not and not only survival and if you are interested in the biology let us say even if you are not interested in cancer, but you are interested in a particular microRNA let us say microRNA 200 C and then you want to see what are the target genes of microRNA 200 C right. We know microRNA might inhibit gene expression through MRA decay or inhibit translation right then we can correlate the microRNA expression with MRA or with protein and this can give us the proteins or MRAs that are negatively correlated with microRNA and those could be the potential candidate for the targets of the microRNA. And then you might be interested in a mutation let us say PIK 3CA mutation and then you may want to see what is a transcriptomic or proteomic consequence of the mutation right. Let us say we have this mutation maybe we want to ask in the proteome or even in the first world proteome which chains are associated with this mutation and of course, for all these analysis we end up with a list of genes or some statistical analysis results and then we also need to convert this to pathway and the network on a network on standing and then we need to do some pathway like enrichment analysis this type of thing to better understand the results. So, together I mean we want to build a tool that can help users to use this data to discover compare and interpret or mix associations. So, you can start with any of these or mix platforms or from the phenotype and then you can get connected to any other platforms let us go and then you can also compare your result across cancer types or across platforms. So, in order to do that we develop the three modules in the system to do this first is called the link finder. So, basically from any of the attributes you are interested in like we said it could be survival or micro RNA expression or any gene expression or protein expression or mutation and depending on the and then you want to compare it with the other space meaning for example, mutation against phosphorylation or phosphor proteome or micro RNA against the proteome or transcriptome right. And then depending on the data type in your query attribute and also in your target space search space and you have to choose a great statistical test in order to do the analysis. I think Dr. Mani already talked about many of these tests and for what type of data you should use which type of statistical tests, this provides a basic summary like if you have binary as a query data set and then your search space is continuous then you use a t-test or where coccasin test one of them is parametric and the other is non-parametric. If both query and the target are binary then you use Fisher's exact test. So, but if you interested in survival as query and for continuous data you use coccas model to do the analysis. But all these tests have already been implemented in linked omics. So, you can just pick the right and all the system actually can help you to recommend the right test for you to do the analysis. And after you do the result to do the analysis you can get your overall result as a volcano plot. Here you have the effect size or the for example, the t statistic on the x-axis and the minus node p values on the y-axis the volcano plot show you the results. And then for individual genes you have the differential results or correlation results or survival result showing as different types of plots. And then for link compare basically it can give you some visualization to compare the results from multi-omic studies or from the plant cancer studies. For example, if you have mRNA or the whole transcriptome correlated to microRNA 200 C and also the proteome also correlated to the 200 C you can have a scatter plot to compare the result or you can have after you have some significant genes you can use one diagram to compare or you can if you have let us say survival results for many cancer types and then you can use a meta-analysis to compare the results. And the link interpreter part is easy to understand now because basically by using the vapour gets thought to do the link interpreter. So, this is an overview of the system it has the data from TCGA and CPTAC and then it use link finder to identify starting from one query attribute could be survival or microRNA expression or mutation or anything you are interested in. And then you define a search space and then you get some results here and the kind results can be realized and then you can compare results across different platforms or across different cancer types using this realization or meta-analysis. And then the results from these two can be used as input to link interpreter to generate the pathway level results. We are going to use ovarian cancer survival related genes as an example. So, the idea is that ovarian cancer has been studied by both TCGA and the CPTAC right. And then from TCGA we have copy number data we have RNA-seq data and from CPTAC we have proteomics data. And then we want to ask this question which genes are correlated with poor prognosis in ovarian cancer and based on all the copy number and mRNA and the protein. It is an interesting question, but it is not that easy. I mean if you want to do it by yourself you first have to download the data from the TCGA and then you have to learn R to do the social survival analysis and then you. So, basically there are a lot of things you need to do in order to achieve this, but within the linkomics you can do this actually in probably just 20 minutes of course, without the traffic internet traffic. So, let us just go to the website if you just google linkomics and then you should be able to find this website. If you go to the website and there are two options you can enter as a guest or you can register for an account. You can do any of those analysis without registration it is free the registration is also free, but the beauty of getting registered is that you can save your result in the database. So, next time when you come back you do not have to repeat your analysis. For example, if you do analysis today you have an account and the next time you just logging and all your results will be saved in the system, but if you just do as guest the result will only be good for today and the next time you come back you have to repeat everything. So, it is up to you you can either register now or later or if you want to make it easier today you can just enter as guest. So, if you enter as guest or if you register and come here and so basically on the left it shows you and if you click on the new analysis basically you will perform a new analysis and this shows you the multiple steps that you need to do in order to perform an analysis. And if you click on the analyze the results so basically this will show you all the results you have generated so far right, but of course, as a guest user or new user you don't have any results but assuming next time if you have a you are a registered user you have some results when you are logging next time you should be able to see some results here. For example, this is my account and all my analyzed result are saved here. So, I have a lot of analysis already performed so I just click on this and I can retrieve the result already. So, but for you guys you don't have the results. I hope today you have learned that linked omics comprises of three modules module 1 link finder which helps in comparing data from two attributes. For survival studies the Cox model statistics can be used and it is very widely used in many publications. Module 2 consists of link compare this helps in comparing two or more datasets from the module link finder. Module 3 consists of link interpreter which makes use of web gestalt to interpret data from the modules 1 and module 2. In the next lecture Dr. Bing Jiang will continue the hands-on session on use of linked omics. Thank you.