 So, this is a topic actually get so many questions and from the beginning to the end so we leave it to the last section is how to integrate in metabolomics with otheromics or we'll talk about multiomics but including metabolomics. So, what we would like to show in this last module is to understand the multiomics study designs and involving metabolomics so what are the common study designs and because based on this one we are going to introduce the different strategies how to integrate them. So my lab actually in the past decades we are starting how do we integrate metabolomics with differentomics so we actually have a lot of tools. And to do that, so we are going to introduce some of the tools, and there's no demo but it's a relatively designed to similarly like metabolism is you should be able to follow. And this should be easy. So, we're doing metabolomics, but we know metabolomics are relatively smaller, smaller field, okay, compared to genomics compared to microbiome. And now there's a new field called the foodomics and exposeomics. So metabolomics are really unique. So you can see a metabolomics at intersecting with a lot of the field. So, and you cannot really understanding biology without knowing metabolomics. But on the other hand, it's true. If we just focus on metabolomics, ignore genomics and environmental microbiome, we won't be able to understand biology. So overall that the future is integration, multiomics. And I think metabolomics are hopefully the last missing link to this omics family, and we should be able to integrate and focus on biology and focus on the data rather than focus on the very raw data fundamentals. So if we put the metabolomics as a bait, and we think about what type of the things can be linked to metabolomics because we are talking about biologically system biology. So genetics, how they influence metabolomics, there's a well published study since I think since 2008, the concept of genetically influenced metabolites or genes. And there's also microbial influenced metabolites or microbial contributed metabolites. Some is unique, some is shared with the host. So it's microbial memes. So we also know a diet and we eat and some metabolites are going to change and exposure. So, for example, drugs and the pollutant, they are going to show up on your blood urine. So a lot of the studies on the exposures and toxicology is we need to understand this overall that metabolomics at the center in a lot of these intersections and functions. So, at the early stage, we're talking about the integration of metabolomics with other data. And the easiest one, our most intuitive one is integrating metabolites with genes. So we want to see if I detect these genes changed the metabolites changed, can we see whether the certain pathways to change it. The other thought is that if the genes are regulated and metabolites change, they probably this pathway involving both enzymes and metabolites will be more likely to be changed. So we put them together in a pathway and we can do the analysis. So this is very reasonable and actually a lot of people are doing it manually. So what we have done is put it in a metabolite list and let people to do it using web interface. And one of the things we found is hard, but we managed to get it done is how do we get a peaks there, because the peak also can be meaningful. How can we integrate peaks together with genes. And this is a particle joint pathway analysis. And we should explore if you have peaks or compounds and the genes. So, if we really think about the joint pathway analysis. And we have the two type of inputs, significant genes from RNA-seq at microarray and a compound peak list. And we want to project in the pathway and do enrichment analysis. And that one is at a feature level. So we can directly merge genes and metabolites together and projecting into like one single list and projecting the pathways, which the pathway definition will include both gene and metabolites. And we can do enrichment analysis like a single omics. So we just merge them, mix them, and the library also contain both of them. So we can treat them as a one to do that, which seems simple and easy to do. And the metabolites will allow you to do that. The issue here is that imbalance. And a lot of times we know transcriptomics more complete, you can get a lot of significant genes like hundreds, but you only get about 10 metabolites significant. If you're doing the integration, you will find yourself. The biology was dominated by the transcriptomics. So that's usually true. The first part, what we want to do is the pathway level integration. So we can do a metabolomics data pathway analysis and genomics, transcriptomics data analysis and get a signal pathways. And we can integrate a pathway based on their p values. So this is a more later. So in this case, we can slightly address the imbalance of the feature level. So we move it later because p value is slightly more complicated. So, so overall that we're not seeing which one is as fast or when to do what we just tell you both are reasonable, but there's also shortcomings. And you are welcome to try and the metabolites just give you a platform let you try without writing a code. That's so this is the one we're just talking about what we try to address is one is like feature level and be aware if your features comparable and they are more, more or less fine. If your feature not comparable, it could be dominated by a single omics. And the later path in that number last pathway level integration is more similar to meta analysis. So, basically you calculate the p values for pathways and integral p values. And how do we integrate p values. This is a lot of well established method such as a vicious method of the Stoffers method, and you can actually combine them. You can also integrate in weight and to emphasize the most stronger p values are high quality data set. So this is used in the meta analysis. In this case, when you integrate p values, we can actually integrate the peaks directly with genes, just because we did, we just mentioned using mommy jog with, we can directly project the peaks, I use peaks to get a pathway significance. So, in that case, we get a path with few values we can integrate with the key values based on gene genes transferomics. So, we were able to do that pathway level integration for on target metabolomics and transferomics. And for the targeted metabolomics you can do either way, either using pathway level off each level. We realized the challenges of analyzing different data and starting with metabolomics metabol analysts which I did it during my PhD and continue upgrade and and maintain and it's getting better. Also, there's other tools we are continue update and the, the one below my type analyst is called express analyst, and which is actually for our sick transferomics data analysis, we just published a niche communication early this year, and it is really for our sick, and for both model species on models and non model species. So, one of our focuses really not just the modern human because if human seems a lot of things been solved the expected for the transferomics, but a lot of issues for non model species you need to do more and annotated genomes or transcriptomes. And what we found is, which is very hard for, for non model species what we have done is we use a much more efficient approaches you can directly into the iron sick data analysis and get to biology without doing the annotation and assembly if you have studied on non model species doing the iron sick, you should try this one. And if you use the blast to go before and this is a new generation of blast to go. And the one on the bottom is called a microbiome analyst, and we just published version to this year, and before there was an age vertical, and which is basically we are actively supporting microbiome data analysis. We also have a lot of modules actually very very close to metabolism. We have also thousands of citations, hundreds of thousands of users so this part is actually quite a popular tools for that. And this main stem omics we also have a lot of the other tools to help to do other other omics and multi omics. For example, we have network analyst for the gene based integration, how the protein protein interactions, transcription factors, transcription factors, and the tissue specific expression is a lot of a lot of the system biology for the gene or gene list, you should try to use natural analyst, which is quite popular also. And if you think about the cytoscape for the gene expression data, and this is one but it's online and with a very good graphical interface and network visualization. And multi omics you can see the second column, and we have the my primary based and network analysis and give us which I'm going to introduce as a basically how to integrate a snips with metabolites and omics net and omics analyst is the two general approaches allow you to integrate in the signatures from different omics data are directly using a data to approaches from different data tables so it's targeting different input and give you a more flexible control. So the last part is more on algorithms and databases so we a lot of them is on the line, a lot of the tools about because we need to have some annotations and knowledge base to help to help the to help the hyperlinking or understanding. So, my next one's a focus on how do we link actually metabolomics to genomics. So, before I already already mentioned about joint pathway analysis, which is aiming to link metabolomics with transcriptomics based on the significant genes and and metabolize ticks and genes. Now, the main part here we are talking about the link metabolomics with snips genomics is we are focused on snips which have some downstream functions and what the field try to do using genomics and metabolomics is called M. Basically they call the metabolites or metabolomics based genome wide association studies. So they scan people's SNPs using a SNP array basically genotyping everybody also marrying their metabolomics profiles and using some kit or array based platform. Then they try to associate the SNPs and unique genotypes with their metabolic concentrations. So this is similar to a traditional geo us basically you see the SNPs associated with some disease phenotypes, but this time there's no disease phenotypes actually using a metabolite concentrations at the end and point as a phenotype molecular phenotype. So far, there's about 65 studies, they have published so far. So we actually manually curating all of them and use their raw data and try to apply consistent the cat hops and and get this studies together overalls that they have about 4000 metabolites being married and some include. Oh, there's, then there's around 2000 2300 of the metabolite ratio. And what you find is metabolites is important. Sometimes you calculate the ratio can improve the signal. So they also found it apart and the SNPs is about 17. 4,000. So here's not about the SNPs married is SNPs that have a significant correlation with metabolites. So here's that we basically using a cutoff about popular user 10 to minus 10 to minus eight. So this is a common cutoff for selecting significant association geo us. So you can see mostly use the blood and urine saliva CSF there's some one study is mitochondria. So overall that there's a lot of studies and they try to understand the genetic influenced metabolite types. So we put all of them together we can actually annotate where they have published and the paper and how many samples you can see all the samples quite large that you keep out this UK by our bank UK BB about a half million. Europeans and a main ish and some include South Asia. So each data we keep their unique ID at the beginning and you see the platforms biocrities metabolite. So there's a long queue executive. And so there's a lot of the meta the difference in the platform used and cutoff threshold. So we try to respect what they use. So mostly is 10 to minus eight. But you can see here is, I guess this is an on target that you, you get more significant signals use a more stringent cutoff minus 12. And all the results you can actually view them. So here is that one we choose the one I just showed here. Blood and exactly European origin about a 9,000 people. So if you click you will see what are the main findings. This is a chromosome position is the basic colors. 21 22 all the chromosomes and the SNPs location is basically organized by the close location and P values metabolites that associate with them. And here's network of you and you can see some of the SNPs like a snippets identified by the ID like RS 296. This is not a gene so it's really have a SNP ID you don't know clearly, but this is a unified ID. And associated without picks, I think some picks annotated some is not just the mz to RT intention. So a lot of them is really give you some stuff with this SNPs functional because the SNP change associated with the peak change. So this is an on targeted. I'm pretty sure this is better for targeted you'll get ID. Okay, so this is a lot of studies. So, we, we collecting all the data. We also want to make people to understand how the genetic changes actually impact metabolic changes. Okay, so the link is from SNPs to metabolize. And we know from SNP to metabolize with there's a geo study and there's statistical analysis to the association. And we already know with 10 to minus point eight, and they are significant. And we just want to know why, why does the changes actually impact metabolize. So in between there must be a path with a regular to pass this right. And we also know the SNPs associated with the disease. There's a lot more publication, a ton of publication on a geo study is associated with these SNPs with disease. Actually, we also know SNP associated with genes. And like a QTL and and quantitative treat low size. So the metap genomics and have a lot of lot of data. They have the downstream effect. And we also know the proteins from genes protein as metabolize. So there's a lot of in between signaling and interaction pathways. We know it. So some of them actual transporter actually a lot of the SNPs affecting transporters which affect metabolize. And so, overall, that this is a rich information. And we know they are associated and then we want to know why they are associated. This is kind of the intellectually we are interested in. The other part is that if so we would like to fulfill the knowledge gap mechanistic hints why they are linked. On the other hand, we also want to feel this statistical link because actually there's a lot of a lot of significant connections and use you remember the here that's a tremendous tremendous large networks. How can we actually improve the signals from association to causality so correlation and is only a few of the much less is a causal relationship. So, how do we, how do we get that. Right. So, at this, we always have a question for that. But when you have the SNPs, actually, there is a way to actually addressing the causality. And the method they called here is called the mandolin randomization. So, the, it is very intuitive actually, because our genetic data change. Okay. And the inner population and the different genotype or different allele frequency going to distributed from people. Randomly. Okay. And it's very similar to clinical. Randomize the control trials. So the genetic when you're doing this inheritance and through this random segregation of alleles and doing the same same thing. So wild type, and have certain disease outcomes and the variant you, you, you have this. So, when you do this filter and control the background. You can see people accept this, accept this variant everything else how what's the chance they have this. So, if the, if, if we can control this variations using the genetic background. And we know the only difference is the SNPs. And we know the SNPs also affect the phenotype through this metabolites and we know the metabolites have a causal effect. Of the disease. Okay. So, here's actually the mandolin randomization. If we really put on a scale is that we know a lot of association between the genotype and disease. We know a lot of association between here and I'm furious. I'm give us how do we actually know they are called associated is by leveraging a lot of the give us we know the genotype associated with them disease a lot. And if this one are affecting disease and through this metabolites, and we can feel taught a lot of other co founder, which is cause a lot of the correlations but not a causal just because we don't change this. So we use this as a background control. So fortunately, and there's a lot of other studies on the SNPs affecting phenotype. And we can match them. So you have this. And give us study, you can use also find a geo study to control the background and do the test. So this is called a two sample mandolin randomization and using that we can actually feel taught a lot of a lot of the associations and identify which one is most likely to be causal. So this is I'm just explore and we just published like the two or three days ago about the version two and version one is really try to understand the what are the possible links between SNPs and metabolites and how the possible mechanism the between them and the version two many folks are mandolin randomization. How can we feel it out to the correlations and identify this more likely to be causal. And this is done through the mandolin randomization. So, I'm not going to go to a very detailed demos and doing stuff but the thing is that this is a part of how if you have SNPs and you study humans. And you can really find a lot of information from this website because it's integrating a lot of the gos and gos and disease phenotypes and all together. So you can see some of the results and and it's it is some pathways and not pathways of networks. So why we use networks just because there's a lot of links. And, and you go through them and think about it and a few and some of them is will be highlighted just because you have more evidence. A lot of other things you probably can feel it out. And this is a general from version one and version two we have a filter basically you can more focus on those and more likely to be causal. And we also performed a lot of the large scale studies. And here is showing 800 metabolites on the 236 disease phenotypes. And this one's past that mandolin randomization so we are more feel this is more likely to be a corollary link. And we also manually check the top. A lot of them actually been confirmed by other literature. So it's very interesting. And to know, we can leverage genomics to improve and statistical analysis. And this is when we have this very huge table, we just manually go through them and some of them have a very interesting linking, for example, some of this serine associated with after real scale road take heart disease. And some of these are carnitines associated with inflammatory disease. And some you can see some hominid snips associated with them. What's the P values, you see the P values very, very high. And if we do some little research, you will see a lot of anecdotes reported talking about the link link is, it's likely to be true, but they are all anecdotes. So now we are talking about large scale studies, we merge multiple studies, we really found the significance very high. But we are talking about tens of thousands of this, of this potential coral links. And so we definitely have no time to manually go through them, but this is published and this is supplementary table. And you can go through the website to interactively download people to go through it yourself. And you will see that when you go to large scale studies and potential leads is so high and to the point that it's all well mean, and the other one is, we also did a lot of the test and if you really study one particular how they are associated with phenotypes. And you can do the test with mandolin randomization using tools and see how many snips actually can you can leverage to improve this coral link and see how coherent the signal is. And the slope we are telling you what's likely to be coral or not the strength so overall that if you are looking to this the snips and metabolize and the disease, and this allow you to really go through it, and focus on the evidence focus on the multi. The table I showed earlier summary, but if you're looking for individual you can go to very deep and click to which studies give this evidence, what do you find how big is population, and the human studies you really need to match in the population Asia, Africa, European, North American they have different background, and sometimes they will make a difference. So that's the snips with metabolomics. And now we talk about the metabolomic with microbiome. And this is actually one of the key focus in my study, because I believe that we talk about a gut microbiome, and living inside our gut, how they impact our health and development and disease immunology. And a lot of things is through the metabolize is bioactive metabolize, it do have some physical contact through the membrane, but majority is with the bioactive metabolize. And that's a lot of people believe so so there's a lot of study on this, and our group is unique so we are more commutational so we doing a lot of large studies really want to see how we can improve it in a large scale. So, I started how we started as we want to see how we can just try from a smaller example. And, for example, chip fan, and David mentioned yesterday about chip fan as is very important but most of this we get from can get from food and in and get from gut microbes actually can do a lot of transformation and some we're getting a result. Some is beneficial some is actually detrimental. And so, we know actually if we compare the microbes, and based on their potential to to generate a chip fan versus a taxonomy there's a huge difference so taxonomy and versus a metabolic function is a huge difference. So we want to understand a functional microbes we need to study their function might metabolize rather than Oh this belong to this taxonomy genius, they must do the same thing. No they are very different. So here we just shows that they are quite different. And we actually collected manually collecting based on literature and found out what are the possible pathways microbiome can actually do based on general models and based on this knowledge we actually build a very simple, based in statistics to predict how they are going to generate the potential different chip for metabolites. And we can, we can do it across different genus and for human mouse and predict the different chip for metabolites, we actually can marry them because we have collaborated they actually have the samples we can actually test them. We test a few of them we also compare with popular use the tools to do the prediction such as pie crust text to text for fun. And we, we tried and actually we can do better job. So, at least in this chip fund and we know we have a very accurate model, and we can use the statistics to actually predict well compared to most other popular tools and so this part is that we did a benchmarking and we also did a web lab. And in the end is we, we go beyond the chip fund, we know it actually working generally across a lot of metabolites. So we are here to see predictor metabolites based on the general model based on the stuff is really dependent on which metabolites are interested some metabolites predict very, very well, some is still is hard, which is not, which is very true for a lot of the other algorithms. So, but overall it's very useful. The great tool is called a microbiome analyst we just published version two, and it's very, very comprehensive you can upload your microbiome data like a marker, a 16 s 18 s or it s for foundry. And you can do shotgun, and you can do some enriching analysis like a text offset enrichment analysis. And here we show is how we actually can do metabolites microbiome metabolomics microbiome data and integration. And it is really based on what I just mentioned earlier. So when you upload your metabolomics and microbiome data together, and you can get a much meaningful result. The thing is that if you see the separation of this, like this PCA if you see this PCA, you see the separation between healthy and control, and you can see this top microbes, which drives the groups. And what are their potential in generating the metabolites. Here, you can see, actually, we're plotting the heat map of the top most significant microbes see their potential, you can see, they have this unique, very standing out to generate metabolites. It's very different from other microbes. That's not significant. And that's not significant. So when you see that, you know, the metabolic potential actually make them the driver very significant contribute to this one. So this is called a mechanistic insight. So when you integrate metabolites and metabolomics with this microbiome, you will get the prediction. And because you actually married because you have metabolomics, you will see the consistency between the prediction environment. So overall, that this is quite a meaningful result. So the other part we can try to do is that when we try to understand their functions, we do microbes, we use a cat pathways, but a cat pathway is really aggregated a pathway. And if we know their compositions 16 s, we can actually customize pathways. And then because if you don't have the enzymes, you don't have that pathways. So we can almost reduce the cat pathway by half by based on the 16 s compositions. Then we use a more accurate cat pathways to do the functional analysis, and you can get a more accurate pathway enrichment result. And with this more accurate enrichment result, you click on that, you actually see what type of the potentials. Oh, this is if you click each compounds, you will see which microbes have the highest potential to generate them. And you can see the from the candidate list to see the potential overall data. The omics data analysis, including a type of analyst, a microbiome analyst is allow you to quickly from your large scale data to get to the least functional insight. It does not give you a definite answer. This is this compound, this is this microbes. It's never the tools designed for the tools designed for give you the most likely hints, and it's up to you to do the next the targeted analysis, grow that microbes, or do more isotope labeling to really find out it is true or not. So a lot of people can ask the questions. How do 100% sure is this one is not that one. This is not this tool designed for that when you need to do one lab. So here is that we just gave you the leads. This is most likely to be, but it cannot tell you what 100% sure just because this is what omics is exploratory information and insight and prioritize prioritize your different hypothesis. So our last one we are going to talk about is the multi omics. More generally, how do we link metabolomics to the expose omics actually expose omics are really multi omics. It's because exposure we're talking about a lot of a lot of omics to study. So what does expose omics data looks like. And here's one of the published studies, we talked about the omics data. And if you see the omics data you will see the API genomics. This is the method DNA methylation data. This is gene expression data. And this is proteomics data. This is metabolomics data. And this is a serum metabolomics uro metabolomics. So this is naturally multi omics, but it is more than multi omics. Okay, it is have a external exposure like the environment like air pollution noise and so traffic and air and water. So this is a lot. Basically, if you study exposure omics, you need to understand environment and also personal if you're aware some exposure related to chemicals and the other ones diet, smoking, physical activity, sleep and social stuff. So there's a lot of emphasize health and not just preventing the disease. So we need to monitor the environment, monitor personal lifestyle and exposure and together with multi omics data to actually improve the health outcome. So health outcome. What's are they related. It's both birth weights and BMI and some other stuff so overall that how do we understand this multi omics data together within the context of this environment and personal choices and linking them to that so we can maybe change the some modifiable behaviors and to to improve the longevity of health. So the understanding here is that multi omics is quite a call for expose omics. The other actual stuff is called a complex metadata. And I mentioned in our lab session the last one if you really want to do is you need to really think about complex metadata. So my type analyst already have the support. And if you're using other omics analyst, we also support a complex metadata because a complex metadata became a norm. If you study human, you don't have control and you have to consider a lot of factors. So this if you're using our include my type analyst or other analyst you use it, you can upload a metadata table and metadata table is very noisy. It's not like a table generated by a machine, you can just generate this gene intensity or metabolite concentrations as metadata table you would compile the manually by physicians by some field researchers, they manually doing that. So they usually have a lot of errors and stuff. So we cannot just doing some missing value imputations or try to do it. Instead, we just flag some of the columns that people to do the editing. It turned out to be quite hard to get it automated easy to do but we managed to get it done. So this is you can see some of the data as metadata table, and this is a lot of them is your note, some of them have the orders you can order them. And then we also allow people to actually view what's a metadata looks like graphically to see some patterns to correct. So overall that multiomics data or expose on the data metadata has become normal. So do not think about metabolomics is very complicated. It is not. So the what's coming is became more complicated. Big data. So we need to embrace that and consider that. And here is that if we do the multi metadata, how can we control it. And for example, here we we show that if we want to test the significance of the between healthy and control and disease. This metadata we can need to control the age, BMI and and some other glucose tolerance or some proteins, and if it's inflammation or some based on some stage. So overall that we do have tools is called a linear models. So it's similar to test but actually much more flexible, allow you to embedding a lot of the factors to analysis. And they will allow you to see what are the changes and allow you to see the factors or potential interactions. Overall that we need to consider and there's tools to help you do that. If you really go to the field and you don't have control and multi factor is really the norm. And so, how do we actually try to combine different multi omics data. And I showed you before, you in a joint pathway analysis using a mandolin realization and us need to metabolize also to microbiome and metabolomics. But if we really talk about general how do we do that. And the one part is that if we don't have tools, we do a user driven user means we analyze individual data, and you synthesize that integrity in your mind, because you read the paper you understand you come up with a story. So this is what people usually do. But we can actually automate them. And the one is called knowledge driven. And if you have some knowledge base and you can map in them and they know they are talking about the same coherent story. And the chat GPD is actually doing something like that. But the thing for us is we don't have that the brain, brain power to memorize everything. So the knowledge you actually have a lot of knowledge. I try to automatically mapping out data to try to find the patterns in that data whether the multi omics have the similar patterns of changes and the patterns consistent and you will find that that patterns could potentially be meaningful. So, here is a knowledge integration. And the, if we talk about the real avoid the diamonds connecting the dots. So each omics data, if you're doing a significant test and get a significant list of genes metabolize and proteins. And that's the seed, because this one is means that they are potential biomarkers potential important point. But how they correlated with each other. And we've talked about we've put them in a joint pathway analysis, but sometimes the job pathways only cover a small percentage of the molecular space. And we need to use a big one. And what's a big one is network. So network is naturally broader. And you can put them together such a protein protein interactions is a regulatory pathways. A lot of them interactions identified from large scale studies, overall that the network is a much broader, broader, broader net can catch a lot of things to complement the pathways. So we can do pathway analysis within network no problem. And so within this big net we hope all these individual dots can be connected with each other sometimes direct connected sometimes they don't direct connected. They need to walk one more step haven't have one neighbors to connect them. Why we don't matter. So why they need a neighbor sometimes the neighbor is not in is not being married so we didn't see the neighbor so because we didn't marry it. Sometimes the neighbor actually is subtle. So they didn't pass the threshold but it's very important. So overall that you use network you can find module. Hopefully, all your signals converge to that module seems a lot of things to change it and you focus on the module focus on that function. And this is what we try to do once you identify module you can do the enrichment analysis just like what your TA showed you you see a hit map you see some patterns change a lot. And you want to understand what what are the functions. So here's the same thing. We don't this is not a hit map but it is a network we can identify modules that have a lot of changes. And then we perform enrichment analysis, and on that, and I see what what they are talking about. So that's, that's called a network analysis, and here's data driven or module driven integration. So we assume it is not model species and microbiome and stuff. We just want to see to omics data how they actually share some patterns of changes. So we don't know their biology we don't use about just use the two data, data matrix themselves. And this is mainly use some dimensionality reduction like a similar to PCA, but we try to rotate them and see whether they match, and they've matched how much similarity are. So here it shows that microbiome and metabolomics data, and using an approach called a pie crust, and not pie crust cutter pro crusty pro crusty analysis so it's very common simplest one, and try to identify coordination between microbes and metabolites so there's more advanced one here just shows what what's the simplest one but to give you an idea about the imaginary reduction PCA based approach. So, and now is the way to introduce these tools and overall design is eroded has seen the big and noisy, but we individually omics we're more or less make it standardized streamline for LCMS we're doing my type of analyst that we can get the functions and get signal features from R and sick use expression and this we can get a lot signal genes and from 16 as we can do a microbiome analyst we can get the microbiome tables and signal features and taxonomies. And now we can annotate put them together using omics net which is knowledge based network based integration and all use omics analyst which is data driven basically use some dimensionality reduction to put them together and show the common patterns. So, here's more slides on the omics net and omics net is really designed to to be intuitive to be easy to use with a very good graphic or support. And the idea is, you, you get a seed you get a dots, and you just individual like a sneak like a genotyping what seems to be looking for what signal genes the protein metabolites, and even peaks, and my clients transfer factor microbes, and you put them together. And they will go and search our online data knowledge base and try to find out how they are connected, and sometimes direct action sometimes need a neighbor to walk one step further to find the connections. And eventually they will connect your dots and have this edges established. And suddenly, it's not individual dots, it's become connected dots. Once it became connected dots it became network. Now you can do a lot of network analysis. So, so not to analysis is by the self is almost producing is is an art because it, it, it's most people is just using visualization plus enrichment analysis to get some feelings. So we understand that a lot of network analysis is important, but also it's, we want to engage users to engage understand use your biological background to really see which one is more important. So we spend a lot of time making it a 2d 3d and try to make it structurally more meaningful intuitive. And you can see some of them is actually this is default if you just go there and do a little bit you'll get a similar result and have a lot of the good visualizing result you can make it more art. If you're like it some people really want to spend a lot of time on it, and it's all directly on the web, you don't need to install anything. And the engine we use is actually people use design games, I think we can use the same thing to design visualization for science so actually it's working well. So the tool is a similar to what we have done with the metal analyst, we do have some interfaces allow you to input different types. So after input, you can input 123. And I think up to five, and you can just based on types input, if you up to five you'll get five lists. And now you are going to use a database, we have multiple databases for different purpose. And you just select one and based on this tool is connecting with this database, and this tool connected with that database, eventually you're going to use a database to build this connections for each of the input. And hopefully some input will be common, and not common to all but at your bridge different thoughts and make it a connected. So you created a network beyond the seed slightly expanding it. Now after you create for individual you're going to really merge all these five individual networks together. And sometimes it can get very big just because they are so far away, you know to connect that you have to walk multiple steps. It's more farther away than the less likely it's going to chasm with it. So you need to have some control if it's too far away you have to trim it, and we only want to be direct connection, or you connect with probably one step. It's one step we call it the first order network so you have doing one step to find the partner, you connect with other dogs. And this is a common practice if you are new to my tab, new to this network analysis you need to read a little bit of literature about how the network to be in that. So I can give a workshop on that but unfortunately we just tell you what's the common practice, and this is all well established acceptable, and you get a lot of the networks. And most time the network we are going to find is one big continent, and you most of the genes metabolism interest will be constantly on that, then there's a lot of small satellite often the nodes like that. So we're just like normal distribution and we always find the same patterns a huge network and continent and then with a lot of small Iceland. So if we do that is always the same as like that. Most of the folks are going on this huge island, and a huge, huge continent so we can always visualize explore and trim it and customize it. This is something I spend two minutes to really make it more spreading out and less overlap, you can change background, and you can see where each notes on the left is individual notes. And you can see you click on it will highlight on here, and you can also do some path analysis and do module analysis. Here you will identify this is a connected network, and you can also find a more densely connected within here and you can do functional analysis override it. The omics net allow you to create a network customized network and a view individual notes and do the module analysis do functional analysis and go zoom in and find more. So overall that you need to think hard. And what the story you are getting. So there's a lot of flexibility. And if you change at a certain step, you're going to have some difference in the result, but this is expected. A lot of people like it. Why because it's allowed them to think creatively think back and forth to really find out what it's about. So if. So this is why we divide the tools to give an open ended so you can think more, which, which stories based on data is most likely and what's your next step going to be, and it cannot be that fixed. And here's some art. So when you get to some stages, you really want to relax your brain. And you will get some arts like this and you can do a different layout and looks quite a quite interesting. And this is some 3D. So we really want to actually look into the possibility can we improve our dimensions visualizing from 2D to 3D. And here is I mentioned that if we play games 3D games, we can do the same thing have a 3D network. And each network, we can have a modules and modules wrapped inside a bubble. So it's became more simplified. And we can click this bubble and make it to be a focus and all the background will fade away. Then we can actually rearrange and extract it and see what's inside this network and you can see here my primary with a lot of things probably the major regulator. So the thing that you see the pattern you zoom in and you're doing more focus analysis. This is the sense we found out is have a multiomics complex system, we can make it a simpler and still get what you want. And the hidden goal actually I want to tell you is we want to really put into VR, we want to see and touch in a virtual world. And this is why one goal is not just for 3D we really have things, but eventually we didn't proceed to that. The Google at least so far is still not so ideal and it's expensive so but the software part and we already explored the same doable. And we, we still put it on park and hopefully the Apple Facebook going to give us some Google give us some surprising next few years. And seeing that all this 3D 2D is available if you use an omics net. And this is we published last year you can get a very good visualizing result. And this is a online version, and we build it with development actually try to compete with cytoscape and 10 years ago, almost, and actually doing a lot of things about I don't think cytoscape can do 3D, and we can do 3D easily, you can drag around so it's well maintained and very easy to use. Next one we're talking about the omics net analyst. So this is when we actually going to have a niche vertical out soon, and it's data driven. So this part is we want people to upload the data data matrix to integrate them using a dimension and reduction, and to see them in the same space see what's a shared patterns. So there's a lot of the algorithm building so individual omics we need to process them identify signal features to really use this as a bait see whether this where you follow individual omics. But overall that we really directly use a whole data matrix doing some normalization and clustering and dimension that reduction and visualize them together. So overall that we have this correlation network, hit map, and 3D scatter plot which is actually visualize the dimensionality reduction, reduction, dimensionality reduction result. So one of the things that omics analysts that we tried, we decided to do is that we don't do a very unique individual omics analysis and normalization. We realized that individual omics include metabolomics, microbiome, transportomics, epigenomics. They are very have a unique algorithms for normalization or statistical analysis. We respect that so we only allow people to upload the normalized data, and which we found is more flexible a lot of people actually like it, and you can upload up to five. And then you can visualize the pair wise, or you can integrate all together as a using several of the dimensionality reduction algorithm. If someone have tried one of them is called Diablo and which is kind of okay one and we found out the some time can identify very meaningful patterns that separate in the groups across the omics result. So here's the one we can actually show individual omics, how they're separating them when multi omics integral together, and we can actually see the scores and the loading, this is scores and the load and integral loadings, and see which, which will actually drive the separation. You can see this small bubbles here actually that's scores, but it's without a data point, we just basically show that's the separation groups, but you have the arrows actually. This is the features, how they drive the separations. And here you can actually see more things, how these things separated the way they located we put all of them together, they call it a by plot. So this is a similar to PC, it's just multi omics, they have more advanced matrix manipulations, and here we show that correlation networks or they're correlated. Once you identify some features, the good things that if you understand biology, and you're comfortable with some this high dimensional dimension matter reduction, and you go back see this correlated features you find a lot of them is actually talking about same stories, and a very satisfying when you identify multiple independent different algorithm found the features that's more or less point to the same, same relationships, and there's much more confidence, okay. That's why we choose to give you the correlations and dimension that a reduction because multi omics, and still give you false positives, but when you use different approaches and you see they're telling the same stories, and you gain more confidence. So, this is a past last few slides. And so, I already mentioned earlier, five slides away with, we just mentioned, we tried a three dimensional we want to try the VR, but it seems that goggles and VR devices not ready soon, which I hope is ready now and, and it's unfortunately, they are slow. But now we have new toys, and we really want to into this chat to be on the spot, and it seems they are very promising. So the issue is that we have the interface, we can design interface allow you upload and doing it, but actually we can do it through the conversation as long as they understand what you're doing and confirm with you. And if you're doing something, not possible, they will give you someone and tell you don't do that. So overall that the understanding of the omics data analysis is became very standardized. So if you're attending lectures from morning to now you should feel that the pattern, this pattern knowledge going to be a very generalized. So the, if we're using it to chat to do a large language model, changing them to doing this, that's definitely they can understand so our natural language is more flexible and when you refine it, or confine it to this metabolomics let's see, and it can get very accurate. So overall that a few years from now we should be able to get there. So whether we can get a 3D interaction like this guy, I don't know. But at least your, even it's multi omics, I think it will be easier than what we experienced today. And I think it's in three years we should get. Yeah. Thank you.