 So, this is our last module, module 8, and it's sort of a wrap-up. We'll talk about a few other tools and techniques that might be of interest, but they'll also be talking about a lot of applications. And the idea is to try and leave you with, I guess, a sense of excitement, perhaps, of what's possible in the field of metabolomics. Obviously, I'm sort of preaching to the choir. A lot of you are quite engaged in metabolomics already. So, this progression I talked about where we go from spectrum lists and lists to data clusters and then from clusters to pathways and then eventually to, I guess, ultimately integrating that into understanding genes and proteins and more of systems biology. So, that progression is one of the things we're going to talk about a little bit more. We're going to also address some of the applications of metabolomics to clinical medicine. Obviously, there are many applications outside of medicine, but typically for a field to sort of get more than a toe-hold in the world, people like to be able to show that there's some applications to health. And metabolomics has struggled, as has proteomics. On the other hand, genomics has had fairly smooth sailing and has received a bit of support. Talk about biomarkers and a concept called receiver operating characteristic curves or rock curves. This is an important area that people are starting to appreciate more, but it's still largely unknown. Learn a little bit about metabolomics applications in the pharma industry and then some new trends in metabolomics. So, the last session there, we learned a little bit about how we were able to go from lists to pathways and identify important pathways and learn a little bit about topology, but we also learned about the statistics that allow us to identify groups of metabolites that are significant. And that's important, you have to be able to identify those compounds that are significantly altered before you really can think about pathways and metabolism. We've talked about pathway databases before and actually you've seen the links to both K and Smith DB. We've talked a little bit more about the Smith database and I won't reiterate it, just that there's a lot of information that you can do with it. And it's similar now to the psych databases where there's this capacity to do concentration mapping. It's a way pathways really allow you to integrate metabolism with proteomics, genomics, and the concept of systems biology. So, from pathways and lists, there's obviously some hope, and this is the central focus I think of a lot of systems biology to actually modeling systems. Modeling metabolism, predicting things like the dynamics, kinetics of metabolism, predicting flux and flux balance. Bernie Paulson, who's in San Diego, is quite well known for developing reconstructions, metabolic reconstructions. And in fact they've done a metabolic reconstruction very recently of human metabolism. They did one on human metabolism called Recon 1, they're now on Recon 2. And I think the number of compartments is now up to 15, the number of compounds is almost doubled I think. But this is a systems biology map of human metabolism, where all the reactions are mass balanced and charge balanced. Compartments carefully identified. All proteins and enzymes are identified. The transcripts and genes are also mapped. And it was a major effort both for Recon 1 and then the recent one, which was published a few months ago, Recon 2 was about like 30 different authors. Paulson is famous in particular for using these metabolic reconstructions of both humans and a number of single cell systems for doing flux balance analysis of FBA. And he has invented a term called bibliomic, or bibliomics, where they do literature surveys and searches to start from the gene lists, to build the protein lists, to build the reaction lists and the metabolite lists and to do validation and iterative debugging. But this process actually has allowed them to identify gaps in metabolic pathways and also allowed them to model metabolism. They don't need specific reaction kinetics or binding constants. What flux balance analysis does is set up a line to figure out what the inputs are, what the outputs are and whether everything is balanced and what happens when you perturb the balance. So it's sort of like if you could imagine a traffic flow on the 401 or other busy roads and the feeder highways into it and then if someone has an accident, what's the consequence? Or there's a flood, what's the consequence? And that's what sort of flux balance can do because there's a flux of traffic, flux of cars and they enter and leave and there are various pathways that are followed. Flux balance analysis is widely used. Another thing that you can do is actually more kinetic modeling. Some years ago we worked on a thing called SIM cell. If anyone has ever heard of SIM city, sort of simulations that you do, we're building a city and taxing people and then building amusement parks and everything else. I don't know, has anyone heard of SIM city? Just one with a thumbs up, yeah. But there's a very popular game and in fact it's picked up again I think because they keep on improving it. But what it does is it's based on a thing, a concept called cellular automata. So you're not actually solving differential equations, partial differential equations or stochastic differential equations. You're just actually letting computers sort of flip coins or roll dice and perform actions and you're letting it happen sort of simultaneously and these are called agent based model systems and some agent based modeling is used in some of these virtual worlds if anyone's ever tried those. And you can apply it to modeling cells and you can model a lot. You can model metabolism, you can model cell signaling and interactions and drug interactions. And for this particular tool we built an interface that allows you to draw out cells and draw out sort of simple pathways. This is when I think for the chip to fan operon and E. coli you can set some numbers, choose values that are in the literature or choose random numbers. You can identify genes, enzymes, transporters, metabolites, literally draw pathways. And in fact the intent was to tie this to SBML, Systems Biology Market Language, tie it to SMICDB. So you can actually do real modeling. It turned out to be a lot harder than we planned but this is sort of as I say the vision is to take a SMICDB diagram, translate that to a SIM cell diagram through what's called the Systems Biology Market Language. And now compartments are modeled and flux and kinetics and dynamics, conversion can be modeled using essentially stochastic cellular automata. We're in the process of just converting all the SMICDBs to SBML images which then would allow us to use the SIM cell but it's as I say taken a long time. So through modeling it is possible and people are starting to do this more and more. It gives us almost mathematical insight into metabolism. It's fundamentally what Systems Biology is supposed to be about which is to be able to predict and model events at an atomic scale. It's pretty cool although no one's really found any practical applications for it yet. On the other hand there are lots of practical applications for metabolomics and this is a list that we saw way back yesterday just some of the applications that people are using with metabolomics. And if you were to measure in terms of millions of dollars or thousands of jobs employed it's quite significant. It's just most people don't attach the word metabolomics to it. So I'll just discuss some of the applications to clinical medicine. That's because questions. They work and so these work for analysis of coffee the flavor of the coffee that is metabolomic. Yeah, it is although they're probably called flavor chemists or something else anything but metabolomics. And I don't think anyone should feel offended because they don't want to use the term. I mean most of us can't even pronounce the word metabolomics. But the concept being is that you're using analytical chemistry to look at a bunch of different things simultaneously and trying to interpret it in a biological context. So applications to clinical medicine. One application originally in fact metabolomics got its start really in the characterization of inborn errors of metabolism. So they're called IEMs. And you guys I think did a little exercise last night where you probably picked up on the aspect of phenylketonuria being evident in a particular sample. But it's a case where you can use databases like HMDB where it's either lists of masses or lists of analytes or metabolites that can allow you to identify compounds that are substantially up-regulated or down-regulated and then associate those with metabolic disorders. You could put in the LCMS peak lists and again the same sort of thing as possible. The function on browsing diseases I made the update last few weeks ago to 3.5 took unfortunately this browsing diseases search off but I've asked them to put that back on but it's an example where if you put down a local list of metabolites it will give you the likely lists based on the number of hits. So that's an example of how you can use a database and just simply matching. It's kind of a trivial example but this is something that does inform people and has been used to inform in the past. You know there's 600 diseases that we know about that have significant metabolic perturbations. No one's going to memorize them. And given that many newborn areas of metabolism involve changes to metabolic levels that none of us can even pronounce it is helpful to have these kinds of references online. And readily searchable. That's one example. Another interesting example was something I came aware of a few years ago and this is the relationship to should be type 2 or should be type 1 diabetes progression. But type 1 diabetes is sometimes called juvenile diabetes and it's been a real puzzle. In fact it's largely thought to be almost a purely genetic disorder. But there's unusually high levels of type 1 diabetes in Finland and they have a lot of concern over that. What is it? They're eating something or there's exposure whether it's just unusual genetics there. And this particular one was looking at young patients who had sort of the parents had a family history or siblings family history of type 1 diabetes. And what they found with this particularly long-term study where they're doing some broad analyses was that children who develop diabetes type 1 diabetes are almost born with unusually high levels of glutamic acid in their blood. Which, you know, so what? It's something that we can use and mobilize. But then what happens is that as the glutamic acid levels start to normalize they also see another large increase in GABA shortly after as this person is getting older. And then what is typically characterized by type 1 diabetes are what are called auto-antibodies to an enzyme called GAD and it's glutamate, something, something, d-carboxylase, which is an enzyme which is exclusively produced in the pancreas. And evidently it seems that this particular enzyme it has to be pumped up in order to start dealing with these very high levels of glutamate and GABA and the body doesn't like seeing these high levels of this protein and it thinks the protein is foreign so it creates antibodies to it. So the antibodies that attack this enzyme but in the process they attack the pancreas and eventually destroy the pancreas. So as this balance of the body essentially trying to mitigate the very high levels of glutamate which is responding by producing high levels of an enzyme which then the body responds to by producing antibodies to get rid of the enzyme inadvertently leads to diabetes. So the cause of type 1 diabetes may in fact be a metabolic imbalance very early on with very high levels of glutamate and this is quite striking because most people are instead looking at the antibodies or looking at the pancreatic dysfunction. Now there are multiple causes to type 1 diabetes and this is not the only example of this and there are other examples where the pancreas can be destroyed for various toxins and other conditions but it is fascinating to think that in fact there is a metabolic control mechanism that the body is trying to do by mitigating activities of enzymes through antibodies ultimately leading to a disease. HMDB as you guys have perhaps seen and I'll emphasize again is a lot of information in the database of linked diseases with metabolites and even some of the stuff about glutamate probably has made it into the database and you can use that and could or should use it as a resource to learn a little bit more about the disease associations with metabolites. Metabolomics can also be used in applications for bacterial identification and this obviously could be used in environmental metabolomics but it's being used in the clinic and it's essentially trying to identify or more rapidly identify bacteria based on their chemical footprint. So we've got rapid methods for sequencing bacteria but you still have to grow enough of them and often culture and that takes a little while but if you could look at someone had a urinary tract infection and wanted to know what kind of bacteria is causing it or whether it's a pathogen of one type or another, you can look for very specific metabolites that are produced by very specific strains of the bacteria and this was actually implemented a few years ago where they used actually NMR. So yes the usual route is 24 to 48 days culture but with NMR you could just simply take a urine sample look to see if there are any metabolic byproducts in the urine and then you can actually, if there are bacteria in the urine, you can start adding substrates and because there are a few bacteria they can actually do some pretty quick metabolic conversions and so there are four substrates they have that allow them to look to see what they had E. coli, Klebschialis, Pseudomonas or this Mirabellis bacterium. Each of them with unique bacterial metabolic pathways and each of them with substrates that allow you to identify. So not only can you identify whether there's a urinary tract infection can identify the cause and you can do it fairly quickly and given how insensitive NMR is you can imagine it could be much, much faster if you use mass spec. So those are some sort of curiosity driven ones as I say the diabetes example urinary tract infections, bacterial detection but there are other things particularly more frequently and certainly last week's conference in the Metabolomics Society there's lots of discussions about biomarkers and this is the application of metabolomics to the clinic but it's also the applications of metabolomics to drug discovery and drug development. Different types of biomarkers, there's at least five they're diagnostic biomarkers those are ones that tell you whether you have a condition or a disease. In many cases some people just simply look at a person say you've got the disease but there are some diseases an example being chronic fatigue syndrome where it's really, really hard to diagnose Parkinson's, another one that early stages takes up to two years to figure out whether you've got the disease or not ALS is another one. Lots of conditions where the diagnosis is not easy and costs tens of thousands of dollars and takes months or years. Prognostic ones, typically the case you have a disease but how will you do with it? So some people have conditions and in fact they live most of their lives long healthy lives with the condition. Others for particular disorder can go very bad very quickly and so prognosis is an important thing so it's actually somewhat predictive but it's essentially saying how you will do with this condition. Predictive biomarkers are different you are currently healthy can we predict whether you will develop a disease? So the classic predictive biomarkers are the BRCA1, BRCA2 genes they can tell a woman whether she might have a high-compensity of developing breast or ovarian cancer although as much as the press is coming about the BRCA1 and BRCA2 genes they are not very predictive. In the world of food and drugs and toxicologies there are markers of response or toxicity so how will you respond to this toxin or this drug? Some people are responders some people are non-responders some of you probably know this yourselves try some medication it just does nothing for you or as another person it knocks them out. There are also markers of exposure and these are becoming increasingly important and it's a field called the exposome so the metabolome, the genome and the exposome. The exposome is an indicator of what you're exposed to so it could be polychlorinated bifinols it could be pollutions it could be lead, it could be mercury those are the bad ones also good exposures, antioxidants and other things but these represent again what you may have taken in what may have consequences for your life expectancy or general health There are some interesting biomarker statistics this has a lot to do with how quantitative a given technique is so gene sequencing is quantitative it produces letters and positions and it has confidences in terms of the reads it reads lengths and so a mutation is something that someone can quantitatively identify it is this mutation A to T at position 4,267 and because it's a quantitative reproducible technique no matter where you do it the sequence will always be the same no matter which instrument, country, city, clinic so that makes gene sequencing a very good biomarker technique Result is that there's more than 2,000 genetic tests that are used and approved in North America However, they only cover very rare diseases or conditions because most diseases are actually not genetic so as extensive as the biomarker sets are for genomics and gene sequencing it only covers a relatively small portion of diseases so everyone jumped onto the transcriptomics and RNA-seq bandwagon more so in the transcriptomics and microarrays thinking that really it's not the genes it's going to be gene expression problem-wise it couldn't get quantitative measurements on microarrays and so a number of companies struggled mightily over almost a decade to try and get at least some microarray tests approved some of them are there's one for the gain of breast cancer a test called mammoprint there's one for colon cancer but none of these have taken off and I think many of the companies that started them have sort of bailed out so at one point there are 5 there are any still going on anymore because the techniques proved to be very non reproducible because they're not quantitative proteomics there's 111 approved protein tests in North American Europe only one of them is based on proteomics reason why there's so few is because the gain no proteomic assays are quantitative and this is something that has really hurt the proteomics community for a long time and so there's a substantial shift in labs, particularly in labs like Christophe Borcher's group where they're leading with the idea of moving towards quantitative proteomics metabolomics which is actually the youngest of all of these omics ones actually is doing pretty good if you look at the number of tests that are associated with metabolite readouts there's quite a few the most common glucose monitor but there are other compounds that are read routinely every day hundreds of thousands of times every day that are based on small molecules and they're all in the realm of clinical chemistry or newborn screening and they cover many of the common diseases heart disease cancer diabetes obesity atherosclerosis lung disease all of these things are actually detectable through small molecules some examples these are just again some sort of random examples but this happened to be done at the University of Alberta you can actually use urine to determine if someone has pneumonia now pneumonia actually is a very hard to diagnose condition it's one where people mistaken for cold bronchitis a bad cough a bad day there's also a distinction between viral pneumonia and bacterial pneumonia and again it's extremely difficult to diagnose so this is an example of a disease that is hard to diagnose and if you could actually just do a simple urine test it would really change the field and this is what they actually demonstrated is that you can distinguish bacterial pneumonia from many other conditions that present with the same symptoms so the usual approach is to sort of take a lung or sputum exam sample and to grow up over multiple days and if it's pneumonia and you're waiting for two to three days for the bacteria to grow either the person will die because you're waiting so long or if it turns out to be a viral pneumonia you will have been treating them the wrong way which is all so bad but if you could do a urine test and figure out whether you have pneumonia and what type of viral or bacterial it will take a few seconds to a few minutes and what they found is they're very distinct metabolic profiles people with TB sort of present with pneumonia people with real pneumonia and then people who are healthy and you can just see in this case just don't have to be a genius and NMR but to see some very distinct differences in aromatic features but also some distinct differences in aliphatic regions and then very distinct differences in the branch amino acids and sort of organic acids that seem to distinguish quite strongly individuals from pneumonia now this has nothing to do with amino acid synthesis and breakdown this probably has a lot to do with signaling and in particular branched chain amino acids play a significant role in signaling, insulin secretion and controlling general stress responses so still it has to be sorted out why but it's remarkable that a urine test could actually tell you whether someone has something wrong with their lungs another example which is difficult to diagnose is organ rejection most common transplant is kidney transplants there's thousands done every year thousands of lives saved organs can last a long time but obviously they can be rejected if an organ is rejected and you can't find a replacement a person has to go on dialysis some cases they die the thing most of you may not know is that even organ transplants that seem to be stable you have to go in fairly routinely have a big long needle stuck into you your kidney or your heart or your lung or whatever organ that's been transplanted and then they have to do a histology assay so it hurts it's painful it's costly because it has to be analyzed by a pathologist and the pathologist who usually looks at it gets it wrong about a third of the time so it's not a very good assay but it's still better than nothing so they've tried they've tried microarrays they don't make assays the question is because you actually just do a blood or urine test that would work we looked at this actually and we looked at just urine samples from individuals who were identified by pathology and tissue samples to actually have a rejection issue and one of the things that struck us was the very very high levels and these are quantitative measures and I'll emphasize both the ones I could have given pneumonia this one we're using quantitative metabolomics identified very high levels of carnitines in the urine and these are the VIP weights and the VIP scales from metabolologists and it says there's something going wrong or something definitely in carnitine synthesis that's heavily modified you can do your PLSDA and you can see very distinct separation of very strong sensitivity and specificity in terms of what this assay is suggesting why carnitines what turns out carnitines are produced by white cells and when they're active, metabolically active they have very active beta-oxidation pathways and carnitines are critical to production of fatty acids and breakdown of fatty acids and in fact carnitines generally are great markers for both inflammation and high white blood cell activity so again it doesn't really have anything to do with say the biosynthesis it has a lot to do with the fact that you're seeing a signal arising from certain classes cells being metabolically active in a region that they're not supposed to be namely the kidney and then what you see are the soluble components carnitines coming out of the urine one study that received a tremendous amount of attention at the metabolomic society last week but it's been around since 2010 or 11 is the relationship between metabolomics and cardiovascular disease particularly atherosclerosis so this is the clogging of arteries and when you get clots that can lead to heart attacks and stroke and also reduce deficiency in the heart and sometimes also neurodegenerative conditions what was found by Hazen's group in Cleveland was this fascinating relationship between the gut microfloresis published in Nature and what you eat particularly if you eat lots of fatty foods fatty foods have phosphatidylcholine you get them in eggs you get them in butter and French fries and what happens is that when you have fatty foods the choline is stripped off the phosphatidylcholine and it's converted to betaine the betaine will kind of float around as well eventually ending up being converted to trimethylamine by the bacteria that microflora who will then convert it to trimethylamine and oxide or it can also be converted through the liver certain types of bacteria are very good at converting or converting stripping choline and producing betaine and generating TMA or trimethylamine turns out the toxic chemical here is trimethylamine oxide so if you inject TMAO into arteries you can get atherosclerosis that seems to be the trigger for it it also explains an interesting phenomenon that some people can eat all the fatty foods they want and never develop atherosclerosis or other people who just sort of walk by slab of butter instantly seem to get atherosclerosis and that has a lot to do with their gut microflora the bacteria that they have or that they were born with so this connection between what you eat what's in your gut microflora and ultimately what the chemicals that lead to atherosclerosis it's a fascinating story and they've published several more papers since this proving quite solid that this is the case and when you think about the amount of money that's been spent looking at both the genetics and the proteomics and the diet issues associated with cardiovascular disease this suggests perhaps a remarkably simple solution and a whole avenue for treatment because we've done very poorly actually treating this disease and this is where metabolomics may help the great scourges of medicine and health this just illustrates the process in a nice diagram that they had where it's just taking the fatty foods from the foods we all like to eat so choline is produced particularly on the gut microflora and then transformed into TMA which then goes to liver which is converted to TMAO which then causes atherosclerosis heart attack stroke and other cardiovascular conditions another one that also was made a fair bit of headway in is relevant and I've mentioned it a couple times discussed lots and it's not just by the Gerstin group but also by several many other groups in the US in North Carolina in particular and this is the observation that diabetes which is a disease of sugar metabolism and the pancreas and this is primarily type 2 diabetes is a disease that can be predicted extremely well by the presence of 5 amino acids the branched chain amino acids phenylalanine and tyrosine and this study describes effort that they did to do this they validated or actually confirmed it first on the Framingham heart study a very famous study where they've collected lots of data on people for many many decades but then validated on a Swedish cohort looking at blood samples and they looked at people who were basically healthy, potentially generally young average weight looked at their blood and looked at their health status 10 to 12 years later and found that they were able to predict very accurately those who would develop diabetes and those who wouldn't and all they had to do was look at these branched chain amino acids and if they were elevated somewhat significantly then it was pretty certain that those individuals would develop it and as I say looking at these individuals they were ones that didn't and ones that did were phenotypically identical it was only these amino acids that tended to differ the fact that you can predict the development of a condition not one month, not 12 months but 12 years before it happens is quite striking the fact that it looks to be a condition that's more associated with amino acids than genetics or anything else is also quite striking but what the lesson has been learned is that in fact isoleucine are insulin analogs they are signaling molecules that act on the insulin receptor they also act on essential control enzymes, particularly mTOR and have a key role in metabolic functions this is just an illustration of how branched chain amino acids can be excess nutrients or perhaps metabolic imbalances in the gut microflora you can see where branched chain amino acids impact the mTOR protein they can also interact with insulin and insulin receptor functions directly or indirectly and then they can raise diabetes or essentially insulin resistance think of loosing and isoleucine as insulin analogs and there's high levels constantly eventually the system develops resistance to that that's equivalent to insulin resistance lots more than $150 million $200 million has been spent on looking at GWAS studies for type 2 diabetes found a few genes that have slightly increased risk just these 5 amino acids in terms of the relative risk in terms of their ability to predict are 12 to 20 times better than what the genetic current genetic tests are another amino acid called amino adipic acid which is the sixth amino acid increases that blue bar by another chunk so these combinations of amino acids in terms of predicting risk are the best thing that we found so far in understanding and in predicting diabetes risk how do you find biomarkers this goes back to how big a sample size be typically minimum size 30 cases 30 controls that's 60 100 cases 100 controls that's better you want to match in terms of age and gender and general health status you need to be able to get biological samples from the individuals biomarker where it requires a muscle biopsy is a terrible biomarker no one wants to give up chunks of muscle or heart tissue or kidney so if you can get something on urine or saliva or breath condensate that's better generally want to make sure you collect things at the same time, same place, same way there's lots of faulty biomarker studies where people didn't control for their collection conditions to do a validated biomarker study actually it has to be quantitative there's more than 90% of the biomarker studies that I'm seeing are using non quantitative methods and so none of the data can be used it introduces a curious guess what I found story but someone's going to have to repeat exactly the same study again using a quantitative method just to repeat the discovery process so in that respect it's almost a total waste to try and do a biomarker study where you're not quantifying metabolites once you have discovered the biomarkers then you actually have to do a validation test that's a requirement so it means repeating the whole thing all over again for technologies we've seen this slide before suitable for different things sometimes just sticking with the known metabolites is safe because those are identifiable, they're knowable, they're quantifiable but the amino adipic acid that was found in the study involving diabetes required mass SPAC so that was an example where doing some exploratory work allowed them to identify things that were completely surprising once you've got your list of metabolites measured for your controls and your cases you can use a tool that was developed by Jeff here called Rocket to actually identify your biomarkers and to identify and calculate what's called the Rock Curve and generally in order to make a useful biomarker, you want to choose a very small number of genes, proteins and metabolites so rock curves, I've mentioned the term receiver operating characteristic curve and so these are things that are a plot of sensitivity versus specificity so this can be sensitivity this is specificity they were developed back in World War II when people were trying to measure the accuracy of artillery shells and how good they were at hitting tanks and things like that but the idea progressed and moved out into the clinic and it's now used to save lives I suppose but it's a way that we assess biomarkers so a perfect log or a rock curve should have an upside down or a logarithmic type shape a poor rock curve so that's sort of a log curve a really bad rock curve should be a straight line with a slope of one which basically means a random guess we can measure the area under the rock curve this is the area for this one and this is the area for this one an area for a perfect rock curve would be one for a bad rock curve would be one half and anything else between a half to one is something better than random so generally if a rock curve has an AUC area under the curve of 0.75 that's pretty good rock curves with an AUC of one are very rare but obviously most desirable so this is an example of the rock curves calculated different ones pretty much a random area under the curve of 0.5 0.6 red 0.8 these are up in the 0.9 0.95 higher AUC better biomarker or biomarker set so what are some examples of common biomarkers? mammograms for distinguishing between benign and malignant tumors you might have heard some of this on the news a few months ago we finally did the test and tried to figure out how good mammograms are so they're very good at identifying masses but they can't distinguish between benign and malignant and the area under the curve for these things is about 0.53 which is just a little better than flipping a coin so when you have a fairly expensive somewhat painful and invasive test and it doesn't tell you a whole lot why use it and this is why the recommendations have been probably not to use it PSHS men over 50 typically have to go through this is a measure of whether you have prostate cancer and it's religiously followed religiously used it has an AUC of 0.65 so these are probably among the world's most widely used biomarkers and also among worst so if this is the standard to develop a good set of biomarkers you don't have to get much better arguably so Rocket is a web page that was developed to do biomarker identification and like Metabolanalyst you start at the top and click on your data there's some example data sets just like Metabolanalyst that allow you to choose something just if you want but you can use your own data set there's a data integrity check not unlike Metabolanalyst after you check the data then you can go next to us and start doing some data processing options and so you can start removing some low quality data do some missing value estimations do some data transformations if necessary we've seen this before where you make sure that things are normally distributed that's key and then you can actually start doing the rock curve analysis there's three options you can do looking at single features which we call the univariate analysis mode so you can say I think this Metabolanalyst is important someone told me it's important let's see if I can get a rock curve with that then they might do it for another one and another one and see what's working but the one that most people are more interested in is trying to use a multivariate rock curve so the point of omics is that you want to be able to use not just glucose to determine whether you've got your diabetic or just want amino acid out of the five to determine whether you might have risk for diabetes you want to use all five or six same things with genes or proteins so a multi parameter rock curve is what you want to calculate so this is what one of the options is done there's also another one for doing model evaluation so in this case the data that we use, the default data we just sort of click click click did the multivariate one and here is the model for this particular one it actually happens to be for sort of called preeclampsia and these are using two, three, five and ten different biomarkers set of about, I don't know, 200 that it was looking at so it selected the best metabolites developed a model this one I think is an SVM model and produced rock curves and if you'll call the mammogram and PSA tests are about 0.6 this is 0.95 or better so a phenomenally good rock curve for actually predicting a disease this case it's a disease called preeclampsia that pregnant women will develop typically around six or seven months into the pregnancy it's the number one cause of maternal morbidity in the western world it places both the baby and the mother at high risk many women die from it so as I said it's the number one cause for both morbidity and mortality for pregnancy the thing is that this is serendipity that was collected at three months they were perfectly healthy at the time so this is actually a predictive biomarker that tells you whether you can get or will get preeclampsia three months later so that's pretty good for a predictive biomarker you can go a little further with this tool and identify some of the significant features so what were the ones that were helping you get this thing so glycerol hydroxybutyrate choline and acetate are really important so those are examples and so we can run a bunch of tests and examples through rocket and other software that's been developed these are examples taken from the metabolomics innovation center which is in Edmonton and we're looking at some examples where we're trying to predict disorders and diagnose difficult to diagnose diseases so here is the preeclampsia one and this is a rock curve and the green represents sort of the error bars so this is early preeclampsia another one is late preeclampsia which is developing preeclampsia for seven or eight months instead of six months into the pregnancy but these are predicting so they're predicting three months into the pregnancy so take a serum sample will you get preeclampsia so a lot better than PSA and a lot better than mammogram test there's also trisomy disorders, trisomy 18 trisomy 21 which is down syndrome you can identify fetuses that have those disorders but you have to use amniocentesis generally so that's invasive put some other risk and also the baby at risk so they'd like to have a non-invasive blood test and this is just simply using blood if we used age particularly for this one, this would go straight up and across so just combining another common clinical feature with biological data it gets some very very good performance and it's a non-invasive test another one is congenital heart defects again, infants develop this we can actually do operations in utero to repair heart defects now but this is one of the number one causes of infant mortality and huge drain on many care units when heart defects are not detected in time and there's no way to actually identify infants with those, especially in utero but this is an example where they're taking blood, gain I think in three months where this sort of metabolite combination allows you to very clearly identify infants at risk or those with heart defects cancer canxia this is the muscle wasting and using some of the data that you guys were playing with earlier the type or range that you typically get again, this is a spot urine test the person looks normal to us, we just know they have cancer the question is will they develop this muscle wasting disease that comes with about half of all cancers and this approach seems to be about perhaps 80-90% accurate in terms of this area under the curve transplant rejection I showed you some examples we did the PLSDA but then you can run it through Rocket and you can look at both adult kidney transplant rejection and also pediatric kidney transplant rejection and these all suggest that sampling urine can tell you very, very quickly whether someone is rejecting the organs with high precision chronic fatigue syndrome some people here may know individuals the hard to diagnose disorder no one knows what it's cause and again using just a combination of two or three metabolites is possible with this to look like identifying Mordecai is working on eosinophilic esophagitis and this is some very preliminary data but it looks like it too is something to be easily diagnosed using urine samples but it's very hard to diagnose using conventional methods so these are just examples many other groups doing many other studies of getting similar kinds of results looking at blood and urine looking at those predictive and diagnostic biomarkers what metabolomics does better than any other omics is that it measures the phenotype or the phenotype that's about to develop or in the process of developing and it's something I mentioned right at the very beginning metabolomics seems to be the best method for doing quantitative phenotyping the last 10 years we have invested quite a bit into the realm of genome-wide association studies hoping to find mutations common snips and variants that might cause a variety of chronic diseases and it's been a frustrating process and because we're looking at so many parameters and having to do so much characterization it's required that we've had to spend a lot of money for each study analyze tens of thousands of individuals and to date the rock curve performance for these things are generally in the low 50% some of the metabolomic studies I've shown you and others are pretty cheap literally $100 in some cases to $200,000 we need lots of patience because we're not measuring so many markers and the number of markers that typically come out are relatively few many of them are simple the rock curves are very very good if these tests could be translated to the clinic they could be substantially cheaper than what other other kinds of tests would be so that's biomarker research pharmaceutical research again it's sort of a medical field but it's an issue it's an issue for all of the field of drug development developing a drug takes a long time costs a lot of money success rates are very very low it's a major concern in the drug industry it's a major concern for governments as well the interesting thing is that metabolomics can be used in just about every phase of drug development from discovery to phase 1, 2 and 3 trials and even some called phase 4 of the approval of trials genomics and proteomics can also be used and have been used still are used but they're typically focused in the early phase work but applications of metabolomics can persist all the way through the drug pipeline so that's something that's particularly appealing these are just more detailed indications of where metabolomics can and has been used in drug development for toxicology looking at biomarkers of efficacy safety biomarkers clinical safety clinical efficacy they can be used in biomarkers but they can also be used in monitoring and toxicology tests there's some examples where studies in metabolomics particularly around cancer have identified some novel targets prostate cancer there's a compound called sarcosine in glioma 2-hydroxyglutarate and actually a number of cancers seem to have 2-hydroxyglutarate showing up over and over and over again so this is a study that was done in Milan where they actually did really very well carefully designed study where they looked at biopsies and tissues and distinguished between people with benign hyperplasia metastatic prostate cancer and they consistently found very very large increases in sarcosine which is another amino acid looked at large numbers of samples found high elevations in metastatic prostate cancer that served as a potentially prognostic biomarker and the fact that it was sarcosine suggested that it would be relevant to a particular enzyme glycine and methyl transferase and they found that if they knocked this gene out or knocked it down they actually attenuated prostate cancer invasion so this is a case where they found a marker found an enzyme found an approach to therapeutic treatment and found results and then the addition of sarcosine or the knockdown of sarcosine dehydrogenase also induced invasive cancer so that was the converse experiment so very very good study people have struggled to reproduce it but it might be because they just don't have the facilities or the techniques to accurately measure these things so again there's a technique called genome wide association study there's a growing technique called metabolite wide association studies as we talked about before genome wide studies have been used for a lot of conditions to investigate and identify some target genes and although GYS hasn't been spectacularly successful in finding drug targets, we have learned a lot we've identified a number of important genes that seem to play a crucial role in the understanding and familial forms of these diseases familial Alzheimer's disease familial Parkinson's familial cancer and that gives us a deep biological understanding but the concept was originally sold on the idea that doing genome wide association studies would give us drug targets and so people have used genome, GWAS they've used very large libraries to screen against these targets and then they go through the iteration of pre-clinical and clinical trials to eventually produce a market however things haven't been that successful the costs of the GWAS studies as I say are millions the fraction of success where they actually are able to find genes and actually pull those out that could be drugable, which is somewhat limited finding HITS is a tough one and the numbers that we know from statistics are about one in five take a while and then there's the one in 500 level once you've gone from the sort of pre-clinical to the clinical one and then even the ones that do make it into phase three or beyond a lot of them are getting pulled so if you add all the odds up, the time, the dollars it comes to the same numbers that the drug industry has been talking about, a billion dollars 10 to 15 years marginal success rate from the very beginning but remember not all diseases are genetic it's very good evidence that somewhere between 80 to 90 percent with a major conditions, whether it's cancer heart disease neurodegenerative diseases actually are probably environmental, diet or exposure related many probably have something to do with the microbiome the genes that we know about the ones that have the highest risk factors like the RSA1 only account for 1 to 2 percent of all cases of breast cancer heritability for polygenic diseases rarely is about 50 percent in most cases it's somewhere around 20 percent even for ones that we largely consider to be very genetic like cancer it's approaching almost 40 percent now actually seem to be caused by bacteria viruses or other infectious slash parasitic organisms so this is something that we don't appreciate and I think perhaps to our detriment so what about metabolites and metabolite based discoveries so the idea here is to look to see if we can find metabolites that are associated with the disease conditions just like the ones that I showed you, those rock curves figure out which metabolites are up, which ones are down, like let's say diabetes okay, leucine, isoleucine, vaguely those are up check our pathway chart well some of them aren't so good but at least we know that isoleucine, leucine, vaguely moderate insulin secretion we can look to see if there's anything that knocks down leucine or isoleucine we can go to Brenda or looking at drug bank, some of the ones that affect those enzymes that play a role and we can also go to our local supplement store or think about some other things okay, we've got too much leucine, isoleucine, or vaguely why not cut back on that and Chris Newgard in North Carolina she did the experiment and lo and behold he was able to convert pre-diabetic rats into normal rats so what you can do then is after you've come up with the therapy which might be just simply a dietary change or supplement if you're short you can start monitoring isoleucine, isoleucine, vaguely, are they dropping? Is it immunodipic acid levels? Are those dropping? Is the TMA levels for atherosclerosis, are they dropping? So that's where metabolomics also allows you to start monitoring things in terms of the metabolomic metwas sort of things that we've been doing with biomarkers about half actually it's more like three quarters of the studies work the data analysis you guys just did this you did that today the pathway analysis you can also do it in a day to sort of identify potential inhibitors or to decide what your therapy might be doesn't take much to say if you're high in isoleucine loosing, vaguely why not cut back what to do well maybe you don't need a drug you just have to say eat less of that and then in terms of success for monitoring it's not much so add these things up and in fact the root to suggestion for potential therapies is much shorter, much cheaper much faster so you might say okay you're just sort of exaggerating well let's pretend we had dialed back 200 years ago and we had all these sailors who were coming in dying from a disease called scurvy and we wanted to figure out what was the cause of the disorder we sent the blood sample into the mass back and we see there's no ascorbic acid in their serum they must be short on vitamin C what should we do with a drug that actually produces more vitamin C or you could just say let's give them vitamin C that's what they did scurvy is gone one that's actually more modern that's probably affected many of you is your mother's probably took folate supplements the reason was because they made an observation in the 1980s and 90s that low folate led to spina bifida and neural tube defects that simple supplementation has now profoundly changed the frequency which neural tube defects are found and it's also affected a lot of other things related to neural tube defects 80 years ago anemia, polygner and rickets were very common none of you were alive then but those were the number one concerns of physicians they were seriously affecting many hundreds of thousands of people they were affecting the economy people were dying early and again if we had metabolomics then we could have identified what the problem was then we would have suggested thanks to nutritionists they did do some of the chemical analysis they did figure it out and that's why we have supplemented cereals and foods iodine and goiter that was another one that led to the appearance of iodized salt again metabolomics could have picked that up because of the very change it cures the disease epilepsy actually can probably be most effectively managed through a ketogenic diet again you typically find very low ketone bodies with people with epilepsy and there's a number of other conditions and disorders where through metabolomics or techniques related to metabolomics can identify the markers of the disease and you can make adjustments either through drugs, supplements or changes that have pretty profound effects more examples more conditions more solutions so this isn't unreasonable it's not as if it had never been invented or thought of before but it is something that could be and potentially should be applied there's also applications in pharma with what's called absorption, distribution, metabolism and excretion which is an important part of drug development and this is really where metabolomics actually first was applied in the drug industry and it's where you look at someone a drug early stages, you look to see what comes out and see what is altered or changed so we can look at urine, we can look at blood and when we're doing pre-clinical trials studies, we do these on rats the old way of doing it meant that you typically had to raise hundreds of rats sacrifice them all do necropsies or autopsies and it was expensive and time consuming but potentially with metabolomics you can do this in metabolic ages and see what happens you can see whether there are trajectories as the rats are being given whether it's drugs or toxins and this is what Jeremy Nicholson spent the first 10 years of his metabolomics life doing which was essentially looking at the responses of rats to a variety of different drugs or drug like compounds and this is an example of some of their stuff where they were looking at these perturbations and what it did to toxicology you can see how some drugs or some poisons affect certain types of the kidney, the glomerulus the cortex, the medulla each of these produces certain characteristic metabolites you don't need to do a dissection or a necromp so you just simply look at what's perturbed and you can say that's where it's hitting it could be a different drug or a different toxin and you can do the same thing and you can repeat these studies because often the animals recover fairly well and so you don't have to sacrifice the rats and you're just simply collecting urine and it doesn't cost anyone and cause any harm so you can localize damage both liver and kidney and there's certain types of metabolites that are very much associated with these and this is again the work of this group over many years same thing with organ toxicity for the liver very characteristic features that I gain are tabulated everywhere so not only in toxicity you can go into phase one and phase two trials where you're looking at people just simply taking the drug but typically you have to trust that they are taking them and so you can start looking to see is the drug there and well here they didn't take it okay so you can record an adverse response for that day it's not the drug it's because they didn't take the drug so this is important for the drug industry especially if you're trying to worry about safety or efficacy same sort of thing happens with compliance when you take drugs you're typically told don't drink don't take other drugs everyone says I'm not drinking and I'm not taking other drugs but people lie and so it is possible to identify when they have taken something they weren't supposed to so there are other applications and if we saw ethanol there this is an application of food analysis so this is something where what is it what we eat what does it do to our bodies and we saw the application of fatty food and atherosclerosis but you can think of what do diets and chronic food consumption do to metabolites in our blood and urine and this is of great interest to people in nutrition research and understanding how food is processed in the body we also want to understand more about the foods that we eat because there's some good things in them there's also some bad things in them so Gatorade is an example of a simple food you can usually get the ingredients but if you went to something like beef they don't have an ingredients list in fact there's probably 7 or 8000 metabolites in beef and getting an exact list is hard it's probably very similar to the human metabolite but again it's important obviously beef and tomatoes yes they're both red but they have totally different compositions and they're very important differences again what's in food very much dictates what's in our body and what comes out of our body some foods are adulterated and in fact the application of GCMS, LCMS NMR has been used to identify adulterated juices so you can get cheap juice to taste like expensive juice by substituting things and this is done much more widely than we're aware of but being able to distinguish between different sugars is something you can really only do with metabolomics beet sugar, corn syrup there's certain allowed sugars in certain foods and very strict regulations again metabolomics allows you to distinguish between those and that's quite valuable dietary biomarkers nutrition is all about finding about what we've eaten people as we've seen tend to lie about what they eat but we also tend to tell what the nutritionists want to hear did you eat all your vegetables today are you eating seven different helpings of greens and other fibrous foods and most everyone will say yes yes yes and then they come back with very surprising results that we're all eating remarkably healthy but we're all getting sick and that's because in fact we don't tell the truth and we actually all have relatively poor recall of what we generally eat and how much we eat so they'd like to be able to go from what are essentially questionnaires or surveys to actually chemically measuring what we're eating and we have this association we think with certain types of food whether you're overweight or underweight or also how long you live for propensity for diabetes and so there are actually markers that metabolomics has identified for a variety of foods tea, wine, coffee, alcohol grapefruit these are all tabulated in various studies that people have picked out and this potentially could allow a substantial change to how nutrition research which has always been sort of marginalized going from simply a survey science to one that's actually a chemically based science we also are learning to our surprise that what we eat substantially influences what grows in our intestines this is the microbiome and we can detect those things we can detect what's in the microbiome that grows up in the urine actually in that example as I say probably from the choline and TMAO for cardiovascular disease but there's also the growing evidence this was first described seven years ago but there's many many studies since where really it doesn't actually have to do with how much you eat but probably whether you're breastfed is an infant or not so the bacteria that grow in your gut that has a very strong issue about whether you will become a visor or not and this abundance of these divisions of bacteriaities and pharmacutes and those are defined by breast milk which have very complex carbohydrates which encourage the growth of certain bacteria which persist in your body for most of your life and there's been a long time association between people who are breastfed who tend to be lean and those who were formula fed who tend to be obese and that's been reproduced many times and this association has been seen with the gut microflora they're also identifying nutritional phenotypes in different countries so they represent types of foods that people eat both no matter where they live but culturally defined and so this is a nutritional phenotype but it defines very much what's in your microbiome which is reflected in what's in your urine they've also looked and monitored how stable the urine metabolome is in individuals over many, many months and basically your urine metabolome is as unique as your fingerprint and it stays with you for most of your life over at least months and perhaps decades it does change it changes typically as you go from age 1 to 2 where you go from milk to solid foods it changes at puberty it changes somewhere around age 40 45 middle age and it typically also changes again around the 70s or 80s so what's the head for metabolomics so some examples of what it does is what we use for applications and mostly human examples but understanding what's in our food also touches in agriculture and environment also in the field of toxicology obviously metabolomics tells us a lot about processes and biological processes and pathways I haven't talked a lot about that but to try and survey that for all the systems that everyone talks about it's just impossible so what's the head one of the most exciting things I think is chemical imaging is there anyone who's ever tried to do metabolic imaging a few people from the borscht's group but this is a fascinating area and I think has a possibility of revolutionizing a lot of what we do but it's a matter of actually seeing not just stains such as how we traditionally image things but to actually look at chemicals so you can do imaging you can actually do metabolite imaging through MRI this has been around for a long time this is magnetic resonance imaging people haven't applied the term metabolomics to it but it exists and in fact it's a very powerful method you can read out concentrations of metabolites in the brain and the muscle about a dozen maybe up to 15 metabolites at a time and it's macroscale imaging and you can also do micro-scale imaging and this is done now with MOLDI which is assisted laser desorption ionization what you do is you lay out a sample on a plate and then you spot your laser shine your laser and make it blast away through a raster or a grid and you collect data every few microns or tens of microns through that raster grid and each spot corresponds to a mass spec the mass spec can be analyzed for ions can be colored you can assign the ions to certain compounds and so you can start doing a false color image that identifies the concentrations of metabolites so people are doing microscopic imaging they're looking at features that are a few microns across thanks to MOLDI imaging most of the metabolomics I've talked about is metabolomics using mass spec NMR these are big instruments they're expensive can you do metabolomics in handheld devices there's a thing called the EISTAT which is produced I think by Abbott Labs that is if you want small scale metabolomics so it's bedside metabolism volatiles we haven't talked a lot about those but it's certainly an area in gas chromatography that sells out and there's a growing interest in looking at volatiles and there are also some handheld devices now that are essentially gas chromatographic instruments volatile sensors that allow you to pick up things and they're making use of some really interesting technology sort of nano or micro electronic systems so they can produce profiles that allow you to distinguish between certain types of volatile compounds they're not terribly quantitative yet but it's essentially it's headspace chromatography so metabolomics and we're wrapping up I'm going to have to lose shortly as well but it is part of all of the omics it shouldn't be thought of as alone or as a loner it does play a critical role there are some things it does very well there are some things it doesn't do very well because we understand a lot about metabolism it certainly opens the door to things like modeling and prediction what we saw at the very beginning of this particular talk because we can use well understood principles of chemistry and differential equations or stochastic modeling systems I've shown you examples of how metabolomics is being used in lots of areas of medicine, pharmaceutical and clinical chemistry I think what I'd like to emphasize is the importance of having well designed experiments this has been a historic problem in the metabolomics in many cases people walk up and say I have some samples that have been in the freezer for the last 20 years can you look at them? I don't know anything about them hopefully find something useful that's the recipe for a terrible terrible study but it's also an issue for us as analytical chemists that need labs don't spend enough time or attention and it's not really taught in chemistry labs about quality assurance and quality control most of us just do a single experiment in a chemistry lab but if you're actually being told to analyze hundreds of samples for thousands of analytes quality assurance and quality control are absolutely essential and so they're key in people doing metamphetamine proteomics and genomics and transcriptomics and so it has to be key if we want to do it in metabolomics understanding the data analysis principle statistics, multivariate statistics these are important hopefully we'll learn a little bit about it hopefully we'll be inspired to learn more in terms of the trends certainly I think metabolite imaging is one of the hotter new trends and if you have an opportunity to try it or do it try it, it's a great facility in Victoria that's offering this as a service metabolomics is certainly moving to much more automation whether it's analysis sample loading sample data crunching sample prep all of those things are becoming automated, they need to be obviously there's importance to detect the volatiles it's not done enough and I think we're finding that there are many volatile compounds that are very important both for biomarkers and for understanding more about human mammalian and plant metabolism how things smell actually has a lot to do with how things taste so this is very important in the food industry so volatiles quantitative quantitative, if that's the only message I get across, I hope that's the one that sticks quantification is critical if any field in omics is going to be useful to the general public it's going to last and persist sequencing has been such a success because it is quantitative proteomics has struggled transcriptomics has struggled because they weren't quantitative largely by choice metabolomics we have that choice and come from a long history of analytical quantitative chemistry we shouldn't ignore it we should embrace it and use it using smaller devices making things cheaper, we've seen that time and time again DNA secret serves from a decade ago over the size of refrigerators now they're not much larger than a toaster UV specs 80 years ago used to fill a room now again they're the size of a toaster it's happening and it will happen for metabolomics many of the devices probably will drift away from a big million dollar FTSCRs or NMRs to possibly hand-held instruments and I think as these things get smaller and cheaper and more accessible and easier to use then we'll start to see them more frequently used in doctors' offices or in the field on the farm and that I think can have a profound change on how we do things so I think the future of metabolomics is bright I mean I think the heyday of genomics, I mean still there a lot of the technology has been done we can't get much faster the techniques are cool and it's now a mature technology we just have to crunch through these things proteomics is sort of advances in mass spectrometry improvements in the technology the adoption of quantitative proteomics I think it's still very much in its prime in terms of technological and computational development the metabolomics is only about a decade old and who knows where it's going to go hopefully it'll pick up hopefully it'll be important and we'll see innovations that same things revolutionized genomics and proteomics so it's time for me to wrap up I want to thank Jeff for all of his help and Michelle for all of her guidance and also thank you all for listening one survey and I appreciate all of your feedback it's supposed to rain today so don't be aware can I just also ask your assistance in helping me clean up the room if you could gather all of your various garbage and just deposit it in the bins and we'll help you prepare for the next class coming in I think you're sitting beside a big power cord if you could just unplug