 So we have about an hour here where we're going to try and go through module two of the workshop. And again, these are just the usual creative commons license points. So the focus today is on targeted quantitative metabolomics. I gave you a bit of an introduction to untargeted. So just so you know what I'm talking about. But what we're focused on both the lab and the lecture today is how to do targeted quantitative metabolomics. So again, just to say or note our time we've just finished our break here. So in Montreal it was lunch here it was coffee. After this module it'll be lunch in Edmonton and coffee break in Montreal. To emphasize a couple things we're going to try and focus on understanding the differences between targeted and untargeted metabolomics. We're also going to focus on why it's important to quantify things in metabolomics. And then we're going to look at the three major platforms in that metabolomics and we're going to show you how in our quantitative metabolomics is done or targeted metabolomics how targeted GC metabolomics is done, and how targeted or quantitative LC metabolomics is done. I saw this slide earlier. And these two bipartite paths there's a targeted approach to quantum metabolomics and an untargeted approach. Targeted is a quantitative approach untargeted is non quantitative approach and target methods you can use often there'll be also a plate format systems. Targeted can be done with triple quad MS Q trap MS we done with NMR can also be done with GC MS. There are kits available for these approaches. It could also be done internally in labs. It's done by many groups. And then targeted metabolomics has to be typically done with mass spec LC MS usually high resolution mass spec. The process as I said is to collect lots of spectra do lots of alignment clustering peak detection, and then to do things like feature selection and minimizing or reducing the number of signals so that then you can do some metabolite annotation or identification. So focus and untargeted is is to do the stats first and then identification later, or shoot first to ask questions later and the targeted approaches ask questions, and then use the quantitative data to identify biomarkers and make biological interpretation. So, with targeted versus untargeted. The targeted side as I said it's defined coverage. It's a pre selected set targeted can do as little as 10 metabolites at a time to as many as to 1400 now targeted is is generalizable to all the major platforms you guys are going to see this today. Specifically was developed for hypothesis testing but a lot of people are now using target metabolomics for discovery discovering biomarkers. They're discovering biology. And this is because if you quantify things. You'll find out that most biomarkers are based on quantitative values, and whether you're above or below a threshold. So if you have a threshold, you have a biomarker. So target metabolomics focuses on absolute quantification. That means you're getting millimolar micromolar nanomolar values. It's a very fast approach. See how quick can be. It's good for automation. It's good for kids systems. As I said before it's something that can be very standardized and increasingly is standardizable, you can follow ISO guidelines you can put it into practice for clinical work. But I'm targeted as we said before it's open ended. You're measuring 10s of 1000s of features. It's really specific to high resolution mass spec. You can kind of do it for MR but that's not really done anymore. It is ideal principle for compound discovery. It's good for exploration. It's good for hypothesis generation. You can do what's called relative quantification. This is higher than that. But you don't get concentrations. And, of course, the relative values vary from lab to lab or day to day. It's not very fast. It's not very automated. It's not very standardized. So this has been a problem with it. Of course, it's fun if everything is nonstandard. It makes it more interesting for students and scientists because they get to invent their own processes, but it doesn't really get you very far in terms of translation. So myths about untargeted metabolomics are many and widespread. So some of the myths that people talk about in terms of quantitative metabolomics are targeted. Most people think it's too expensive. Most think it's too time consuming. Most think it's less sensitive. Most think it lacks comprehension and it is not as comprehensive. Many people think it's less likely to lead to significant discoveries because you can't do this hypothesis generation. So what typically happens with targeted or quantitative metabolomics is relegated to a minor role of confirmation. Well, typically untarget metabolomics has been getting the starring role of discovery. And that's kind of reflected in the way that the number of papers that have appeared. So in the early days of metabolomics. Targeted metabolomics is the most common approach. 76% of all papers, 10 or 12 years ago, were in the area of target. It was rare, it was 24%. And then if we classify things in terms of, yes, you can do target metabolomics, but were you also doing quantitative metabolomics. And that was even smaller. So yes, you can do target metabolomics, but you can also do it in a way that it's not quantitative. That is, you aren't using reference standards, you aren't using calibration standards, you aren't using C13 isotopes or other things. Now in 2022, which is the last year we looked at things untargeted has moved ahead. Almost two thirds of papers are untargeted. One third are targeted. And the percentage of quantitative metabolomics papers has continued to draw. And this is a very disturbing trend for me and perhaps for many other people. Because the only way you can get it to move into industry, the only way you can get it to move to pharma, the only way you can reproduce it is to have something that's quantitative. I mean, that's the essence of science. So at some level, I worry that metabolomics has moved towards extinction, because if we're only doing untargeted stuff, then it's a free for all and nothing will be reproducible. Nothing will be used in the world of industry. So why have we been shifting to untargeted? Well, again, people think that if I do untarget, I'm going to discover a new compound, a new biomarker. I can speak from experience of almost 20 years of looking in this. An average of one to two compounds are identified each year. So only one to two compounds are identified for every 10,000 published metabolomics studies. So if all of you published, you know, two papers a year, every year. Over your lifetime, you will never report the discovery of a new metabolite. This is the odds are against you. So you're not going to discover new metabolites by untargeted metabolomics. There's a generally so that you can get more compounds identified by targeted methods that's then targeted. So right now the current targeted methods average about 500 plus compounds. Some methods can get up to 1400. When I've done literature reviews on untargeted, the average number of compounds identified by untargeted methods is less than 100. So the fact is targeted methods identify many more molecules than untargeted. A much larger step that they're working with. Therefore probably better opportunity for discovery. There's also assumption that untargeted is faster and easier than target methods. The fact is that untarget methods because there's so much data analysis data processing, follow up all those peaks that have to be where all those peaks that have to be characterized. All the batch control approaches you have to implement all the challenges with data processing and reprocessing and reprocessing the game. They are generally five to 50 times slower than targeted methods. People usually say that untarget is cheaper. So typically targeted assays can be run for as well as $20 a sample. Typically if you go to a lab or core facility untarget methods often costs more than 200, often $300 per sample because of the high data processing and informatics costs. In the world of untargeted there's certainly more opportunity for software development and software integration. That's good on one level but if it's if none of it standardized it means that you're writing software to an audience of two. Because everyone's got a different method a different approach a different style. So yes, you can publish a lot of software and that's what a lot of people do but let's just say these things aren't getting used or cited because it's just for your lab and your friend. You're more likely to find novel patentable biomarkers are untargeted methods. Well the fact is you can't patent natural compounds and most biomarkers that are found and assessed are ones where you're using concentration thresholds. It's not about the marker itself it's about the concentration. If you don't measure concentrations you can't get FDA approval. So the FDA Health Canada everywhere else requires precise quantitative measures. But if you don't identify can't get anything translated to the clinic to the industry to pharma to anywhere else. So in terms of targeting metabolomics. Yes, I've got a question. I'm from Montreal. I've noticed many targeted kits and extraction methods are specific tissues or fluid and model organisms. How much of the increase in unfargeted locations coming from plant or non model animal species that don't have any kit options. So it's true a lot of the targeted methods are sort of specific for bio fluids or in some cases tissues and that the quantitation normalization has been done for common or frequently used samples. But yeah there are targeted kits or systems that have been published for tissues there's ones that have been published for plants. And others that are being established for the microbiome so it's starting to appear. But yeah if your system is not part of the favored targeted set, then you have to develop your own targeted asset, or your, you have to do is untargeted methods. So in our base metabolomics, the number of compounds you can identify is between about 50 to 200 gcms between 20 and 120 something called direct injection or direct infusion mass spec 150 to 400 compounds. LCM space methods 300 to 800 lipidomics up to maybe 3000 compounds that can be identified and semi quantified. Each can be in micro molar to nano molar so in a Mars least sensitive ms is the most sensitive where you can get down to nano molar sensitivity. And one of the things we have to remember when we're doing metabolomics is essentially doing analytical chemistry. Analytical chemistry has been around for 100 years, it is a branch of chemistry that deals with the quantitative determination of the chemical components of substances and mixtures. That's a definition of analytical chemistry. So word there on your lines are marked and read that's quantitative. So I think what's happened in both metabolomics and also proteomics is that we've forgotten the definition of analytical chemistry, we've forgotten the importance of quantification. We've been affected by what I call proteomics influencers and proteomics refugees because a lot of people who do metabolomics have originally come from the proteomics world. Today's proteomics workflows MS based proteomics concepts have been widely adopted in the MS community. Some of that's good but it's also been problematic because a number of papers in the field of proteomics is steadily declining because they didn't bother quantifying. And so people have never been able to identify any useful protein biomarkers for more than two decades. And there's the general belief that medically useful biomarkers are qualitative measures. And that's, that's, that's wrong. Every biomarker that's used except for histology imaging markers is a quantifiable entity. We have concentration values for glucose and creatinine and for everything else that's used by doctors or anyone else in the world of toxicology or environmental testing. So the issue of quantification metabolomics is, is, is not new. This is a feature article in trends in biotechnology with the title, there's a specter hunting metabolomics specter of quantification So it's written by these two scientists and they basically came to the conclusion the field has to become more quantitative. If any of the findings are going to be translated to practical applications in human health, environmental medicine, but also in environment and agriculture and any other field. So when you quantify, and this is what analytical chemists have known for many years is it's reproducible means you've got, you know, a value. It's coded in nanomolar, micromolar, millimolar, therefore it's verifiable concept of nanomolar, millimolar, micromolar is universal. Anyone, no matter which country you're from which discipline you're in, you know what that means. The units don't change over time. It means you can do direct comparisons to standard reference values. And so there are lots of reference values. And so you've been in clinical chemistry know there are textbooks of these values. And maybe has thousands of these reference values. So this means that if you can use reference values you don't always have to have hundreds of healthy control samples in every study. That's the normal design and untargeted metabolomics, you have to have dozens to hundreds of healthy controls because you don't have anything to reference to. To quantitative values, then yeah, just look it up in the standard reference tables. These are universal. They are published for different age groups, population groups, ethnic groups. They all have reference values. If we look at what's been important, the most important metabolomics discoveries over the last 15 years. Most of these have involved measuring concentration differences of well known metabolites. Trimethylamine oxide has been identified as critical development of cardiovascular disease. TMA is well known threshold values were identified and made these high unpacked papers published in nature and science branch chain amino acids and aromatic amino acids were identified in the late 1990s or the 2000s as being critical in diabetes. These are well known metabolites, but their concentration differences between normal healthy and diabetic ones is the fundamental difference. Discovery of host gut microbiota and metabolism interactions. Discovery of uremic toxins always changes to their concentrations. These were known compounds, but their concentrations were the things that made them important. The discovery of oncometabolites. Again, these compounds were known fumarate succinate lactate to hydroxyglutarate, but they change. They change in their concentration and that's what makes them oncometabolites. The role of metabolites is immune signaling molecules. Again, all well known molecules. Again, it's the change in concentration. That was the discovery that hit this impact. And it goes on and on. Well known molecules and it's the change in their concentration, their absolute concentration that determines whether this is a discovery or something that's minor. It's not a new compound that's being discovered. It's their change in concentration that led to these papers which have been cited thousands and thousands of times. So if quantitation is important, how do you make it simple. So you can do quantitative metabolites yourself. There are protocols that are published. And if you read the method carefully, and you know a little bit about chemistry, you can do it yourself. You obviously need an instrument. You need to spend some money to get some isotopic standards or labeling agents. You can do this for LCMS or in some cases you don't even need labeled agents because you can do this by NMR or GCMS. You do have to make reference calibration curves. It's dull, but it's important. And you can sometimes either use existing software or develop in-house software or many vendors supply the software. So you can do it. Many dozens of labs create quantitative metabolomic assets. You can also spend some money. You can buy things called the NMR Food Scanner from Brooker or the NMR BI-LISA or BI-Quant from Brooker. These are instruments with the software. Just install, press go, and you're off to the races. You can get, or used to be able to get Psi-X, something called the Lipidizer. You can get a Seahorse. You can also get a Commercial Clinical Analyzer. These are instruments that measure quantitatively many metabolites. They're expensive. The Seahorse is cheaper, but these are things that you can do. You can also send to academic labs, core facilities that do metabolomic services. This is a company in Austria called Biocrities. You can do certified labs. TMIC has a number of labs that do quantitative, absolutely quantitative metabolomics. They're institutes in the U.S., Broad Institute at MIT, University of Washington, Seattle, Beaumont, Michigan, Chapel Hill, Duke, West Coast Metabolomics Center. All of these centers offer some quantitative metabolomics assays that you can buy at sort of an at-cost level. You can also send samples to commercial labs. A lot of groups, especially large epidemiological groups, will send their stuff to metabolon. And that has almost 20 different quantitative assays. Biocrities is a company. Kinomics is a company here in Edmonton. Nightingale, a Finnish company. Metware, another one. These are commercial facilities that also will do quantitative metabolomics. Or if you've got the instrumentation yourselves, you can buy and run kits, just like if you've done molecular biology. And these can be kits, so they can be methods. You can get kits from biocrities. You can get from kits here in metabolomics, which is a company based in Edmonton. You can get pie methods from Sinex, lipidomics product calls. You can get software and tools from Kinomics, various MS vendors and white papers. So whether it's the kits or methods, you can do this yourself as well. So I'll talk about some of the kits that we have developed in TMIC. And this is the basis to what you guys will be doing in the lab. The reason why we're pushing it is because this is a Made in Canada approach. It's being done by TMIC. TMIC is one of the sponsors of this workshop. But it also opens the door for people to do quantitative metabolomics and we'll show you how easy it is. So there's an LCMS kit system, the one that's been developed, can range from measuring metabolites from 140 compounds up to 650 compounds. It's being adjusted to measure up to 12 or 1300 compounds. And just like my Clibology kits, they're relatively easy to use. Everything you need comes in the box. There are training videos. It's been moved to several labs. It's relatively cheap anywhere from $45 to $90 per sample depending on the type of assay. And you can run lots and lots of samples in a 96 well format. But also GCMS kits. This is also a quantitative approach. It measures up to 100 chemical standards. It has, it's bundled with those standards. It's bundled with the derivatization reagents. It has the software. It affects, accepts a variety of soft files. It's very fast, as you guys will see. And you can get very, very high accuracy identification. It can run on several different sample types. We'll be working one that's just specific to urine. And it certainly requires some sample preparation and derivatization, which is what you have to do with GCMS. We've also, and we'll explore NMR kits. And the idea again is to help standardize things and is to make other labs standardize how they do NMR. There's a number of new software tools for this. It makes it very fast. It allows you to interactively work with the spectra kits themselves, have all the reagents, the standards of software. And you can analyze up to 60 or more compounds in different samples. This is much cheaper than the GC and LC runs. I get about, it's about $10 per sample. I'm going to talk about each of these methods. I'm going to talk about NMR first, then I'm going to do GCMS, then I'm going to talk about LCMS. And this is just to give you an overview. And then after lunch, we'll do the lab and you'll actually work with some real samples to work with the software. We're not going to put you into the lab because otherwise we have to have everyone approve for lab. We don't have as many instruments, but we're going to give you data that was collected on instruments using these kits. And then you're going to process them. So the NMR kit, in terms of an overview, typically what you have is you've got your sample, maybe it's a urine or blood or serum sample. If it's serum, we're going to do an ultrafiltration or remove the proteins. We're going to add a couple of compounds that you're in the kit. One is used a phase reference, another is a chemical shift reference. We put them into an NMR tube, we collect the NMR spectrum, we process the NMR spectrum and that's the software we'll use. We'll analyze it using a software called Magnet and it'll produce a list of metabolites and their concentrations. So the focus that you're going to have is these last three steps. The first part is stuff that someone else has done for you. So typically when we do NMR, we usually take a spectrum, we get what's called this free induction decay or FID. That's that noisy bell ringing spectrum that I talked about before. So we use Fourier transform to convert that so that we can see the piece that makes sense to us. But often when you do the Fourier transform, things are sometimes upside down or they're what we call defazed. And so you have to do a manual thing called phase correction where you change the spectral parameters or the display parameters so the peaks are pointing up, things are correctly phased. You have to make sure that the baseline is flat. You have to get rid of the water peak, which is the big thing on the left. And then you have to reference it so we get a zero point chemical shift reference. So that's something that NMR people kind of like to do, but it takes about 10 minutes per spectrum. So if you've got, you know, 100 spectra, that's a lot of time, and it's just sort of menial tedious stuff. Once you've done that manually, then you do the thing called spectral deconvolution. And this is illustrated with this figure here so let's pretend we have a mixture. The mixture has three compounds, compound A, B, and C. In NMR, things add based on their concentration and their peak intensities. So if you visually look at A, B, and C and just sort of see where the peaks line up, you can see how the sum of A, B, and C equals the mixture. Well deconvolution is the reverse of that. It's an inverse problem. It's taking the mixture and say what are the components there. And so in this case it does this sort of spectral matching. It has to match these peaks from a known set of other peaks from pure compounds from a database and then it's got to sort of adjust and match adjust and match to see if those peaks will all fit. And whether you can come up to the solution that says yes, this, this top spectrum is the sum of these three compounds, not the sum of four different compounds or 12 different compounds. It's the intensity and positions. So that's spectral deconvolution. And spectral deconvolution can be done manually. It can be done semi automatically, or you can do it fully automatically. So the manual approaches are done by a software tool called Kinomics, which is a company based in Edmonton here. They have a whole process of spectrum by hand. They have a large library, and they do this guess and check to deconvolution, which is like I showed here, here's the spectrum at the top. Guess which ones it might be could be compound Z compound X you try does it fit no try compound Q try no compound a sort of okay what about compound C yes maybe, and it's back and forth so it's a guess and check process. So you're dragging and dropping and adjusting the spectrum to match the piece. Now if you can automate the process where you're not only deconvolution but also automating the idea of phasing and peak correction and baseline correction, then you can make it a lot faster, potentially 10 to 25 times faster. To prove the precision and recall, it means that you can let the thing run overnight instead of spending hours staring at a, at a computer. It also means it's consistent and reproducible because it's not prone to user bias, or errors that humans make, and it often can detect signals that humans don't easily detect or individuals differ in. It's a software tool that automates spectral deconvolution, and it's called mag net, not mag net so that's magnetic resonance from metabolomics. So, we originally had a program called basal but magnet is new faster gives us greater flexibility. So it uses a combination of machine learning rules expert rules to do the pattern fitting and deconvolution. It has automatic phasing. It has automatic chemical shift reference it has automatic water removal baseline correction, and it has automatic peak down deconvolution means that identifies and quantifies everything. So on average it can identify about 55 to 60 serum metabolites in about 10 minutes. That's completely without any human intervention. So you can run it in parallel or overnight and you can do hundreds and hundreds. It's being adopted and has been adopted to things like wine and beer juice and other fermentation products. So this is sort of what it looks like in concept so you're doing peak integration. This is the sort of spectrum you can see with a mixture and where you're seeing elements that are covered or overlapping indicates where the fitting has been performed. On the other half you're seeing this iterative fitting where it's seeing some peaks and it says oh if I move shift or left or right it fits by scale up and down it fits. So we're seeing the outline of the spectrum and then you're seeing where the fitting has been done so it's over about. So the color and the color indication tells you how confident you are in terms of metabolites, the integrated area tells you the concentration. The serum which is that you guys will try it can do it in five to 10 minutes identifies about 57 metabolites and the variation in terms of concentrations is often well less than 10% sometimes as good as 5%. You can do it in fecal water. You can do it in CSF. We've done it in wine and beer. It's it's fully automatic and it is doing the spectral deconvolution area calculation concentration calculations. So in terms of operation typically it'll take that spectrum will do the Fourier transform in about five seconds. It'll do some of the phasing in about 15 seconds. It'll then continue to do further pre processing about another 30 seconds, cleaning up your spectrum. The rest of the time is spent fitting those peaks against the library of compounds in its spectral library. Once it's done that then it produces a fitted spectrum. And so this is a full spectrum and a more spectrum at the bottom. This is an expanded view so you can zoom in and you can see how the fitting is performed and how close or how good the match is. Most of the time you don't have to do any other adjustments. Sometimes it goes off in a wild goose chase, and it does need a bit of intervention so the idea is to have the software and the visualization tools to allow you to make those adjustments, in case it just went off on a wild goose chase. Once it's finished its deconvolution, it gives you the list. So the list is the compound names, there's a hMDB identifiers, the concentration, and then also a confidence score. You know, not every metabolite is perfectly identified some just have a single peak. Others have 30 peaks. If you have a metabolite with 30 peaks and all the match then you can be 100% confident that's that's there. If you have a single peak. You know it could be something like acetate or format, but if it's in a non unique area then you're not so sure so there's a confidence score that's often associated with metabolites. So just for a moment. I noticed that the last people that mentioned that I can't even need to go so does it mean that. No NMR can't generally distinguish between stereo isomers. So we know that these compounds are the ones that are found in human body. So there isn't the D carnitine and mammals and the Declucose seems to be the preferred version that's found in the million systems so there's an assumption of what these things could be or should be. So gets to GCMS metabolomics. Again, I'm just doing it kind of an overview, because we're sort of short on time here. So as I said, there are these kits that are available. I've highlighted some of them about what they are. The one that we'll be using is specifically optimized for for urine the sample data will be using the lab is for urine. And yes, you know running the kit takes a few hours but then once you've got the data then the analysis in principle should take only a minute or two. The kit assay for this is a little different. So if we're taking a urine sample. We're going to add compounds like methoxamine and TMS FA. This is a way that does the incubate for an hour, and then we'll transfer the mixture to an auto sampler to. Once you've gotten the sample to then you can put it into the GCMS instrument, it will run things through the GC chromatogram, it'll get your spectrum. The sample set will also include some baseline values because that's a chapter run in in GCMS and some calibration points also typically run in a GCMS run. So when your data that comes from there you'll run it through this software called GC auto fit and just like the magma NMR it'll produce a list of metabolites in their concentrations. So the concept in GCMS is sort of here so if you've got your GC chromatogram. Choose a peak. Sometimes the peak is a pure compound sometimes it's two or three compounds. So you'll do a sort of spectral deconvolution thing. You'll look at the fragment iron spectra. These are the EIMS spectra. And so this particular peak had three compounds that were alluding almost at the same time. They had three different EIMS spectra. And then we're going to compare that those spectra against our database. In this case, the GC auto fit has its own database, it's going to match these things it's going to identify the specific compounds based on those spectral matches, and you can see just by visually comparing the different spectrum how they match. And once you've got a good match then you've got your compound and you can integrate and figure out the concentration. So, again we talked about this earlier that when we do EIMS spectra we have multiple peaks. Either you can, you know, collect them on pure compounds you can also predict them. But that gives you the structural uniqueness. Each spectrum here is unique. Each of them corresponds to a compound and to specific structure. So the question was do these software tools require certain spectral types of data files. Yes, there's certain data file types that are allowed and those are mentioned in the software and we'll get into that actually in the lab about what file formats they accept. So again, you know GCMS you're typically working with EIMS you're also working with derivatization TMS derivatization, TDBMS, methoxamine derivatization these things change the chemical structure. You're also going to change the mass. So if you know you're looking for a metabolite but it's a different, if it's derivatized it's going to be a different mass than what's going to be in say HNDB. So you need to know that to make sure you're identifying the right compound. So in terms of GCMS most people apply it to things like amino acids, organic acids, sugars fatty acids, and lower molecular compounds. I mentioned that gas chromatography has very high resolution, it's more reproducible than LC. So this is why it's generally more popular it's also a lot cheaper than LCMS. EMS also has very standardized ionization. It's always exactly the same energy 70 electron volts, whereas with ESI, it's all over the place. So that means that the comparison between AI spectra and the library is really good. Whereas when you go to LCMS spectra. They're all pretty different, even if it's the same compound, different fragment ions different intensities. It's actually quite confusing. So EIMS and GCMS highly reproducible, very standardized, which is great for translation and applications. Anyways, this is the spectra compared to a database and scored by their similarity similarity is done through a scoring called the match factor. And it's sort of it's a, it's a dot product basically, if you remember a dot B, a B costata that's a dot product or it's two vectors, a, a 1b1 plus a 2b2 plus a 3b3 that's a dot product as well. You normalize it they multiply by 1000 they do it with the intensity normalization. So this is the formula but it basically says how many peaks match with the observed and expected, how well do they match, if you've got a perfect match, a match factor is 1000. A dot B equals one times 1000 equals 1000. If a dot B don't match too well, then it's point three point five times 1000, or in other words between 300 and 500. This is an example of a match factor where the match factor is 823 or point 823. And you can see, you know, almost all the peaks match the intensities don't, you can see this. And we're looking at two five dichlorofenol and three four dichlorofenol. So they have exactly the same mass, just isomers. The MSMS spectrum match almost perfectly. And most people would say the same compound. They probably have a different retention index and that would be the way you distinguish it. But this is typically a threshold of 800 above 750 is usually considered to be a match. This case you'd be wrong but it's darn close. Well, first of all, you need to have some standards. You use alkanes going maybe from octane to hexadecane. And these are both your calibration standards for your retention index, but they can also be used to help with some quantification. And the blank sample which contains your solvent and derivatization reagents, like the TMS FA or methoxene. Those are there because they show up when you run your GCMS and you don't want to chase useless peaks. Then you run your sample of interest. So there's sort of three runs that you do. And then from there you create your calibration file, which is your retention index to set your retention index values for all your other metabolites in your sample of interest. So this is what your alkane standard might look like. If there's these eight or nine standards, they separate in this case over about 10 minutes, ranging from octane to hexadecane. So in this set of standards, you can calculate your retention index. So retention index is basically you look at the peaks beside your unknown. So in this case, two is unknown. One in three are the known ones. One is hexane one is heptane. That's done through your calibration. So we knew that hexane came off it at six and a half minutes and heptane came off at 11 and a half minutes. So you do a two minute correction because of a solid blank or delay, but then you calculate the index using this equation 100 times n plus n minus n log unknown log. And so you can see the numbers. In this case, hexane has six, heptane has seven. So it's seven minus six plus. Six, whatever. So it should come up with either 100 times six plus 100 times the log values. So the retention index for compound number two is 644. So even though the retention time was 6.25 minutes, the retention index is 644. And you can do this for all the other compounds using your hexane heptane octane so on. So this calculates the retention that normalizes everything so that you are consistent. That's that's a real strength of GCMS. So then you can analyze your sample data once you've got these retention indices calibrated. Yes, I'm in it. There is, there's an isothermal retention index and then there's some of the modified ones but the in essence, if you've separated your L cane standards using the same temperature profile then you should be able to create a retention index that is the same because you're always referencing it to your L cane standards, and you're using the same formula. So in principle those retention indices should be the same but they're normally called isothermal meaning that's the same temperature. But I think when you look at the literature, even if you have a temperature program where you're thinking they still match almost perfectly to the isothermal ones. So, you take your L cane standards you get your retention indices you get rid of use your blanks to get rid of the false positives that cleans up your extra peaks. And now you can start doing, you know, the match factor and the integrated area under the peaks to determine which ones and the retention times or retention indices to identify your compounds. So, that process of matching identifying quantifying is done by the software called GC auto fit. So, it needs three spectra sets it needs your sample, you know your urine sample in this case it needs a blank, which has your, your solvent and your derivatization mixtures, and it needs the L cane standards. And also we'll have done and this is sort of the dirty secret, you've done a calibration run on certain compounds, and that calibration run was already done for you, so that you can do the quantitation. But once you've done that calibration run, you can quantify for months. And with those spectra, which you guys will get, you can, this will do the alignment. It'll do the, you know, peak retention index calculation. It'll match those retention indices to its known reference set of retention indices. It'll do the peak identification it'll do the peak integration, and then the peak concentration using the calibrations which were run months ago. There's different files so there's net CDF and MZ XML files. It's fast to add a minute perspective, and you can identify up to 100 compounds, and these have pretty high accuracy. So it's the one year running is optimized for urine, but it can be adapted to other biosuits. So you still need to do, you know, if you're running these things you still have to do the derivatization you still have to do the sample run so it's it's not, you know, press a button close your eyes. There's a bit of work, as there is with any molecular biology or chemistry kit. They're different input files the L cane standard the blank sample sample files. These are all required you have to upload them. Some may require file conversion. It's explained in the web server. You guys I don't think have to do that everything is bundled for you. But depending on the instruments. There is this support. If you need to convert. There's a really good program called proteo wizard, which allows you to convert almost any file in GCMS LCMS to some kind of standardized MS file. So be aware of proteo wizard. This is something that many, many people have to use all the time. So this just shows you how you would upload the GCMS files. You can upload your spectra file, you can have them individual the L cane standards banker samples, or they can all be zipped together. And the program will figure out which one is the L King which runs the bank and which one's the sample. You can see the samples you can run. So you got acquainted with it. For the lab, you guys won't be running examples you'll be actually running real samples and collected for this course. If you're, you know, running these things you should check your L King standards so there's tools to visualize online so you can see your spectrum you can see this case it looks like there's about a dozen peaks that run in the reference L King set. Here's your sample spectrum. You can see there's quite a few peaks if this is a urine sample there's literally hundreds of peaks and you can zoom in and scroll out just like you can with the NMR spectrum. As it's running it's going to produce a list and here's the same thing here's your hMDB identify. Here is your compound name. Here's the calculator retention index and retention time intensity, the match factor, and then the concentration. You can send it out for you. All the peaks in this case are identified and quantified in the spectrum, which is also interact with the people. You can then send all of that to a CSV or Excel file format. And you can download those results. And as a side you can also view the spectra interactively just like with the NMR one. There's a difference between running the kit, which means you're in the instrument you've got gloves and you're pipetting and you're running things for an hour or two. Versus this which is that's been collected for you. How do you interpret it. Now running GC or LC or NMR is something that people have been doing for decades. This stuff is new. It's very widely done. And the automation is relatively recent. And that's what's made metabolomics. So interesting. So we've done NMR, and we've done GC mass, and NIA has a question. Oh, I got five minutes left. So we're going to talk about LC mass metabolomics and I talked about targeted metabolomics by LC triple quad. So this is a tandem mass spectrometer, and we use QQQ mean quadrupole collision cell quadrupole, but it's still called tandem mass spec. Things get ionized, things get dissociated, things are detected. We've seen this slide before and the focus we do for this particular lab in this example is multiple reaction monitoring means choosing a specific analyte. This is the one that's shown here, red, triangle, blue. That's a compound. And so we've chosen a specific precursor. We know it's mass and it's retention time. And then we look at its product ion and it's a specific product ion. It's only the red product ion that we're looking at it. So it's a precursor product ion pair that we look at. So that's sufficient, both the mass and retention time for those two is sufficient to fully identify that molecule. Even if our mass resolution is only one Dalton. And so it simplifies our spectra. We're not seeing thousands of peaks anymore. It's just, you know, pairs. We can integrate the peak areas because we have some reference calibration ones that were there were some isotopic standards, and so we can actually get the concentration. So here's a, you know, total ion chromatogram that we can get from a single quad, pretty messy. If we've done the Q triple quad MRM, if we've just chosen a 2286 157 pair, but it looks at 9.89 minutes. That's that one peak, basically. So instead of a massive peak stacked on the top of each other. It's just this one. And that one peak is one that we can integrate easily. And that's the one that allows us to to con a identify be quantified. So we have to have a list of precursor product ion pairs need their collision energies, the clustering potentials and retention times that's the input information. And that's the requirement for all the target molecules. So if your kit has 140. Is it standard message, you have to do that for 140 metabolites as a kid has 640, you have to do that for 640 types in principle. This is used as a since software and so this is the table that would go into the software and that's what the software you guys will use has. Typically, it's run on 96 plate so a lot of the software that's used by vendors or companies will have ways of registering the saw the samples matching which ones are which ones are the calibrants, which ones are blanks which ones are standards. Typically, you have to add isotopic standards. You don't have to generate calibration curves, you have to do integration to make sure that peaks are like they're seven points they follow a clear line as you go up the concentration. So those are things that are put in usually in the kits as part of the MMM list as part of the calibration process. Once you've got those MMM transitions are identified from your actual samples, qualifiers are identified, peaks are picked, peaks are integrated concentrations are determined. And then those integrated areas and concentrations that all the metabolites are generated to produce a list, just like it did for GCMS just like it did for NMR. So all of them produce lists of metabolites and concentrations. This kit that was run for your samples is one of these that uses about 140 so it's a smaller kit, relatively fast 96 well format. So the actual process that was used in the kits, we have samples that are split into a derivatization for means another one that's derivatives for organic acids are separated run on these triple quad instruments. So you can see the derivatization reagents that were used this helps in quantification. It also helps in enhancing the signal intensity. It makes things a lot easier. We've got a mix of liquid chromatography mass spec as well as flow injection anywhere between 20 and 50 microliters takes about 30 minutes per sample so 96 samples means it takes about to 30 days automatically on an instrument just runs measures many different types of molecules, and it has a software, typically we want to have the software process each sample very quickly. We use chemical tagging to give a single column separation. This is of the different panels one for amines one for organic acids. Another one is flow injection for lipids. We have isotopic standards, and these have to be following very strict requirements for ISO so you have to have interday high interday accuracy, high inter and intra day lower limits of detection CVs typically less than 15%. Chemical tagging is used a lot now in quantitative metabolomics because it offers a cheap way of getting isotopically labeled standards. If you had to buy or synthesize isotopic standards for every single compound, these kits would cost millions, but by using chemical tagging, you're able to do this much cheaper, but includes improved stability improves reverse phase separation improves ionization improve sensitivity, and it reduces costs. In the case that Iran you have to follow specific design. There's lots of calibration. In fact, almost a sixth of the plate is used for calibration. So 96 wells 14 are used to standardize calibrate, clean things up, and then the 82 samples are used to measure. So this is, this is the rigor that has to be done to get quantitative reproducible metabolomics. This is developed. Alan has been working on a lot of them. He's developed the team that prime to develop the team of mega along with his wife Tammy and then he's also working on a new one called the giga. So, when you guys are going to be working with measures about 145 metabolites and get ratios as well. It's now more frequently used by people and in team it measures about 650 compounds, but because you can measure and quantify. Absolutely, it means you can calculate sums and ratios. And that's actually really been advantageous. You can pool all the triglycerides into one group you can do all the branch amino acids together you can calculate ratios of fentanyl into tyrosine. You can't do that with untargeted or unquantified methods. I say which hopefully be ready in the fall should measure about 1400 compounds. Again, this is many, many times more than what you can identify through untargeted methods. The software has been developed. Vasu is here has helped develop a lot of the software along with Alan in terms of design. He's choose the bio food to choose the files. You got the retention time list you drag and drop this in you'll have a chance to do this. Processing samples can be done quite quickly in ideal world. You guys will have a chance to have some processing. So, the goal is to do it all about five or 10 minutes right now it takes a couple of hours for about 640 compounds per sample. And it's been used in a whole bunch of different sample types blood urine and stool. The software is available as a web server just like GC out of it just like magnet. To say right now the current version software takes about an hour or two to identify metabolites in a sample. Ideally that should be shrinking down to a couple minutes. You guys are working with a smaller sample set so it should take you much much like time in the lab will try that out shortly. You guys are doing interactive calibration curve editing interactive peak adjustment and peak integration. Just like the others that produce a list of concentrations and table formats, you can view the spectra to see how things are looking. So to wrap up, I think targeted quantitative metabolomics is actually much easier and much faster than untargeted. All the major platforms, you guys will have a chance to do that. They're different workflows and different types of software. That's certainly expected. Some platforms have some advantages other platforms have other advantages. And certainly we're going to try out these things in the web servers in the lab that happens after the break in Montreal or after lunch here.