 OK, sorry about that. Hopefully we'll have a trouble-free morning. So welcome, everyone. Glad you could join us. I think as Michelle remarked, I think we've got a pretty diverse group with people with, I think, a wide range of experience and backgrounds. When you've got such a wide range of skills and interests, we're not going to be able to, I'm not going to be able to hit on everything that you would like to see. And in some cases, this might seem a bit of a review for you. Other cases, it's going to be brand new material and seems way above your head. We'll try and meander through that rather difficult territory to make sure that everyone's at least learning something. But if you are finding some of the material routine, well, you can sort of turn off for a little while. But I'm sure that there'll at least be something that will be new for you. So as with everything Michelle said, we're going to have this Creative Commons slide in front of all of our general lectures and even the labs. The other thing that we'll do is try and introduce each of the modules. You guys can follow along in your books. And then we'll also try and provide you with some learning objectives before each lecture and each lab. So hopefully we'll address those things. And then you can kind of tune in or tune out as necessary if you're quite experienced. So this one is really an introduction to metabolomics. Many of you are actually already doing metabolomics. But a few of you are fairly new and to the field. And so this is sort of a get acquainted lecture. We're going to talk about some of the applications of metabolomics or potential applications, many of them already mentioned by people. And we'll also talk about the technologies. The major platforms, primarily chromatography, liquid chromatography, gas chromatography, mass spectrometry, and NMR. Now some of you are already using these instruments. Some of you are familiar with one, but not the other. Some of you are actually more of a computational background. And so this will be new to you. Others are probably far more familiar with these technologies than I am. So again, we'll probably see a range. If you have some things you'd like to add or feel that we're missing something, let us know. Because again, it's a small group, and we could all share in that kind of expertise. And then we're going to talk a little bit about the differences between targeted and untargeted metabolomics, which I think is an important distinction. Michelle's already given you the schedule. And we'll try and follow that fairly closely. I think we're so far right on track. So I'll dive right in. And I think I'm going to continue sitting down, partly because it's a little easier for me to pivot and see people, but also so that we can track or keep close to the microphone. So I often begin a description of metabolomics with a picture like this, which I call the Pyramid of Life. And it's illustrated by a few things. One is sort of at the base is the thing that constitutes every living cell. It's DNA, which is the genome. And whether it's our genome or microbial genome or parasitic genome, the study of all of those genomes is done through genomics. Genes code for proteins in the method for studying proteins, so the proteome is proteomics. And then proteins are really designed to manipulate, catalyze, absorb, and transport metabolites. And so the study of the metabolites or the metabolome is metabolomics. So generally, we think of the base as the genome and everything else follows from it. But you can also look from a top-down perspective. And what I've illustrated on the one side is this influence of environment and physiology. And this is something that I think is quite important. The genome is remarkably stable. In fact, it's intended to be stable. We're isolating DNA from samples that are thousands, hundreds of thousands of years old, and they're largely unchanged. Proteome is a little less stable. Some proteins, eye lens proteins, are stable for up to 100 years, but most proteins turn over in a matter of hours or a few days. So they're not designed to last forever. The metabolites are something that are exquisitely affected by the environment. And they go up and down, depending on what you just ate, what you're drinking, the coffee you have, the tea you have, how you're breathing, how much you're breathing. And it's this fact that the metabolome is at that interface between the genome and proteome, and the environment makes it particularly useful for characterizing the phenotype. And so in that regard, the metabolome is regarded as the window on the phenome or the phenotype. And that's really important to remember, and that's why metabolomics has been picking up more and more in terms of popularity. Because outside of humans, you can't really ask a plant or a mouse how it's feeling, and you can't really expect to get an answer from a bacterium. So quantitative phenotyping is what metabolomics is all about. The other thing to remember, particularly for multicellular organisms, like humans, animals, even plants, that there's a tremendous influence on physiology. So in large part, the genome is the same in every single cell. Methylation will change the gene expression. Proteome is largely similar, but again, it'll be altered. But the metabolome differs from organ to organ quite profoundly. And we have in our own bodies very specialized organs for specific metabolism. So the stomach, the intestine, the brain and heart all have very, very different metabolomes and functions. Liver is another example of a dedicated metabolic organ. Pancreas is another dedicated metabolic organ. And we can go through, and if you know something about physiology, just about every organ and tissue has a fairly unique metabolism, but often identical genomes and somewhat similar transcriptomes. So with the metabolome, we have to think of the many compartments, and we have to think about the many purposes of organs, and we don't regard everything as just a single cell. And that's a little unfortunate because most of what's in biology and most of what we're taught in molecular biology and even biology today is just consider a human as a single cell. And that's really not the way it works. In terms of a definition of metabolomics, again, everyone has their own, but I think the most useful thing to do is just to relate it to something we're all familiar with, which is genomics. And the standard definition is using, you know, essentially it's a life science field that uses high-through-peck technologies and it characterizes all the genes in a cell, tissue, or organism. Metabolomics is an identical definition except we substitute the word genes for small molecules or metabolites. Another important distinction or definition is what is a metabolite? And that's been evolving over a number of years, but I think people are settling on this. So basically, it's any molecule, it doesn't actually have to be organic, so I should probably cross that out, but it's any molecule detectable in the body where the molecular weight is less than 1,500 Dalton's. So that means it does include some small peptides, less than about 10 or 12 residues. It includes some oligonucleotides. But it also includes more characteristic metabolites that we generally think of. So the sugars and organic acids and ketones and amino acids also includes most lipids. There's some massive, massive glycolipids which are really more micro molecules than metabolites. But it also includes things that are exogenous or foreign. So that could include things like plant alkaloids and foods, food additives, which are synthetic, a variety of toxins and pollutants, drugs and their metabolites. And these are things that we will find in humans. We'll find them in animals and fish and plants. And this too represents metabolite. In the case of humans or other animals, it also includes microbial products because most animals have microbes that help with food and digestion, even plants obviously have them. In terms of a metabolite, generally the limit of detection right now with some very sensitive techniques is maybe about a picomolar. So that essentially is not only what there is, but there's a practical definition of what we can detect. If we could go to femtomolar and sub femtomolar or septomolar, that could be included. But today's technology doesn't allow us to detect those things so therefore in our eyes, they don't exist. So the other thing to remember is that the potassium includes small molecules, both in cells but also in organs. Remembering that organs are metabolic factories, also includes tissues, also can include entire organisms. Knowing the location of the metabolites and what you're characterizing is very important. As I said, it varies tremendously. Whereas saying worrying about the location of a gene or genome isn't quite so important because the large part of the genes are the same regardless of the cell. They are obviously specific to organisms. As I said, it includes exogenous and endogenous. There are many cases where we know the molecules exist but we have not detected them yet, partly because either they're at low concentrations or they're transient. So in some regards, and you'll see this on some of the databases, we're including theoretical molecules, but chemistry tells us that they're there or analysis in model organisms tells us that they have to be there. As I said, the limit is defined by detection technology, one picomolar, but because of this issue of temporality, detection limit, variability from organs, and then this influence of exogenous compounds, the size of the metabolone is always ill-defined. On the other hand, we can give you very precise numbers of the exact size of the E. coli genome, the Drosophila genome. We're still having a trouble, I guess, finalizing what it is for the human genome, but we're pretty close in terms of knowing those numbers, but the metabolome is intrinsically always ill-defined. But it is useful to look at this slide and to sort of get a perspective in terms of the chemical complexity in the major kingdoms of life. So mammals, or if you want animals, have a relatively simple metabolome. It might consist of 60, maybe up to 100,000 chemicals. In large part, that's because mammals and other animals have to, they're not autotrophic, they need to get a lot of their nutrition from other organisms. They don't need to be able to synthesize everything, but they can also run away from threats. On the other hand, microbes don't move very quickly and plants don't move at all. And as a result, these organisms, these kingdoms of life, actually have to use chemistry to defend themselves. And that's one of the reasons why the complexity of both microbial metabolites in particular plant, the metabolome, is much greater than animals or mammals. Now these are still estimates and it's still hard to know exactly, but we certainly do know that the plant metabolome, plant kingdom is incredibly complex and a great source of natural products. Many new drugs and drug leads come from plants. And likewise, microbes communicate not by speaking, but by chemical communication. And so the diversity of chemicals that they use to communicate, to signal, to track is quite large. They also use chemicals to kill each other. And this is the source for most antibiotics. So there is a pyramid if you want in terms of metabolome complexity. These numbers are changing all the time. So I'm sure next year they'll be different again. Yes? You said about the metabolome in the model. And then you mentioned the microbiome, probably, basically including more. That's right. So when you're talking about the size of human metabolome, which one is bigger? The humans in the office come by the bigger or is the microbiome a complex? The gut is... David, can you just repeat the question? Sure. So the question is, humans have microbes in their gut. We have a lot and so therefore should the human metabolome include microbial metabolome? And it does. The number does include that. Humans have maybe about 500 to 1,000 different species, but there's literally millions of microbial species. And the diversity in microbial metabolism is quite broad, particularly for things living in exotic niches or plants or other thermal vents. And the other point is that microbes on their own, a single microbe might have about 2,500 metabolites. It's just the fact that there's millions of species. It gives you that diversity. So with respect to the microbes, what they produce in humans, it's probably on the order of a few thousand different metabolites, many of which overlap with humans. So we can't really easily distinguish them. There's maybe about 500 compounds that are uniquely from the microbes that human cells can't produce. So when we look at humans, and not obvious everyone is not studying humans, but I think we're all interested in ourselves. So it's a useful reference in terms of thinking about this. So it's the same sort of thing you could imagine for other animals. It'd be a little different for plants. But the range in metabolites goes from picomolar to molar, almost. Eurya in humans is the most concentrated metabolite found in urine, and you can get up to several hundred millimolar. So a huge range, a factor of 10 to the 15 or more in terms of concentrations. About 20,000 endogenous metabolites have been identified, and in some cases quantified. They're kept in a database called the Human Metabolone Database. Humans take drugs. Generally you hope they don't take all 1500 drugs, but drugs can be found in human samples. And so if you're looking at a population, it's actually important to know all of those drugs. Now I notice that drugs are not as numerous as endogenous metabolites, it's 1500. They range generally in a lower level, at least in blood and urine, you'll find them at micromolar ranges. I notice the scale here has kind of got corrupted, but should be molar, millimolar, micromolar, nanomolar, picomolar, and then femtomolar. So you might want to write over that scale if it's messed up. Foods, we eat lots of different foods. A lot of plant foods, about 80 to 90% of our calories come from plant foods. And there's lots of different metabolites and phytochemicals and additives and those. Again, they're at the concentrations typically found for drugs. Drugs are metabolized, they're broken down into drug metabolites. Again, they're at lower concentrations by about a factor of 10. And then you hope at the very, very lowest level are the toxins and in some cases environmental chemicals. And those are usually in the low nanomolar or picomolar range. Anyways, these represent the metabolomes that you can find in humans. So only a fraction really is endogenously produced. A lot of it comes from external sources. And this doesn't cover everything. This tallies up to around 40, 45,000 compounds. But these represent the ones that we know or have some good estimates for concentrations or we have some authentic data or confirmation. I've also listed the locations for these databases or the materials. So the T3DB is a toxic exposome database. Drug bank contains material on drugs. And foodDB is a database on food metabolomes. We'll talk about these things a little more in a lecture later today. But there are also other types of metabolomes. These are sort of theoretical ones where we haven't got either the authentic compound but in some respects we can use what we know about biochemistry to predict things. So in the case of knowing all the fatty acids and all the head groups for lipids, you can actually come up with up to 100,000 different lipids that should or could exist in humans. Given the number of drugs, 1,500 and the number of typical drug metabolites anywhere from five to 10, you can estimate there's at least 10,000 different drug metabolites. So again, it's theoretical. We don't have authentic data for all of them. If there are 20,000 different foods, again we could assume that each food chemical is metabolized into many different forms through phase one and phase two metabolism and microbial metabolism. So multiplying the 20,000 times five you end up with about 100,000. And then there are metabolites of metabolites. So your liver and liver enzymes can't distinguish between endogenous metabolites and exogenous ones. So they will put other compounds through phase one, phase two metabolism thinking that they're foreign. So they'll also be processed. Again, the number of these metabolites of metabolites could be 10,000, could be 20,000, could be 100,000. We really don't know. So there's considerable interest actually in trying to predict these because in many respects these theoretical metabolites represent the unknowns that many of us are trying to identify. So can I give an example of metabolites of metabolites? So you can see probably the best example I'm aware of is some of the acyl glycines and acyl carnitines. So these are compounds that are commonly found in humans. The acyl chains are sort of fatty acids and several groups including those in Liang Li's group at the University of Alberta identified up to 400 different types of acyl glycines and 400 different types of acyl carnitines. Most of those seem to have hydroxylated fatty acids that are attached to them and most of those hydroxylated fatty acids would have had to have been processed by acetychrome P450 enzyme. The number of fatty acids that are naturally produced in the body is something on the order of 30 or 40. So we should have only expected 30 or 40 acyl glycines or 30 or 40 acyl carnitines. So to get 400 means that they had to have been processed through some kind of phase one or phase two metabolism. Bacteria could also produce these things and that's the other form of metabolism of metabolites. Okay, so that's I guess a big picture of you, a 30,000 foot view of metabolomes and just to get you acquainted with the different sizes, perspective and some of the challenges that are facing as with respect to metabolomes. So why is metabolomics important? Why are you guys interested in it? Why are we studying it? Why is it actually picking up in popularity? There's a few reasons. When you look at the common medical tests today, more than 95% of them actually are for small molecules. If you're under the age of 25, you have had a metabolomics test. It's called newborn screening and it's universally practiced in North America and every province and state and most countries in Europe. So small molecule testing is pretty much standard. All of us have had at least one of them. Almost 90% of all drugs are small molecules even though most of the money these days are being made in biologics. The vast majority of newly discovered drugs, newly introduced drugs are still small molecules and that will continue to be the case for many years. Most of our inspiration, it's actually higher than 50%, it's getting up to 60% of all drugs are actually derived from existing metabolites or natural products. So we look to nature for most of our inspiration for drugs. Even in the world of genetic diseases, 30% of the genetic diseases involve errors or problems in small molecule metabolism. And then when we think about it, small molecules are essential for most of the operations of the body and cells without the most signaling proteins and signal events would not occur. So small molecules are important. They're also sometimes called the canaries of the genome and this is one reason why they're so frequently used, not only in clinical testing but also for phenotyping. So in some cases, a single base change in your DNA can lead to a 10,000 fold change in metabolite levels. So finding the single base change in a genome is like finding a needle in a haystack. Measuring a 10,000 fold change in metabolite levels that's pretty easy to do. And so that's one of the things that's made metabolomics particularly appealing. There's a temporality to metabolomics which is both a blessing and a curse. So this is a picture of someone eating very heartily but you can see that if we could track them and you would hope this would be the case. If we were measuring their genome either while they're eating or shortly after or even months afterwards, it should not change. It couldn't. I mean, if our genomes were sensitive to what we ate, we'd be a real mess. If we do eat food, we do get some changes in the proteome. Insulin levels will change. Gerellin levels will change. There's about a half dozen proteins and peptides that'll rise and fall over the course of a meal or for an hour or two afterwards. And then things stabilize. But with the metabolomes, things are all over the place and they rapidly change within seconds and continue to change for minutes, hours, and sometimes even days. So that sensitivity to what we eat or drink and that temporal sensitivity is something that's quite powerful and it's how and why a lot of nutritionists are very interested in metabolomics. It's also a weakness because if your metabolome is changing depending on what you just ate, the time of day or even your mood, then is it really reflecting some of the physiology that we want to detect? Is it gonna tell us something about disease or propensity for disease or effects? So these temporal issues have to be dealt with and therefore the experimental controls in metabolomics generally have to be much more careful than in other omics fields. Metabolism is well understood. Thanks to the efforts of hundreds of biochemists through the 40s, 50s, and 60s, we have these metabolic wall charts. These represent the most extensive pathway diagrams ever created. And with all the efforts by hundreds of thousands of other scientists to create pathways for protein signaling and gene signaling, they're still not as extensive or as well understood as metabolism. Metabolism is so well understood we can write hundreds or even thousands of differential equations and simulate precise metabolism of single cells quite remarkably well. So above all other fields, metabolism is our best understood field in biochemistry. The other thing to remember is that metabolom is connected to all the other omes. The genome codes for the proteome, proteome codes for the metabolome, but the metabolome also, including many small molecules, affects expression of genes. It also can affect the stability, methylation of genes. So again, where does the methyl group come from? It's small molecules. So there's this communication between genome proteome and metabolome up and down. So that connection is very important and obviously without small molecules you couldn't synthesize the genome. Without small molecules you couldn't synthesize the proteome. So that connection comes from both the synthesis without AMP, GMP, CMP, or TMP. That's constituents of the genome and the transcriptome. 20 amino acids make up the proteome. Lipids give cells their shape, their integrity, their structure, constituents for the cell walls and microbes and plants. The energy factories, ATP, sugars and lipids. And then of course the cofactors that are critical. So really, metabolome is connected very much so. And I think there's a growing view, although a little reluctant by many, to actually say that the genome and proteome largely evolved to catalyze chemistry. It wasn't the other way around. So the chemistry was happening. It just needed to happen faster. And so that's how the first enzymes, RNA, world, protein started appearing. Yes? Does that mean that? We'd like to think so. No, but I think it is remarkable how much, and if we look at the really simple organisms, that almost everything that they code is just for metabolic control. And then as we get up to more complicated organisms, there's elements obviously to control the cell, cell communication and adhesion. But again, that's also mediated by sugars and larger, small molecules. But yeah, it's an interesting perspective. So this idea of connectivity between the metabolome, the proteome genome is why metabolomics actually is facilitating a lot of work in systems biology. So most of the big successes in systems biology actually have been based in metabolomics, even though the systems biologists don't want to admit it. But it's because we know metabolism so well, it allows us to write those equations. It allows the systems biologists to do the modeling and say, look, look, we've been able to mutate this gene or I'll modify this protein and we predict exactly what would happen. But usually it's about metabolism. But that integration of bioinformatics and cheminformatics allows that linkage between the proteome and the genome and the metabolome. We've heard this morning when everyone was describing some of the work they're looking at and applications, there's lots of things that people can use and are using in metabolomics. Considerable interest in toxicology, this new field called the exposome, which is getting a lot of interest in traction in Europe and the US. It's now used in clinical trial testing, wine, beer, scotch. All of those also are used in sorts of metabolomics, not only for quality control but monitoring. Drug phenotyping, water quality testing, petrochemical analysis, these are all using the tools and techniques developed in metabolomics, these applications in clinical areas and then with emerging applications in imaging, both with NMR and mass spectrometry. So those are, I gained just a 30,000 foot view of metabolomics, why it's important, why it's worthwhile studying why you're here. And you gotta switch gears again and talk about metabolomics methods. And again, some people are very, very familiar with some of these technologies and so you can shut your brain off and look at your Facebook page if you want. Others of you are probably brand new to these things and wanna learn a little bit more. So hopefully we'll try and keep it at a pace that's understandable for those of you. So a standard metabolomics workflow is it begins with a biological sample. So it can be tissues, it could be organs, it could be cells, a whole variety of things and we heard some of those examples just today. In most cases, it's hard to analyze tissues. You can use solid state methods to look at them, solid state NMRs approach tissue imaging by mass spectrometry or MRS is possible but generally what people prefer to do is to take the solid tissues and extract them and extraction allows you to get fluids and that's what most analytical chemistry systems are designed to look at either fluids or vapors. In some cases, you don't have to extract them, you can get them for free, so you can get them from plant sap, you can get them from urine or blood. Those are nicely sterile fluids that contain an awful lot of information. So once you've got your fluids then you can do that chemical analysis. So it's chromatography, mass spectrometry, NMR spectroscopy, infrared. And the big development really in the last 10 years from metabolomics isn't really the chemical analysis. The technologies have been around for decades. It's that last step, it's taking a mixture and interpreting the components to that mixture because up until recently, the whole point of analytical chemistry was to analyze pure compounds and identify pure compounds. But in metabolomics, we're looking at mixtures and in many cases we don't really make a whole lot of effort to separate things to pure compounds. And so it's the data analysis component which has revolutionized metabolomics. It's capacity to look at mixtures, to quantify them, to deconvolute them. And a lot of that's based on the development of spectral databases, new kinds of software and analytical tools to parse out the spectra and their meaning. So one of the challenges with metabolomics is that even with the technologies of NMR, mass spec, GCMS and chromatography, we don't often identify much more than about 200 compounds. Now if you're doing untargeted metabolomics, everyone will claim that they've got 5,000 features or 10,000 features or 20,000 features. But in the end, they're usually lucky to identify 200 or 300 of those features. And that's probably on the high side. Many people are quite content to just talk about 10 or 20 metabolites. So that's an issue. In the world of proteomics, with newer developments, it's fairly routine now to even get up to 10,000 proteins in human cells, certainly on the order of 1,500 to 2,000 proteins in microbial cells, largely covering a good portion of the genome. And then in the case of genomics or transcriptomics, we're talking about humans, it's pretty routine to be able to characterize all 22,000 known genes. So in terms of completeness, there is a range. Genomics is very complete. Proteomics is about half complete. And metabolomics, we're typically getting just 1% of what the genome codes for and 1% of what's probably out there. So there's a real trend, I guess, in terms of what's possible. Obviously, we'd like to be able to increase that by a factor of 10 or 100-fold in metabolomics. But the reason is it has a lot to do with the complexity. So we can characterize the genome completely and routinely and at low cost because the genome is only made up of four chemicals. Chemistry for that is well-known and pretty simple. We can do so well in the characterization of a proteome because all proteins are made up of just 20 amino acids. And again, the chemistry for sequencing and characterizing proteins actually worked out in the 50s and mass spectrometry just exploited that to some extent. In the case of metabolomics, we're looking at several hundred thousand different chemicals. So it's not 20, it's not four. Each of them requires a different type of separation. Each of them requires a different kind of analysis. Some are volatile, some are insoluble, some are hydrophobic, some are hydrophilic. Some go down columns nicely, some don't go down columns at all. Some are big, some are small. The range and size, the range and concentration is profound. So the chemical diversity and complexity is what makes metabolomics so difficult. It's also what makes or requires metabolomics to use a vast array of technologies. A wider array of technologies than are typically used even in proteomics. So yes, you use chromatography, but in many cases you also use capilliroctrofluoresis, microfluidics. In many cases you have to use all the different types of mass spectrometry to get the kind of resolution you want, separations and robustness you want, the fragmentation patterns you needed. In our spectroscopies, routinely used in metabolomics, it's not routinely used in proteomics, never used in genomics. GCMS, another critical tool, never used in proteomics. Crystallography, to characterize the structures, infrared spectroscopy, all are legitimately used in metabolomics. So metabolomics labs are usually pretty complicated analytical chemistry labs. And yes, many people doing metabolomics were formerly in proteomics or still actively doing proteomics, but you'll also find people who were formerly in structural biology or people who were traditionally analytical chemists. So the diversity of equipment is one reason why I'm gonna talk or spend the next 30, 40 minutes on the techniques used in metabolomics. So chromatography is one of the most important tools in metabolomics. And whether it's to sort of separate pure compounds or to separate clusters of compounds, it's still helpful. And it's particularly useful for mass spectrometry. So again, most people are very familiar with it, but for those who are new to it, it's essentially a separation process. You have both a mobile phase and a stationary phase. So the mobile phase contains material that's dissolved or in some cases, if it's a gas, it's a vapor. And it moves through a stationary phase which is not moving and which typically has some chemical moieties on it. And so the mobile phase interacts with the stationary phase and the interactions may are transient and brief, but they cause a partitioning and they cause a separation or a delay in the movement of the mobile phase down the stationary phase. So chromatography, you can separate things by columns. You can also separate through plates or thin layers. Mobile phases, so it can be liquids or gas. The stationary phases can perform their separation through affinity, so through recognition, perhaps by antibodies or chemical recognition, through ions and ion exchange, attraction of positive and negative ions. Through sizing, small molecules moving faster or slower than big molecules again, depending on the type of separation. You can separate on hydrophobicity, which is called reverse phase, and you can speed up or slow down the separation. So if you just let separation happens through gravity or through electric fields or you can speed it up by adding higher pressures. So the pressure technique is perhaps the most useful method for accelerating chromatography. It's also a way, in many cases, of actually improving the resolution of chromatography. So high pressure or high performance liquid chromatography, HPLC is standard that's used in many metabolomics applications. So the field is old, it's getting onto 40 odd years. High pressure, it's 6,000 pounds per square inch, and small, small particles, five microns, which are typically decorated with some kind of hydrophobic chains. It allows you to separate things, allows you to detect things at very low levels. It's widely used in the field of environmental chemistry but many other applications. And you can separate a whole range of small molecules and even large molecules. So three major forms of HPLC. Reverse phase, which is for non-polar molecules. So those are lipids, typically, if you think of them, or things that contain aromatic groups. So in reverse phase, the particles are very hydrophobic and the mobile phase is usually a very polar solvent like acetonitrile, water, methanol. Normal phase HPLC is usually used to separate non-polar molecules. In this case, instead of a non-polar stationary phase, you have a mildly polar stationary phase and you use a non-polar mobile phase. Normal phase chromatography isn't as good as reverse phase. So it's not widely used. On the other hand, helic, H-L-I-C, or hydrophobic interaction liquid chromatography is becoming quite popular and it's very good for separating polar molecules. So it uses a polar stationary phase but also sort of a mixed non-polar mobile phase. So between helic and reverse phase, a lot of useful separations in metabolomics are done. Columns in HPLC are different than the columns that you might have used if you were in chemistry or biochemistry, which are gravity feed columns. They're very thin or narrow columns, five or 10 millimeters, maybe max 20 millimeters across. They can be made of glass, although that's rare, most are stainless steel or if you're dealing with things that are harsh on metals you can use a plastic polymer called peak. So the thinner the column typically or the narrow the column, the narrow the bore typically used for analytical separations. And then if you're trying to separate large quantities, use a preparative column, which is much wider. Columns aren't terribly long. Some can be very short, just as short as maybe 20 millimeters. I've never seen a half meter HPLC column but most of them might be on the order of 30 or 40 centimeters. In terms of the composition of those particles, most are made up of silica, so if you want tiny sand particles. A few microns across, and typically they're decorated or derivatized with hydrophobic molecules in the case of reverse phase. So you've heard probably terms of C6, C18, C4, C12 columns. This represents the number of carbons in the aliphatic group that's attached to the surface of the silicon particles. So they can be aliphatic chains, which is great for reverse phase, but you can also have some other moieties, which are shown in very small, unreadable for me actually. But you can include aromatic groups, polar groups, all of which can be attached to or derivatized on the silicon. And that allows you to make either a polar phase or a mixed polar phase, a non-polar phase. So it's the chemistry on the silica particles. It's really responsible for the separation. And as with most things, you've probably heard about like dissolves like, so it's again, the molecules that are similar to what's attached to the pore, pore silica is what sticks or preferentially dissolves. Things that are in the case of polar molecules don't like hydrophobic surfaces and so they would slip through a reverse phase column very quickly. So there's some issues with respect to separation efficiency. The shorter the column, the faster the runtime, but the poorer the separation. So if you want to separate things really nicely, you need a longer column. So that's just illustrated here. Whereas it could be the same two molecules run it through a 50 millimeter column and they barely separate that through 100 millimeter column and they're nicely separated. But it was also discovered that you could work with much smaller particles. And as you shrink the particles down, not only you get finer or greater separation, but you also get narrower peaks. And so the use of very small particles is like the development of U-PLC or UH-PLC. So that's ultra high pressure liquid chromatography. And so the pressures there are much, much higher because the particles are much smaller and to move molecules through these small molecule columns requires lots of pressure. So advantage of working with U-PLC is you get the separation that you'd normally get in a long column in a short column. And you can get the advantage of the short column in terms of speed. So many people are using U-PLC are able to get separations in two or three minutes, which used to take 10 to 30 minutes of an HPLC. So in terms of what does an HPLC system look like, typically you've got some kind of solvent well, you have a pump that pumps the solvent into the column. And then you'll have a sample injector, which puts a bolus of your sample in as the column is flowing through. So material is separated on the column and you might attach a detector. It could be an ultraviolet detector, fluorescence detector, evaporative light scattering detector. Even a mass spec can be used to attack things coming up from the column. Some cases you collect and reuse, other cases you discard when it comes out. And you track it. So this is an example of a single solvent HPLC system, but you can also have binary or ternary solvent systems, where you start to play some fun chemistry, mixing different types of solvents to create a gradient. And good HPLC chemists will happily use two, three or four different solvents to get some pretty remarkable separations. And they'll work out different schemas where different solvents are added at different levels and different rates. And this just simply takes advantage of the fact that solvents interacting with molecules and the stationary phase will cause and lead to different types of separation. But outside of that mixing of solvents, the same process downstream is followed in terms of ejection and detection. So what can you normally get with respect to an HPLC or UPLC? This is an example of a spectrum that you can collect. In this case it would be UV absorbance. And you can see probably on the order of 50 or 60 different peaks. Some are big, some are small, and this sort of emphasizes a huge range and concentrations that you'll see. Now under any of these peaks, there could be dozens of small molecules. So even though you're seeing 60 or 70 peaks, there could be 600 to 700 different chemicals in this separation. Obviously you hope you're lucky and in fact each peak is a single molecule, but that's not often the case. So that's an example of liquid chromatography. It's a mainstay for a lot of mass spectrometry-based metabolomics techniques. Another type of chromatography is gas chromatography. And this in many respects is actually more powerful than liquid chromatography. It's around around longer, but in essence what gas chromatography is about is separating where the mobile phase is a gas, not a liquid. And so you can use a carrier gas, it might be nitrogen or helium or argon or hydrogen. Separate on a column just like you do with HPLC, except the column is much longer, much narrower, and it's heated. And then you detect things. So you could detect with flame ionization, but you could also detect with mass spec and many other different approaches. So the trick in gas chromatography is to vaporize your sample. So it has to be a gas. Now there's lots of tricks which will actually allow you to take things that are normally liquid and turn them into vapor. And so that's why actually gas chromatography can be used for just about anything. Just to say it's ejected into a column not unlike HPLC. It's transported through a column. The mobile phases say it's an inert gas, like nitrogen or helium or argon. The column itself is decorated so you don't deal with particles. The column itself, the interior is like a hollow pipe. Then you put some kind of polymer that's absorbed to that surface. And instead of being 50 or 100 or 200 millimeters in length, these columns are 10 and 20 meters in length. And instead of being five to 10 to 20 millimeters across, gas chromatography columns are two or three millimeters across. To get things that are normally liquid to become gas, you have to derivatize them. And one of the best derivatives is trimethylsilane or TMS. And there's other types of derivatization agents that will convert a molecule that is normally liquid or solid into a very gas-like phase. And so this is how you could sort of take a sugar and derivatize it. Yes. No. Yeah, you can have controls. These are very efficient reactions. So they go to very near completion, but you'll get a combination so you might get one, two or three silanization groups attached and so you'll see three different or two peaks for a particular thing. If it hasn't been derivatized, it won't be volatilized, so you'll never see it. You have to, I mean, the protocols are quite rigid in the sense that you must do it at this temperature for this many hours. I know that when we were first doing it, we discovered that our protocol was probably only getting about 60 or 70% derivatization and just simply adding five minutes to the time allowed it to go to full completion. So the methods, once you've got them working, are as efficient as some of these protein labeling schemes where the reactions are very, very efficient and very, very complete. So as I say, this sort of illustrates the type of reaction, but there's many different ways of volatilizing or derivatizing molecules for GCMS and we won't go into them today, but just be aware that they're there. Once they're derivatized, they're put into a column and this is sort of this illustration here, but what you will notice if you compare the HPLC with the GC spectrum is that the separation efficiency in GCMS is many times better or GC than LC. So what's called the plate count is a measure of the separation and peak width and maybe rather than getting 50 or 60 peaks that you might get from LC, getting several hundred peaks from GC is pretty routine. A separation with GC, as I said, involves a very long column, 10, 20 meters in some cases and this is what the columns look like. As I say, they're pipes, they're not filled with particles and the derivatization is within the column and so what they do is they will put on this polysiloxane or other kind of polymer coating, which has, in this case, benzyl groups attached, which act like a sort of a hydrophobic component that allows molecules to interact. So if there's benzyl groups and methyl groups, things with benzyl or methyl groups will interact preferentially with that surface. So things pass through a column, it takes minutes, sometimes even up to an hour for stuff to completely pass through a column and we measure what's called a retention time. It's the amount of time it takes for an analyte to pass through. Liquid chromatography also has a retention time. So that retention time is affected by many things, just like with the liquid chromatography by pressures and particle size and gas chromatography, it's affected by the column dimensions, the type of stationary phase, the flow rate of the carrier gas, pressure, the temperature of the oven, the type of gas that's the carrier gas, all of those things can influence it. And so by playing around with those components, you can play around and get enhanced separations. So the retention time is variable, but if you normalize it to a set of standards, you can get something called a retention index. So that's the retention time normalized to the retention times of a series of N-alkanes. And the calculation of retention indices in gas chromatography is very standardized. And so that means if you use the right columns and right N-alkanes and run it according to protocol, you can actually get a pretty good idea what a compound is based on its retention time or retention index. So there are tables of thousands of retention indices for compounds. The other thing that's nice about chromatography is the detector's sensitive, so it's liquid or gas, is that you can not only identify by where they come off, but you can identify how much is coming off by looking at the area under the curve. So you can see that top one has a very small area under the curve. It's coming off at exactly 2.85 minutes. And you can probably estimate the difference in quantity by about a factor of 10. So one picogram versus 10 picograms. So this could be both liquid chromatography or gas chromatography, but because gas chromatography is so much better in terms of retention index and normalization, and because it's so much better in terms of resolution, it's generally easier to quantify compounds through GCMS. Yes. So what you typically do is you have a set of standards you've run through before, and they're often somewhat similar in chemical nature where they're the exact compound. So you have a reference that you've run, and that reference, normally in GC studies you'll run your reference early in the day, or you'll have a set of standards beginning, middle, and the day that help you calibrate those things. Some people also put in isotopic standards as well, so that'll help you get calibrations. But yeah, there's a pre-run standard so you can always quantify. So this is just a blown up picture game just showing you the incredible resolution you get with gas chromatography, and there's a large number of peaks. Ones that are numbered are the ones that are identified, but there's obviously many other peaks that were not identified there. And in some cases, those peaks actually represent some variations in the silenization or incomplete reactions or fragmentation patterns. So how do you identify things once you've separated them on a column? Well, one of the best ways is to weigh them, and that's where we use mass spectrometry. So mass spectrometry is pretty ubiquitous, widely used. How many people have actually used a mass spectrometer here? Two, three, four, five, six, seven, eight, nine, so about half of you, okay? So again, you can kind of shut off if you don't want to hear this, but for the other half of you who are new to mass spectrometry, this'll sort of go through different techniques. So there's all kinds of mass spectrometers. This is an older one, but this is a time of flight mass spectrometer. So it's a long, horizontal tube. It's part of it. It has all kinds of gadgets attached to it. And it's for measuring fairly accurate masses. And just like chromatography allows you to identify compounds by their retention time or retention index, mass spectrometry allows you to identify compounds by their weight. And so technically, we could identify everyone in this room by their weight, because I doubt if anyone has exactly the same weight or mass. So if we have a list of everyone's name and their weight, and then we have a scale over in the corner here, and we blindfolded Michelle for instance, but we had to look at the weight list to be able to tell who's who. So we can do this with small molecules. Many compounds have very precise weights, and typically we can know their exact molecular weight down to seven or eight or nine decimal places. So that capacity to measure precisely the mass of molecules really is a very powerful way of identifying them. So as I said, we can determine their molecular weights down to one PPM with high resolution QTOF or Fourier transform or Orbitrop instruments. And that usually is sufficient to determine the molecular formula and in some cases to actually solidly identify the compound. We use mass spectrometry for proteomics, although typically again with that one or two PPM accuracy means that's about maybe one Dalton for a four year 50 kilo Dalton protein. So I'll use the term Dalton a lot, but that also means one atomic mass unit or one AMU. So they're all the same. So if you attach a gas chromatography instrument to a mass spectrometer, you have a GCMS instrument. So that's used for separating and identifying volatile compounds. If you attach a liquid chromatography HPLC or UPLC to a mass spectrometry instrument, you have an LCMS system. And it's used generally for compounds that are a little more delicate, don't derivatize so well, but it can also be used for many other compounds. And then if you attach two mass spectrometers together, you have a tandem mass spectrometer and you can actually separate using two types of mass spectrometry. Tandem mass spec doesn't just measure the mass of the parent or intact molecule, it typically measures or helps you separate or fragment those molecules into smaller components. So if per chance that there was person or two people in this room who had exactly the same mass or exactly the same weight and we wanted to distinguish between the two of you, the way that an analytical chemist would try and deal with the problem was rather than asking your names, they would carve you up and weigh your arms and legs separately to figure out who was who. And this is what we do in mass spectrometry, fragment the molecule and to look at those components because molecules are fundamentally different in their structure, even though they may have exactly the same mass. And so the fragments will differ and that will give you a unique signature of what it is. In the old days, we could only measure masses to sort of one decimal place. And so in many respects, masses were reported as the average mass. Today, almost every modern instrument can measure down to a few 10 or 20 ppm. And so the result is that for a given single molecule, in this case a molecule that weighs 1,155 Dalton's, we would see all of the different isotopic variants, the isotopemers if you want. So the ones with the carbon-12s, the carbon-13s, the nitrogen-15s, the deuterium derivatives, all of which are in low abundance. So the result is that we now have to talk not about the average mass, but about the monoisotopic mass. And that's the mass referred to by most abundant isotopes. So in the case of organic molecules, that means the carbon-12, the nitrogen-14, oxygen-16, and hydrogen. If you were to average all of those masses based on their abundance, you would come up with an average mass, which is not, in this case it's marked as 1156.3, but it's not the average of all of those other ones. It's a weighted average by their abundance. So if you're unfortunate enough to have a molecule that has chlorine, which has two isotopes that are almost equally abundant, you can end up with kind of confusing mass spectrum. But here's an example, so it's chlorobenzene. And you can calculate its mass using just the hydrogen-carbon chlorine-35. But because of the based isotopic abundance, so there's 0.02% of deuterium, 1.1% of carbon-13, and about 32% of chlorine-37, you can end up with all these variants or fractions. And in this case, there's six different fractions with six different intensities made up of these different isotopic combinations. So the mass spectrum for this compound is in this nice diminishing profile of four or five peaks. It's this one where you have a high peak, a low peak, another high peak, another low peak. But that intensity actually is quite unique and would uniquely identify this compound because it would have to have chlorine in order to have that kind of peak character. So again, people can look at not only the abundance and the position of these isotopes of these peaks, but then also use the patterns to actually determine the molecular formula quite precisely. So in the case of electrospray mass spectrometry, which we'll talk about, it's the most common one used in metabolomics and in proteomics. Typically, it'll take a sample, run through a column and then it'll go through what's called an ionizer. The ionizer converts the molecules into ions. No matter for mass spectrometry work, you have to have ions. You can't weigh neutral molecules or you can't measure the mass of neutral molecules. So once molecules are ionized, they can be positively or negatively ionized, they fly through a mass analyzer, a series of electric fields or magnetic fields. Different masses respond to the electric or magnetic fields in different ways, which allows you to measure them precisely. And then those ions hit a detector. And by tracking the detector, we can measure how much of those ions are hitting so we can measure intensity, sort of pseudo concentration. And then by the time it took for them to reach the detector, we can determine their masses. So this is an example of a typical mass spectrum of a small molecule, aspirin. You can see a parent molecular weight of about 180 Daltons. And then we can see some fragments. And this is likely a typical case of something that's gone through electron impact ionization or maybe an MSMS, where there's fragments that have broken up as well. They have different peaks. So the intensities of these peaks do not necessarily correspond to the abundance. So in that respect, mass spectrometry is notoriously bad for measuring or quantifying things unless you do some special tricks. So we saw LC peaks were broad. GC peaks were narrow. MS peaks are incredibly narrow. They're almost like delta functions. Very sharp, very narrow peaks, incredible resolution. So on the x-axis is the mass to charge ratio. So it's roughly a measure of the weight, but it's usually divided by charge because everything is ionized. And then the y-axis is the relative abundance of the ion. And that abundance is just really a measure of that ion's ability to desorb or fly. Not its abundance. So some ions fly better than others. And that's why the intensities are different. So it has nothing to do with abundance overall. It's just how well it desorbs. In mass spectrometry, we talk about resolution and resolving power. There are some types of mass spectrometers which have very good resolution, very good resolving power. Typically, it's a measure of how narrow the peaks are. So the better the resolution, the better and more expensive the instrument, the better the mass accuracy. So it's measured by the peak width and the observed mass. So in some cases, we can talk about resolution through this where we look at the half-height width. So that's a delta M. Or we can talk about 5% delta M, which is almost at the baseline. And you can see in these cases, upper one, two peaks resolved. Lower one, these two peaks would be barely resolvable based on their peak widths. So here's an example of two different resolution instruments. So the top one is a low resolution ion trap or it could be sound, maybe a single quad or triple quad mass spectrometer. And what you're seeing, same compound, but it looks like one giant mound. So this is a low resolution instrument. So the resolution in this case, full width half max, and that's FWHM, or it's about 700. So if you go to a high resolution instrument, this is a TOF instrument, it has a resolution of 6,000. And in this case, you can see a total of seven clearly marked peaks. So this again is just another illustration. Different resolution instruments. So the blue low resolution instrument has a delta M over M of 1,000. A medium resolution instrument, 3,000, which is the red peak gain, you still can't distinguish things. But then if you get to an orbit trap, high quality QTOF, you're into the black region and you've got a resolution of 30,000. And once you've got that type of resolution, you can do lots of things, particularly with characterizing molecular formulas and positive identifications. So remember the mass spec concept, there was this ionization, then there's this mass analyzer and the detector. Connected to a mass spec is some kind of liquid chromatography, gas chromatography, direct injection, syringe, whatever, that allows you to put the molecules in, but then those molecules have to be converted to ions. So that ionization can be done through laser desorption, electrospray, ion spray, atmospheric pressure ionization, chemical ionization, electron ionization. All of these things convert, what might have been neutral molecules to charged molecules. That's the first step in the most, often the most critical one. Different ionization steps are either considered hard or soft. So electron ionization, which is used for GCMS largely, is ideal for small molecules. It fragments things. It's great for determining structure. Chemical ionization is a little more gentle. Again, it's ideal for very small molecules. Spectra aren't as complicated as EI spectra. And then you have a very soft ionization method, electrospray. It works for small molecules, as well as for very large molecules. Typically what you'll just see is the parent ion mass. Things won't fragment too much. And then another soft ionization technique is called matrix assisted laser desorption or MOLDI. Again, it's soft and it lets you look at much larger molecules. But you can also look at smaller molecules, especially with matrix free systems. So as I said, for gas chromatography, we use electron ionization or electron impact ionization. Gases are released into a container and then ionized. And then sent off to an analyzer. But in that ionization step, they apply a standard set of 70 electron volts, which is enough to fragment things. So there's a bombardment phase that happens with electron impact. You bombard things with electrons, you shatter the molecule, those shattered components are sent into mass spectrometer. So for even something like methanol, you can fragment it into a whole bunch of different ions. And these are sort of the fragmentation patterns, just for methanol as an example. So it's pretty rough, even on a very stable molecule. And to predict the fragmentation patterns is not trivial or obvious. There are some old style GCMS people who can basically look at a molecule and tell you what the fragment patterns would be, but most of us can't. So that's why we use databases. But this is an example of what the mass spectrum would look like from methanol and the types of ions that you'd see. But if you use the same standard ionization energies, same standard way of delivering things, same standard way of sending them off into the analyzer, these spectra are very reproducible. So they serve as fingerprints. And so now there are libraries, databases, with tens of thousands of these molecule fingerprints that allow you to identify the compounds. So we've seen one picture of the electrospray ionization. The other approach is to use Maldi, which typically puts a compound on a surface, plates it over with an aromatic compound, cyanohydroxycinetic acid is one, and then you shine lasers onto the stuff. The matrix absorbs it and blows up and ions are sent off. In electrospray it's a little different. The solution is sent through your HPLC system into a very narrow column, and then you send also some gas around. So there's a sheath of gas surrounding your cupillary, and it heads out into a empty space, atmospheric pressure, but it's not unlike an aerosol can where you spray things. And as you spray things out of this column, this very narrow tip, you also send it into progressively lower atmospheres or higher vacuum. And so the droplets essentially evaporate. And as they're evaporating, they're also propelled forward using a whole series of plates, electronic plates that have different charges that accelerate the ions. So things are sprayed out, equivalent almost to a aerosol can. The droplets start drying off because they're now in a vacuum, and then they've gone through these accelerator plates, electric fields to move out. And eventually what happens is that these droplets evaporate down to just a single ion. And those single ions are the ones that you're detecting. So typically in electric spray, you use a polar aqueous volatile buffer. You don't want any salts, pump it through a very narrow syringe or cupillary, apply a very strong voltage to help get this sort of aerosolization, and then you send that aerosol through a vacuum to dry it off, to shrink the droplets down to a single ion. And by playing with different combinations, based on the voltage, polarity of the solvent, viscosity of the solvent, you can get different types of sprays or spray cones. You can work with very small nano sprays, those are often preferred now. As I say, it's very sensitive, you can get by with very tiny amounts. But with electric spray, if you have a salt or detergents in them, it's totally messed up. So you have to work with very, very clean samples. And then depending on whether you want to look in the positive ion mode or a negative ion mode, you have to add a certain co-ion. So you might add formic acid to the solvent for positive ion or ammonia, if you want to work in a negative mode. So once you've got things ionized, then you can start analyzing them, sending them through a series of electron plates, quadruples, even magnets. Yes? Why would you choose a positive ion mode versus a negative ion mode? It has a lot to do with the character of the molecule. So if it has acids or amino groups, you will have a preference for positive or negative ion modes. And some molecules fly equally well, sort of amino acids can fly equally well, but other cases you have to work exclusively with a negative or positive ion mode, depending on its inherent charge state or preferred charge state. So Michelle's reminding that I'm going to pick up a bit of speed here, but as I say, they're different types of mass analyzers, different types of mass spectrometers, literally dozens of types now. Some which are old style, magnetic sector ones, very high resolution, quadrupole instruments, which are very cheap instruments, low resolution ones, time of flight or Fourier transform instruments, which are the highest mass resolution, but are probably the most expensive. So depending on your budget or your friends, you can have access to very high resolution or moderately low resolution instruments. The best ones these days are the TOF instruments. They're actually getting down closer to one PPM now, but if you can get Orbi Trap and FTMS instruments, they're down to even tenths of PPM. And then the low resolution instruments are these ion traps or triple quad instruments. Now, when you're running a system, you'll have your gas chromatography and then you'll have, or liquid chromatography, and then you'll have your MS system. You can produce different types of results if you want. In terms of sort of a chromatogram that comes out. And so we call them chromatograms, but we're not measuring UV or fluorescence. We're actually measuring the masses or ion currents. And so you can end up with what's called a total ion current chromatogram, a base peak chromatogram, or an extracted ion chromatogram. And each of these are of use or of limited use, depending on what you want to use. So one that a lot of people like to show is sort of this base peak chromatogram, but it shows sort of the most intense peaks from each spectrum. So you're getting things that are coming off in your column, but then within each column, with each peak, you're also seeing many, many masses. And so you're seeing the mass spectrum summed up over all of the different masses under each of the peaks coming off of your column. And so that means it's peaks layered on tops of peaks, so it gets very intense. So the total ion current chromatogram is almost indecipherable, because there's so many peaks overlapping. Base peak chromatogram, we're just taking the most intense mass peak, and so now it's largely separable. The extracted ion chromatogram is just down to basically one or two peaks taken from the base peak chromatogram. So that means there's a lot of data that comes out of both GCMS and LCMS instruments, where you have masses and fragment mass peaks, as well as for all the compounds corresponding to each of the peaks coming off of your separation column. So 60 separation peaks times 10 single ion peaks plus all their fragments, which may be 10, so 60 times 10 times 10, 6,000 peaks that are coming off. So here's a base peak chromatogram of sort of a biological mixture from tomatoes or plants. I'm gonna jump now to another kind of spectroscopy, which is NMR. In NMR, we don't use masses. We use magnetic fields. We don't use columns to separate. We actually just throw the whole mixture right into a giant NMR spectrometer. How many people have ever done NMR? One, two, three, four, five, so not so many. Anyways, this is an example of an NMR spectrum. It looks like a mass spectrum, very narrow peaks, different intensities. We look at them on the basis of their, not by mass to charge, but by their chemical shifts. So for NMR to work, you have to put a sample, a liquid sample under a strong magnetic field, and then you send radio waves on the magnetic field and you see how well those radio waves are absorbed. In some cases, some are strongly absorbed. Some have weak absorbance levels and there's gonna be an absorbance spectrum over sort of a whole range. So what we're measuring in NMR is not mass to charge, we're measuring nuclear magnetism or changes in nuclear magnetism. We're measuring the absorbance of radio waves due to changes in nuclear spin orientation. You can only get NMR to happen if things are put in a strong magnetic field and it works on the principle that different nuclei under different chemical conditions will absorb at different radio frequencies. What we normally do in NMR is we measure hydrogen atoms and then we'll just for the sake of simplicity we call them protons. And protons have a spin. They can be spinning like the Earth and some of them are spinning counterclockwise, some are spinning clockwise. So we call a spin up or spin down. When you have a charge spinning, so protons have a positive charge, when they're spinning, in this case, so they have spin up, the north pole is at the top and the south pole is at the bottom. The spin is down, counterclockwise spin, the north pole is down, south pole is up. So if you have a sample that's in a strong magnetic field you'll have some spins that are up, some spins that are down. If you shine a radio wave or submit a radio wave into the sample, some of those down spins will flip up. So from down to up, so you go from low energy to high energy. And then once you take the radio waves off they relax back. So they'll flip back down. That's what we're really detecting in NMR, just the spin flips and then the relaxation and we collect that information. In NMR you like really, really big magnets and so the higher the magnetics, stronger the field, basically enough to pick up a city bus. So 15 to 20 Tesla or 20,000, 200,000, I guess, sorry. So you'll get different spin frequencies, so higher fields, higher spin frequencies. So typically within NMR you have a superconducting magnet, you have a bunch of samples in liquid, you take those samples out, inject them into the magnet, then you submit some radio waves, the radio waves are picked up by a transceiver processed by a computer to produce your spectrum. Magnets are huge, as I said, they're superconducting, they're sheathed in liquid helium, which then is bathed in liquid nitrogen. They have space blankets wrapped around them, several layers. The magnets are made out of niobium tin, they're wrapped with wire, they wrap stretches for miles, they cost about a million dollars. So they're not trivial things and they're basically large thermoses with a relatively small magnet inside. So the thermos, as I said, has the space blankets, has liquid, nitrogen, liquid helium, vacuums, but then it has this electromagnet, which is a superconducting magnet made of niobium tin. It's a hollow magnet and inside the hollow magnet you stick a probe. The probe has a coil, it's a radio coil, looks like a little wire, it's called a saddle coil and it's where all the radio action happens. So it's basically a very little, simple radio receiver transmitter. Inside that radio transmitter receiver, you put a little tiny test tube about as narrow as a pen or a pencil and the sample, about 500 or 600 microliters, is put into that. When you collect a spectrum, you'll see peaks and those peaks are characterized by chemical shifts. I measure them in parts per million. Here's a sample. It's ethanol, so it's got signals, no it's not ethanol, it's some sort of benzene loop. So it's got a peak at one PPM and a peak at seven and a half PPM. You'll see splitting patterns from a phenomenon called spin coupling, which is based on how close hydrogens are to each other. And then the intensity of the peaks, the height of the peaks actually has to do with the quantity of material or the number of hydrogens. So in fact, unlike mass spectrometry, NMR is inherently quantitative. You can get concentrations from peak intensities. So mass to charge is unique for different compounds. Chemical shifts are unique to different compounds. So you can actually identify compounds based on their chemical shifts. And the positions of these hydrogens and the patterns of those hydrogens and the fingerprints from those spectra allow people to uniquely identify compounds. And so there are large libraries of hundreds or thousands of characteristic chemical shift patterns. And so some patterns are based on whether they're methyl groups, amino groups, aromatic groups, acidic groups. They all have different chemical shifts ranging from about zero all the way up to about 12 parts per million. So again, this is a simple spectrum but just illustrates the positions of those chemical shifts. So the B methyl groups are around 1.9 parts per million and the A methylene groups around 3.8 ppm. And you can see one is a triplet and one is a quadruplet. And the pairing or multiplicity of those peaks has to do with the couplings of adjacent protons. People assign spectra and by looking at their chemical shifts and their coupling patterns. And again, if someone's trained on it and many chemists are, they can often look at a single pure spectrum and identify and determine the structure of a compound. NMR spectra need to be processed and fixed in different ways. They often come out kind of messy. They have to be phased and shimmed. Water has to be suppressed. They have to be referenced. They have to get baseline correction. All of these things take a bit of art and skill. But once they've been fixed, they look quite nice. And as I said, these are all components that often require a fair bit of expertise and training. And we would have shown you guys how to do this, but there's a tool we'll show you later that actually does all of this for you automatically. So just like with the GCMS and LCMS and GC and LC, this is what an NMR spectrum in the mixture looks like. Lots and lots of peaks. And instead of retention indices or M over Z, we're seeing parts per million. So last five or six slides here, I know Michelle's getting anxious, but we've looked at three different techniques, LCMS, GCMS, and NMR. They differ in terms of their sensitivity and limited detection. So the least sensitive method is NMR, the most sensitive method is LCMS. So why would anyone want to use NMR? Well, in NMR you actually are characterizing mostly known compounds. And so in fact, an NMR study allows you to generally interpret your data a little more easily. In LCMS, many of the components that you're looking at are not identifiable yet. And so you're in the realm of the unknowns. And so it's often hard to write papers about unknown things. However, in LCMS, there's opportunity for new discoveries. And so that's particularly appealing. Whereas in NMR, discovery is mostly based on whether things change in intensity or concentrations rather than identifying brand new compounds. GCMS is right in the middle, where it's getting a nice mix of both knowns and unknowns, but generally there's far more knowns that you see in GCMS. And so that too is quite appealing. This is mostly a comparison between the different techniques that sample volumes that are needed. You can see that with mass spectrometry, you can require very, very little volume. Whereas with NMR, typically you require anywhere from 100 to 500 microliters of material. GCMS again is sort of in the intermediate. Different types of samples that can be used, different types of metabolites that are detected. So these things are not all overlapping. They're not all redundant. Often you need to use all three methods to be effective. Some take a fair bit of time for prep. Some take less time. Some take very little time for run. Some take more time. Some have very high limits of detection. Some have very low limits of detection. And as a rule, mass spec GCMS and NMR have a ranking that's different. The number of metabolites that can be detected varies. GCMS actually can generally get more than 50 nowadays, but depending on the type and sample. NMR is highly variable, and mass spec is highly variable as well, depending on the sample. By using multiple platforms, you can also cross check the identity of what you're seeing. And this also is one reason why multiple platforms should be used. So in terms of modern state of the art instrumentation and techniques, this is sort of what's possible. NMR, anywhere from 50 to 200 commons. GCMS, 70 to 120. Direct injection mass spec, maybe 190. LCMS, 300 to 500. Lipidomics, 3,000. And then there's all kinds of specialty systems. What I wanted to finally talk about was the distinction between the two types of metabolomics, targeted and untargeted. Targeted is something that I will be emphasizing for this course. Untargeted is what's historically been used in metabolomics. It's still a very useful technique, but one where we use more chemometric perspectives and where the emphasis on quantitation and identification isn't as strong. So the trend in metabolomics these days, and it's very strong now, is to move towards quantitative or targeted methods. Now, when I say quantitative or targeted, they're not the same. So NMR is an untargeted technique, but it is quantitative. There are types of mass spectrometry which are relatively untargeted but are quantitative. There are other methods that are highly targeted and therefore quantitative. But the point in quantitative slash targeted metabolomics is you identify and you quantify. And unless you identify and quantify, you generally can't publish. And unless you identify and quantify, you can't compare between labs or instruments or platforms. And so that's why people are realizing quantitative metabolomics is really the route to go. So historically the method with untargeted techniques is to do lots and lots of data collection, many samples highly controlled. And then to treat the samples as sort of images of clouds. And then to compare the clouds to each other and say which ones are most similar and which ones are most different. And then from there, from that data reduction phase, it might allow you to do the metabolite identification. But that's still very difficult. In the targeted one, you take your spectra and you spend a long time identifying and quantifying everything. And then from your list of metabolites and their concentrations, then you do the data analysis and reduction. And from there, you can very quickly go to the biological interpretation or to the patent office and make your millions. So what we're gonna do over the next two days is we're gonna go from spectra to lists of compounds. We're gonna see how we do this with GCMS, LCMS and MR. And so from the list of compounds and their concentrations, we're gonna go from lists to pathways. And we're gonna see how you can help interpret those things. And from pathways, it's also possible and from the lists and models to go to biomarkers. So we're gonna look at those specific informatics challenges both in the next lecture and the next lab and then we'll continue on for the next day on those two things. So we're gonna break now for coffee and for stretching and hopefully everyone can find the bathroom which they need to.