 So, this is going to, we're going to cover some, I guess, issues on chemical, spectral, and biological databases. And this is partly to facilitate, and I've mentioned some of these already, but this is to help with later assignment, as we said, it's an optional assignment for you. And makes use of some of the ideas and concepts that we've already mentioned. So, today we've talked about metabolomics in general, why it's important. We've gone through some examples where we used the Konomic software as a kind of a model, because a lot of the concepts in the Konomic software were similar to what you would do with GCMS or even LCMS, and tried to reiterate that in the third lecture. This last one is really about the databases because it's without the databases, we couldn't really do metabolomics. We've seen how we can analyze these things, but even the Konomic software depended on a database. Everything we talked about in identification, My Compound ID, Pubcam, HMDB, these are all about databases. And it's a theme we'll go over and over again. And it's the same thing, it couldn't happen, proteomics couldn't happen, but genomics couldn't happen without their databases. So, we're going to talk about the databases, database models, the different types of databases, talk about both spectral databases, pathway databases, and then public metabolomic databases. So, one of the things that became quite evident, especially in the early days of metabolomics, and Jeff lived through those early days, and in fact, even Allison was involved in the early days, she's that old. But there were bioinformatics, and then there are people called chem informatics specialists, and they really didn't talk to each other. And that's partly historical. So, chem informatics is actually a very old discipline. It was established mostly by commercial enterprises in the 1960s. It was done to work primarily with organic chemists. It was done partly through American Chemical Society. The result was that they produced things like user pay systems. They had very, very limited public access. Large companies, MDL is one, Bilstein, Sigma has set up things widely, and the chemical abstract service, all established concepts, ideas behind chem informatics. Then, 30 years later, this thing called bioinformatics appears. And the idea was to try and address not the needs for chemists, but molecular biologists, particularly people doing gene sequencing and gene characterization. And they decided to make things open access, web-based. And instead of being funded through private companies like Bilstein, Sigma, and MDL, they were funded by NCBI, EBI, Genome Canada, NIH. So, the result is two very different models. And that's why they were solitudes, two different solitudes. What's happened, though, thanks to things like PubChem Project, thanks to drug bank or HMDB or metabolites, the barriers are being broken down. The monopoly that commercial companies have had is largely disappearing. And so, there's a blending now of chem informatics with bioinformatics, which I think is good. So, we've created databases, whether they're bioinformatics or chem informatics databases, for a certain reason. We do this to make data and information linkable and consolidated, trying to find in one place. If you're trying to get information these days, everyone goes to Google. And even though all of that information in Google is distributed, it still is the conduit. So, it's sort of your database. Where do I find cool facts about interesting things? I go to Wikipedia. It's, again, a central resource. So, those are examples of database portals or databases. We do information retrieval. Query matching is one of the better cases for databases. In the case of scientific databases, we like them for reference values. We like them for reference sequences like GenBank, reference images, if you're trying to put together a slideshow. Usually, you can get them from Google images. A lot of us use data in databases actually to train or improve the algorithm. So, all of the mass spec prediction algorithms in the CFMID were trained on databases. They were trained on data in Metlin, trained on data in HMDB, trained in mass bank data. But without that training set, we couldn't have been able to develop a predictor. The other thing that we use, and that's probably marked in red here is the idea of similarity searching and prediction. Databases are invaluable for looking for similar images, similar sequences, similar text, similar spectra. So, we've done the similar spectra searches in Kinomics already. We've seen examples how it works in Amdus and NIST. We've seen examples in CFMID and Metlin. If you use BLAST, you've seen how it works. If you've ever done structural biology, you can see it in the PDB. We also use databases to help with prediction. It's through data in databases we can learn certain trends, facts, numbers. And so, with databases, we enhance or improve the capacity to predict. So, databases are invaluable. And I don't think people really appreciate just how critical they are, how critical they have been in developing modern science. Interestingly, most databases start out as hobby projects. And I can speak from experience. All of the databases that my lab has produced started out essentially as hobby projects. Sometimes they were partly graduate students. Sometimes it's something I might be hasty. They're very simple, flat file, Excel, text files. A lot of early databases we build, I just built through Word, Microsoft Word. Obviously, when they're hobby databases, they're not so big. They're not covering information so well. And then as you progress, either because people like the database or you feel an obligation to the community, you start making them curated. We might make them relational rather than simply flat file. We typically house them in one place. That gives you essentially become a web database. So you can publish on them. The last phase of a database is something that most databases never go to. And that's where they become an open deposition database. GenBank is an open deposition database. PDB is an open deposition database. Metabolites is a new database for metabolomics. That's an open deposition one. They usually are relational databases. They don't want to go to flat files. They're often distributed. They may be in several countries to make sure that the data is never lost. So that's the case with GenBank and the PDB. Problem is, you go from the hobby database to these really extensive archival open deposition databases. They become very, very expensive to maintain. But as they become big and expensive, the community appreciates them much more. So moving from early startages to the later stages, you have to add more capabilities. You need to standardize data, data retrieval, data entry. You need to depend a lot on automation. You'll have archivists as well, but you do a lot of computer automation. And you also have to improve your querying capabilities. You've seen this slide before, but this was a situation in metabolomics about five or six years ago, where we just didn't have the databases to do anything practical in metabolomics. Whereas genomics and proteomics did have those data resources. The reason why it was so difficult is that most of the data, and it's still a problem, was in books, in print journals. It wasn't electronic. There wasn't the tradition as there was in DNA sequences or proteins to deposit things. And my own estimate is that metabolomics still lags behind genomics and proteomics by about 20 years, which is a lot of time to catch up. I think the other thing is that we know that metabolomics is a field that has a much wider and more diverse user base than proteomics or genomics. So yes, there are the metabolomics researchers, and many of you guys might count yourselves among them. But there are also analytical chemists who couldn't tell metabolite from a rug or something. But to them, it's all about the instruments. There's also people who work only in plants. There are also people who work only with humans and under clinical circumstances. There's physicians. We've looked at the very different technologies, NMR, mass spec, GC. Again, these are quite diverse. You don't see that diversity in proteomics. You don't see that instrumental diversity in proteomics. And then there's the chem informatician and the bioinformaticians, those two solitudes that also have to be in there. And so when you build a database, you have to think of all of these people and try and appeal to all of them. And that makes it particularly hard in metabolomics. The result is that we have a bit of a fragmentation. We have databases that are somewhat specialized, some that are specialized NMR databases, specialized MS databases, specialized compound databases. And then for the people who are more biological, we have specialized pathway databases. It's only relatively recently that we've started to see if we can consolidate them to make comprehensive databases. So I'm going to talk about all of these, one by one. Some of these you know about. Some of them I've already mentioned. Some you have yet to hear about. And hopefully you'll learn a little bit more about them. So the spectral databases, as I said, we've touched on them. Some are only about small molecules. Some are not necessarily metabolite oriented. Many of the compound databases are fairly simple. Pathway databases are ones where I think most of you find or would find more appealing, more interesting. And these are areas that I think we need a lot more development in. And then last, as I say, is the comprehensive ones. So here are four important spectral databases. How many people have seen or heard of any of these? OK. Well, that's good. What's that? How many have ever heard of these databases? Oh, there must be something wrong then. No, I think Rose is the only NMR person. So she's probably the only one who would know these ones. This is a spectral database system, SDBS, that was developed about 25 years ago in Japan. It's, the website is here. It's a real gem. There's a lot of material here. It's their National Standards Institute. And it has a huge number of mass spectra. It has a huge number of NMR spectra. And it has a large collection, probably the largest collection we know of, of different compounds. These are different or distinct. Whereas the NIST one has a lot of duplicate ones. Again, not all the compounds are metabolites, but some are, and that's great. It also has some nice spectral search tools, which is very, very useful. BMRB or Biomag Res Bank. This is the Wisconsin resource. And I showed you guys the BMRB or Biomag Res Bank peak search. They have also maintained a collection for reference compound spectra. And they've done a really nice job with it, both in terms of the viewers, the data collection. They spent probably more than $1 million assembling the data and putting it online. So it's a really nice job. As I say, about 1,000 reference metabolites. They support all kinds of searches. They have the NMR reference spectra. Focus historically was on plants, robidopsis. And most of the spectra are assigned, which is also very useful. Another database that very few people know about is called NMRShiftDB. And this has NMR spectra that have been collected by organic chemists. And they have pooled their resources over many years. And it has now 50,000 spectra. Not all of these are metabolites, but many are drugs. Some are toxins, and a few are metabolites. So this particular database was actually started by Chris Steinbeck. And he's the guy who heads up kebby. He's also the guy who heads up metabolites. So he's a very, very active person in the field of metabolomics. They have a capacity to predict chemical shifts. So we talked about being able to do mass spec prediction. This can do chemical shift prediction. It's not great, but it's free. And it's about the only free one I know of. You can search. You can search by name, structures. You can do chemical shift searches. The issue here is that these things are mostly compounds that are used by organic chemists. So it's not stuff in water. Whereas in metabolomics, most everything is. Last one is the Madison Metabolomics Consortium Database. It's also based in Wisconsin. It's also an adjunct to the Biomag Res Bank. But what they've done is they've taken all the Biomag Res Bank data and then they went and found a bunch of things in the literature of metabolites. And so they put in the literature chemical shifts for protons and for carbon-13 and HSQC. And they also collected literature MS data for other metabolites for about 2,000. So yes, these are mostly NMR databases, but at least two or three of them have MS data, other kinds of data. Some of them have predictive capabilities. So they're not necessarily the ones that I would use every day, but there are resources here that could and should be mined. And I think people could actually find a lot of useful nuggets of information here. So that's just a description of the MMCD database. The other collection, so we talked about the NMR databases, but there's also another set of mass spec databases. And I've mentioned the NIS database before, and it's not just a GC MS database. It also has triple-card data. I've mentioned the Metlin database. I've mentioned the Golm database. But I haven't talked about mass bank. How many people have heard of mass bank? Again, no one. So mass bank is also based in Japan. It's a really nice resource. It's really nicely designed. It's very rich in information and it was designed for metabolomics. So it's different than Metlin, and it's different than the HMDB. It has a huge number of experimental MS data. And you can do things like peak searches. And it produces output. So it's somewhat similar to Metlin, somewhat similar to HMDB. They do a really great job of maintaining it. So you're getting Qtof, triple quad, you also get GC data, FTICR, IONTRAP data, 15,000 compounds, 41,000 different spectra. And it's international. It takes data from all kinds of depositors from many different countries. So again, if you're not having any luck with NIST or not having any luck with Metlin, consider this database because the data is freely available. So those are two examples of, or two sets of examples of NMR databases, the mass spec databases. There's also compound databases. So these definitely don't focus on spectral information, but they do have information about compound names, compound formulas, compound structures, some of their predicted properties. So we've talked about PubChem a lot. It's sort of the big, big database. How many people have used or heard of ChemSpider? A few. Anyways, it's sort of a parallel analog of PubChem. It has more of a private focus and it's maintained more by British chemists than say PubChem, which is more an American enterprise. Ligand Expo is a collection of all the small molecules in the Protein Data Bank. And this is actually quite valuable because many of these small molecules are either drugs or natural products. And many of these small molecules are crystallized with their protein binding partners. So this relationship between the small molecule and the enzymes is right there for you to see in 3D color images. And they spent a long time extracting that information out of the PDB. Used to be just terrible, but they realized how important small molecules are, thankfully, and they've done a nice job with that. Kebby is one that I've mentioned briefly, but this is a great resource. Again, Christoph Steinbeck is the one who's been putting it together. They're up to about 40,000 compounds and they have an annotation team of about three or four people working steadily every day, collecting the compounds, annotating. They gather data from other resources, from keg, from lipid maps, from drug bank, from current news articles. Their focus is on ontology, chemical ontology. It's describing chemicals, naming them properly. Similarly like the gene ontology, if you've heard of that. So they've given them tools to search by names and formulas. You can draw structures and search by structures. So that's obviously very useful. And as I say, the compounds there are mostly natural products, mostly relevant to the world of metabolomics. I mentioned Pubcam, so it's 75 million substances. Can't be basically proteins, but they can be pretty large. They collect data from all kinds of depositors and vendors. They distinguish between substances and compounds. Sometimes that's confusing. The compounds are sort of the pure ones, unique ones. They have a lot of data, including names, synonyms, some properties that are calculated from the chemical formulas. Inchi, it's international chemical identifier. This is the way that we're all moving now to almost all chemicals being described with these inchi strings. There's another way of describing chemicals through what are called SMILES strings. These are character representations of atom structures. They tried, originally Pubcam's goal was to actually use these as compounds for drug research. And these bioactivity assays is what they had collected it for. And they kind of went sideways. Everyone forgot about the bioactivity data and they just glommed on to the fact that they had all these chemicals to work with. And so they sort of, well, they still try and push the bioactivity and that's sort of the main funding goal, but everyone likes Pubcam because it's got all these chemicals about it. Mentioned Kim spider, not as big as Pubcam, but probably from more data sources. The other thing is they collected a lot of data on spectra, whereas Pubcam doesn't have that. They're also obsessive about their synonym sets. So the synonym quality in Pub, or in Kim spider is much better than what it is in Pubcam. Last time I looked, they didn't have great tools for searching by chemical formula or mass, but I'm sure that's, I hope that's changed or maybe it's changing now. I mentioned ligand expo and as I said, this is a much smaller database. It's only perhaps five or 10,000 compounds, but it's unique in that it ties the compounds to the proteins. And in many cases, the protein is associated with a specific organism. So there's a lot of biology there. There's a lot of chemistry there. And very minable, very accessible. A lot of that data now is linked to Drug Bank and HMDB as well. So there are other compound databases that are less well known. 3DMap, it's sort of 3D structure database of natural compounds. Napsack, which is this plant metabolite database. Zinc, these are compounds that are commercially available. And then lipid maps, which is a lipid database where there's about 30,000 compounds from mammals and plants and insects. And this is developed at UCSD. So they're somewhat more specialized than PubChem or ChemSpider, but they are very useful in their own right. All of them are, you know, have say five to 10 data fields describing the chemicals. Typically not a lot about the spectra. And again, this is just the more detailed information. Zinc is mostly about these commercially available compounds. They list two and a half million. In many cases, these are ligand or chemical libraries used or developed by drug companies for drug screening. They're not really commercially available unless you're a drug company, but it's still a vast portion of chemical space. Okay, so we've talked about spectral databases. We've talked about compound databases. A lot of compounds, some are about metabolites, some are just general chemicals. Fairly lightly annotated. Not a lot of biological context, except perhaps with the ligand-expo database and even kebi, very little biological context. Where you get into the biological context actually is in pathway databases. And this is where these databases I think are perhaps most useful. And I would hope that most all of you have heard of these ones. So the most famous one is KEG, the Keodo Encyclopedia of Genes and Genomes. One that's probably just as old as KEG, but not as well-known is called the Bio-psychermetasych databases. Has anyone heard of those? Few, three, four. Then the reactome database is something that has a Canadian connection, but it's also been picked up now by EBI. Has anyone used or seen reactome database? You've seen it? One, two, three. And then another one, developed in Canada, the Small Molecule Pathway Database, or SMIPDB. I don't think anyone's heard of that, except maybe Allison. So the point is that the pathway databases give you biological context. And I complained about this issue where we saw network diagrams or hairball diagrams, which just show arrows, but they don't, well, or links, but they don't show any directionality. So pathway databases can relate genes and metabolites and proteins together. They can relate them to diseases. They can relate them to signaling events. They can relate them to all kinds of processes in the cell and within physiological context. A lot of the pathway databases, the better ones allow you to visualize or map genes and protein concentrations or positions or presence, absence. A number of them will cover many different species, which is also useful. And this is particularly evident with KEG. So KEG has these network diagrams. I think everyone has probably seen that show pathway metabolism. The network diagrams are hyperlinked, so you can click on them. And they pop up a card, which gives you about 10 data fields, describing the particular compound, or in some cases, the enzyme. So KEG is getting on like 20 years now, old. And it has about, say, 17,000 metabolites, 10,000 drugs, a huge number of glycans or carbohydrates. And then they've created about 450 different pathways. So the number of compounds in KEG hasn't grown a whole lot. It's, in fact, somewhat smaller than most metabolome databases now. Number of drugs is almost insanely high, and people are completely puzzled at why they've got so many drugs, because FDA has only approved, like, 1,500 drugs. So a lot of these might be illicit ones, I don't know. But it is sort of the number one resource that people use for pathway analysis. I'm gonna talk a little bit about another database, the small molecule pathway database. This one, as I say, was developed in Canada. Allison's played a big role in developing it along with other graduate students in my group. This one was developed to be, basically, a metabolomics pathway database. KEG was developed when the word metabolomics wasn't even around. And so it has a slightly different mandate or expectation in terms of what it does. So SMIPDB actually has a lot more pathways than KEG. I think it's getting up to, like, 650 pathways. So KEG does have drug pathways, but it shows you how the drug is synthesized. SMIPDB has drug pathways that show you the drug and how it works. KEG does have a couple of disease pathways showing how cancer develops for, I think, maybe it's three or four diseases. SMIPDB has about 230 disease pathways. A lot of them are the metabolic disease pathways, which, as I said, count for, like, 30% of all genetic disorders. The other thing is that when you look at metabolism, you can cut many pathways many different ways. And so you can just have a pathway that says alanine to aspartate and call that a new pathway, or you can have two branches or two sides of the citric acid cycle and call that another pathway. So people can slice and dice pathways in lots of ways, but when you try and create pathways that are sort of comprehensive, it basically boils down to about 100 basic metabolic pathways that are found in essentially all mammals, most plants, most microbes. So SMIPDB also has that. And then there's about 30 signaling pathways, and this is the vast dark matter of metabolomics. Metabolites actually aren't just to bricks and mortar, they actually function as signaling molecules in more cases than not. And we don't appreciate that, and many of those signaling pathways aren't in your textbook, they aren't taught in school or university. And that one is where I think the next great challenge in metabolomics is to get those. What SMIPDB also does is it tries to draw pathways, not the way that KEG does, which is just sort of a wiring diagram with points and dots, but it actually tries to give you some biological context. You know, a lot of metabolic activity happens in the mitochondria. Have you ever seen a mitochondria depicted in a KEG pathway? There's about 20 different organelles in a cell, and each of them often very metabolically specialized. Do you see that in KEG? Do you know whether the proteins, which are usually given with EC numbers in KEG, whether they have dimers, multimmers, do they have cofactors? So these are things that are not typically illustrated in pathway maps or pathway databases, but we try to do this in SMIPDB. We also try to allow you to take data covers. If you get microarrays or any seek data or metabolomic data and upload that data so you can visualize it on the pathway. Same sort of thing is you can take gene, protein or chemical lists that you've gathered from a metabolomics experiment, a proteomics experiment, and see where these show up in abundance in particular pathways to help with things like disease diagnosis. So this is an example of a pathway for SMIPDB. So obviously it looks different than a KEG pathway. For one, you can actually see the structures. You can see the different organelles, you can see the mitochondria. You can also see the organs that are affected, in this case, I'm not sure if this is PKU or something, you can't tell. The metabolites that are produced seem to cause damage to the brain, to the eyes, looks like to the kidney. And then you can see the different proteins. Some are dimers, some are trimmers, some are cofactors. And of course there's the directionality and you can see where the metabolites are sitting inside the cell or outside the cell. Sure. So with this form of... So if you type in a list of metabolites in SMIPDB, so you've done a metabolomic experiment, got 50 compounds, 100 compounds. It will take those lists of metabolites and then it will essentially identify where they are sitting in the different pathways. So it'll run through all 650 pathways. And pathways that are most heavily enriched with those metabolites will be listed first and they'll have a score. And then you will have a list that you can browse just like with mass spec search, which has a list of metabolites. So you have little thumbnails of all the pathways, the score that you got, the number of metabolites that were identified. And so that way you can identify which pathways are consistently enriched. It won't show you the entire metabolite map. And I think, I mean, there is value in the giant metabolite map that they have, but this one is also, you know, might identify specific diseases, might identify specific pathways that aren't in keg. And if you're seeing that it's sort of the same similar named pathways consistently, then it obviously tells you there's some connection or link. All of the images are hyperlinked. So the spectra or the structures are linked to HMDB, the proteins are linked to uniprot. This is an example where you can put in your lists of metabolites. You can map them, click checkboxes to highlight them. You can put in concentrations, relative or absolute values. And these will show up colored in different colors, red, green, yellow, within the pathway. So there's a fair bit of interactive work it's designed to be viewable like you look at Google Maps. So click and drag or click and zoom. So this is a very useful way of navigating through pictures. Last set of databases I wanna talk about before we wrap up I think to do some of your assignments or look at things that are the comprehensive metabolite databases. So keg is another example of a comprehensive database. HMDB, we've mentioned that. I've mentioned metabolites, which is the archival resource. Drug Bank is a drug database. It's also another database for yeast metabolomics, another one for E. coli metabolomics. The HMDB mentioned is about 42,000 compounds. It also identifies gut microbial metabolites. It has lots of information about normal and abnormal concentrations, data on diseases. Lots of spectra. But because all the metabolites are also linked to genes and proteins, you can do blast searches. You can do spectral searches, you can browse. All of the information is connected to the SMIT DB and also to keg pathways. You can search for different data in different biofluids, saliva, CSF, blood. You can do structure searches. So you can draw out a partial structure or a complete structure and it will look for similar structures. You can do very complicated text searches and then all of the data is freely available. It actually grew out of a project called the Human Metabolome Project, which started in 2005 and technically is still continuing, although most of the funding for it ended in 2010. The goal from that project was actually to try and associate metabolite concentrations to diseases to make all the data freely available, which we have. And then also to help develop some new technologies to improve metabolomics and metabolite coverage. So in large part, I think we've succeeded from that project. And certainly the Human Metabolome Database is the strongest legacy. Again, you've seen this sort of picture. So the Human Metabolome Database was one part of that project, but we created the drug bank, we created FoodDB, we created the Toxin Target Database. All of these represent components inside the human body that can be found. And among these databases, it's actually drug bank is far and away the most popular. It gets something like seven or eight million hits a year. And the reason why it's so popular is because it was the first database to link drugs to drug targets. And that's something apparently most drug research firms and most drug companies didn't know. So it's now standard for all of the major farmers to have this particular database. We've done the same thing with the Toxin Database, Toxins and their targets. And then with the Human Metabolome Database, we've tried to associate as many metabolites to their targets or enzymes. Food database is still under development. And it's gonna be very hard to associate food components to genes and enzymes, but I think food has a different purpose. And so a lot of the information of FoodDB is about flavor and aroma and taste and color that we're recording for that. So I'd mentioned HMDB has lots of information about chemicals, searchable with spectra, there are pathway links. We've seen some of those examples. You can browse through the data, there's lots of data fields. So unlike say Keg or PubChem, which average about 10 data fields, HMDB is over 100 data fields. So in some examples of spectral searching, these are evolving even as we speak. Pathway tools for pathway searching, again to be able to type in lists of metabolites and then to be able to identify the most enriched pathways based on those. Humans have different biofluids. We have blood, we have saliva, we have urine. They surround different compartments and so one of the objectives with the Human Metabolin Project was to annotate each of those biofluids. And so you can search through this database and look at the normal abnormal concentrations. So it's very, very rich and one of the more unique parts of HMDB. Drug bank I mentioned is another tool and this is to cover the drug foam or the components of drugs that can be detected in a population. If you do any clinical studies, you will see large numbers of people with drugs. Hopefully no one has all 1400 drugs in them but usually in a population you'll see at least a few. And so that was the purpose of drug bank but as I said, because it linked drugs with their targets, that's when it sort of took off. It has ways of browsing, look for categories and indications. You can draw a structure or partial structure and look for similar drugs. You can do the searches against the drug targets using BLAST and then you have this relational database query called data extract. It allows you to do very complex queries. The yeast metabolism database, T3DB, E.coli metabolism database, all the databases that are maintained in Edmonton have very, very similar structure, design, and makeup. The other one that's important and it's relatively new is called metabolites. So all of the databases I've shown you up till now are curated if you want private databases. People, curators make the data, they pitch it out, they release it every couple of years. Gen bank, protein data bank, those are archival databases. You can deposit your data and there's been a strong need for the metabolites community to have a Gen Bank or a PDB. And this is it, this is metabolites. So you can deposit your metabolite data into metabolites. It's operated by Chris Steinbeck who I've mentioned before. It's maintained at the EBI and you can upload your experimental data, the spectra, your compound lists, their concentrations, the experimental design, experimental results. It has a lot of data, they've been adding some of it so you actually can find metabolomic data for different organisms, probably about a hundred different organisms. So some of it they've gathered from data in Kebbi, some of it they've gathered from our databases elsewhere. It complies with the metabolomic standard initiative, MSI. So it's compliant with the standards that are expected in the community. This is just a table that sort of compares the different types of databases to each other. Kebbi, STBS, Racto, Metlin, site DBs. Each of them will have, you know, information about nomenclature or spectra or descriptions or chemical properties. But I think to be truly comprehensive, you'd like to have, you know, check marks on most of these. And that's what's needed to sort of appeal to the broad community. It's also what's needed to, I think, deal with the specific requirements of metabolomics. Many of these databases were developed before the word metabolomics was even invented. And so it's only the more recent ones that have actually tried to adjust their data structures so that you can meet the demands or needs of people in the field of metabolomics. So in the remaining time, that's 40 minutes or so before we break, but also if you guys want to work after the break or all night, we have a few options. Some of you guys have been playing around with the kinomics stuff. You might want to play around with that. We have an exercise handout. We also have a tutorial. You can choose one, the other, any of the three, all of the three to work on. They're mostly intended to make use of some of the databases, tools, descriptors, techniques we talked about over the last two sets of lectures. And it's essentially, I guess it's a say, the best way of learning is doing. So I've been talking at you, but if you want to really use these resources, it might be worthwhile just sort of poking around on the web and seeing what they're like and seeing how you could use them. I mean, a theoretical understanding of them is helpful, but it really does pay off to actually click on the website and try and see if you can solve a problem or two with them. And I think you'll find, you know, I've tried to give you a wide variety of database choices and you might find some of them are more appealing to you just because of the makeup, because of the layout. And those are the ones you should stick with. It's nice to have options, just like when you go shopping, it's nice to have different choices. So that's it for me right now. If you have any questions, I'm here. And if you want us to get started on doing some of the exercises, you can do that. If you need to take a break, this is also a good time to do it too. Okay.