 So this was a lecture that really originally planned that Fiona Brinkman was going to present. Unfortunately she was unable to come today so instead you have the opportunity to have me present this final discourse on microbiome, biomarker discovery. So I should say that all of these slides that we're presenting today were created by Yelin's group. So there's Taya, Van Roosom and Yuriy Krissen, might be one of Yelina herself who have contributed this slide deck. So I just wanted to make sure that this is acknowledged that those are the people who have really contributed to this set of slides. Okay, so what are biomarkers and how do we find them? So how many people work on biomarkers and try to identify biomarkers? Okay, just two people. How many people are interested in biomarkers? Okay, good. And I'm not going to ask the other question. So what is a biomarker? So a biomarker is it's really some kind of measurable biological property that can be associated with a particular phenomena or a particular sample. So it might be an individual that has a particular infection or disease or it could be something out in the environment that's indicative of some kind of pollution or some kind of environmental phenomenon. And there's two types of biomarkers. They're the functional biomarkers. So these might be specific to single organisms or these might be functions that could be provided by any one of a number of organisms, but it's a function which is indicative of what's happening within this particular sample. The other one that we're probably more familiar with are these taxinoic biomarkers. So these might be something like a specific species or it could be a category of different organisms. And again, we're all familiar with this term or operational taxonomic units. So what are biomarkers? How do we find them? What are the phases for identifying these biomarkers? So there's a discovery phase and there's a number of biophase software that look at data that's coming from the sample. So more digitized genocata, for some sort of quality control, biomarker quantification, and then you apply some kind of math, so over its statistical methods, conversation methods, so forth, to identify from within that sample what are the useful biomarkers that are associated and specific to that particular sample. And then there's a validation stage where once you've identified these biomarkers, you then actually go and test them to identify the primers. So these biological hooks of light which go into your sample and try to pull out your biomarker of interest. And once you have your primers, you can apply QPCR as an example, and you can see how many times we are able to actually identify these markers within our community. So we'll just flip this up into six different processes. So first of all, we have to plan our analysis. What is the bio part of the biomarker we're going to go after? What is the mark? We need to obtain our biological samples, extracts, you can see LA. And then from this, identify what are the potential biomarkers associated with these samples, and these could be OT-mutable functional genes, for example. Validate the biomarker abundance using a separate set of samples. And then from that process, we hopefully identify and we can select some useful biomarkers that we can use to apply in the real world. Okay, so how do we put the bio into biomarkers? What are we looking for in these samples? So we could look for specific bacteria. We will go to the new area of 16S shotgun surveys, for example. We could do some kind of shotgun sequencing and identify all the DNA or genes associated with our samples. So the bacteria is the best study using the ones which have the most methods developed. We might want to look at viruses. So again, we could use a shotgun or an ampermone kind of approach based on the double-bacter genes. One of the issues if we're looking for viruses within a sample though is that these can be challenging to get enough DNA. We might be able to get over this if there's some kind of population verse or if we're identifying individuals perhaps who are infected with a virus. If we get them at the right time, we might find that they have a burst of viruses. And then in terms of eukaryotes, so similar to bacteria, we can also use an ampermone based method, ATS or ITS. Generally, eukaryotes have pretty large genomes. The genes that are actually protein-coding vary from maybe 2% to only 30% of the genome is actually protein-coding. So these large genomes can make shotgun sequencing as a route to biomarkers to go with somewhat challenging. But again, we can use similar methods to what we have for bacteria. So what kind of biomarker do we want? We can have a tax point biomarker, so we can use apricot and shotgun sequence data. One issue if this is that you can have straight-level diversity. This can lead to pulse positives, pulse negatives. So for example, if you're interested in a particular species and you're trying to find it within a sample, but then you get strain diversity where there's been some sort of divergence in the sequence, it means that your biomarker might not be recognising that particular divergent strain. And so that could be a challenge and give you a pulse negative. And also, in terms of the taxonomy, it might depend a lot on the actual environment. So depending on the diversity inherent in the environment, whether it's a soil sample, as we know it's much more diverse in the deep sea sample, which might be even more diverse in say, kimchi sample, these variabilities within these environments again could make it challenging to identify suitable taxonomy markers. In addition to taxonomy markers, we have gene-based markers. So these require some kind of shotgun data, whether it's DNA or RNA. And if you're after a specialised gene, so you're not really caring which species is providing this gene, you just want to know if this gene is there or not. If this gene is present, then it's indicative of a certain condition, perhaps associated with disease, perhaps associated with. There was one example of, I don't know how many of you are familiar with this example, but certain individuals who were prone to high blood pressure were taking this particular statin, and I think this was back in the early 2000s. And for certain individuals, this statin wasn't actually working. And it turned out that there was one particular micro associated with a certain group of people which had the ability to break down this statin, which meant that it was no longer available to the patient. And this was a micro that was in very, very low abundance and wouldn't have been able to be detected through these 16S or metagenomic regions. But nonetheless, there's a marker which if you have it, then it's going to cause this particular statin not to work for you. So this is where this idea of good secrecy that needs to happen, to make sure that you have enough depth to reach these kinds of specialised genes. You can also be misled by the most architecture. So if you're thinking about designing a biomarker and you've got a particular gene, you've got to look at the composition of that gene and the various domains that make up that gene. And maybe if you target one of those domains, it's a very prevalent domain. It's a very ubiquitous domain. It's found in a lot of other different proteins. And so maybe that's not going to be very useful. Outside taxonomic and gene-based biomarkers, there might be other types of biomarkers. So you might use some like diversity metrics. So again, we know that high diversity from ROG, that soil sample is incredibly diverse. So if somebody gives you a sample and it's incredibly diverse, this is a biomarker which tells you that this is likely a sample that's coming from soil. So you could use an indication like that again to inform a particular microbiome. Another idea is that you could use microbiome analysis to suggest other types of markers. So for example, metabolic markers. So this is where again we could go back to a metatranscriptomics. We'd perform a metatranscriptomics. One of the outputs might be a particular pathway that's up-regulated, which might be indicating a specific metabolite that's being produced, which is then going into the host. Could we use that metabolite as an indicator that you have a certain type of community? So those are the different types of biomarkers. Okay, so how do we go about biomarker selection? What is biomarker selection? So biomarker selection is the process for getting rid of all the non-informative or redundant OTUs, in this case, from an analysis. So what makes a good biomarker? What are we actually searching for when we're trying to design a good biomarker? So first of all, for a good biomarker, we want one which is able to discriminate between the two different types of consultants that are out there. So I'm going to wash down a little bit. I hope you can see that there's a red line here and a green line here. So here this is an ideal situation. Here these are different samples, sample frequencies. These are the abundance of these two different taxa. This particular taxa is abundant at a certain level in these red samples. This one is abundant at this level in these blue samples. And so because they're very far apart, there's no overlap in their abundance. It's quite easy to discriminate if you come across one of these OTUs at a relative abundance, and it's indicative of that particular sample. On the other hand, you can have a large amount of overlap, and you analyze your sample and you see that. If you have a certain abundance associated with this red taxa, then it might not be so easy to associate it with the red type of samples versus the blue type of samples. And so we're really looking for things which are able to discriminate between these two different types of samples. So again, we're looking for class means that are far apart. We need authentic type variants. So we have consistent OTU abundances for each of these groups. And ideally, from a statistical perspective, that the abundance across samples provides a normal distribution which makes the statistical calculations a lot easier. So here might be an example of three different OTUs that we can apply to separate these red samples from these blue samples. And so out of those three OTUs, which ones might we continue using to separate these two classes? So OTU1, it's very good because we know that blue measures are large so that's very good, very discriminatory, clear difference, consistent size. Number two, a little bit problematic. There's a little bit of similarity between the OTU representation of doing the samples worldwide. So we're not sure if this could be a useful measure of this particular set of classes. And then we might have some OTUs where there's no difference at all between the samples and then be absolutely no use. So again, just to emphasize a good biomarker, you should be able to distinguish between different types of samples. So what makes a good biomarker? Well, there are a number of biomarkers out there or there might really be a number of biomarkers out there that are already doing a reasonable job. So you need to think about what is the value that you're adding compared to current testing procedures. So for example, if you're looking at code forms in water samples, maybe one of the considerations, one of the values that we can add is to make these kinds of tests a lot more accessible, particularly for people who don't have access to the same kinds of facilities that we have in a lot of different health plans. So again, if you're thinking about developing these biomarkers, can we come up with adding some value to what are already existing tests that enable us to discriminate between samples? So once we've done some kind of sequencing between samples, how do we go about identifying using some kind of statistical or computational methods to identify what might be the most optimal biomarkers associated with these samples? So there's a number of statistical techniques that you might think about applying. So these can be incredibly simple. You might just want to do a T-test. Or you might have something a lot more complex and you could think about writing your own statistical analysis software, an experience of art, and so forth. Or you could use some of these more complex methods that have been covered by others. So a couple of things that I've shown up here, NES, E, and GeneOverse. So these are tools that you can feed your data into, and then depending on what samples you've given it and what kind of labels you've given those samples, it can suggest some appropriate biomarkers for you. Again, and I hope this was highlighted in this morning's tutorial, and this morning's lecture, that whatever you choose, it's really important for you to understand what the statistical methods is doing and what the actual output is going to be looking at. So you need to understand your assumptions about the data. You need to know what assumptions these different methods are making about the data and where the statistical methods are going to fall down, what their limitations are. You also need to interpret your results from the output of the statistical method. So if you remember going back this morning to that slide I showed where we had three different methods that were applied to identify different taxonomic groups from these mouse samples, MetaCV, GIST, NBC, they all gave different results, but anybody who might choose any one of those programs apply it to their particular sample and they're going to end up with a completely different result that they're going to publish, depending on which program they've actually used. And it's the same kind of idea. Depending on the statistical method, they're probably as good as each other. Well, some might be slightly better than others, but you also need to correctly interpret what the actual results mean and not just treat the output as a standard kind of output if you like. So I guess what you're saying is right now is better to compare within your own study than to make too many explanations to other studies because they might have used different methods. Absolutely. So... And this was a major motivation for comparing the deep sea with the mouse with whatever is because each of those different papers used a very different pipeline for doing their analysis. And it's only when you use the same standardized tools in the same way for each of those datasets that you can really infer any kind of meaningful comparisons. And again, for this kind of biomarker stuff as well, be aware that you understand what the limitations of your methods are and what it actually means in the context of your actual experiment. So this all sounds great, but we know that biological data is a little bit messy. It's not ideal. It's difficult. So a lot of biological data is correlated. So it's not independent. We know that certain types like to be with other types of taxa. So there are these kinds of correlations between which OTUs you might find in association with each other. And particularly with my probe, which is based with chemistry quite sparse. So you could have, say, a hundred different samples. And amongst those hundred different samples, you might find a hundred different... or a hundred million different species. And those different species might only be one or two of them in one or a handful of samples. Most of that matrix, if you like, with samples versus species is going to be very, very sparse. And statistical methods on the whole don't handle sparse data very well. So statistical methods vary with respect to how correlation and data sparseness can be taken into account. So again, depending on what your data looks like and how sparse that kind of abundant matrix might be, you might want to think very carefully about which statistical method to apply. So how do others deal with this problem? Well, you can walk correlation and use any parametric methods. So this is a very common approach. And what's easier might not be right, but it's what is easier and how low is the bar and how low do you actually want it to go. So the bar is fairly low at the moment. But maybe as we're going through more of these kinds of experiments, the bar is going to start rising and we do need to be a little bit more consistent as to how we're actually treating these different datasets. Some people are also using more complex approaches that can take into account this correlation and sparseness. What are these more complex approaches that they can be very computationally intensive and also very time-consuming to use? Also, if you have these kind of complex approaches, again, it's like using MetaCV or NBC or GIST, you really need to understand what is actually going on so you can have a correct interpretation of what the output data looks like at the end. So what are these mysterious statistical methods? So there's two types of methods which are generally applied, depending on whether you're trying to predict some kind of label or some kind of continuous variant. So we have these kinds of classification methods where we might have five examples and we want to classify the samples into one group or another. So these are trying to make some kind of inference that these samples will associate with, for example, somebody coming down where the certain... So as an example, maybe some kind of cancer where the prognosis is very poor versus a cancer where the prognosis is very high. Can you classify the samples that you get and into some kind of predictor as to what the prognosis means and which it's going to be? The other type of beyond classification is this regression analysis. So this attempts to predict what is going to be the future value, and so it said through some variable. So rather than classifying them into whether they're doing well that both are going to be doing badly, by having some kind of regression analysis you might say, well, what is the expected lifespan of this person given these kinds of samples? And these methods generally belong to one of two categories. So supervised methods versus unsupervised methods. So in these supervised methods, you know that the samples come from known classes. We have these red classes, we have these blue classes, and so we can use the knowledge that we have associated with these classes to try and optimize the model's ability to identify all these different... for most of these different OTs which ones are able to optimally discriminate between these different samples. So this is a relatively easier supervised kind of learning which is a relatively easier kind of approach to improve. The other type of approach is unsupervised. So this is where you know nothing about the samples. Do you know that you have differences between the OTUs? And so you're really allowing the data to actually drive if you like the ability to discriminate between different samples. So here we know that some good sample file are all varying compared to the other samples and so maybe there's something interesting about sample three and sample five that they share in common that the other samples don't. So this is kind of interesting because it can enable you to identify new relationships that you might not have known or not before. So it's quite a powerful kind of discovery that you'd like. And also relying a little because unsupervised is actually with semi-supervised learning methods as well. All right, so something a bunch of supervised learning. Easier faster to do. That's great. More like easier and more like faster. It allows one to create a much simpler study design as well. So the biomarkers can be more robust and relevant. We might not have so many assumptions that we need to take into account. And the biomarkers are easy to validate because we know what we're looking for. We know what the classes are and how we're able to discriminate between those classes. And so we can validate as long as we have additional members of these classes we can validate our approach. One of the disadvantages that these classes might not be well-defined and so it might be difficult to find some kind of biomarkers that really support these kind of predefined classifications. So there might be, rather than having two different classes it might be actually a spectrum of different classes that you have and they all overlap to some degree. So that could complicate things to some. The advantage of unsupervised learning methods it doesn't assume a class structure. And so when you do the analysis when the data is driving the analysis you can actually come across the novel relationships. So this is the potential to the advanced knowledge in the field. Disadvantages can be difficult to evaluate. So if you do have some way of discriminating between different samples and you had no prior conception as to what the differences between those samples are you know to identify what those differences and what is really separating and driving those two samples might not be that intuitive. It might require very large sample sizes to do this properly and consequently it can be more difficult to have much more competition intensive as well. So there's two ways to come at this kind of problem of doing these kinds of approaches for identifying these suitable biomarkers that are supervised or non-supervised. Coming from it from more of a computational machine learning angle you can come from it from a statistics angle. And it was interesting to see this and see how the different terms are applied but basically the same thing about machine learning aspect to all of this. So networks and graphs are described in machine learning where statistics use some bucking model. Machine learning will talk about generalization. Statistics will talk about test set performance how well are we able to recapitulate our funding from the test set to run the type of data sets. We have un-supervised and supervised learning here we have regression classification and there's the estimational clustering that's very well described in these approaches. There's also a bottom line here so from machine learning you can draw a large amount like a million dollars with a statistics person of like 50,000 dollars. So they're using similar kind of metrics if you like or similar kind of ideas but there's the idea that these are two very different communities. I think at the bottom here you can see the size of this apparently like skis though the Utah statistics people like Hans Las Vegas presumed it's trying to win money. So the best approach in all of this is really going to be the one that works best for your own data set. So everyone's seen Indiana Jones, haven't they? Has anybody not seen Indiana Jones? Really? Oh there you go, there's your Saturday night sorted out. This slide might make sense to you then after that. So yeah just to emphasize that again be very aware of the choices that you're making in choosing the approaches for building your biomarkers identifying and validating them. Okay so now we have the biomarkers we need to validate them. How do we validate biomarkers? So once we've found gene technical group that we want to use we need to design some kind of test for it so QPCR is a good option. If we're going to design a QPCR test to identify this gene from our particular sample we need to identify a region within that gene that is going to be specific to that specific gene. So if we're using some kind of marker base tool if we're using some kind of sorry if we're using some kind of specific gene we need to identify a region within that gene which doesn't look similar to anything else that you're going to find in these samples so it doesn't get confused. So there's a tool called menoplan 2 which does this automatically and other approaches like cluster reads and align them and see what are the conserved regions and what are the specific representative regions. And so you need to think about when you're designing these biomarkers bring in all the different sequences that look similar to your sequence and make sure that when you're designing your primer and your region that you want to use as your biomarker and it's not overlapping with anything else that you might expect to find in these samples. And then there's a couple of tools for designing the primers to allow probe and prospect to find the last two examples. Okay, so this is a case study and I think this is from associated with Will's group as well. Are you involved in this project, Will? Yes. Yeah. So this is an example of a fast... Do you know why it's a fast track? Oh, so it'll be ideal. Okay, so by sequencing everything through metagenomics all at once you have a very quick method of being able to identify those sequences which are necessarily discriminated between these different types of water quality samples. So the initial step was to do bacteria or shotgun data sequencing using the high-seat platform and then comparing the river water and the systems at an agricultural unimpacted site versus two sites that were impacted there. So a simple attribution. We use Melaflan. Melaflan offers relatively low sensitivity but high precision. So it might not capture necessarily all of the differences that are within the sample, but those it does capture are very precise. Based on clay-specific gene sequences, so it can identify which gene sequences are associated with a limited set of taxa, and it has these 3,000-reference genomes and it uses the turtle to search through and identify what is specific and what might be more variable and therefore less useful in analysis. Melaflan is also pretty fast, so it goes through these treatment in retail quickly in about 10 minutes. So the first step was to process, validate the data, quality treatment normalised data across samples, and there is some kind of mock community validation. In order to make sure that you can identify these kinds of biomarkers within a community, you might want to think about trying some kind of mock community. So this could be DNA-free water that has been spiked with DNA from that culture bacteria. So, just as an example of Melaflan, and this really shows low sensitivity, only 70% would be assigned to a species. However, high-position, 84% of these were correctly assigned. So I think the idea here is we're just using some kind of mock spike in as a way of validating this kind of approach. So this is a way of actually validating the entire method using the Melaflan coming out with the biomarkers from what was spiked in. And then this was applied to the various different samples that were collected from these sites. So this is about 15 or so different sites, so upstream sites, out-site and downstream of these sites. And they prioritized these high-bundance taxes, so I think these are the top in 30 or so taxa that were found in these samples. And there are two that clearly stick out, taxa one and taxa two. It's incredibly good at differentiating between nuts through the sites and the other two sites. And then there were these kinds of physical methods that were used to identify these as well. So there were one about 60,000 reeds that were assigned to taxa one. Only about 2,000 reeds that were assigned to taxa two. So taxa one was in much higher abundance. And previous workers suggested that you should always focus on those biomarkers which were the most highest abundance. Taxa one was one which was preferred and prioritized for subsequent foundation. So the taxa one can extract taxa on specific sequences from the MediFam database. So there are 607 sequences which are taxa on specific for taxa one. Reeds were aligned against these sequences. And the idea is that when you take your sample, you've got these 607 sequences which are specific to this taxa one. But not all of them are going to be equally represented within your sample. Some of them may be easy to find out. And so you want to choose the regions which have the most hits. So these are the ones which are going to be easier to detect. Just to give you a kind of a graph, for example, that if this is a marker sequence up here, you might have some kind of consensus sequence here. And you want an area which has the highest coverage. So whenever you're buying all of our reeds or all of our metadata reeds to this marker sequence, you might find that this region here is the one which is the sequence that we might want to design some kind of probe against. Then they use Primer 3 for the Primer and Pro design. And they looked at the relative recurrence rates of the different candidates and primers. And they considered matches that were either exact or just have one or two matches. So this is, again, looking at the fidelity of the matching of the biomarkers. And you choose those sequences that are really minimizing the specific matches. And so when they analysed these different sites, they found that there is really good discriminatory potential between those sites and their treatment. So they're able, when they were doing this analysis, that they could amplify the product to the right size and that it did seem to be able to discriminate between these two different types of samples. So the next set of stages are going to be testing and improving the QPCR to make this a very, very consistent test. So just to summarise this kind of case study. So the initial analysis identified TTA marker based on differential abundance of bacterial species. Metaflam was then used to identify the sequences that were specific to that particular species. And then you map all of the metagenomic reeds onto those 600 or so sequences to identify a marker which has a really relatively high abundance of reeds when you map to it, which suggests that it's going to be a very sensitive kind of approach. So the benefits of this kind of so-called partial track case study where you apply metagenomics to do this discovery for you is that it's fast. You can go from sequence data to PCR primers in a couple of days. And it's relatively simple. It doesn't require a large amount of processing power. However, it does have some limitations. So it does depend on differential abundance of known bacteria. And so we know that this might be an issue from many transcriptomics approaches where we can't map some sequences to known genes because there's a large amount of diversity. So if you don't have known bacteria or if they're sufficiently divergent from something that's a well-developed database, then it's going to be difficult to identify sequences that are specific to that particular bug that you're trying to look for. And kind of thinking into that a bit as well, it also depends on the factors. So there's some to actually be more variable across environments than others. So you might find that some are going to be present, some are going to be absent, and the game goes back to the sparseness of the network. So some samples just are going to have this bug, some samples may have this bug. So it's going to be dependent to some extent as to what size or what the amount of variability there is across the environment. And this might suggest that maybe gene functions might be more appropriate to go under those circumstances where you don't really care about the specific taxing or really care about the function that that taxon is actually giving you instead of taxing you. And then just to emphasise that this is a really low-hanging fruit approach. It's a good first step, but I think as we do more sequencing we're going to develop more specific methods as to how we go about doing this. Yes. So this is like a gene signature, right? Why do we follow like the council or the thing where we try to already develop the gene signature, how they validate, nobody you find the signature then to cross validate. And another sample in the seed, leave one cross-validation which you do a statistic procedure which is already developed. Are you able to discriminate the sample one which you put in? That is the kind of thing which is already developed in the council in the world. We used every kind to differentiate, to type of very closely related cuts like leukemia, lymphoma, those kind of things. So that is the approach, right? I thought when we started the gene marker probably I thought you try to find five, one gene. If you try to find a group of data signature to discriminate two classes or three classes of the same thing, right? Yeah, I think you can, I don't see any reason why you can't do both. I guess this comes back to the original issue of what is the value that you're adding to the biomarker. So if there's an existing test, what is the actual value that you're adding if you're creating a new biomarker? Perhaps, as you suggest, if you have a suite of different markers that might be more indicative of a prognosis and give you a greater detail of the subsequent prognosis for a patient than relying just on a single biomarker. So that might be exactly, as you say, one reason why you might want to extend this approach to multiple markers. We're almost done. Okay, so one are some of the alternative methods. So we do a more complete taxonomic analysis of cracker, discriminate, a couple of programs. In terms of gene function analysis, we can go back to our friends, Megan, and we can see that we're in a keg database. So can we identify enzymes or particular functions that are indicative of your particular sample? And something called real-time genomics, which uses these things called big bands, which I believe are part of the seed database as well. And we can also think about some kind of cluster-based analysis as well. So the idea here is that we can add protein for ourselves, we can do some kind of clustering, we can find different clusters, and then we can design PCRs based on all the genes within these clusters. So, okay, this probably feeds into the idea of finding multiple markers where these clusters of genes are indicative of a certain type of class, and it gives you a more complex kind of readout. So how do we design the actual test? So once we have the biomarker, how do we then do that test? So once we've identified the discriminative textual functions, we find a formative region for the primer, we've started primers, we've found a class or whatever, and then we can validate these primers in silico, and then validate them in vitro. So just to kind of workflow diagram, if you like. So finally, just remember there are other community markers that are out there. So diversity, as we mentioned before, Shannon index might be one way that you could actually indicate whether an ecosystem is helping or not. And as you do microwave analysis, you might find other types of screening tests that could equally be informative. So I saw a talk from Dave Wishart, and some of you guys might have also seen this, where he's been applying these metabolomics approaches, and then these untargeted methods. So you don't actually know what the metabolites are. You just get these profiles of these peaks and these peaks are representing individual chemical varieties. But using this untargeted approach, he's applied this to, I think, these were patients who were suffering from potentially diabetes, if I remember correctly. And this was about 2005. And in 2010, he could have predicted from 2005 the metabolome profiles that he got in 2005 as to what the prognosis was going to be five years later. So metabolomics is potentially offering a really powerful route for doing this kind of biomarker and predictive kind of capability. So if we can use these microbiome analyses to identify other types of screening tests, what the metabolites might find associated with different types of microbiome, the metabolites seem to be an incredibly powerful way of being able to discriminate and separate different types of health outcomes. And finally, if an exercise in this, microbes are only good as the data is based off, so you need to design your experiments carefully, and you can include the positive and negative controls. And you also need to be very careful about how you're interpreting the data and understanding what the methods are that have led to the generation of that data so that you can understand what the limitations are in interpreting those data sets. And that is it. Well done for most of you staying awake. All of you staying awake. All right. Are there any... No, follow announcements.