 All right, well welcome back from lunch everybody. Who's relieved that the hardcore bioinformatics is over. All right, so we're going to talk about sample collection biases integrating sample handling as well into all of this this afternoon. In a lot of ways it's still intense there's still a lot to think about, and not a lot of hard answers a lot of the time but I'll share with you what I've learned from a decade of making a lot of mistakes in this area. All right, bands the slides up here. Okay, so these are the learning goals, again, from today's lecture. But we're going to talk first about sample collection but it's really two main parts and then there's some subsections within sample handling we'll talk about. But really when you're thinking about sample collection and sample handling and microbiome, you really do need to be thinking about what it is that you're going to try to detect. What is your question and what kinds of biomolecules are you interested in, you know it's kind of like looking at the microbiome with all these different sets of glasses right like what you are going to see depends on whether you're set up well to see it. And so you need to think about that ahead of time. I think where a lot of people have gotten into trouble. In the field is taking from the literature which focuses heavily on DNA stability and processing and collecting their samples based on best practices for that and then nowadays are like, Oh, I want to do other stuff I want to measure carbohydrates and metabolites and proteins. Well, where you set up from the beginning of your study to measure those things or not. So, really my lab's work is heavily focused and I would say in general the work we're doing in the International Microbiome Center is heavily focused on multiomics. So we are all about preparing our samples preserving our samples to be able to do everything, look at every type of biomolecule at some point. And also we're adding culture capacity into a lot of our studies so that you know we get all this material it's super valuable and we can use it for all kinds of analyses in the future. And that takes a lot of work and a lot of planning and a lot of time. So if that's not your company, you know a lot of the data and tips I'll give you are also just set up for DNA because that's where we've really done most of our standardization. And a lot of people are only interested in making sure their DNA is okay and that's good enough and that's fine, but I'm going to kind of paint the picture from both sides today because we are definitely interested in multiomics. I'm just going to give you the overview then of how you set this up and you're thinking so if you're just interested in who's there. Then you're going to go for your amplicon sequencing your shotgun sequencing with your taxa right, and you just need DNA for that. But if you're interested in what they can do, you're going to have to go with shotgun sequencing right. These are all things that you guys have become very familiar with this week. And then from John's lecture this morning if you want to know what they are doing. What are they doing right now. That's where you're going to get into your mother transcriptomics and potentially and I think growing capacity in the field proteomics and metabolomics. And you can see that when you get into this last bit here there's all kinds of biomolecules that you may be interested in that you may want to capture. But it doesn't, it becomes not so straightforward how you prepare and design and collect your samples and process them to enable you to look at all of these diverse types of biomolecules in the future. But what we've kind of come to us, the sort of simple solution to all this in the microbiome field is this freeze as soon as possible store at minus 80 minimize free spas. Cool, right. How hard could it be that it turns out that there actually are a lot of things that kind of come up with this. So, this slide shows you then, you know, the schematic of sample collection and in my role leading the IMC core facilities I talked about yesterday. Certainly we have seen all kinds of samples. I have processed most of these types of samples except for plants and soil that I put them on here so I know there's a number of environmental folks but you know you might be working with, you know, whole animals, insects, worms, tissue samples. Of course we've got our fecal samples, different types of liquids could be sampling water or urine, breast milk. We've got kinds of swabs that are going to capture the surface material in the mucosal fluid from different body sites saliva. Anybody know what this one here down at the bottom is. Anybody know what that is. Yeah, it turns out to be a great way to get cervical vaginal fluid. So that's a common sample type now as well. You have all these different types of samples that you might be collecting, and you have different devices, and then you have different ways that you need to, you know, extract your biomass, get your microbial biomolecules out. Before you get there though you have all these steps so sample handling really has four components transport storage processing, and then your bio banking or your archival. So these are places, you know, our places where you can introduce bias, and you can't go back. We'll talk more about that. And then you need to extract these biomolecules that you're interested in. So like I said, sometimes there are mitigating factors or things that come up you're like okay my plan is to freeze as soon as possible. In practice, I want to store it minus 80 minimize my free thoughts, you get into the weeds here though. Well what if my samples at room temperature for one hour. What if it takes eight hours for career transport, what if the samples are held at minus 20 and a participants freezer for one week before it gets to my minus 80. There are so many what us when it comes to this idea of freezing at minus 80 as soon as possible. When you get to the sample handling part when you want to minimize your freeze thoughts. Okay, so I'm going to follow my sample I'm going to make my aliquots so I don't have to do too many more freeze thoughts but what if that process takes four hours and everything's on ice is that stable enough. What if I need to do a 30 minutes spin for my process and I don't have a refrigerated centrifuge that will work. What if I didn't add protease inhibitors and I said later I want to do proteomics. All of these things sort of compound the lots of what ifs. And where we're at right now microbiome science is that it's just such a broad field there's so many top sample types there's so many variances and an accessibility and being able to adhere to these best processes what you really have to do is plan incorporate good controls and systems of evaluating monitoring for batch effects monitoring for problems. And try to stick to your plan but like with the with the monitoring, you know, adjust as needed. That's really what it boils down. There's no simple. This will always work. And that leads here especially with these smaller details, but I will share as we go on, you know, some of the, the things that we've learned along the way and some of the things that the literature is showing to be efficacious when it comes to these processes. Okay, so with sample collection and we have some technical considerations. Now the goals here are to maximize our microbial biomass in microbiome science that's not always easy. Not just the biomass of what you're collecting but that microbial component like that's what we want to measure especially if you want to look at the taxa and the genes or proteins that are encoding. You need that microbial biomass more than you need that host biomass but sometimes you can't separate them right. So you want to maximize that at the same time avoiding contamination perturbation or dilution. But when I was working in this field I kept finding that a lot of clinical folks they insisted on every time that they wanted to take swab samples and they're like I'm going to put an Amis transport media. Why are you putting it in Amis transport media it was kind of like a, you know, blinders were on from some of the historical processes of collecting swabs for anaerobic clinical microbiology. I'm not necessarily always trying to culture these bugs so I'm putting it into transport media, you know, a dry swab don't dilute it don't change anything just keep the swab dry freeze it and then we go on to microbiome. So you do sort of find sometimes these ideas out there that don't fit they're not compatible they worked really well with the way that science was done in one way but now that we're doing microbiome we have to change our sample collection so you don't have to swap Amis transport media, you know, unless you need to culture or you need some very specific purpose for it so otherwise you just don't want to add anything in or perturb in ways that aren't specific for microbiome. So I know this is going to be a little slanted to humans because that's where I work and I know many of you work in that area as well but of course we are considering when we design our studies for sample collection. So we have a lot of historical considerations that are beyond the scope of my talk but even from a technical standpoint we still have to think about. We want to get the best sample quality and the best bio minus we can but we have to also couple that with participant safety comfort and compliance. And so, for instance, one of the things that's been explored and is still discussed quite actively in the field is whether we should be making people scoop this poop all the time like who likes doing that and honestly certain populations that's really difficult so like I'm running a study with autistic kids right and they have a lot of sensory processing differences that make it extremely uncomfortable for some of them. To go through this process of having to change the way I hate poop and maybe having it right front and center for them after they're done. And of course the parents have to be heavily involved but you know it's a lot it's a lot for certain people in certain populations to do this. So just get them to wipe like normal and then just give that to us and actually this was commonly used in one of the early studies called the American get project. But then there was this like, oh, well that's not a good process because they left it at room temperature and it's not a biomass and all this. So these couple studies have come out a little bit more recently gone back re evaluated it and have shown it works quite well. So it's hard to kind of convince clinicians sometimes that they should just let people send into toilet paper instead of like a nice big pumpkin. You know sample so you know it's one of those things that we have to think about this would be a lot easier if we implement this more commonly. There's nothing else that's coming up with my work since we do a lot of swab samples, mostly vaginal swab samples but we've also processed and done studies with respiratory swabs and oral swabs. And a simple way to get more biomass is simply to use a double tip swab participant doesn't have to do anything extra. They just have the swab has two tips instead of women. So, you know there's all different kinds of swabs and it does matter and you kind of have to test for yourself we found that phone swabs work the best for us. But we use these double tip phone swabs they work great. And then I think we get twice as much biomass and we can split it off as soon as we saw it the first time. We split one off like some of our studies we do really want to treat with protease inhibitors so we can treat have a previous inhibitor treated swab, and then our other swab we keep pristine for our DNA and other work that we don't want to add anything to it. So it's a you know there are some simple hacks to keep the burden on the participant down don't make them collect a bunch of different swabs but you still get more material to work with. And of course, their sizing weight constraints yeah you want a lot of biomass but I can tell you when you got a whole bar dish of poop it's almost like a little too much sometimes and they take up a lot of space. So, McCoy people know you're talking about. So, you know you have to think about these things as well in your design. I'll see you have to ship often these things and weight can really affect these things cost. So, you know, there are examples going into you have to plan around that, especially if you're planning to culture, or you're working with low microbial biomass material. So, you know there, you can get microbial DNA free materials tubes and so on, they're very expensive. They have to be treated with a gas basically ethylene something dioxide or something. So if you have an extremely low biomass sample where contaminate DNA, even a low amount is a concern it might be worth it but you have to think about these things ahead of time. Cold chain or preservatives so again, ideally, you maintain cold chain that's sort of the standard in the field we'll talk a little bit more about this on the next couple slides. So, this comes up a lot if you do want to culture, mostly. And we've had a lot of success with just these simple little anaerobic bags you can get these little bags they have like a nice thick, it's not just your regular zip lock it's got like a special kind of extra thick zip lock closure and you get these small little sachets that eat up the oxygen inside this bag and this has worked really well for us we're culturing. So some of the anaerobes in our vaginal samples, just by having the clinicians pop it into this bag before they send it over to us. So there are some simple solutions that can really improve what you can get out of these samples. And of course cost though every little thing even these little anaerobic bags that adds to the cost a lot. Okay so like I said, if you don't make smart choices with sample collection there's no going back right you got your samples you got what you got. So here's some things this is a really nice review that kind of goes through really from the standpoint of DNA based microbiome science but some of the difficulties with different sample processing steps and how that can lead to sources of error and bias. So with sample collection, you know in addition to things I just said you know like, I mean it kind of, I guess echoes them inadequate sampling is really about the biomass. Not getting the sample stabilized not having the right biomolecule stabilized. So in addition, you really want to try to avoid having any contamination in your kits that you start with your sampling kits, and labeling is also not always trivial, because sometimes these things are like like your swab is in a package right it's like in a sterile package sometimes. So your participant wants to open up the sterile package and kind of know that they have a sterile spot but there's no label then on that swab. So they put the swab into a baggie maybe you've put a label on the baggie that comes to your lab, you gotta make sure that that label gets moved back to the swab before it goes in the freezer right. So you have to really think about all of these little steps to make sure that you don't lose continuity of what's what and what's going on. Okay, so we're going to go into storage. And more now, like with handling. So, like I said this has three main parts and here we want to stabilize our microbes and biomolecules and then eventually partition aliquots and archive the biomass. So this is highly sample type specific. So it's always though that what I've also learned, you know, we tend to think fickle samples are not low biomass but you get some really young babies and all of a sudden they're kind of low bio so these things do matter. It's not just the sample type it's the age of the participant. Sometimes it's the sex. I didn't know when I started working with urine, I did not know that the biomass and female urine is like five to 10 times higher than the biomass and male urine. So you have very different requirements for volumes that you collect depending on whether you're working with male or female urine. So, you know, things I've learned along the way and sometimes of course with, you know, participants that might be very vulnerable going through severe illness, there's also that could be dramatic differences in the samples you get from those participants. All right, we've already kind of hammered the idea that cold chain continuity is ideal. And this really goes all the way through these four steps though, and maintaining that all the way through these four steps is not always trivial. So we talked about maximizing biomass in your sample that you've collected but sometimes you just also have to recover that and get it into the processing steps. So this is not always simple when it comes to swabs getting the material off of a swab is not super straightforward. And we've kind of actually worked on a little hack of my lab I wanted to get a picture in here and I didn't get it in time but there's these things called spin baskets and there's a variety of different ones and we finally found one from one manufacturer that works really well and we actually clip our swab and we put it into the spin basket and we centrifuge the material off and we've really increased our biomass recovery the use of spin basket so little details and your prep process can make a really big difference. As you're processing your samples, sometimes it's important to actually quantify your biomass because certain downstream analyses need to be normalized to weight. So a prime example of this is metabolomics. It really only works quantitatively if you can normalize to weight. So if you want to avoid your free spas. One idea that you know you can think about that we've adopted is when we do that first free spa and we go into sample handling the first time. We actually create many many aliquots, and some of them are weighed especially for feces. We also weigh other things like eluded material from a swab for those we track volume, and we also take an aliquot off so that we can count cells. So we have a cell count and a volume we can kind of infer some biomass numbers from that with feces we weigh. So, most of my studies I create at that first five to 10 aliquots, some of them are different weights or different amounts and they're sort of ready to go into different downstream analyses. If you hadn't quantified anything at the first free spa, then you're waiting till your second free spa to get that weight sample you need from metabolomics. So choices that you can make in your, in your own SOPs. Okay, adding stabilizers I mentioned this, the kind of proteomics we're doing we need to have protease inhibitors in there. Again, we don't want to add it to all of our samples because we don't want to necessarily, you know, inactivate enzymes that we might like sometimes we want to measure protease activity and sometimes we want to measure proteins that haven't been acted on in the web. So we partition our sample and make sure we add it at that first three freeze thought right away. So we don't want to go through multiple freeze thoughts before adding something that's going to protect a biomolecule. So that gives opportunity for things to change. And I can tell you an anecdote at the end if you want we have seen that this actually happens these enzymes are very active and things will change even on ice. Okay, like I said, we make many aliquots that are ready for downstream analysis. This takes a lot of time though. And it's worth the time and effort. It does sometimes save time in the secondary steps like you can just pull your samples from metabolomics if they're all weighed out. We often also put an aliquot into a bead beating tube, not with the lysis buffer because I can cause problems but just in the bead beating tube it's already there in the tube ready to go. We can pull our tubes because we do a lot of robotic extraction so 96 at a time we can pull our tubes add the lysis buffer. And we're ready to go. So we do a lot of that kind of pre planning in the sample handling procedures. And like I said, you do have to be mindful of biosafety for most human and animal work, and also DNA clean procedures so, but this means for us, since we're working with a lot of different sample types and a lot of them are lower biomass. We have a separate room, and it's a closed door room we call it pre PCR, and we just make sure that we don't bring lots of bug cultures in there and we don't bring amplified DNA in there. And we also work gowns, we do a lot of extra decontamination this space with bleach, because bleach will get rid of DNA ethanol won't. And we have a UV spectral linker in there and we UV a lot of materials not everything because some things will break down, but whatever we can UV or treat with bleach we do so that helps keep the DNA burden down in the space. We have a bio true biosafety cabinet to protect the humans from whatever diseases might be in the samples, mainly, and then we have some other smaller hoods that are just PCR laminar flow hoods to try to keep cells and dust and DNA right which is dust is mostly your profile DNA and human DNA. We try to keep all that dust out of like where we're working on your samples. So, those are, you know, pretty, you know, thought out and planned setups for a high throughput kind of microbiome processing but you know a lot of labs can adopt things like that having the sample processing and in a particular hood, or you know making sure things are wiped down with bleach before you start you know there's lots of things that you can adapt, even in your standard lab. Okay, any questions so far, feel free to interrupt me, and we'll have time at the end for some chatter to because this is not as long of a lecture. I wish I had a really great answer. I'm working on a review and we're working on a big study in the as part of impact with the McCoy lab and lots of you here have been putting lots of effort into that. We're trying to understand more about sample handling procedures and preservatives, and it's tricky. So far it's mostly been evaluated for DNA. There's lots of commercial options, they're very widely used. Omnigene is probably one of the most commonly used but RNA later in Zymo have been used quite a bit as well. Omnigene now has options for RNA metabolites and of course RNA later, good for DNA and RNA people have used it for a long time. But yeah there's less literature and microbiome science about the effect of preservatives for RNA metabolites than there is DNA. So the DNA literature. It's really hard to compare and do a meta analysis on my next slide I'll show you there's one meta analysis that just came out that I'm working on understanding but yeah it's just hard because so many studies do so many different things. But most say that there are some minor differences when you add preservatives. You know this is not as easily seen with diversity metrics. That's all you care about but if you really want to look at enrichment of taxa, there can be some differences and, like I said, I think it's still to be seen whether we really are seeing any consistent differences with so many different procedures sample types, different it's just still hard to kind of hire you, but you know a lot of labs kind of go with one of these preservative methods for big studies and then they want continuity with their other studies. So certainly if you want like continuity and cohesiveness like sticking with preserved samples is good, but then for like cross study comparisons. So if you're interested, you know, in comparing to existing data that's out there where they didn't use preservatives and you did, you might have some bias there. Like there can be some potential for that for sure. And then interestingly, you know I do think there's evidence in the literature this is just one study but we're the one we're working on is kind of showing the same thing and there's other studies and I think the meta analysis on the next slide also keyed in on this. So just DNA work, there's quite a bit of evidence that room temperature for a week is pretty much fun. So that actually can simplify reduce costs a lot for microbiome studies. But yeah I think people have been a little bit shy about it thinking that it's better to go with preservative and you know I'm not sure that the evidence suggests that preservatives better than room temperature. So you're not going to be doing DNA. But the thing is the preservative, it's not going to let you do like enzyme studies or like, yeah, enzyme activity assays or proteomics you can't do that right now with preservatives anyway so you're already kind of limited either way. Interestingly to what we're seeing in our study we're going to present a little bit of our study impact next week and it's not yet published so I didn't want to put you data in but we're seeing an effective homogenization. And there is some evidence of this in the literature as well. And I think it's a little I'm still working on understanding it myself, why. And, and I think it matters a little bit when you homogenize one study this one by Zostac said that the length of your homogenization matters. So yeah it's a little hard to iron out still but you know there are. It's a factor me maybe it's a more important factor than some of these other things preservative or not. It could be that you know when you put things in preservative you're kind of homogenizing at that time right you're kind of diluting it and mixing it up with the preservative. So I think it remains to be seen really what's important is it homogenization is it the preservative. And you know what is the best practice with that. So this is this meta analysis that just came out earlier this year. And you know it's one of the first I think that's really tried to aggregate all of this literature. So take a look at it but yeah, it's, there's a lot to go through. Okay, I'll talk a little bit about extraction so historically. This was thought to be one of the huge the biggest influencers of inter study variation. And I think it's a question that part of this was just you know are you really going to lice all of the bugs in your sample. Is your extraction kit contaminated with microbial DNA and, and that's still actually not necessarily trivial to avoid. And, yeah, do you have cross contamination so if you're doing high throughput extraction there's always potential especially for cross contamination in this process. What's important over the years is that be beating is critical. The thing is though with this that it's effectiveness is sort of intertwined with the lysis buffer and how you agitate. So sometimes it's a little difficult to tease out but certainly in some studies. They've had like a kit that's designed to work with be beating I think I have the reference here coming up but they have a kit that's designed to work with be beating and then a kit where they just added a be beating step sometimes it doesn't work as well that they just add on. And we've also had had the problem and if any of you have had the problem where you try to be beating in on a kit that's not designed for it you know with just a phone mess. Lysis buffers can handle that degree of agitation without creating a huge phone mess. So you have to think about not just the be beating but kind of that whole package for extraction and the be type in size matters, and I think, you know there's not a lot of we're working on a study that's going to evaluate that and there's not a great literature on it. But certainly if you're interested in fungi and bacteria, this is where it becomes really important, because you need bigger beads to lice fungi than bacteria. So that's why a lot of kits now put a combination of sizes together. And so like what's kind of become the go to is this kaijin power soil or power people pro. These are essentially identical. They're often a go to solution and the study that I said where they had the difference I think it was the ball study, the difference between this power soil pro and just some other kaijin kit that they added be beating in power soil pro was way better in that study so these are just a couple of studies that have shown the power soil pro or power people pros better and I mean that lots of people homebrew their own, you know, floor form and they love it and they've shown it in their hands. We've tried off label kits for specific, you know, or just like smaller companies I should say, for certain sample types and we have had hits and misses sometimes a lot of DNA contamination in the kit. Sometimes it's okay, but almost universally with a variety of sample types we don't see better yields or sort of better performance from anything else when we compare to power soil or power people pro. So, you know, there's other options, you don't have to use this but it's kind of become a pretty standard go to and a lot of my colleagues at the IMC feel the same way. And the reason it works well it has to do with these mixed speed sizes, lysis buffer, my postdoc lab always measured inhibition in their DNA and we found back then before it was acquired by kaijin it was this company mobile by by far it was the best at having low inhibition. And all of our library prep methods whether you're doing shotgun or amplicon require PCR so if you have PCR inhibitors it's going to reduce the efficiency of the process. And it can be hard to avoid having inhibitors in microbiome samples but this particular kit does a good job of reducing them. I don't have any references for this I haven't found a study I need to go back and look more but I've been in multiple labs and have tried to do plate based beat beating beat beating and it's just always been associated with high cross contamination so we avoid it we've tried it even again recently with whole genomes, and it was still too much cross contamination so unfortunately I think best practices is to beat beat and tubes, and least in our hands. So what if you can't be beat so for long read sequencing where you need these long DNA templates, you really can't be beat. And so I think it really remains to be determined what's great for microbiome this paper came out last year from the Arab, the Aaron lamb. They're still working on it they mainly were evaluating for microbiome samples, well length of the reads and the amount of DNA, not perfectly evaluating yet whether you're getting all the taxilized and so on so well ways to go with that maybe, but it's coming along. Okay. So, those are the main sources of bias really your sample handling and your extraction. When it comes to your sequencing library though there is some evidence of literature and again we're kind of seeing this in our own hands that library prep truly for shotgun can lead to some bias so depending on which kit you're using. There's a long standing notion that there's GC bias in your library prep. And this can vary from kit to kit based on how the library is made how the DNA sheared and how it's amplified from there. So, I don't know for sure yet if it's GC bias or if it's other aspects of library prep that are contributing to some of this microbiome variability. But this is starting to emerge and something to keep an eye out for. Okay, so this is just a different way to show what I just showed you all the steps right all the steps collection storage extraction. Then here all these lighter kind of orange circles then show all this bioinformatics that you've been thinking about for three days. So, in this long sequence of events that go into microbiome multiomics where do you think you need to think about positive and negative controls, which steps, starting at extraction extraction. Anybody else have any other ideas. Go ahead. I like that idea. Yeah. So, it turns out. Yeah, pretty much all of these steps. They should be thinking about controls, but you know kind of in line with what you're saying. Sometimes these controls are started at the beginning and carried forward and sometimes they're introduced at certain steps, and you really kind of have to do both to adequately control for all the steps. These should be positive and negative controls, and they should be done routinely, and they're not, they are not always. So for positive controls what we're talking about are known or expected biological content or DNA or wholesale community standards and we'll talk more about that in the next few slides. For negative controls we're talking about collection materials that are sham or lacking biological material. So for instance, you know, like some of the ways that this has been done is for swabs let's say. If I have vaginal swabs that are being collected in the hospital. I definitely want the exact same kind of swab and I want to carry it all the way through all my storage procedures and biological extraction. Some studies have gone so far so like give the sham swab to clinician have them unscrew the tip open it up to the air in the hospital environment where the regular samples are being collected and then close it. So you don't introduce it to a patient but you introduce it to all the other environmental features. Again, it's more important that you go to the extremes with all of that if you're doing very low microbial biomass where that background signals really going to be important, but for sure you should think about having some sort of sham or negative control that goes all the way back to sample collection, if you can. A lot of times you can carry those forward all the way through you can run it through extraction and library prep and so on. Other times like I said though you might introduce your community DNA or wholesale sample so wholesale that extraction, introduce your community DNA library. That's because then you can separately control for those two different things if you introduce the cells at extraction, and then there's some bias in the extraction. What if there's also bias at library prep, you're not going to be able to tell from the extraction control, what was the bias specifically from library prep. So that's why you introduce a DNA control at library prep as well to differentiate between two. And the purpose the purpose of these is to identify those unexpected problems things that happen that will reduce your sensitivity introduce contamination or bias the results and to detect batch effects, which could be a big clue that something's going through. Okay, so a little bit more about defined community standards and we'll wrap up and do some questions. So like I said these have to be defined you have to know exactly what's in it. You have to definitely advocate against trying to hunger them. I know sometimes in certain fields you have to because what the things that you're interested in are not available in a sort of manufactured and standardized standard. But it's very difficult to actually know and be able to control the ratios of the bacteria or the DNA or whatever and keep that consistent. Yeah, we prefer using sort of the same ones that are produced in mass by companies or some other standards entity that I'll talk about. You can get them like I said as whole cells so mixed communities of bacteria they're a frozen pellet that's how they come to you so they've already been frozen but they haven't been laced, or you can get the DNA. So the genome copies of the bacteria are mixed in equal ratio, essentially, or known ratio. So controlling the things that are important to control for with this is, you know, are you lysing both ground negatives and ground positives well in your extraction process. So I'm going to use these to look at are you recovering low biomass taxes so some of these samples purposefully have low biomass entities in them and then you can see did I extract and detect the bacteria in my control. Give me some expectation of really how low biomass I'm detecting in my true samples my experimental samples. You can either or determine what is the optimal sample handling procedures you need controls to work that out. Like I said with library prep, you can examine this taxa or GC bias with sequencing controls can help you detect and account for barcode hopping. Let's get into the bioinformatics to look at your taxonomic assignment is it sensitive and specific for all of the taxa and your controls to good measure of whether you're on the right track. One, two, five. Okay, so these are just a couple sides this is actually some data that hen has been putting together for me and other stuff in the IMC data science core. So these just we're starting to compile all of our standard DNA, sorry standard sequence runs from the last few years of experiments the last few years of sequencing data that we've generated and run and just look at the varies. This is where the field doesn't have a good answer is how good is good enough. Is it good enough to just see all the bugs you're expected to see to have the ratios look pretty much like they were expected to actually put a mathematical number on this in terms of, you know, how good did I do in representing a bacterial profile that was expected and known the math for that in the sort of standard application of that is still kind of lacking in the field. So to me develop there are some papers and algorithms out there, but it's not universally applied in the field at all and for sure. People are not accounted in that way at all at peer review. Most of the time in peer review that they only want to see did you use controls. They want to sort of use controls to maybe look at batch effects and a lot of times I've even seen authors come back and say, we know we didn't have batch effects because our controls we detected every bug we and just in our controls and every batch. abundance doesn't matter to them it's just we detected it and that flies to peer review. So, it's hard to fix it like you get these things to change but so far that's where we're at. We should be running controls we should be looking at the data but how good is good enough it's hard to say. So this is what we're getting in the IMC we can see the expected values for the Zymo community here, the standard Zymo community on the left. You can see the expected values and then you can see what we've been getting. Interestingly, you can even control the expected values by lots, and they have also ways to change and adapt your expected values based on copy number so you can tweak what the expected is, you know, using your Zymo control compare exactly to what you got. So we've got 16 as it's pretty consistent. We've also used some other controls that are a little bit more complex, but also then still commercially available. We've used. This is the Zymo gut control we've also used some of the ATCC ones. So you can see it's a little bit more complex and then a little more variable and like how well we replicate what we expect from the control. So you can see overall still even more kind of variability from experiment to experiment. And yeah, I don't know it's like certain experiments kind of our stand out so what we're doing right now is trying to dial into exactly what library prep methods and extraction, sorry library prep extraction doesn't vary too much but library prep definitely has varied over the last few years as well as sequencing platform we sequence on Nova and high seek. So we're looking at those variables to see if we can sort some of this out. Okay, so where can you find standards to use in your experiments so this is the international microbiome and multiomic standards alliance, of which I'm contributing member and their goal is to just continue to develop and build this understanding of best practices in the system. They have a great resources webpage where they have links to all of these commercially available publicly available standards. They also have like links to all the main microbiome standards organizations that have other kinds of things like data sets that you can use to control. So then I'll talk a little bit about what we're doing here in Canada with the impact. So we've been working on developing our own microbiome standard. The reason is that these, particularly the host whole cell ones. They're quite expensive. You don't get many uses out of each order. We can't actually import very easily a lot of the host whole cell ones that are manufactured in the US or other places they tend to have some bugs in them that are like fish pathogens and then we can't import them so we're home growing our own Canadian animal here. And we're also trying to make this more complex, whether or not this is a good idea remains to be seen but you know most of the others have no more than 20 taxa and we are spiking in 45 unique taxa, some of which are very novel and so it will also be like a good challenge for your microbiome informatics are you detecting these bacteria. On the other thing that we're doing this is a collaboration with an Ellen Burcos this is kind of her brain child but you spike in the strains into a matrix of human feces so you actually have like all the bacteria and the stuff in the human feces you have to get through that like component and control for that component, and then detect your spike in bugs. Right so it's kind of like making it much more of a realistic challenge to to our samples. So this is how we have been making these we got healthy human fecal samples we have been doing shotgun sequencing on them. We've also got the isolates of all the spike in strains and we've just finished the whole genome sequences on those. And these then were sort of pooled and Ellen Burcos lab and allocated and we've been distributing them in Canada for a year now to probably 25 different labs so far. So we are hoping to distribute the reference data sets everybody can map their reads back. Next, and then, you know, we'll see how they're performing and see what we might learn when we challenge. Our methods even harder. This is kind of what it looks like it's a lot of bugs in theory they're equal abundance we know they're not which so what would be sort of telling people based on our own sequencing. What's what we found in terms of our typical abundance ratios. So I'll just close with the impact websites. If you haven't seen I suppose you have because most of you were registered here but the resources tab up there is going to have the information of where you can find those standards if you want to sign up and have them shipped to you. So links to SOP some of them still need to get uploaded here in the next little bit but some of our sample handling procedures where you are definitely trying to make them public so you can benefit from them.