 All right, talking to a few of you, I just wanted to go over some of the concepts, and also I had some questions, and I'll try to go over it and give you my two signs, vignettes, so to speak, where I'll go over some of these issues that you've been talking about. Okay, I've shown you this slide maybe five times, but a camer. I don't have the knowledge of Morgan and Rob in terms of how to name a little bud, but I like math, so I like camers. So if you have a camer, essentially, if you have a two million base pair genome and you have a camer of 21 or any size, a camer is two million minus one, you just move along with that window. So you get lots and lots of these camers, and that's what is a very useful unit for us. I want to show you two things, how we went about the study, I talked to you initially, how we gave antibiotics to healthy people, basically. And what we learned from that is kind of interesting, and we haven't finished analyzing all the data, and I just thought about new things, how to do it, which I'll tell the students a little later. But in essence, the test was pretty easy. So we didn't do 16S because we had enough money to sequence. So essentially, we took some controls, six of them, and 18 people who actually were healthy people who took the antibiotics. And then we took a sample at day zero. We took a sample at day seven, and day seven was chosen because this is supposed to be essentially the peak of where the activity of the antibiotics is, in terms of suppression, and the paradigm or how people thought it was like 90 days later, you should be okay. So we took a sample at that point. And then we extracted the DNA from the stool sample, and we sequenced pretty deep 15 billion nucleotides per sample, just to get decent data. One of the first things we saw, sorry, just moved a little bit, but these are the controls and these are the exposed. And then they're paired by three, zero, seven, 90, zero, seven, P for patient, and then C for control and E for exposed. The first thing, if you look at that graph, it's like the purple is bacteria, I say, so that's good. We had one person, but if you look, the best thing I can say about these types is that an individual is his best control. If you want to do some microbiome research, take one person and follow him. It seems to be a way to do it that you can get some data a little easier, because unless you can do 200, 300, 400 patients, you don't need that because there's a lot of variability. So sometimes go for the lowest hanging fruit, think about following people and gives another vector in the analysis. And you can see, most people, if you look like pairs of three here, or essentially, they look the same. And then some of them went wacky, a couple of them, and that's the person who took an iron. She never came back. So I think that's one of the little vignettes that I wanted to tell you. Just try to see if you can add another dimension for analyzing the data, not just zero or one, another vector by giving some time points into it. One of the other points to make is we always have these graphs saying, well, these bacterias, they increase, well, I gave the drug, and these decrease. And obviously, careful that is you're moving your cursor into the iceberg of sequencing that you do. So that's one of the issues. One of the other points is you can see, I've shown you this, it's very dynamic. And normal in terms of microbiome, I think everybody agrees here, normal doesn't exist. Although, by Thursday with the yogurt, maybe you'll feel good. Maybe I should, a disclaimer, I'm actually, I'm doing a study with them, I should have said that. But they're not paying me, they're not paying me, although they gave me some free biocay when I was taking an antibiotic because I wanted to test it. Other things, cool ways to look at it, the paper was looking at what happens if you take an antibiotic. And I'm just asking the medical questions here. If you have a very diverse microbiome, are you better off than somebody who doesn't have a microbiome, that's a little bit more constraint. And then we went and looked at basically the enterotypes, you know. So there's a bacteria also in privotela, so we look, okay, and that's the patient who went berserkie. And then you could, we could cluster it in terms of similarities, okay. And then you can, that's at the level of enterotype. And then you can start digging down a bit. And then about their microbiome diversity, those are made, measured by Shannon index and all of these types of things, and I won't say OTU here. But you have, some people had a very high density, the cluster here. And some people had a low, low diversity in the cluster here, okay. And here. And then these were a different group because they're very different from the others. Now, what was really interesting is that those people were at lower diversity in these cluster two and three, though some of them, and all of them you can see exposed seven, a lot of them that the lower diversity had this one that's prognostic of not having a good outcome, okay, with diarrhea. Yes. Is this diversity at baseline with warranty policies? Yeah. Yeah. And I'll come to that. I'll show it to you. So, and when you start counting stuff, okay, you have, we could detect, we can screen all these gamers. And this is the level of material you've got there in some ways. We found 24,000 mobile elements, so about 340 in each one of us. There's 43,000 resistant genes, about 600 per sample. That's a lot. And then resistant genes that are next to a mobile element because the hypothesis, if you're next to a mobile element, maybe you, an easier chance of being transferred, 29. And we find a lot of putative beta-lactamase. So in some ways it goes back to what, be careful of what happens. You can make some, all sorts of assumptions because there's a lot in there and there's a lot moving in there as well, okay? And then, so when we took this Ceph Prozil, and we took Ceph Prozil because that's the only one the ethics committee would allow us to give, because it doesn't do much. Because I couldn't do kingdom vancomycin and flush it pretty well. But in essence, we could see that it had an impact, and that's kind of the puny one. And it was kind of specific, and it seemed to be reproducible and also predictable. And then the thing is that low diversity, and this is what the question's really, we kind of answered after the study in some ways. But this is, the ones that had low diversity had a bloom of enterobacteric cloaca, which is not a good phenotype. And then resistant genes that were undetectable before, we saw in two-patient we could actually see the actual gene that's responsible for this resistance, so I didn't show it to you. So, I think this can be generalized to a lot of other things. The way also, can we look a little bit further than taxonomy? And here, this is a plug for Maxim who's working on a software called Ray Surveyor, so we do everything that's called Ray for us. Ray meta, Ray Surveyor. And initially, I thought, I think it was insanely great. Is that what I said? My quote, yeah, okay. And I still like it, and then you'll be training on it a little bit. So it's not published yet, it will be very soon, and it's obviously available. So we've talked about when you take an antibiotics, you select one. And then everything is quite interesting. But in single looking at all of these pathways, could we use the Dubroon graphs that we had? So we created a graph of all of the camers and their relationship. And then we have the other one and another time point. Could we do this? And could we compare all these camers of all these genomes as camers? Who cares about OTUs and all the rest? And can we cluster them? We could. And then we could set that up in minutes. You'll test that. We can compute distances. And it's not phylogenetry, but phonetics. So we could actually start looking at this. And we can do all sorts of funny statistics, but I'll show you a little bit how I'm thinking about how to go ahead with this, go forward with this. So if you look at how the antibiotics affect the metagenomes. So in the control, here's the essentially kind of the diversity index. And it's here, an average for the control. And the average for all the people who receive they were exposed to the antibiotic. So when you give an antibiotic, you start squeezing a little bit the diversity. And then if you look a little later, then you go back a little bit. Not much, not totally. And the thing is, OK, but what did disappear? And what's coming up? So in a sense, you gain some kmos. So these are, again, control and a patient. So both of them are modulating. These people had nothing except time. They gained different kmos, and they lost different ones. So that's just basically how the variability. If you look when they're treated, they gain some. And they lost more, also. So that's very interesting to me in some ways. And then if you look what happened at day 90, so, again, about the same came or gained and lost. And then, but you're starting to gain more kmos. Some things are coming back in terms of just what was there before at day 90. And what's interesting, and I just thought about that essentially, but we could actually look for functions here. What do you lose as a function when we give a particular drug or a particular antibiotic? So I'm thinking those lines. So feel free to do that if you don't do it. And same here, what do you gain? Obviously, you're getting some of the same species back, but not always. I mean, you've changed the niche, so I think it's interesting. So now I'd like to introduce to you a little bit what race-vehra are, OK? And race-vehra is my attempt to try to understand what's happening. And I've failed, but essentially, it's starting to give us a very nice picture. Here, I'll show you something about antibiotic resistance. And these are all one bug, kloskidium difficile, OK? And then what we did is we sequenced, I've sequenced nearly all the kloskidium difficile in the hospital in Quebec City since 2010. So lots of them. So there's certain hospitals you shouldn't go, and I'll tell you later. But if you look along the diesel camber comparisons, if you look along the diagonal, so that's essentially you're identical to the other one, OK? This is comparing in terms of camers. And in Canada and also in Quebec, we have one massive strain called NAP1. That's actually antibiotic resistance. And all of these are not so antibiotic resistant. So it's a way to see, OK, this is what circulate right now. These are the other ones that circulate, but some of them have some resistance. And you can detect little clusters in terms of maybe that's the next strain that's going to happen in a hospital or something like that. And then I really liked it because it's very, very visual. And then I asked, OK, so now this is comparing sexually the camers of the whole genome. So what happens if I compare this graph generated with the whole genome with just the camers that are associated with resistance, OK? So I remove everything and just look at camers associated with resistance. And in essence, this is what it gives you. So you look, this is like Scottish tartans in some ways. But in some ways, you can see that for this group, it's along this, this group resistance came more important for this group. So that's normal. I mean, they're resistant. And then within this other cluster, there was a smaller group that was involving resistance. So I think that was pretty interesting. So maybe that's the next one we'll have to check to see if it's OK. And then you can go a little bit further with this. And we haven't totally validated that. I put that out here because we have time to do it. And there's some experts in the room to shoot holes in there. But so here's Clostridium difficile. Here's the resistance. It seems important. This is more mobile element. Same, looks like a, and then this is a biosynthetic gene cluster. Really doesn't look like this one. Here's the importance of bacteriophage, maybe more here. And here's the emplacements. So this is all for sedif. So we went a little further with Frédéric and Maxime to look basically at Streptococcus pneumoniae and Pseudomonas originosa and bacterias. That was 2,600 bacterias or so that we use. And we say, OK, let's see how this is the whole genome. So again, you're identical here. You can't read, obviously, all the teeny things. This is 2,600. And Pseudomonas originosa is around 600 and about the same for Streptococcus pneumoniae. So and then you can look at insertion sequences. And then we can say, how much this looks like this? So we do the percentage here, 6%, 6%, 0%. So it doesn't seem to be important here. Plasmid genes, 0 with full bacterias, 97. So resistant genes for Pseudomonas originosa, quite important. For the other one, important but to a certain extent, 28%. And then bacteriophage, you can see the percentage that look alike. And then here in bacterias, plasmids are super important. So you can practically, if you just take plasmid sequences, they kind of recreate the same graph in here. And here in terms of plasmids, it's important to Streptococcus pneumoniae. Doesn't seem to be important for Pseudomonas originosa. And bi-synthetic gene clusters seem to be important for these two, a little less for Streptococcus pneumoniae, who's in a human niche nearly all the time. So not totally sure what it means right now. So let it fester in your mind as well. But we went a little bit further and asked ourselves, well, this is all the bacterias. And then we say, look at mobile elements, resistant gene, bacteriophage plasmids, and biosynthetic gene clusters. And we have all the families here. And then you can look what's important for each one of them, just as a whole. So if you look for bacteriophage, in some ways, you shouldn't look in these. You should look a little bit more in these. And that's the way we approach the problem. And in essence, if you look for instance, just bigger because you probably couldn't read it very well. But if you look for a biosynthetic gene cluster, it doesn't seem to be a great deal here, a lot more there. And when we did this tree, we had at least 100 kind of bacterias per line. So it's starting to do. So it's a way to compare in some ways the evolution and the importance of these different agents to the evolution of these bacterias, because you actually sequence them, they were alive. And basically, that's what we're doing. But we could do any functions here and see how important they are for this particular process. And in some ways, maybe we can try it from it that transcriptomic in some ways. So it's very early days for this. But we'll see how it goes. And you'll have a little tutorial to see you're helping us to make it better before we release it in some ways. A little thing I wanted to talk to you about. And it was brushed a little bit. So biomarkers, this morning, Fiona talked about biomarker validation. So there's a lot of them are DNA-based. We know that. A lot of them are bug-based. So which bug do you have? But I think the problem is that sequencing is not very amenable to a pipeline in a hospital, because I'm in a hospital. So I was trying to look at other ways to do it. And I just want you to think about it. I'm not telling you how to do it in any way, shape, or form. So these bugs, they produce little molecules, metabolites. And they're the actual effectors. And I think these are the ones we can measure. And we have beautiful instruments to measure this. We have mass spectrometers. We can measure metabolites of specific bugs. That could be basically a proxy for these particular bugs, rather than just doing sequencing. And it's really cheap. OK, the instrument is not cheap. But when you run the samples, it's pretty cheap. And then the last thing I wanted to say is when we're talking about supervised and unsupervised, we like, in this group, we like supervised. One of the reasons is because it's usually interpretable. And not just supervised, but supervised and parsimonious. So that means there's sparse. And the reason is it's helpful to build your house of cards in your mind. In a sense, if you have like five, six, if the decision, the machine learning, and you will dwell on it a lot, if the machine learning tells you four or five stuff, features that are important to make your decision, it's a lot easier to interpret than SVM that will take millions, every pick you have, basically, or every feature you have. So we're concentrating on this great deal. And I'd like to come back and you'll have plenty in the next day and a half to either refute what I just said, or make your mind that's where you want to go. There's a lot of really cool stuff right now with K-Base and also Prism and to find these metabolites that will help us and also Metasys. This was a comment on Galaxy. Who uses Galaxy here? OK. It's OK. It's good for the notebook. Pierre-Luc, you use Galaxy or not? Well, you know how to code, so you get irritated with GUI and rather write the code line. But in some ways, these couple of projects are really good for you because in some ways, it simplifies the approach so you don't have to know too much of command line approaches. And the whole aim of this school was to basically take the 1 third of computer scientists and the 2 thirds of biologists and put you. I don't think one will become the other. But you'll be at an interface where you'll have a common language and that will be very, very useful, I think, for all of you guys in your careers. So I'll stop there by showing again the figure of the gang. And I'll take a couple of questions and then we should go to our symposium. OK. Any questions? None. OK.