 So just the last thing from yesterday, I'm going to continue this, we made this really nice box plots, and we were able to change the colors and change how these access labels were sized and named themselves, and also split our box plots by the individual markers. So we could have them side by side and what's really notable here, and very distinct from, as we saw the base R implementation, is that our axes are exactly the same. So when you do a facet grid like this, so here's the facet grid over the markers. When you do facet grid like this, it's going to fix your axes. So they're exactly consistent across all of your graphs here, and that can be really useful if you want to do a direct comparison, so that you can see the difference, maybe it's important what scale they're on. So it could be that if it was graphed on its own scale, it would look very big, but if it's graphed relative to the other markers, maybe the scale is actually very small or the variance is very small. So this can be a really useful technique for fixing your axes. You can do this manually, of course, in the base R setting, but it would have to be manual and you'd have to know what axis limits that you'd like to set. But of course, you do this graph, but we also want to do some statistical testing here. So I think the statistical test that we are all most familiar with is the t-test, the simplest here, and you'll see the t-test, the actual script to do it is very similar to actually the base R box plot. So here, we're testing the difference in the marker's mean, so the mean of marker one in this first line by the treatment factor here. And treatment factor could be zero or one, or it could be a factor variable. However, if you have more than two groups, you will want them to be factors because you want everything to be compared to the baseline. Again, actually, though, with the t-test, it's just two groups. So here, that's not a difference. But if you're doing an ANOVA or a Kruskal-Wallis, you'll want it to be all being compared to a baseline. So you'll need to have it encoded as a factor variable. I know I harp on that a lot, but it is one of the most common errors in R is to have a grouping variable that's coded incorrectly. So here you can copy your box plot code and really just revise it to have a t-test instead of box plot in front of it. So you'll notice that our base, our box plot code looks almost identical to this. So go ahead and copy that and click Yes. Once you've done a t-test over each of your markers, and I really encourage you, copy and paste as much as possible here because typos are a scourge. And if you're running into issues, just screenshot it into the Slack. We'll all debug it together, and others will learn from what you're encountering. We saw yesterday a lot of errors. They're just being shared by all of us. So sharing them with the group is a really good practice here. Great. We already have one. Let's move this. Sorry, folks. Awesome. And if you've been able to run this, you can also try t-testing different things. So maybe t-test the sex factor, t-test sites, see if there's significant differences between that. Experiment, maybe see if age is also significantly different by treatment factor. Explore the output of it. We'll go over it in more detail, but it's quite a detailed statistical test output as well. You can even use structure around the output, but we'll do that later today if you're not comfortable experimenting with that. Oh, so Nagla, what you're finding is, if you've exited our R, then your environment's going to be empty, actually. So it's here. You can, I'm just going to put this over here. So let me just refresh my environment here. Yes. Okay. So if you start up a new and you have your script here, but your environment's empty, that means you need to run your script again and get your environment all filled up again. So you need to just go and run set working directory, get your DF, run, you don't have to run this lines, but you want to create your sex factors. So here I'm just holding down control and pressing enter as I go. And I'm just running line by line, but see my environment is growing here. And so all we need is this DF2 here because this one, I'm going to do a viewer of it. It has each of my markers and it has my factors. And so for the T test, that's all I need. So if you are in an environment where you haven't, or you've closed it down and you no longer have the data there, it just means that you have to run it again. Okay. And then you should have no problem at that point running these T tests. So to run it again, I just need to choose everything and run. Yeah. So the main things you'll need to run are reading in your data. So this line here, reading in your data and then creating your factor variables. Okay. Yeah. Really good question though. I'm sure others are encountering that as well. I've tried to play with it a little bit and I changed the comparison to be between treatment and sex factors and still getting an error for some reason. Yeah. So treatment isn't continuous. So a box or sorry, a T test is comparing a continuous value between groups. All right. The continuous value in the first position and then groups in the second. Yeah. Yeah. Makes sense. Okay. Ah, great question. Ah, yes. Thank you, Raza. So he's pointing out, I think, just correct me if I'm wrong, but here yesterday we discovered, whoops, not there. Here. When I did the facet grid, I did a marker instead of variable, but you may need to change it to marker or variable. It depends. If your melt function had this variable name created as a marker or as I found, it was being created as variable. So that's a potential thing that will be wrong. But if you get an error at this point, it shouldn't actually be an issue for your T test down the line. So definitely write some, Raza, that it will give you an error when you run the whole script. But this is our melted data frame for these guys. But here, we're using DF2. So strictly speaking right now, you just need the DF2 and the factors. But absolutely right. There's a potential bug. I'm just going to go to it here. Where you melt and then where you do your facet grid. Yeah. Good question, Christina. So let's actually try without that. So T test, marker one. I'm going to compare it here. Treatment, treatment factor versus. My understanding is if you have a comma, you've already split your groups. And yeah, it will compare them. It will compare two vectors. So it's treating them as the two groups. Yeah, that's what's happening. Okay. So let's look at the help documentation. So okay, great. Here, what the help documentation is saying is either you put a formula. So this in R here where there's a tilde separating two things, this is a formula where your outcome, which is our marker here is on the left-hand side. And on the right-hand side, you'll have a factor, one or more factors. Okay. And so this by our standards is a formula. We'll see this later when we do linear modeling. And so when it's saying T test formula, this is what it's asking for here where it says T test X, Y. So what that means is you can see with the arguments down here, where it's describing them, X has to be a non-empty numeric vector and Y has to be a non-empty numeric vector. And so the reason it didn't work when we did it here between marker one and treatment factor is DF2 treatment factor is not numeric. It is a factor. So if we wanted though to do a T test between let's say marker one and marker two, then that's also totally valid. So if you already have two, if you have two vectors that you want to compare and you've already split them, then you can do it by writing it out this way where you have T test marker one. So your DF2 dollar sign marker one is your first vector and DF2 dollar sign marker two is your second vector and you're comparing them. And these are two numeric vectors. So T test comparing two numeric vectors, that's an option. But if you have a data frame where you have a long vector that you want to split based on some other column or factor, then you would use the formula, which is specified here. And so the formula is allowing you to now split one vector based on another vector's factor assignment. If that makes sense. Yeah, thank you. Awesome. All right, don't forget to click yes once you've got it working. And we'll move right along to look at the output. Yeah, so test for normality. I believe it's Kolmogorov Smirnov test. So it would just be another statistical test. Alrighty, so I'm going to start going through the output here. So what we have, just for marker five, for example, we split this marker five values by the treatment factor and we're testing a difference in the means of these two. And we're doing a Welch's two sample p test. It's nice because we get all these details around our hypothesis test. So I'm going to jump down to this middle part. So the alternative hypothesis is that there's a true difference in means that's not equal to zero. So it's actually subtracting the means from each other and testing if that difference is different from zero. So they show you the mean in the group control based on treatment factor here and mean and group treated again based on treatment factor these names control and treated I gave to the treatment factor when I was creating that factor variable. And then this 95% confidence interval. This is the confidence interval around the difference in these means. So the point estimate for this confidence interval is actually this minus this. So this one group minus this one group, I believe, let me just stare at that a second. Nope, it would be this. Sorry, this first group minus the second group because it's negative. So your point estimate is actually not shown here at all. You would have to get it by subtracting group one from group two here. Okay. And that's what this confidence interval is around. And then they give you all the statistics around it. So they have your the t distribution statistics. So your t statistic is here, the degrees of freedom with these two things, you're able to find your p value. And so that's where the p value is reported as well. All right. So isn't 0.57 minus 0.51 isn't that 0.05995? Yeah, it's not right. It's not the point estimate. So the point estimate is in the middle of the confidence interval. That's why I know that it's this one minus this one because it has to be negative because it would be a centered confidence interval. Okay. Okay. Okay. I understand. Okay. Thank you. Yeah. Yeah. No, good question. So just to make that more concrete. So we're going 0.51 minus 0.58 about that's our confidence or that's our point estimate. And then this is the lower confidence interval limit for 95% confidence interval. And this is the upper confidence interval limit around that point estimate. Okay. So all this is to say this is your t test t test output later today, we're going to look at how do you actually extract components of this. So you could report it in a table. All right. So that's it for this. So I wanted to find out because I put another short exercise on the website. I wanted to find out if anyone has done this short exercise, it was put for the end of day two. And I said, you know, you guys can do it or not. No problem. But we're just testing whether the distribution of skin cancer biopsy results differ by city where they were done. Nugla, is it for marker five? Because there's five markers. So it might be different. Also, the data might be slightly different at this point. Okay. Can you just screenshot your output, Nugla? Like just do a snip of it. Awesome. So for the short exercise, and we you're not going to have to do this right now, but I just want to go over it so that it's optional for you guys down the line. And then the the solutions will also be posted. And so not today, though. You read in your data. You do multiple chi square analyses, so different from T test, but we went over how you do that on a table and create a series of bar plots, something like this. So you can see how this is matching what was done with the box plots previously. But it just gives you an opportunity to redo an experiment with and try out more of what we've been practicing already. So