 Hello, and welcome back. This video takes off right at the end of the previous video, where we went through and developed this loop to create a thousand samples using Bootstrap. But we still haven't created the sampling distribution because that's the sample of statistics. And in this case, we'll use X bar. So in order to do this, we start off by finding the mean of each Bootstrap sample. So I'll just call this data frame BootMeans. And we say BootDF.GroupBy. We wanna group by sample, which is what we called each of our Bootstraps up here. We wanna take the mean and we wanna tack on a rename function so that it's a bit more descriptive because it's technically no longer the outcome. It's now a Bootstrap sample means. So we can print the first five rows of that. And so now we can see we've got different samples and we've got different means of those. So this is now our sampling distribution. But it's usually a little bit better to plot these things. And so we can go in here and use our ggplot with BootMeans. And so we can do a dotplot, which we introduced in the previous series where our X is just Bootstrap sample means and our dot size will do 0.25. And so we can see that this looks pretty good. So compared to what we previously had because we have a thousand iterations, we're now starting to see this normal distribution that we expect from rolling a dice. And so, but we wanna continue to build off of this plot. And so I want to show you how to plot the original sample means from those original four samples to this dotplot in order to do that, we need to reset the index somehow. And so previously we've done dot reset index, but another way you can do it is say dot index. So you can just create a new column that is equal to the index here. And so now we can see that the index stayed the same, but we've got this extra column called sample that now matches what the index was. And so then we can build up our previous plot. I'm actually gonna come up here and just copy this. So we'll run it. And so this is our basic plot. If we wanted to add in our original sample means, we can also add a secondary geom dotplot where x equals sample mean, which is what we called it up here. And we want to fill based off of sample so that we can see the different colors. And then we have to do an additional thing here. We need to specify the data. And that's because this data is different than the data we specified up here. So this data set will be carried throughout the entire plotting function unless we specify a new one in a specific graph. So because we're working with data means now, we need to add this, this way. And then we can say dot size will add, we'll make those a little bit bigger so we can run this. And so we can see these giant dots. So maybe that's a little too big. We'll do 0.25 there. And so now we can see our bootstrap in the back with our black dots. We can see our original sample means here. So you can see how the distribution has changed just by increasing the sample size. And then the last thing that we'll normally do is plot where the actual truth is. So here I'm using a new geom called geom vline which we'll just add a vertical line. And within the AES, we just say x intercept equals seven. It's just the true value. And then we outside of the AES, we can just set the color equal to blue and we can set the size equal to one. And so now we can see the completed plot. We've got our bootstrap mean or bootstrap sampling distribution in the back, our original sample means here and where the true value is. So we can also use this to see that technically while the bootstrap does have this nice normal distribution, it is heavily skewed towards the lower half of our rolled values. It's not directly centered around our true value seven. And that's because the bulk of our data, as we can see from our original samples was way down here. And so when we're pulling the bootstrap, we're more likely to get those lower values than to have some of the values up here. And so that's a really critical point to remember that, yes, you can use bootstrapping to generate a number of samples, but it can't give you anything that you haven't given it. So if you give it biased data, your bootstrap sample is always gonna be biased, even if you increase this to 10,000 to a million samples, it all depends on what you give it. And so this is how we can do bootstrapping to develop a sample size. Later, we can talk about how to actually take the sample size or take this bootstrapping sample and develop a confidence interval around it.