 Hi, in this video, I want to show you what the effective sample size would have on our hypothesis tests and the results. So for starters, just to summarize, we saw in all five hypotheses tests that our p-value was zero. And this was using all available data, which one should use normally. You should always use all the data that you have available to yourself. But we had pretty large sample sizes. So we're using all the data, good amount of data, data collected every 15 minutes during that two-day duration. So there's a wealth of data. And because of that, we got really low p-values. We had lots of evidence to definitively say that, well, first with the single sample tests that, well, really both wind and gas didn't meet capacity. But then when we did the two sample tests and compared them, natural gas performed even worse than wind. So both failed, but natural gas failed more effectively is what's being said in the results here. But we didn't see much variation in our p-values. Every time the p-value was zero and we had these very definitive results, we always saw that the sample statistic fell well outside of the randomization distribution. And so I want to show you what the effect sample size has here. One, to give some more variation in the results and what statements you should make if your p-values are not zero. But then also to really give you motivation and show you the importance of collecting more data. So in order to do this, we're going to imagine some situations here. So imagine the situation that we have far less data than we actually have. Say, for example, suppose during this two-day event across February 14th and 15th and February 2021, maybe there was some sensor failure in the grid and we weren't able to record data every 15 minutes. Maybe we only had a handful of data in actuality. So in order to kind of simulate that hypothetical or imagined situation, let's insert some code here and alter reality. So we're going to down sample the data set. And now this is something that you shouldn't normally do in practice. Again, in practice, you want to use all the data that's available to you. But I'm doing this here just to show you how these results would change if we weren't fortunate enough to have this much data. So in order to do this down sampling, we're going to go back to GenPIV and we're going to pivot this. I'm just going to actually, we've done this before, so I'm just going to copy and paste where's our pivot. So we're going to pivot so that we have this situation. Now that we have values paired here, we can easily down sample. Essentially, we can just select some of these times in which, say, for example, the sensor would actually be working. And we can do this quite easily using a built-in function in pandas called sample. So this sample function is from pandas. And what we do by specifying n, say, equals 20 is we're just going to randomly sample 20 values or 20 rows from this data set, okay? So instead of having, you know, 192 samples, we're going to go down to 20. So far less data. Then we're going to, then we'll though, then we're going to reconstitute this in its original form. So we're going to unpivot aka melt. And we did that here. So we can copy and paste this here. So we'll, datetime is down index will bring us back to a column. And then we will And then we will melt this into this final form. So let's actually go to runtime and we will run everything before Up to this code. Now we can run this again. And here we go. Here's our, our new down sample data set. We've got datetime field generation note that we've got 40 rows here. So Starting at zero to 39 Because our sample size is 20 and we've got two groups. Okay. So that's our new genpif down sample. Now let's see what change this has on all our results. We have a slightly different when being all those basically the same 5.15 will redo a randomization distribution. Oh, This randomization distribution has gotten quite a bit larger now still centered on the null value of 6.1 is this for the single mean here and it overlaps with our sample statistic. So our P value now is not 0.042 This is still less than 0.05 though. So we're still less than our, our Our significance level, our default significance level, but we should change this our P value is not zero anymore. It's 0.042. We would edit this statement. I'll just delete this right now. Therefore, we can still reject it. We still reject the null hypothesis but 0.042 is less than 0.05 but just a little bit less. It's not dramatic. Less. Okay, let's continue on. What about with our test for single proportion? Our sample statistic has gone up a bit. What seven Can just redo all this code here. Again, we've got overlap here. And what's our P value 0.061. So this is not, this is not meet the criteria for ejecting null hypothesis that is using the default value of 0.05. So we would change this. Since the P value of 0.061 is more than 0.05 We can no longer reject the null. We would say we fail to reject the null hypothesis that we met or exceeded the capacity half the time. Nope. Really, really important here. We do not accept the alternative. We are not saying that the alternative hypothesis right here that this proportion affiliates greater than half the time. We're not saying that that's true. We're saying that the null hypothesis is wrong. In other words, the evidence I'm sorry, we're saying that we're not accepting the null hypothesis. We're just failing to reject it. And we're not accepting the alternative. Furthermore, continuing on, if we get into two samples, we'll just recalculate our percent deficit values here. Note that this is you know, much less detailed of a curve. We have, you can see where the data points are here, where the inflection, the curve changes. And we will go ahead and run our tests here. Our difference of means no longer 12.5 percent, 13.8. So that changes a bit because we've altered our dataset. Again, we've got overlap here. And again, our p-value is greater than 0.05. So we would again alter the statement to reflect that. This is no longer 0.0, but 0.77 is more than the significance level of 0.05. We fail to reject the null hypothesis, but there's a difference. Okay? So we fail to reject. We do not accept the null. We fail to reject it. We're doing, now going down to the mean of differences. Running this again. Note that this Python notebook makes it very handy to just rerun the code. We just go through and press play. We still have everything called genpiv. We just altered genpiv. So it makes it very easy to examine this hypothetical situation. This is interesting. So when we do the difference of means, our p-value is 0.038. So it's still low, still below 0.05. We still reject the null hypothesis, but this is different than the difference of means that we did above. So take a look at this randomization distribution from about negative 20 to positive 20. Note that we still have the same sample statistic here, minus 13.898, blah, blah, blah. When we go up to the difference of means, we still have that same sample statistic, but the randomization distribution is fatter. Goes up to maybe minus 22.5 and positive 22.5. The p-value was 0.077. It was about twice as large as what we got with the mean of differences. So as I said before, when doing the comparison of the mean of differences, by constraining the randomization to be at those times within each time interval, we're adding information to the test. We're saying that the time makes a difference. The time is valuable. And in doing that, we get a lower p-value. So really interesting here that doing the difference of means, we fail to reject the null. However, if we do the mean of differences, we do reject the null. We get a very different outcome. Just to round off the discussion on effective sample size, we'll redo our comparison of proportions. We run our code here. A portion of win 0.7 agrees with what we have with the single proportion, difference 0.3, much wider outcome with the randomization distribution. P-value quite low, 0.077. So we would still reject the null in this case. We still have the same outcome. So the effective sample size, if we have a much smaller sample size, we do see these non-zero p-values. And in particular, some of those tests failed to reject the null. The p-value is larger than our default significant level 0.05. And also in the two-sample comparison with means, our difference of proportions fails to reject, whereas our difference of means fails to reject, whereas our mean of differences still rejects. So really interesting there. Okay, so sample size, really important. Larger sample size, always better. More data is always better. Really important to repair that in mind. Okay, thank you.