 Hi, in this video I'm going to show you how to get confidence and prediction intervals for a linear regression model in Python. So let's get to it. So in order to make this work, the key function we're going to use is get prediction. And this is coming from the stats models package that we imported last video when we wanted to get our squared. And so the command to do that it import stats models API as SM. And recall that in the previous video on the previous page we ran this code to get this output and stored the fitted model in rest. So we defined an ordinary least squares were simple linear regression model using this line here. And then fit the model to the data here. And this is the summary of the output. So this sort of code needs to be run first, in order to make the confidence in prediction intervals using the method I'm going to show you here. So let's start off with a simple example here, if we want to make some predictions. We can then call our object rest shorthand for results and rest, you know, could be anything this is user defined. We again we define it up in this line in the previous chunk. And the key function here is rest dot get predictions, or just predictions are. And the arguments that you give it are the values at which you want to make the prediction. So I have to give it two values, let's say one and 625. Now where I'm getting these values from one is corresponding to the intercept. So if in this model I am adding constant here, and thus have this constant line here. I need to give it a one first. This is basically a placeholder for the intercept, and it has to be one. And then value is the value that's going to be substituted substituted for the explanatory variable, in this case drowning. And so I'm picking, picking 625 arbitrarily because if you go back up to the data 625 is, you know, a bit higher than anything else that we've seen the data. So I'm saying, okay, let's predict outside of the data. So then in our object predictions, I'm going to say summary frame. And I'll give it an alpha value. This is my significance level. We'll use our default significance level of 0.05, which would correspond to a 95% confidence interval and a 95% prediction. So let's go ahead and run that. This is the output I get. And this value, this is the prediction of my linear model. So this is the value that falls on the line of best fit at the drowning value of 625 mean SE this is the standard error about that estimate. Here are my lower and upper bounds of my confidence interval for that mean for that prediction. So my lower and upper bounds for the prediction interval. Note that it doesn't indicate prediction it says the observed confidence interval and observe, observe confidence and lower and observe confidence interval upper here so not not doesn't align very well with the language used in the lesson. But you just have to bear in mind that these last two values are the prediction interval okay. So that's how we can, at a very basic level get the confidence of prediction interval for one point, one x value, but oftentimes it's useful to get get it for many values at once and to plot that graphically so let me show you how to do that. For many instances of x and then we can plot that. So let's make a new code cell. Let's make a new data frame from scratch here using pandas data frame command. And, and here I'm going to just make a new set of x values using range command. And here I want to span the range of drowning data that we have. And so I'm just going to say 400 to 625. And the third optional command I can give to range is a step size. So this is going to be the values 400 to 625 steps of one so 404 to 142 etc all the way to 635. And that'll be stored in this object new DF. The other thing I need to then create is a column of ones that are the same size as what I currently have here. So I can say make a new column called constant or constant. And this will be a bunch of ones. So I'm going to have one. And when I have one in the square brackets, and then multiplication it means replicate the value one, not multiply that value one this many times. So new DF shape zero. So I'm going to replicate the value of one. So many times as I have rows in this new DF. Okay, so let's take a look at what that looks like just to show that this works as expected. We can look at the first five rows. There we go. So I have my x values and just once again these ones are placeholders for the intercept. So now that I have those I can then use get prediction this function I can exercise it over all of these values. So now I give it my column of ones not just a single one value and my column of X is not just the single X value. So we'll store the summary of these into a data frame. So summary frame. Again, I need to specify alpha we'll use our default alpha 0.05. And let's see what we have now. So now we've got the prediction at X values 400 401 402 etc. It's not showing those values here but it's giving us the predictions confidence intervals and the prediction. Okay, let's merge this then with new DF so we're going to override new DF which has the X values in it. Using our concat command. So I'm going to take our existing new DF and I'm going to just append on these predictions. I need to say access equals one so it does this over the columns. And we'll see what this looks like here. There we go. So now I have the X values in there as well. And then we can start with our initial data frame that has the data in it. And we'll just start by plotting those points. So our X was the drowning in our why is the data in it. And then we can start with our initial data frame that has the data in it. And we'll just start by plotting those points. The drowning in our why is the nuclear. And let's see what that looks like. Okay, so there's our scatter plot, just the data points. We can use GM smooth to conveniently make not only our line of best fit. But also our confidence interval. We can give it the aesthetic, and then we'll give it a line color. And we can say our method will be a LM or linear model. We'll use, we'll say, as equals to use the standard error method, level 0.95 94% competence interval, and we'll fill it with red as well. Like, there we go. So there's our 95% confidence interval in our line of best fit. So the red lines are line of best fit. And 95% confidence interval. And then this, the thing is, this doesn't give us our doesn't give us our prediction error. So we had to calculate that manually. We don't have a convenient built-in function in GG plot to make that prediction interval. But what we can do is we can reference our new underscore DF, which has the prediction interval information in it. And so we just have to plug in this new data frame here. So we're not borrowing from this data frame that we put in GG plot. We have to give it this new data frame and we'll specify our aesthetic now is a little bit different. Our X is the X column in that data frame. And our Y would be the OBS CI lower or the lower limit of the prediction interval. And let's give this a some colors to differentiate it from the red, we'll give it a blue and let's say line type equals dashed just to differentiate a little bit more. And we also have to, of course, give it the upper down to, so we'll just copy and paste this line and change lower to upper. And there we go. There's our 95% prediction interval in the blue. You'll note that it extends a bit beyond what we have in the red here. That is because of this line here this range I said 400 to 625. It's going from 400 to 625 here. When I use the GM smooth command, it automatically limits itself to the limits of the data. But if I wanted to extend the line of us fit and or the 95% confidence interval to the same limit as the prediction interval. Instead of using GM smooth, I can just use GM line with the new DF because all that information is contained here too. I just need to give it the right wise change the colors how I want and so on so forth. Okay. So that is how to do prediction and confidence intervals in Python. Thank you.