 So in this video, I'm going to finish up this series and specifically covering NG Boost. I want to keep doing probabilistic forecasting. I'll probably use some different libraries. But this is going to be the last video on NG Boost. I went ahead and I finished up the script for this series. In this video, as opposed to going line by line like I did in the previous video, I want to just go through some of the updates that I've made. So the first thing I'm going to do is open up the file. All right. So to show some of the changes, I'm going to open up infugitive with colon G. Go down to line six and type dv. And close this window. All right. So the screen on the left, that is what used to be. And the screen on the right is some of the updates that I've made. Line 15 has some collapsed lines of code. I'm just going to open that up. And just it might look odd if it's unfamiliar. So with that expanded, I want to just touch on one main line and that's line eight. In the previous video, all we had imported was the exponential distribution. In this updated script, I'm importing the normal distribution also. And that's so I can compare the results of ngboost with a naive forecasting method. And I'll talk about that a little bit more below. Most of the changes have just been additions to the end of the script. So from about line 64 on is all new stuff. So I'm going to talk about some of these changes. I'm going to close this up and just start on line 64. And finish this up. All right. So you know what? Let me actually run this line by line. Import the libraries and just go chunk by chunk here. That'll be good. All right. So what just happened there is the ngboost regressor was trained. We also have a distribution object. And that's about where we left off the last video. Now for the new stuff. Right here is a objective function for distributional methods. So this is CRPS. I'm not going to go into the details here. Most of the equations you see for CRPS has an integral and certain implementations. This specific implementation is a faster version. So it won't look like the version that has the integral. So it's fast and fairly straightforward. All right. The next thing is setting a random seed. What's happening in this for loop is I need to make some draws from the exponential distribution using the values that are stored in this y distribution object. So we're getting the param called scale. So this is a scale of the exponential distribution. And we're going to be pulling 5,000 draws from the exponential distribution. So for each point in the future we're going to have 5,000 possible scenarios. And then the rest of this I guess line 77 is just grabbing the single actual value. And then we're passing the single actual value and the distribution to CRPS. And then finally we're going to append these CRPS scores together. So here is our array of CRPS scores and then we're calculating the mean. And so the mean here was this 3,783. We could have just finished off here and been done with it. But I did want to compare how this I guess stacked up against a seasonal naive forecast. And that's what the rest of the script is, is calculating the seasonal naive forecast. And the idea here is I'm not sure if this is good or bad, slightly better than a benchmark or a lot better than a benchmark. So I just wanted to see what we could do with a seasonal naive forecast. So the first thing is what is a seasonal naive forecast specifically as it relates to distributions. So there's this nice online textbook called Forecast, Principles, and Practice, version 3 now. Rob Heinman. He has a section in his book and this is doing distributional forecasts and he's relating this back to some of these benchmark methods. And so he talks about, he has a table here, benchmark methods and the standard deviation that you should use. So first of all, seasonal naive. What that is is that's using the most recent available week of predictions and repeating those predictions forward. So your Monday, next Monday is going to look like your last Monday. All your Fridays are going to look like last Friday. And so this is a pretty good method and we saw in the last video that lags are effective methods for forecasting a highly seasonal time series. The thing with going distributional though, you might think that doing just the standard deviation is sufficient. But the catch here is that the further we go from current, so the further in the future we go, our uncertainty should also increase. And so we scale the standard deviation by root K plus 1. And that's all I'm going to cover here. That's all there is to the seasonal naive forecast and this is going to help us get our distribution. You can pass this to a normal distribution. You can also, in his packages, he'll do bootstrap residuals. And I believe in his code, I've seen that he does a T distribution. So I can't say exactly what's right. You just kind of have to test out what will work for your use case. But in our situation, we went the simple route and just did normal distribution. So to get a seasonal naive forecast, what we're going to do is we're going to get the max train date. And really what we want is the most recent weeks. And so that's what these chunks of code do from line 87 down to 97. So I'll talk a little bit about what we're seeing in line 97. We have a state ID, a day, back sales, and a standard deviation for California, for Texas, and for Wisconsin. This is the last week in our training set. Yeah, that's it for this DF last data frame. I do want to make a point that this standard deviation here isn't the standard deviation that we want yet. It's standard deviation. It's not increasing with time. So we're going to go ahead and work on that. And that's what this DFS naive data frame is. So in this data frame, we have lag sales. We have standard deviation. We have k, which is the same k term that we see in this formula. And that's going to give us what I called SD hat. And so that is this term here. And that's what I'm calling SD hat. And if you look at the code, it looks just like the formula. We have standard deviation times the square root of k plus 1. And k increases as we move out each week. So the first Monday is just a factor of 1. The second Monday is a factor of 2. The third Monday is a factor of 3. So our distribution is going to get wider as we go out. OK, so now it's time to pull our draws of the S-naive forecast. So what we're doing here is we're going to have our S-naive lag sales. And then we're going to pass in our S-naive standard deviation and do 5,000 draws just like we did with ng-boost. And we have to grab that first value for CRPS score. CRPS takes the actuals and the distribution and returns a score. And we're saving that into a CRPS array. Whoop, if. All right, so it looks like VEC forecast is not defined. Let me fix that. Array CRPS naive is not defined. All right, so you get to see some live editing here. Let's see. Here we go. So we'll start with an empty array and fill it up. All right, so here we get 3, 4, 1, 2. What do we get for ng-boost? I can't remember. So I guess I'll label this. This is for ng-boost. And this is for S-naive. So for CRPS lower is better. ng-boost is doing quite a bit worse than the seasonal naive forecast. So I think there's a couple of reasons for this. I had in a comment right after the side that probably what's happening is that ng-boost needs some tuning. In fact, that's probably the most obvious thing. We didn't do any tuning. We just took the method straight out of the box and crossed our fingers and hoped it worked. That's generally not what you do in real life. So to increase the performance of ng-boost we'd probably want to look at hyperparameter tuning. And we could also get a little bit more clever with our cross-validation and how we push that to our boosting model there. But also just wanted to point out it's always a great idea to run a benchmark and just see where things are falling relative to the benchmark. Sometimes, not always, sometimes those benchmarks can be tough to beat. But usually, with enough data, enough features, enough computation, you can usually get as good or better. I was going to try to do some plotting here. But what I think would be fun is that if you're interested in trying to plot out these distributional forecasts, go ahead and submit something. And I've tried to think of ways to do this. And the best thing I could think of is that if you want to add to the script or modify it in any way, just submit a pull request. So go on to the XVZF GitHub where I store these scripts. And if you've got a plot or if you want to make a modification, maybe you want to do some hyperparameter tuning to make ng-boost beat the S9 forecast, go ahead and make a pull request. But that's all I have for this video. Thanks for watching.