 In previous videos I talked about installing command-stand and generating quantities, so that was kind of like a getting started in command-stand Here I'm actually fitting a model and generating predictions So this should be something somewhat similar to what you would see in the wild with command-stand an actual model So starting with the data why the why here is a time series So we're gonna just pretend like someone handed us a time series It's the values 22 25 all the way to the end which is 29 We'll assume that 29 was was the most recent date n is the number of values in why so that's 11 Location and scale our parameters to a prior distribution and I'll talk about that in just a second We can ignore these for a little bit just for now going into the model line 23 What I'm doing here is forecasting so f is a vector of values One to end so this is looking 11 days into the future where each value is one step out from today So f of one is tomorrow f of two is two days from now f of three is three days from now And so what we're doing here is if I look at the very inside I'm using a normal random number generator That's what our NG is use a normal random generator where the mean is 3.367 and the sigma is some parameter that's fit above and it's multiplied by the square root of J So that seems already a little bit confusing, but I can break this down just a little bit 3.367 is log of 29. So why log 29 what this model is is a random walk forecast A random walk forecast just takes the last value in the time series and assumes that all future values Will be just like the last value in the time series What we do with that is we scale the variance for every step out So for tomorrow the variance is somewhat small but 11 days from now the variance will be large The idea is that we might have a good idea of what tomorrow will look like, but the future is a little bit noisier So that's what this sigma times square root of J is supposed to do J is every day out And as we get further into the future the prediction intervals get wider. That's all there is to random walk forecast It sounds really simple, but an application Sometimes the random walk forecast is the best forecast so even though this is a simple model It does have its own merits So it's something worth at least being aware of and also all of this is wrapped in an exponential The idea here is that anytime I'm given positive data I'll log it model on the log data and even generate forecasts in terms of log and then Exponentiated and so what this does is it constrains the data to always be positive. I just went with a simple log It's usually a little bit safer to do log plus one because sometimes you might have zeros in your time series But that's not happening here. There are no zeros So I didn't do a lot plus plus one there are more advanced transformations also So if you wanted a more advanced Transformation normally you would do something like a box cocks, but we're just gonna keep it somewhat simple but still effective So this is a random walk forecast and again all a random walk forecast is it's taking the last day in a time series and Generating a normal distribution and then for every step out We're just making the variance a little bit noisier as far as the model goes all that's this is this looks complicated But all this is doing is trying to find reasonable values for Sigma Sigma is coming from a normal distribution with location scale and when I read the documentation normally These values are hard-coded if you want to change the values in this normal distribution that would require Recompiling the model. So what I'm doing here is just kind of treating this data.json as a Config and defining the scale and location there. So I don't see a reason why not to do this It's a good way to decouple the the parameters so you don't have to recompile the code So this is a normal with a mean of zero and the scale is 0.5 so that's what we're using as a prior for the Sigma and then what this model is doing is it's trying to Use a normal distribution where the mean is equal to the previous value in the time series and the Sigma is gonna be fit To try to predict what the current value in the time series is So we're assuming that the Sigma is some kind of normal distribution with a mean of zero and scale of 0.5 and all this for loop is doing is saying look we're gonna be using a normal distribution with a mean of the previous value in the time series to predict the current value in The time series and we're just gonna be Simulating what Sigma's work well to make this prediction happen and that's how there is to the modeling block again The main idea here is that we're just trying to find Reasonable values for Sigma to help make this model happen in terms of the transformed parameters We're taking in the time series and logging it. So that's where this y underscore log is coming from The main parameter is Sigma again. This is what's being used in the model You'll see that the lower bound here sets a zero so where it looks like on line 15 We're using normal distribution if we look up in the parameters. We'll find out. It's not a real normal distribution It's a half normal distribution what that means is we're just taking the normal distribution and Cutting it at zero and just taking the positive values So we're using the positive side of the normal distribution as a prior for our Sigma The data since we already went over the data. Jason. This should be pretty straightforward And as an integer the number of values in the time series Y is a vector of length n the scale is some real number in our case at zero point five The location is also a real We'll still define it here just in case there's some reason to change it And that's all there is to this model. So before the video I already compiled this so I don't need to recompile it We're just gonna go ahead and and run it. So here We're just using the model executable using some of these keywords sample output giving the output a file name Which is o.csv and the data file is called data.json and let's take a look at the output Before I click enter here I guess I can talk a little bit about this string of commands Some of this just requires a little bit of familiarity with how Stan outputs CSVs But the main idea here is that I'm taking out the comments and I'm taking out all the parameters that Stan generates And whenever Stan generates parameters, they're suffixed by underscore underscore And so there's some regular expression happening there But exclude those and I pipe that into another command saying hey I don't even really care what Sigma is because I did define the Sigma parameter I'm to help fit the model, but right now. I'm not really concerned about inference I just want the forecast and so all the forecast columns start with f dot where f dot one is Tomorrow f dot two is the next day f dot three is the next day. I just want to look at the forecast I'm piping it to a command in Miller that just takes those Forecasted values from the CSV and then I do a little bit of reshaping so I can feed this to a command line utility Which which generates box plots and the terminal and also gives a table of values for a five number summary So reshape all it's doing is converting the data which is wide into long format So I'm taking all the columns which is all the forecasts and I'm converting it into a column of just keys and values Where keys is f1 f2 f3 f4 and the values are the predictions for those forecasts And since we are using Stan and generating samples every forecast out is gonna have hundreds or thousands of predictions And that's kind of the beauty of using Stan is that we don't get just one point forecast We're getting a whole distribution of forecasts. So let's take a look at this All right, so I'm gonna ignore the box plots for now the box plots take into account the max value and the max value isn't really What we're interested in here. We're just kind of interested in what's happening more more around the middle So I'm gonna be looking at the columns P 25 P 50 and P 75 So P 50 you'll see that the values pretty much always stay pretty close to the last value in the time series Which is 29 all the way from the beginning to the end What's interesting here though is that for P 25 for the first value f dot one We have 24 and that goes all the way down to 15 by the end for the upper bound for P 75 We started 34 and that gets very wide up to up to 53 So as time goes our prediction intervals get wider now again The box plots are not really showing that because every once in a while We just happened to sample a really large number But overall what's happening is that the prediction intervals Are getting wider and that's exactly what we want in a random walk forecast We want the mean to be the same but we want uncertainty to grow over time But one other nice thing is that the minimum values never go below zero And so that again that was the purpose of logging the time series and modeling on that log time series and then Running the forecasts through exponential and that's it That's all there is to this command stand is a very effective tool for doing real models and real forecasts Thanks for watching