 Hello everyone, welcome back to the session of ARIMA. Now, we will discuss the moving average process that is the second component of ARIMA moving average process. Remember, in different type of moving average model of time series data, when we discussed simple moving average, weighted moving average and say exponential moving average that concept was different. There we used the older time series data, older pure data, actual data. But in this moving average process of ARIMA, we will not use the actual data of time series, we will use only the error term. Look at the moving average model uses the past forecast error values, error terms, the wide noises and weights of past error to forecast the future time series data. So, therefore, this is the fundamental concept of moving average process or say moving average model of ARIMA than the previous basic moving average models. Here, we do not use the past data, here we use only the error term. Now, the question here is that, how that error term will be used or will be calculated and to be indicated in this moving average process of ARIMA model. Let us understand that. Similar like AR process, we have completed the AR process. In AR process we use the actual data and we take a combination of them and regress the data with your own past data that is what the basic AR process, auto regressive process. But here we do not do the regression, here we use simple weighted combination of the error terms. Now, how we will calculate the error terms? Let us understand with some basic understanding or basic example. Suppose, you have a time series data say Y, say Y1, Y2, Y3, Y4, Y5, Y6, Y7, Y8 like this. These are your time series data, we will not use this data to calculate your forecast of MA process. But in AR process we have used this data and we have used the auto regressive model. Just now we have discussed. So, now in this session, let us focus about the MA calculation process. First you calculate the average of the data. Look at that Y mean, the average of the data that you can calculate of the data here say you will get the Y mean say the average of the data of all this data you take and take the calculate the average. This data will be used to the error terms right. And then what you do? Suppose you calculate the error whichever method you follow, maybe basic any basic time series model you can use and you can calculate the error terms here. Error 2, error 3, error 4, error 5, error 6, error 7, error 8 like this say you know these errors you can calculate using any past forecast and then the difference between the forecast and the current value. These errors will be used to calculate the forecast of MA process. But here we will use some trick to find the moving average steps in a iterative process right. But you can follow some past model and to calculate the error. These errors will be used to make a forecast. Remember here only the error terms and these errors are to some extent they are follow they are independent to each other because they follow identically normally distributed distribution. So, they are the white noises. But the question here is that how we will use these errors in your formula. There are two way one is that you can use these errors by using some you can calculate these errors by some past method. You can take the actual time series data and any method you can follow and you can calculate the error. These errors you can use here with your average data and you can calculate this forecast. Maybe one pre-order error the one lakh error just only one previous pre-order you will consider or maybe two pre-order error you can consider two lakh of error you can consider and you can take a weighted combination of them. You can take three pre-order error with your mean that up to you and that optimum value of order order of a ma process the q value that we will decide later we will discuss that later how to select the order of a ma process. But for the timing let us understand how the ma process works. Now here we will use some different method which is very interesting as well as more you know accurate. Now let us see how this formula work for one period of lakh or say two period of lakh or say three period of lakh. Suppose here we would like to calculate the forecast right here we would like to calculate the forecast. So, say we will calculate say one period of forecast in that case your calculation will be say y bar say this mean value y bar plus say another weight. Suppose weight suppose weight is say gamma this weight gamma into this error this is your new forecast this is your new forecast. Now for the second period what would be your forecast say it would be y bar as it is plus gamma into error 2 dot, dot, dot. Suppose now you want to make forecast say y 7, y 7 forecast it will be y mean plus gamma into error 6, y 8 you would like to forecast y 8 it would be y mean plus gamma into error 7 you drag it because one period forecast you are doing one period forecast you are making. So, this is what the forecast of moving average process. Now suppose you would like to forecast say two period combination like lag of 2, lag of 2 you want to consider say in that case how you will use the formula. Suppose you will have to consider two period two past error you will have to count in your weighted moving average process. So, in that case here your forecast will be let me you know delete this say say average you have already calculated say average y bar you have already calculated say y mean whatever. Now the forecast of this formula will be calculated y mean plus say gamma 1 into say epsilon 1 plus gamma 2 epsilon 2. So, two period combinations will come here to make the forecast. So, third period or not your forecast will start. So, effectively it will start from y 3 forecast will be y bar plus gamma 1 into I would write the immediate period it would be say you know error 2 plus gamma 2 error 1 why I am telling because the error are being calculated from the back side. So, effectively the formula will be y t forecast will be y bar plus say you know suppose you are at 6th period say 6th period in that case it would be gamma 1 may be whatever the notation of immediate say y epsilon 5 or error 5 plus gamma 2 of error 4. This way you can define your forecast model for two period like the same logic you can use. Now suppose y 7 you have to like to forecast it would be y mean plus gamma 1 that in that case you have to calculate optimize gamma 1 and gamma 2. How the weightage weights how the weights will be optimized that I will tell you later like in solver we have optimized in moving average process. So, same logic will be used to optimize the weights. Here for single variable single point error or only over lag 1 in that case your gamma will be only 1 value that you have to optimize look at the combination here let me use another pen you will get to know say this one. So, look at this this forecast are your 1 lag forecast 1 lag q equals to 1. Now here we are using q equals to 2. So, that concept is here 2 weighted combination you require. So, 2 weights will optimize for the time being we are assuming gamma 1 and gamma 2 and look at the formula if you drag it in that case what would be your y what would be your y t y 7 say y bar y bar plus gamma 1 this optimum value or weights into the immediate past weight. What is the immediate past weight y error 6 plus gamma 2 this error into 1 period older error. So, it is gamma error 5 and you drag it wherever look at this y 6 5 and 4 are has come now for y 7 6 error 6 and error 5 are there. So, all list all are listed here in excel I will show you the calculation process and you drag it you will find the general formula y general y t hat equals to y bar plus gamma 1 into immediate older period plus 1 period another combination of on another older error. So, these are called lack of errors. Similarly, if you look to calculate the forecast for say you know 3 period combination of errors and the moving average process in that case your forecast will be y bar or y mean plus gamma 1 into error on immediate error. So, suppose y 10 if you look to forecast y 10 in that case it would be 9 this will be 9 error 9 this will be error 8 this will be error 7 and this combination weighted combination with mean is your forecast for y 10 you drag it. So, this is what the concept of moving average process. Now, remember one more point here in AR process remember the AR process AR process say 1 or 2 whatever you can take. So, you had the data and y 1 y 2 y 3 y 4 like actual data and there you are taking combination in terms of regression value right say you know alpha plus beta y t minus 1 say y t forecast equals to say 1 period say. So, these are regression value intersect and slope and therefore, this is a intersect value, but here remember you are not doing any regression let me come back here this is very crucial point here you are not doing any regression here you are taking the weighted combination these are the weights these are the weights right may be 1 lakh of weighted combination and 2 lakh 3 lakh etcetera depend on your order selection of MA process that we will discuss later, but remember since these are the error term and error term are the mean of the error term are closer to 0. Therefore, this should be very high value like not the intersect value of AR process and then only you will be able to make actual forecast because say TCS remember the TCS stock price example 3500 suppose in that case if this error may be maybe 23 maybe 20 minus 25 plus 25 kind of thing. So, that cannot give you your forecast. So, therefore, you need to add the weighted combination may be 1 period 2 period 3 period of error with your mean data. So, that you can get a accurate forecast through moving average process also remember AR process is different moving average process is different you can use any one of them dependent on data pattern and the requirement or the instruction right. So, this is what the MA process now let us understand this MA process through one example with the same TCS data will take and for one period of illustration of error or say you know lack we will consider right and you can extend the concept same as it is like here I have explained remember this slide this if you can consider 2 period of combination then you have to optimize 2 error 2 weights, but for single weight single lack Q equals to 1 let us first understand the formula right generally more than one period of 2 period of lack people do not consider 1 period 2 period for error process or even for MA process also. So, it is a it is a just extension let us understand the basic 1 period or 1 lack of MA process. So, this is the formula then right 1 lack only 1 lack this is the weight right and these are the errors. So, these errors you can calculate through any 4 process, but here as I mentioned we will use some trick to calculate the error and in a iterative process we will calculate our forecast of MA process. Now, look at the first point calculate your average data the mean this is what your mean data, mean data say or y mean say whatever mean data you can take the data 3 months data we had and you can take the average store it. Now, you have to add that the with this weighted combination. Now, how we will calculate this weight? Remember the excel in moving average process same concept we will use here, but first you have to calculate the weight also right initially what we will do look at the steps first step. So, here what you do initially look at the weight here this weight will optimize this will optimize later initially suppose you have assumed 0.5 what is the best weight that we will see through solver. Now, what is the first forecast initially we have assumed that let us see the understanding initially we have assumed that the residual the error is 0. So, if the error is 0 the initial forecast will be 3, 5, 4, 1 say y mean y mean plus say 50 percent gamma into error right. So, in that case it would be y mean plus gamma is a 0.5 into error is 0. So, this is your first forecast that we found in the next row say that would be forecast for the next period that would be a forecast for the next period. And then using this forecast value this forecast value what you do? You calculate the error residual again the next residual initially you have assumed 0 that is sufficient to start your iterative process to end the forecasting procedures of MA process. Now, initially you have considered 0 and then you have taken the weighted combination mean plus weight into error. So, you will get the forecast this forecast you write here and then you take the difference between these and this this forecast and this values you take the difference this difference you write here. And then now you will get a error now residual now again you calculate the formula this weighted combination formula y bar plus gamma this gamma into this residual now you will get the new forecast suppose if it is a y 1 then this is now y 2 forecast and you drag it the process. So, error in a iterative process error forecast error forecast error and the weighted combination are coming it. Now, the question is that sir what is the best way how will get the forecast to be understood, but what is your best way so that we may get the best forecast in future also that we are going to discuss now. Look at drag that drag down the forecast till the end you will get the forecast process right in excel also I will show you. Now, come here the optimization of the weight right that they mean that is the main part now. Now, what you have to do how will optimize the weights remember the RNEC calculation in different models of time series that you have to recall. So, you what you do based on these errors come here based on this error say these errors look at the errors the residuals you take the square and then the mean square error and then the RMSE calculate the RMSE look at here the steps I have mentioned take the residuals all this say 60 data residuals and then then take the square and then the you know sum of square and then the by say 58 or whatever on the data point are there then square root of it. So, you will get the RMSE simple RMSE calculation session you can go back and you can check and that RMSE you optimize you have to minimize it right you have to minimize the RMSE. So, that you will get the best weight that is it you will get the final weight and the corresponding for automatic iterative process it will be done the best combinations will be done by the software or solver itself. Let us understand how look at this just you have to optimize the cell and you have to find the optimum of non-linear equation because RMSE is a square root. So, it is a non-linear let us see how does it work so come here. So, here you can see I believe you can see the excel here we have the actual data and then we have taken the mean of the data let me go back to the top here you can see the mean of the data right. So, y mean or y bar is there now initial residual we have assumed 0. So, if initial and the weight is here say weight look at the weight suppose initially keep the weight whatever say 0.5 now what is the first forecast this is the first forecast for the second period say immediate next period. So, if the mean look at the mean average plus this. So, I can show you the calculation here also suppose here the calculation will be mean value plus weight gamma into this is your gamma into the residual the error the previous error one period combination we are doing right. So, our forecast will be y bar equals to y mean y forecast y hat equals to y mean or y bar plus gamma into epsilon t say error t. So, this is what your forecast write t minus 1. So, this is the logic now suppose say y 1 y 2 say forecast will be y bar plus gamma this gamma into error 1 and you drag this this is what we are doing into this. So, here your forecast now look at the forecast this one now you freeze this and you drag the formula you will get if the inter calculation have been done now suppose you got the forecast right. So, what will be your error now for that period error error will be actual minus forecast this table 7 you want to understand because you have taken the data from the TCS from the direct NAC side. So, therefore different table 6 table 7 table 6 16 are coming in different illustration. So, do not be confused about that. So, now here you can see the forecast. So, this is what the forecast look at this and these are same. So, this is what your forecast error now you have the new error now you calculate the forecast for the next period the forecast for the next period will be mean plus weight into the new error new residual. This one you will get the forecast for the next period again you calculate the residual and then again you multiply the weighted average of with average value mean value you will get the forecast to drag this process we have done it and we found the overall forecast here these are the forecast column, but now the question is that this weight optimization right how will optimize this weight look at the RMSE we have calculated look at the mean of square error and then and then the RMSE square root of it by we have taken the mean 61 data we had and then we have taken the RMSE square root. So, this will minimize where is the error total error should be minimum for given weight. So, what is the weight what is the gamma value right that we will have to optimize go to data as I discussed in detail of you know error calculation session go to solver look at what you do you do not have to put any condition also here it is a unconstant optimization this case you can put condition but it do not make any changes in your output process. So, here which cell you want to optimize this say RMSE right. So, let me delete it you will get to know. So, this RMSE you will have to minimize right total error should be minimum it is a minimization it is not a maximization it is a minimization problem and which cell you will have to optimize you have to find the gamma value right the weight of your moving average model of order 1 of lag 1 of error 1. So, you do not need to put any condition here because it is a unconstant you can put a range say 0 to 1 whatever you want but no need to put condition here its system will optimize with the best weight. Now, since it is a weighted moving average. So, you do not need to carry forward the constant there just solve it is a nonlinear remember it is a nonlinear problem just click it just solve it weight can be positive negative anything. So, we are not clicking that we are just solving it done we found 0.89 as the weight gamma with these data sets of TCS and this is the forecast here you can see the forecast value also and corresponding least RMSE and the weight we change the weight we will have to see it just change the weight say 0.2 we will put 0.2 now look at 0.2 we have given and we got the new forecast say but if you want to optimize it just select it go to data the formula has been already incorporated just solve it again it is coming back to 0.99. So, this is the optimum weight and for this particular data of moving average process of ARMA this is what the ARMA process I believe it is clear to everybody now let us come back to PPT now I believe it is clear to everybody the MA process. Let us understand the final prediction. So, final prediction will be average value plus weight into the forecast you drag it for any period you can calculate it right this is the forecast but the next question is that sir how many error term we will have to consider here you have considered the order is 1 if I would like to consider my order 2 then one more error term will come like one more optimum value will come gamma this may be gamma 1 then gamma 2 into error 2. So, that how will optimize the process like optimization process you can do as it you can add one more weight here and you can optimize both like you know RMSE calculation process or weighted weight calculation process you can consider gamma 1 gamma 2 2 parameter you can optimize in your in this cell here you can select both the variable and you can take the weighted combination of older 2 older data 2 older residual. But the point here is that how to select the order whether it is a 1 order 2 order the what is the best order that we will discuss through SCA and PSN. But weighted combination you can consider like 2 period in case you will have to forecast say let me tell you here in case you will have to forecast say y t plus 1 of say 2 period. So, 3 5 4 1 plus say gamma 1 in that case error weight which will change gamma 1 into say the 2 error you will have to consider. So, these 2 error will come into say 51.25 plus gamma 2 into minus 3.35. So, this that forecast will give you if it is 1 2 3 then say 4. So, y 4 you will get. Now, for y 5 forecast this plus as it is plus gamma 1 that you have to optimize gamma 1 and gamma 2 you have to optimize here. So, j 3 and say j 4 you will optimize it 2 because 2 past lack of error you are considering into in that case you know now for y 5 you will have to consider these 2 34 error 34 plus gamma 2 into error say 51. Drag it same logic just only 2 error you have considered or say included in your MA process because you have considered MA process of order 2 lack 2 and formula will remain same just you optimize your weight might change might change and you drag it minimize your RMSE same formula you just extend it the one more combination of your weight I do not want to discuss that. What important to discuss is that the order selection how big of Q you should consider only 1 period 2 period 3 period like the in AR process we have discussed detail of this selection process here also you need to understand remember here you need SCF and PSCF that we have discussed you can refer to that session hot is SCF hot is PSCF. SCF is nothing but the actual data and their correlation right PSCF is what PSF is the partial correlation. Now, if you draw your SCF graph here say time lag say and this is your SCF you know the graph right you know the graph like this this what lag say 1 lag 2 lag 3 lag 4 lag 5 like this and if you draw your PSCF PSCF in that case you know it could be like this say you know how to calculate it this is your partial auto correlation partial not the actual only the actual impact of older data in your current data. So, if it is a time series data y 1 y 2 y 3 y 4 y 5 like this. So, impact of y 3 on y 5 will be counted by removing or eliminating the impact of y 4 this is what the partial, but SCF is a direct correlation. So, this is to some extent not correct for AR model because AR model is a real regression with your older data remember AR model AR models of order 2 say it will be like say alpha plus beta 1 into y 1 plus beta 2 into y 2 right y t minus 2 t minus 1 whatever you can consider y t say forecast. So, you can remember the forecast these are actual older data of time series actual older data of time series, but you are taking their combination through regression right not combination through regression. So, since you are doing a multiple regression you need the actual impact of that because all them are considered as independent variable. So, therefore, it should be a partial correlation actual correlation you need to calculate otherwise multicolour net will come we have discussed it all, but in MA process you have to take other way remember in AR process your SCF will be exponential decay and PSA will have a clear cutoff PSA will have a clear cutoff remember will be clear cutoff it will be falling under band line inside the band line. So, you can select the order say p equals to 2 in AR process because you have to select the final p from your PSA model which is the actual partial correlation and there will be clear cutoff, but SCF will be a exponential decay, but in moving average it is other way the reason is that moving average process your PSA will have exponential decay, but your SCF will have a clear cutoff your SCF will have a clear cutoff interestingly. The reason here is that you do not need to select the PSA to find your final q final q selection you can do it from a SCF graph you do not need to go to PSA because here you have the error term right you have the error term in your model in MA process. Since errors are already independent they are not related to each other and their mean is to some closure to 0 and they are identically independent distributed random variables. Therefore, you do not need to go to the PSA you do not need to calculate the PSA from SCF you can directly calculate their pattern and then you can select how it will be your q. In that case if you find say 2 lakh positive clearly visible standing line may be positive or may be negative depending on your data sets then you select q equals to 2. If you find only 1 lakh q equals to 2 if you find only 1 lakh then you select q equals to 1 you do not need to see the PSA, but the pattern the data sets the past experience of this MA model says that your PSA will have exponential decay completely other inverse your PSA will have exponential decay interestingly and SCF will have a clear cut of 1 or 2. If this situation occurs then only you select the MA model and the error term and the MA process which I have explained so far I believe it is clear to everybody. Now, let me summarize it the two process here couple of example. Suppose take the first example AR process and MA process both we will discuss AR to summarize the model 2 model and MA. This graph I have kept both in the correlogram both SCF and PSA put together in the same graph. Now in this case look at the black one is a representative SCF and white one is a representative PSA. So, just note down it. So, black cells like black black lines the standing lines represent the SCF. So, here you can see they have a exponential decay look at the band line here look at the band line here you can put define your epsilon the range through which inside which that your values will fall if the values fall then is that will be a cut off point your SCF has a exponential decay. But look at the PSA look at the PSA the white one only one positive standing line here there are only 1 lakh after that look at here closure to 0 look at closure to 0. So, there is a clear cut off in your PSAF. So, if SCF has a exponential decay and PSAF has a clear cut off here only one standing line you can see you know what will be order of AR model this is AR model because PSAF has a cut off and SCF has a exponential decay. So, it will be AR model if the data pattern follow like that correlogram diagram follow like that in software you can draw it it will like or how to draw it that also I have discussed in the first session of SCF and PSAF calculation it will be AR model of order 1 because one standing line of PSAF you are getting PSAF you are getting just only one positive standing line or say clearly it can be negative also one standing line be clearly visible standing line above your band line. So, yes AR model 1 now look at the second graph look at this graph here the black one is your SCF and white one is your PSAF look at the PSAF data in a alternative manner it is exponential decay look at white one here PSAF. So, PSAF has a exponential decay, but SCF has a clear cut off only two point cut off look at two maybe negative this is negative, but it can be data could be like that where you know error term or positive negative it is coming this way or data are behaving like that way like you know one if some someone's some company's performance comes in this quarter positive in next quarter it comes negative suppose this type of data or maybe someone's some actors movie if this movie gets hit the next movie get flop whatever the example you can take suppose the data pattern are following like signality or some sort of different behavior. So, in that case your data could be negative also correlation could be negative also these are only correlation or partial correlation right. So, but it seems auto data therefore, you call it auto correlation or auto partial correlation. Now SCF you can see here two standing line. So, here you can see SCF has a two clear cut off, but PSAF has a exponential decay red PSAF has a exponential decay. So, in that case which model you should select remember just now you have discussed you should select the MA model come back here look at MA model your SCF will have a significant fall from the zero suddenly there will be a cut off, but PSAF will be exponential decay. Therefore, you select the for this type of data you select the MA model of order 2 because here two clear visible SCF graph you can see standing line you can see. So, here q equals to 2 so MA model of order 2 now come back to this graph now third graph here you can see all are negatively correlated right not a matter it can be. Now suppose in that case if you see the data SCF will have a single cut single standing line look at here PSAF has a cut off PSAF same then in that case here also you will need to select the MA model because PSAF has exponential decay SCF has a cut off because error terms are saying that this model is more suitable for MA process or this data. So, SCF has a single standing line clearly visible standing will negative side, but clearly standing line and look at the your SCF is cut off now has a clear clearly closure to 0 look at this. So, you do not need to select more than one error just only one error. So, here the order will be one in that case what will be model? Model will be y bar plus gamma into because only one error into epsilon t that is it because only one error you have taken what will be the model for this it would be y bar hat equals to y bar plus gamma 1 into say epsilon say if it is t say t plus gamma 2 error t minus 1 or t minus 1 t minus 2 you can put say this is t. So, this way you can define your order 2 of error this model will be this model will be AR model it would be y hat equals to alpha plus only one standing line is visible. So, AR model of order 1 alpha plus beta into y t if it is t then t minus 1 that is it simple one order regression this is MA process of 2 error term because 2 you know standing line is visible from SCF graph and PSAF has exponential decay this MA process of order 1 because only one standing line is visible from the SCF graph and PSAF has a exponential decay, but SCF has a clear cut of sudden fall because there is no impact now you are not getting any impact suddenly it is falling. So, older error will not have any impact to your current forecast only one period error is sufficient to your combination with the mean data. Now, look at this case the last case here you have the SCF exponential decay, but PSAF has a cut off. So, which model to select again in this case you select the AR model because AR the PSAF has a clearly visible standing line, but after that closer to 0. So, we do not want to take more than one order. So, it would be AR model of order 1. In that case what will be forecast sorry forecast would be alpha plus beta say y t minus 1 regression right it is a regression because AR model direct regression auto regression you are doing because PSAF has a clear cut of an SCF has a exponential decay done. So, this is what the illustration of correlogram from SCF and PSAF. Now, for any data if you can see this type of graph with your python or you know excel or any kind of graph I believe you will be able to understand these graphs gives AR process or MA process this is the most important part how to select the order of a model AR model or MA model. ARMA is not difficult I will discuss that later after the break, but you will get to know the most of the data will follow either AR or MA. So, in that case how to select the order of a model the P for AR process Q for MA process the lag for AR process it is the actual older data how many older data you want to consider in your model and finally, in autoregressive model and finally, you conclude the model and run the model in MA process how many older pre error you want to consider with your mean and you want to make a average weighted average combination or end up for a corresponding forecast through error term in MA process that is in illustrated here now. I believe it is clear to everybody how to read the AR process and MA process and correlogram SCF and PSF and how to select the order of a model of AR and MA. Now let us take a break after the break we will discuss the ARMA process.