 Hello everyone welcome to the course of business forecasting. Today we will discuss different components of time series and measures of forecast accuracy. In general in machine learning and predictive analytics models when you analyze different time series data you need to understand the data behavior their pattern and their trend. So, in a time series models there are various type of components exist. For example, it could be trained up trained down trained it could be seasonality also like in India most of the companies sales or say you know products demand follow seasonality. So, seasonality is a one such component. So, what are the components exist in time series data that we are going to discuss at the same time when you make some prediction using any time series models you need to bring the accuracy of your forecast right whether the prediction is right or wrong or good or you know not good. So, that how will measure in comparison with other model or some competitors models or say you know friends model. So, in that case you need to use some measure of forecast accuracy. So, there are many methods of measure of forecast accuracy are there like you know MAD mean absolute deviation RMSC mean absolute percentage deviation. So, all this the mean square error. So, we will study all this measure of forecast accuracy along with components of time series in this session. So, these slides talked about you know different type of models that we are going to cover throughout this course of business forecasting. Apart from that many more aspects will come into different sessions, but this is a overall you know different techniques that we are going to cover whether it could be qualitative methods time series methods and then you know causal models or regression models then simulation models. We have already discussed qualitative methods in the previous session. So, once the that components of time series are completed today and then different measure of accuracy will be covered then we will enter into the different type of time series models. So, let us discuss about components of time series and different type of measure of accuracy what exists in the literature that we need to study to measure which model is better or which model is performing best as compared to the other models. So, that measure of accuracy also we will study today. In general apart from the qualitative I talked about time series causal models and the simulations models all these aspects actually will be covered in this course. Let us focus about little bit of you know recap that what we have covered in the previous sessions that. So, actually in time series when it comes to the time series qualitative aspects I have already discussed. So, we are not going to discuss that today. So, when it comes to the you know time series forecasting the data are first data or say historical data in that case it could be quarterly data, it could be monthly data, it can be say you know daily data, weekly data, it can be yearly data also. Suppose you are taking the like say population study you want to do a population study and you want to make a forecast or say you want to have say you know sales of a company for a yearly basis how their performance are going on. So, in that case you might require yearly data also, but in general in India especially you know over the world itself generally data are being captured quarterly basis or monthly basis. Especially in India if you think about you know the company's arming and their profit growth etcetera or sales all these things are been done quarterly basis. So, we will discuss that when you will go to the seniority and all these analysis we will discuss, but that aspects I am talking to you is that you know data could be the data that you collect could be different type. So, when the data are of different type that means data might have a different nature or different pattern. So, that pattern we need to study how this different pattern of the data can be classified that we are going to discuss. It could be trend analysis, it could be you know to some extent relationship with other parameters also like regression analysis kind of thing causal relationship between independent variable dependent variable that understanding also you need to take into account for your analysis of the and the future forecast. As well as the for the time series data you need to see what kind of past pattern the data has already shown and how it will how it can behave in the future. So, that mapping or interlink need to be captured in a better manner and what is the goal of time series. The goal is to you know understand the data remove the anomalies and clean the data and make a better forecast with the goal or the context of the problem or the industry problem that you have kept in your mind. So, accordingly you have to make a forecast. Let us see the data could be you know cross sectional data it could be time series data. So, when it comes to the cross sectional data what happens the understanding that I am trying to share today. For example, say you know if you take a data at the same time period for different variables the time period is fixed like time that you are measuring that time is suppose in a month of January you are counting right. So, in January or in a particular day you are counting the data say temperature you are counting for different city. So, you are calculating the temperature for different city say Bombay Delhi Calcutta Chennai Bangalore say you are you are counting the temperature for a same at the same time period for different cities and maybe one variable maybe many variable like you know you can see here in this example humidity wind speed. So, there might be many more examples I can show you the where many variables will be there and you are measuring that variables or the data of you are collecting data for them for different from different locations demographic location say at the same at the same time period, but when it comes to the time series data we are focusing only single variable and that for single variable you are collecting the data over a period of time like say January February March April May June this way you are collecting the data or this quarter second quarter third quarter fourth quarter fifth quarter sixth quarter that means next year you have gone. So, this way you are collecting the data for same variable over a period of time here is the example of time series data you can see here here you are collecting the data of say profit or say demand whatever here I have mentioned profit for different year 2001 what was the profit 2002 what was the profit. So, this is called time series data that means for a single variable you are collecting the data and you study the nature of that data how you behave in the past and make a forecast for the future. So, this difference you should understand that which type of data you have whether you have a cross sectional data or more mixed of panel data or you have a time series data. We will be focusing today on time series data and their behavior analysis or competent analysis. One more part that you should know for understanding the data that data in general time series data could be classified into two category it can be classified primary data secondary data different way it can be classified also the primary and secondary part data analysis part also I have discussed little bit of you know in data decision making session, but let us focus on today's session. So, when it comes to the time series data it can be stationary it can be non-stationary what is stationary data stationary data or to some extent stable data that means over a period of time the mean and variance of the data won't change it will remain steady or stationary. So, that data is easier to study the past behavior and easier to make a forecast for the future and the accuracy level which I will show you today the accuracy level also remain quite steady quite better actually forecast accuracy or the you know percentage of reliability will remain quite high, but when it comes to the non-stationary data where that over a period of time the data fluctuate too much. So, when you calculate a block if you take a block of the data and if you measure the mean and standard deviation of that particular data it might change to the next block. So, that means I can show you one example here you can see suppose this data if you think about this particular stationary data and if you take this block say this month or this block and if you take the calculate the mean and standard deviation and if you take this block say and if you calculate the mean and standard deviation probably almost it will remain same. So, this is stationary data, but so it is easier to make forecast for the future, but if you think about non-stationary data non-stationary data here if you see let us calculate the mean here suppose mean is falling on at that particular location and standard deviation here by suppose with the corresponding standard deviation and you could take this block say and we calculate the mean suppose your mean is coming here. So, you can see that in that case you can see the mean has shifted actually. So, therefore, this data now next you do not know where it will go it might come this way this way this way it can be anything right. So, therefore, what I am trying to say is that non-stationary data is to some extent unstable or too much of irregularity will be there randomness will come into the picture. So, it is very difficult to make a forecast. So, does it mean that we cannot make a forecast for non-stationary data in practice there might be many examples which falls under non-stationary data. So, in that case to some extent even stock price also falls under non-stationary category. So, how to handle that type of data? So, there is a way different models are there which we will discuss like Arima models are there, differencing process are there, you can use Dickey Fuller test and you can change the non-stationary data into stationary and you can make a you can bring AR model Arima model or say different type of whole-to-inter models which can take takes care of train and analyze the seasonality and decomposition method is also there. So, we will discuss all these models which can handle to some extent non-stationary data, but in general stationary data are we quite popular to as a part of time series data analysis because you know if you can bring moving average methods or exponential smoothing methods these are actually easy are quite suitable for stationary data and make a better forecast. So, we will discuss that type of both type of data and both type of analysis with suitable or appropriate time series models for this or methods for this particular data set whether it could be time say stationary data whether it could be non-stationary data. Now, come to the different components I have already given the introduction now actually the time series data are being classified into four category in general there might be many more aspects like you can think when the data will have a different type of randomness or irregularity it might follow in a different aspects or different pattern. So, therefore in general we classify the data into four category one is the trend then second is the cyclical and then seasonality and randomness or irregularities. I will discuss all these four with examples how the time series data actually behave over a period of time and how we can make a forecast for the future and all this depends on your how you are measuring the how are calculating the measure of accuracy or the error you are calculating of the data and accordingly you can make a forecast with accuracy level that means if the data has a trend you have to calculate the measure of accuracy and then the corresponding model you call and you run the model and then you see how much is the forecast and how much is the measure of accuracy for this data which is having trend. So, this way you have to come you have to make a forecast for the future. Let us see different type of frame of the data that means time frame that means data can be one month data it could be you know one year data 10 years data it can be in a couple of days data. Suppose you want to make a forecast for a stock price for say you know for next 10 days or next two three days and you have a last 10 days data. So, in that case you have only 10 days data you can consider is a short term data and you want to make a forecast for the you know next three days seven days or for two days you want to study the trend. So, in that case you can say it is a short term prediction maybe one or two period or maybe five six periods you want to make a forecast. So, that to some extent we call it a short term predictions even if you can think about the for intraday trading etc. it is like one hour two hours you want to make forecast right now what could happen in the next one hour. So, that you are trying to swing you want to swing of the stock price you want to capture. So, in that case it is a very short term predictions there is a model there are different models will study on this type of models for what period you want to make forecast for two period for three period for short term for long term for medium term. Suppose you have a quarterly data right as I mentioned in India generally data are being forecast or being calculated quarterly basis for any company whether it is a you know chemical sector whether it is a paint sector whether it is a you know automobile sector whether it is a IT sector or service sector financial sector everywhere generally on consumption FMCG sector generally company comes with their quarterly basis data. So, in that case when you have a quarterly basis data you want to make forecast for at least for one year. So, you have a four quarter. So, you can consider it as a like to some extent medium term prediction four to five quarter you want to make prediction or maybe next two years you want to make predictions. So, it can be considered as a medium term predictions or there could be examples where you need to make a forecast for next suppose Indian GDP you want to calculate right and you want to make a forecast or say you know population growth you want to forecast for 2030. So, this way you want to make prediction. So, in that case it is a long term prediction. So, you can have a previous data with longer duration of data that data you should also you should require you know long term data historical data you require that also long term and you want to make a forecast for long term also that very good actually suppose you want to calculate predict the you know trend of the Indian passengers which traveling through airlines. So, you want to make a prediction right. So, you need one you cannot have this with one year or two year of data there might be COVID cases where people have not developed that like airlines etc. So, that randomness regularly you have to remove then you have to go back to the past data. So, in that case you might have a longer period you might require a longer period of data and once you get that longer period of data you can make a better prediction for a longer period also maybe next five year six year what could be the trend of different airlines or different airport that you can make a better forecast. So, this way you can classify the data into short term prediction mid term medium term prediction and long term predictions and there is you know specific you know bar or you know demarcation that you this data or this period means it is short term it is a long term it is depending on the context of the problem depending on the availability of historical data and depending on the requirement of forecast. So, all this depends on the situation I have already discussed in one slide remember that different type of selection of a models and around eight points I have discussed. So, that is very important that slides and this is a actually you know one of the part the time frame of understanding is very important. So, you might have a data and you want to make a forecast for longer period and the data says that it is not possible to make a forecast you can make forecast, but that forecast may not be suitable or may might have huge amount of error it will not match with the reality. So, how to make a forecast for longer period which is not suitable with the existing data. So, in that case you might think about prediction of a short period one more example I can give to you suppose you want to make a prediction of air quality right. So, air quality you might need to prediction for say you know maybe now it is say 3 o'clock suppose and you want to make forecast for 4 o'clock. So, just one or 5 o'clock you want to make forecast. So, you have the data 2 o'clock 1 o'clock and 2 o'clock and say now you have to make forecast for say 3 o'clock or 4 o'clock or 5 o'clock. So, these type of you know short term data you might have and short term prediction it might also require. So, in that case it is a short term predictions. So, depending on the situation you have to make forecast and the data requirement you need to collect. Now, once you understand the requirement of the company or requirement of the forecast whether it is a short term, long term or medium term then the next part is that understand the data pattern because you have time frame you have decided whether the availability of the historical data and the requirement for the forecast. So, once you identify the time frame from both side for backside and the forthcoming side. So, once you know all these things then you understand the data behavior that historical data is been provided to you you take that data and make a forecast for them. Here are the top 4 you know most popular or majorly defined components of time series are that trend, seasonality, randomness or irregularities and cyclical. All 4 I am going to discuss today with examples. This is very important slides and very important for you know understanding of time series data and making bringing or selecting appropriate time series model. So, if you know better about this 4 pattern of the data the suitable model you can actually select. So, first trend, trend by term itself you can understand the trend means it is to some extent you know data maybe there might be up trend of the data there might be down trend of the data. So, if the data is following suppose sales are going up every year year on year basis or quarter to quarter basis. So, if the sales are up so that data is a trend data it is not steady data right it is not a stationary data. Suppose data is following like this over a period of time data is following like this or stock price is going up or you know sales of a company or the demand are going up. So, if you think that type of data it is to some extent non-stationary data. So, here if you calculate the mean your mean at in this time period your mean might be here, but at that time period if you come here in this from this block to that block you can see your mean might be here. So, mean has shifted actually. So, in that case we can say that this data non-stationary data, but this data is following trend. If this type of data are with you and if you realize that the or after drawing the graph if you realize that this data is following a trend specific trend may be up trend or down trend. So, which model to select. So, once you know one model suppose hold model once you know one model you bring that model and use this data for that particular model. And you will see a very good forecast as compared to the say moving average model which we will discuss in the next session. So, you will see the difference or measure of accuracy of the data where you use a appropriate model or if you use a wrong model. So, therefore, data understanding or component understanding of time series data is extremely important. So, this is all about trend part common sense you can use and you can understand that. Now, second is the seasonality. In India as I mentioned most of the companies predictions or the sales or you know demand and all these things production are generally been followed seasonality basis. It seasonal can seasonality can be you know monthly basis quarterly basis it can be you know like in India you can think about say summer season we can think about monson season winter season festival season we can classify this way also. We can classify seasonality in terms of quarter to quarter basic quarter one quarter two quarter three quarter four. For example, in India quarter one means say April 2 once the financial year starts. So, April May June. So, first quarter these three months are being clocked and we call it as a first quarter and then for say beverage products cold drinks products they have a higher sale in this particular month because it is a summer and then if you then suppose next quarter say quarter 2 suppose July August September. So, if you consider that quarter second quarter you might see the agrochemical products or say you know fertilizer products will have a higher sales. But that sales might not last in the says fourth quarter because once the crops are being harvested they people might not buy agrochemicals or fertilizer. So, therefore the our say even cold drinks products might not be bought in winter. So, therefore the product follows seasonality. So, if the seasonality pattern are involved in the data. So, once I understand the pattern of the data through graphs or through analyzing the past data. So, you have to bring the appropriate model for that. So, therefore, seasonality is one of the important components of the data. Here you can see the graph the blue the sky color graph this data in every year in a term period it is actually follows seasonality. Here you can see this year one year and if you see this time period at that time period there is a speak you go to the next year you can see at the same time period tentatively there is a high high sales. So, every year there is a specific pattern it is following. So, we call it as a seasonality that means it will be repeated the pattern is known and pattern will be repeated. So, it is easier to forecast despite it is a zigzag, but it is easier to forecast because the data pattern you know and it is following a similar pattern it will be repeated every year that pattern and in a particular time period might be sales might be high or low, but overall if you compare quarter to quarter basis or you know adjusted quarter you might see the sale is up or down. So, that pattern will be followed every year quarter to quarter basis and in that case that pattern is called as a seasonality. So, in that case how to handle the seasonality you have to calculate the seasonal index you have to do the scene decentralization of the data like you know you have to remove the seasonal factors and by by calculating the seasonal index or every weightage for a particular index and then you have to bring the appropriate model say winter model or so your decomposition method and then you can make a better forecast. So, this is what the seasonality then come to you know randomness randomness like you can see this particular space here it is a you can see here it is a random. So, this randomness are not been there present in second year or in third year, but here it you can see the randomness is there. So, therefore, we call it as a randomness or irregularities I can give one more example. Suppose you know Middle East suppose Israel and Palestine were started right began. So, because of that there might be a crude oil price crude oil price may go up high, but after 3-4 months it may fall down. So, this crude oil in India crude oil price if it goes up it means that for that time period for this particular 2023 the crude oil price has gone up suppose and next year it might not go up at that time period because that is a specific event and because of that crude oil has gone up. So, this irregularity you have to remove this randomness you have to remove if you put that spike of crude oil price for your future forecast because it is not in it is not exist in the other years right or it might not come in the next year also or in the next year data it might not be there. So, therefore, this irregularities or randomness you have to remove then only you can make a better forecast for the future, but otherwise this outlayer will drag your forecast to upside or downsides. So, make sure of that. So, this is what randomness I can give one more example suppose during corona what happened once the corona came most of the you know like you know restaurant business or the tourism sector or hotel sector hotel business got closed right. So, or I mean startups also got affected heavily. So, what does it mean? Because of this one corona cases one or two year this particular industry or the sector say tourism or the hotel industry got suffered and there are no sale. So, you know generally there might be selling their sales are like this suddenly what happens one or two year it like this now they have come back to the normal mode, but if you want to make forecast for future in that case your forecast should follow like this, but if you include this particular data with this particular data of randomness or irregularities during covid period of one year or say two year. So, if you include that in the past data and if you do not replace that with the average data or you know to some extent interpolation. So, what happens you know your forecast might your forecast might come down because because of this irregularity or out layers it will or you know low data it might bring down your forecast, but actually it is not correct forecast your measure of accuracy which I am going to discuss now will be too high. So, you cannot consider this data this randomness. So, you have to replace with some average data then only or you can remove that data or you can replace this data then only you can get a better forecast with least RMSE or the error. So, that error will be considered is a better error. So, this is called randomness and it is heavily involved in time series data. In practical scenario if you go to different context and if you bring different industrial example you will or so real life social example you will find that randomness exist in the time series data and that data you cannot consider always without making a average or handling that in a or treating that with a separate aspects. So, therefore randomness has to be taken into a considered account in forecast and how to handle it how to replace it that will discuss later and that once you get that replacement of the data randomness or irregularities of the data or out layers then only you can make the focus on the data clean and the pre-processing are done then only you can think about forecasting for the future. Now, look at thing look at the last one before I go to the last one look at this particular example the graph these data actually includes trained signality and randomness look at the data the sky color data with this example here you can see this data has a signality look at this data has a uptrend also look at the blue color line this data has a irregularity randomness also. So, these three part that I have discussed so far are involved in this data and which is practical in reality and this type of data pattern you have to understand and there are components in to understand and then you have to make a forecast for the future with better accuracy if you want to keep a better accuracy in that case you require this type of component analysis and corresponding model you have to bring so that you can make a better forecast for example here this data look at this particular graph sky color graph this data has a signality as well as the trained right. So, when the data has a signality and trained which model is most suitable because you understood the data pattern and the behavior and you have classified the components also. So, you realize that the data has a signality and trained. So, you bring inter method or decomposition method which will be most suitable for that even hold model I told about for 10 data you may select the hold model right, but hold model is suitable for only trained data, but if the data has a signality and trained both then you cannot use hold model you can use but the measure of accuracy of the error will be too high. So, therefore you bring winter method or decomposition method which will discuss all these things later. So, this way you can actually understand or the data pattern through component analysis the past data behavior and you can make a better forecast with higher accuracy. Now, let us come to the last one that is called you know cyclical pattern. So, what is cyclical pattern? Cyclical pattern of data actually occurs for a longer duration that means it is not for short term or one year or two year. Suppose you know if government changes policy or so far as per my knowledge you know metal sector in India could not perform very well because you know because of government policy regulations etcetera or say edible oil sector sometimes are been totally you know government policy driven. So, therefore what happens they suffered now if you go also even airlines industry. So, they could not perform well most of the airlines are getting start shutdown or to some extent they are loss making they are in loss making business or they are selling their you know entire business actually. So, why? Because they are not performing very well. So, that might happen in last decade say, but now if government comes out PLI scheme or the incentive etcetera or subsidiary etcetera in that case you might see that the airline sector or say metal sector might perform well even I read a recent article that Jeffrey one of the like you know Bokharaj advisor or the consultant team they said that you know metal sector will perform very well in future in the coming years in India. What does it mean? Earlier last one decade or ten years they could not perform now they will perform because of government policy on the government initiatives. Suppose you can think about say you know defense sector India never perform very well defense in defense sector India is to you know import defense components from the foreign countries. Now India because of making India India is producing not only meeting the demand for Indian army and defense but also India is planning to make a export of you know aircraft of fighter jet etcetera or the defense components. What does it mean? It could not perform well but now it is working so it is a cyclical. Maybe ten years it will work then it might go down against some other sector will come forward. Because of China plus one policy what happens most of the clients are been shifting. Last two, three decades China actually dominated dominated based on that you know in manufacturing sector. But now because of China plus one policy or between the war environment or US China tracel what happens you know most of the companies are shifting from China to India. So it is an opportunity for India. Now think about chemical sectors most of the chemical sector are trying to you know semiconductor manufacturing sector are planning to set up their plan in India. So India will have an advantage for next one or two decade for them. So this is what cyclical think about another example of say you know Ukraine war, Russia Ukraine war what happens because of Russia Ukraine war you know the entire Europe is facing with energy crisis. But what happens you know maybe two years or two years two years or three years they will suffer that entire Europe will face the energy crisis. But after that they will come up with recovery. So this two years or three years period you can consider the cyclical pattern after that they will come back to the normal mode. So this type of whenever longer data longer suppose government changes policy with education policy or say population policy etc. population control policy. So in that case you might see that you know population will curve or say education policy once the changes you may say the rural area will get advantage of that. So this way if the data follow different pattern of a longer period and then it changes in the next two years three years or five years or one decade and then again changes so this type of data pattern follow cyclical. So cyclical data if you have and if you think that the data is having a longer period of trend or pattern and then it is changing for energy remaining steady for the next couple of years or decade and then again it might change. So this type of data pattern are called cyclical pattern. They don't fall in the trend they don't fall in the signality they don't fall in the randomness. So in that case this data you can analyze through cyclical pattern of the data and the corresponding forecast you can do. So these are the four category of you know time series data major four category one is the trend signality, randomness and cyclical. Now we will discuss the measure of accuracy. So let's take a 5 minutes break and then we will discuss the measure of accuracy for time series data.