 Okay, please remember to complete the register, it's in the chat and always remember if you have any questions, any, you need any assistance, don't waste any time, send us an email, those are our content details, send your email to ctntatatunisa.ac.za and copy me in all the correspondence so that I am aware of what is happening. Okay, so today we're going to deal with time series and forecasting, which is the last session and then by the end of today's session, we should have done almost all basic skills related to your studies, especially STA 1502. The only thing I realized that throughout the sessions, what I have uncovered, it might, I'm not sure if it's very important. When we did the ANOVA, I think, there are some tests that you need to know, like the Bero, I think it's called Berofoni or Bforoni or something like that. I can't even do my monologue. Now I'm exposing myself on the YouTube channel now on the recording. What is it called? I will tell you just now, which is one of the things that we didn't cover as part of our basic skills. It is cold. Just give me a second. I'm going to get there. I know that in some years, they used to ask questions, but I don't know if this year you did get questions related to those analysis, those type of tests. Is it that Bonseroni? Ben Foroni. Yes, that one. I'm not sure if you still do get some questions relating to that. I haven't covered them, and I don't think I will even have time to cover them. But yeah, so yeah, those are the ones that I am referring to. Ben Foroni. Ben Foroni. What else is there that we can find it now? Yeah, but they are from the analysis of variants. There are some sections that I didn't cover because I don't think the way that's necessary for us to go through them in detail. The other thing that I also need to make you aware of is we didn't cover the F test, especially the one way we do hypothesis testing for the variances. I think it's fairly easy and straightforward to do, but if you have any challenges in answering questions relating to that, then we can look at that. Otherwise, before you go write the exam, I think probably in the second semester, we will go through the exam preparation sessions where we can get you to understand how the questions are asked and how to answer those questions. But also realize that on a weekly basis, as we meet, most of the exercises or activities or examples that we use in the sessions, they are all coming from your past exam papers. So it means I'm not coming up with new questions. Those are the questions that the past people used to write exams on or either assignments on or tutorials on. Okay, so let's start with this week's session, so that we can get done with the session. Do you have any comments here or question before we start with the session relating to basic numeric or statistical reference? Are there any questions? No questions from my side. Thank you. Okay, then we can continue with today's session. So today's session, like I said, we're going to be looking at time series and later on we're going to look at how we do forecasting. The requirements for you to be able to complete this section with ease is that you need, I'm not sure if we will use statistical tables, but for sure we're going to need, you need to use your formulas and your calculator are very important. By the end of the session today, you should be able to learn about the different time series and forecasting models, especially the smooth smoothing techniques, which uses the moving average and the exponential smoothing. And also you should be able to learn more about the trends and the seasonal effects in terms of the trend analysis and the seasonal analysis. And then lastly, we're going to touch on the forecasting model, how we do forecasting. So in statistics, you should be able to focus what the future will look like. And this forecast needs to be based on the historical data as well. So there are different approaches in which you are able to do the forecasting. One, you can use the qualitative forecasting method. And the other part, you can use the quantitative forecasting method, which is the one that we're going to be using in your module. The qualitative forecasting method is used when your historical data is unavailable. And therefore the researcher, based on the past experiences, they can estimate or they can point out and say, in the past, this was what was happening. How can we use the past to focus what's going to happen in the future? And that is the qualitative forecasting method. So this method is considered to be highly subjective because it is based on the researcher themselves or the researcher, him or her, herself, determining what the future should look like based on his or her previous experience. And in a way, it will also be judgmental because it's their own opinion. The good and perfect way of doing the forecasting is when you have quantitative data in that way, then you can base your forecasting on the numerical values or numerical historical numerical values that you have to inform or tell you what the future should look like. And in terms of this, we use two types of methods. We use the time series, which we're going to go through in detail, and we can also use the causal one, where it means when we include other variables into the mix, what happens to the forecasted value that we want to calculate. And that is the other part. But in your module, we're not going to concentrate on the causal one. We're going to only look at time series and use the past data to predict what the future will look like. Let's assume that we are given numerical data. We are able to collect the data. And we know that this data needs to be collected or obtained at a regular interval. So usually for a forecasting, it needs to be if you collect data on a daily basis, it needs to be consistent. It does not have to have any missing data. So you cannot have a data that is missing, otherwise then you will need to impose a value or fix the data and make that value either a zero or use the average to smooth it out. And we're going to talk about it at the later stage. Your data needs to be collected at a regular time interval. The time interval can either be annually, quarterly, monthly, weekly, hourly, daily, and seconds. It doesn't really matter. When I used to work at ShopRite, when I was consulting at ShopRite, we used to look at the sales. We used to do sales analysis in terms of the Black Friday. So we're going to focus in terms of when are we going to run out of stock or when the stores needs to make sure that they replay. Okay, put now that weight has slipped my mind, replay-ish. But it slipped my mind. So I'm going to use the one that I know to re-stock or replace things on the shelf so that people can have the items and not have any missing. And especially when we were running campaigns and all that. So we used to use the method of time series as well to focus in terms of when are we going to run out of stock so that the stores can be prepared in terms of making sure that there is enough stock on the shelf for consumers or customers to buy. So in that, we used to use the especially when it's competition time or when it is Black Friday, we used to use the hourly data. So we will collect data in store at an hourly rate and make sure that we analyze that. But for a competition that runs for the whole month, we will use the weekly or the daily depending on the type of a product that we want to analyze. If it's a fast-moving product, then we will do a daily forecast. And if it is a product that does not sell so quickly, people like, for example, if you put microwave on a sale, not everyone needs microwave on a daily basis. So it's not a fast-moving product. So we will use that to determine the type of data that we will have. But that is one of the scenarios that I can give you in terms of relating life work experience with the information that we're looking at. So let's look at this example of ours. In a year 2000 and 2001, 2002, we were given the sales pay that years. So we know that in 2000, the sales in that store was 75.3. In 2001, it was 74.2 and so on and so on. And that is a yearly time interval of this data. So we can take this data and visualize it so that we can see the trends. And by visualizing, we use what we call a time series plot where it will put the years as your columns and your rate or the sales as your vertical axis or your y axis as well. So a time series plot is a two-dimensional data plot of your time series data. So when you look at a time series, let's go back. If you look at this time series, you can also deduce from here to say what kind of a plot this is. And that is determined by the type of a time series plot that you are looking for. So you can have a trend component, which also shows the overall. And it is also persistent. So the flow of your data will be persistent or consistently. And it's usually over a longer period. It's usually over a long period or a long time. A seasonal component is when you have breakages of data or showing the peaks at some point on your data set. And that it happens, or those fluctuations that happen periodically. Like for example, when you sell winter clothes, you know that when will you have a peak sale in terms of jaysies? And when you will have a low peak in terms of those winter jaysies, because people don't buy jaysies in December, they don't buy jaysies in January, they will buy jaysies closer to the winter time as well. So you will have those peaks if you're using your sales data. And usually you will be able to see this fluctuation if only if your data it has at least, or you can identify it at least if your data has at least 12 points or 12 months. For example, if you are tracking it for the last 12 months. And we're going to look at that in detail just now. Then you have a cyclical component. And this one, it is when you have a repeating stream. So for example, in January, it's down February, it's up March, it's down and up and down until December, you will see those peaks going up and down, up and down, and it can also happen over more than one year. And if you are able to track that in two years and see similar patterns of up and down, up and down at the same period, then you are able to know that you have what we call a time series, cyclical component, time series. Then you also get an irregular component, which usually it is based on your residuals or erratic fluctuations. So this happens there and there. When we looked at regression, we also spoke about some errors. And these are based on those little errors that can influence in terms of those fluctuations that you see. So these are the types of time series components that you can have. So in detail, a trend component will look like this because the data is taken over a longer period of time. The trend might go up or might go down based on this. So if it takes an upward, we say this is an upward trend. If it goes down, we say that is a downward trend. And if I look at our original data sales data, you can see that it started at the top and then it's going down. So this will be a downward trend, right? We can also have, this is our downward trend, we can also have a non-linear trend, which can take a form of an exponential. I lost my pen, an exponential, or it can take a KVD side, which can be called a quadratic trend, which an exponential and a quadratic will be part of the non-linear trends. So you are able to also, when you put your data on a scatter plot, you are able to see the trends in terms of your trend component. Your seasonal, with your seasonal, because you can allocate it in a short regular intervals, pattern-like. You can also observe it over a period of one year. And often happens on a monthly basis or a quarterly basis, depending on the type of data that you have collected. So in terms of sales, you will be able to see when the fluctuations happen. It might be that in winter, the sales go up. In spring, everybody has the normal sales. In summer, everyone wants to buy new trends and buy those stomach outs and all that. And in fall, it's a little bit chilly. Everybody wears the normal clothes that they have. In winter comes, everybody wants to buy jackets that are in trend and like that. So it will fluctuate over a period of time. A cyclical component, this one happens over a longer period or a long term and is a wave that happens on a longer period. You cannot observe it only for one year, but you need more than that. So it's regularly okay, but may vary in length. So it might be that June and July, the peak looks almost the same. But then when you get to December, it only peaks in December and then the next month doesn't peak and it will come back again, May, June. So the peaks will happen or okay in different ways and different lengths. So it means how long those peaks last. That's how you will determine whether is it a cyclical or is it a seasonal thing or is it a trend thing. And it is often measured peak to peak or through to through. So it means if I'm looking at the peaks, all the peaks needs to happen at an equal interval probably or more or less the same over that period of time, which more or less can be two years. So let's look at this example. So you can see that if this was one year, so the peak, we can see that from that month up to that month. One, two, three, four, five, six, seven, eight, nine, 10, 11, 12. So this is one year and then it picks up again and you can see that cycle that happens as well. Which one of the following component of the time series measure or cares over a short repetitive calendar period by definition and have a duration of less than one year. Is it cyclical? Is it irregular? Is it trend? Is it exponential? Or is it seasonal? I'm going to give you one minute while I go find my pen so that I can write on the boat. Do we have an answer? I think that one is seasonal. Let's go back to our definition. Okay, seasonal, short term, regular interval observed within a year on a monthly or quarterly. The other one was a trend taken over a long period of time and can be different wavelength, short term, seasonal. So you are right, cyclical as well. It's a long term, wave-like pattern and the answer here is seasonal. Which of the following is correct? A cyclical component or cares on a short time series as a result of those components are the most commonly occurring patterns in the time series. Number two, the irregular component is the leftovers when other components of the time series, trend, seasonality, cyclical has been accounted for. Seasonality is the underlying direction upwards or downwards and is the rate of change in the time series when allowance has been made for the other component. Trend is the short term movement in the time series and forecasting is not one of the aims of time series. Which one is correct? I know that number one is incorrect and number two is number three as well is incorrect and number four as well is incorrect. Let's see. Number one, number two is number one is incorrect, right? And number three as well is incorrect. Number three is incorrect and number four is also incorrect because trend is also over a long term like cyclical and forecasting is part of the time series analysis. So that is correct. And so that is incorrect. Number two would be the correct one. Yes, because with number two, because it talks to the residuals, the residuals are those among that are not accounted for after you have done all your analysis is those errors, those residuals, those erratic values that you cannot account for. That happens. So the only correct answer here is number two, you are on the right track. So now let's start looking at time series analysis and now we're going to expand and look at smoothing techniques or smoothing methods. When we talk about smoothing methods, so because we know that when we're looking at a trend and it might look like this or it might be fluctuating, all what we want is to make sure that we smooth it out. If it was like this, we want to smooth it out so that all the points are almost all the points, all these points are almost closer to the line as possible and no longer scattered everywhere. So that is what smoothing technique is. So that's what we're going to be doing. So let's look at so when we calculate the moving average, we want to get an overall impression of the pattern of the movement over a period of time and the averages or consecutive time series values for a chosen period of length will be demonstrated and will be smoothed out as well. Then we also have another smoothing technique. So the first one was the moving average. The other technique that we can use to smooth out the time series data is by using the exponential smoothing. So with the exponential smoothing, we use weighted moving average. So let's first start with the moving average. How do we do that? How do we do the moving average? You know that it's a technique to do the smoothing and we're going to use a series of arithmetic means over a period of time. So it means if I have data like one, two, three, four, five, we know what a mean is, right? Meaning is adding all this value divided by how many there are. But if I'm going to do moving averages, therefore it means when I'm at this point, if I'm using two as my moving average, it means I'm going to add these two value and find the moving average of that, which will be 1.5. Then if I go to the next one, I must add both of them and find 2.5 and add those two divide by how many there are. There are only two. So it's 3 plus 4, it's 7 divided by 2. That will be 3.5. And I do the same at those two. 4 plus 5 is 9 divided by 2, which will be 4.5. And I have smoothed out the data now. That's what the moving average is and that's what I am referring to in terms of a series of arithmetic means over a period of time. So it will happen even if they tell you for three years, then you smooth it out using three years of values. So the result will be dependent upon the choice of the length of time. Like I said, in my example, I was using two as my length of time. So you can be told that it must be five years or three years or two years or 10 years. You must read the question carefully so that you can identify that length of period. The last moving average of the length can be extrapolated by using one period into the future for a short time forecast. So it means when we get to the last value because in the last value you don't have any value after that. So you can extrapolate that using estimated value in the future by using one period in the future. Okay, so how do we then do this in a practical sense? So here we have an example of five-year moving average that we want to calculate. So in the formula, we're going to add all five years of data divided by how many there are because we're calculating the average. So we'll take year one plus year two plus year three plus year four plus year five divided by five because there are five years and that will give us the moving average for the five-year moving average. If I need to calculate the second moving average, it will take no longer taking into consideration the year one. I will start from year two and count five of those years. That will be until year five and I add all of them divided by five and that will create a moving average, a five-year moving average for those years. So let's look at this example of ours. So we have year one up until year etc. In this table that we're looking for, we're looking at we got up until 11 and etc. We also have the sales. So I can plot this on a time series or a scatter plot, a time series, and I'm able to see this. But I can see that there are some points where there is a fluctuation and it goes back to normal and then it fluctuates again. So there is some seasonalities in this example of ours. So let's calculate the moving average. To calculate the moving average of the same data, so if we calculate in five-year moving average, we're going to take the first five years, divide by how many there are. But when we write the value, we're not going to write it on the first year, we're going to write it on year number three. Because year number three is our average year between the five. So we use the average year. So we're going to take the year, we take the average of the year. So well, not the average, actually let's say the median of the year, which is three will be our middle value. And we take the moving average of all these years. So it means when we get the moving average, we're going to put it next to number three. And we're going to continue like that until we have all of them. So the the moving average will be one plus two plus three plus until we get all of them. And moving on and on we can calculate all the moving average. So for number four, moving average, because then we're going to take all this values. So now when we move to the next one, we move this. So this is the year and we move to the next one. We take this and we take the moving average and we get 29.4. Oh, sorry. 29.4 is for year one, two year five. And now we are on year two, two year six. We take 40 plus 25 plus 27 plus 32 plus 48. We get the moving average and we take the average of this one before. And we continue like that to complete the entire table. So that will give us 34.4. And you move on. You take the next five years. The average is five. And that is what is five. 25 up until 33, the sum of that divided by five will give us 33. And you continue. We take four up until eight. The median of this is six. And the average of 27, 32, 48, 33, 37 will give us 35.4. And that is your moving average. And you can go back to the original data and plot your new data set, your new moving average on top of that, so that you can see when you are moving, when you are using a five year moving average, how smooth out your trend will look like. And there we are able to, on top of the same original trend that we had, we can also demonstrate our smooth moving, five year moving average data that now all the fluctuation have been taken away of the data set. And that is how you do moving average. Let's see if we can do it. A shared trade has a policy of selling shares. If its share price drops more than one set of its purchase price in January, the investor bought the shares in Neutron, an electronic company for a 90 cent per share. The table below shows the price of Neutron share at the end of each month. So we have demands from January until July. We also have the share price. The question asked, the three period moving average of the share price in May is, so we need to calculate the three year moving average. So remember, moving average three years, the median, so it means we calculated the moving average for that point. Adding all the values divided by three will give us that value. To calculate for much, you will then start from February. Not all the values because we're only doing three year moving average. That will give you the much value. So go on. You don't have to calculate all of them, but I was just demonstrating. You can just calculate the much, the May one, because I'm showing you that for much, that's what you will do. So when we are in May, what is three months in May will include all those values. And that will be 36.7. And that is in May, going to say 80 plus 74 plus 76 divided by three, which is equals to 76.7. And that's how you will answer the questions. Right? Easy, right? May I ask something before you go, please? If they said a, here's a three period. What if it says a four period? How would you? So if it's a four period, so now we're going to use those. And you're going to decide whether are you going to use much of February. Because the middle or the median of this two of this it's here in the, in between, right? So usually if there are numbers you would have said it is one, two, three, four, because those are the months. And when you add, when you add all of them and divide by four, what answer you will get will be one plus two, plus three, plus four and divide by four, you will get 2.5, right? The answer you will get here will be 2.5. And what is 2.5? If the value, we're going to apply meds. Meds says if the value after the decimal, after the copper, if it's greater than or equal or the value that you want to round up to, if it's greater than or equals to five, you add one to the value. So this will go now estimated to three, therefore it means you're going to start here for a four-year difference. And the next one, which will include all this, then it will go there, the next one, like that, like that, like that. So that's how you will work it out. Because these are months, you will have to find a way to say the highest value of the, the between, you will choose the highest value to use. Now let's look at exponential smoothing. So this one, we need to pay a good attention to it because it's different to the moving average. So we know that it is used for smoothing and this, we can use it also for short-term forecasting, especially when we're looking at only one period into the future. And we know that we're going to be using the weighted moving average. And the weight will decline exponentially. So as we move decline, most recent observation will be weighted most. So it means, because it's it's declining, the weight will be declining. So it means the latest one or the first ones will have a bigger weight, a bigger weighting as well. And let's look at an example. So we're going to use a coefficient, which is W. You must look at your module, what symbol they use. So for all the presentation, I use W, which is our smoothing coefficient or our weight. And this, we the researcher can can decide what the value will be, whether it's 80%, whether it's 20% or 0.2 or 0.8 and so forth. So they can decide the weight. Oh, yeah, it is subjective to the researcher. So the researcher will determine how much weighting will be allocated to the values. And it needs to range between zero and one. And what do we also need to be aware of is that the smaller the weighting, the more smoother the data or the trend will look, the larger the weight, and that will give less smoothing trend. So it means you need to allocate a smaller value so that the smoothing is more. If you use bigger weighting like 80, 0.8, you will still see some fluctuation because then it will have less influence in terms of the trends. That's in the natural, what this face. Okay. So the weight needs to be close. Oh, if the weight is close to zero for the smoothing, we call this an unwanted cyclical or an irregular component. So zero will warrant for an unwanted cyclical and irregular component. That is when your weight is your weight is closer to zero. So when your W is closer to zero, so like for example, if you're going to use a weighting of zero comma zero one, that will bring an unwanted cyclical and irregular component into your. So your weighting should be at least 0.2, 0.35, 0.4, but it cannot be bigger because when it becomes bigger, then you will still not be smoothing out. There will be less effect. So if your weighting is close to one, then you can use the data to do your forecasting. So it means it's closer to four. Forecasting is appropriate for forecasting. So let's look at this exponential smoothing model and the reason why also let's put it this way. We say the bigger means less smoothings, but when it's closer to one, we can use it for forecasting. The reason for that is that when you do your forecasting, you want to use values that are closer to the original data so that you don't bring other noises into the data set because when you use a smaller weighting, roughly I wouldn't trust the data because then it will be too smooth and it might give us the wrong indication. But when the value of your weighting is bigger, it's closer to one, we can use the forecasting because we know for sure that it's going to focus or it's going to predict the correct value as per the historical data that we used. But in terms of the exponential smoothing, we use this formula. E1 is your exponential and always at the beginning, your exponential smoothing will be equivalent to the value of your observation that you want to estimate or your estimated observation or observation of the that you're going to use for the estimate. The second value, which will be your exponential value i will be given by your weighting. So because you're going to apply the weighting to the original, to the first value, so you're going to apply your weighting of 0.35 for 0.25 for 0.8 to your original value. Let's say this is the same, let's say this is the same. So you're going to apply your weighting to your first same plus you're going to say 1 minus the weighting to the current, which is your current exponential value, where I represent 2, 3, 4, 5, 4, 6 of those ones. So what does that mean? So in the beginning, everything will be the same. So my exponential will be the same as my original value. My second line will be if I'm on line number two. So if I have sales, let's say year one, these are my years. So year one, year two, year three, and the sale year was 20 and the sale year was 40 and the sale year was 30. So in terms of this, it says in my E1, my E1 here, at the beginning, it will be similar to my sale. So which will be 20. But my E2 here, my E2 here, I will have to apply the second formula. So I must take my original, which is 20, I must multiply it with the weighting. Let's say our weighting is 0.8. So I must multiply it with the weighting plus I must add 1 minus 0.8 to the value of my second, which is 4. I'm doing all the wrong. I am doing it all wrong. I am doing it all wrong. Remember, this is the original. So this is I. So it means it's a two. So there, and remember what we said. Let's go back. Wait. Let's go back. We said this one, but we said the most recent observation will be weighted most, right? So because of that, the most recent one will be at line number 2 will be weighted highest. The next, the other one will be I minus 1, which is 2 minus 1, which is 1. So it goes back to the original sale, which is 20. And that's how we will complete it. And for 3, it will be 0.8 times 40. That is 30. It will be times 30 plus 1 minus 0.8. And we go back to this one, which is 40. And so forth, and so forth, and so forth, and so forth. And that's how you will apply your exponential smoothing technique to the values. So let's look at more examples. So we know that we have our time period and our sales. I'm going to use this as my forecasting prior period because then I'm going to take all these values and move them, shift them one one bit every time I use them. But I will use the exponential values because this is the value that we will need to use. So at the beginning for my exponential smoothing, remember my E1 is the same as my Y1. So 21, 23 will be the same as 23. The next one will be 0.2 because my weighting, oh sorry, I forgot to mention that. Remember the weighting will be given to you. So my original weighting, my weighting will be 0.2. So it will be 0.2 times 40. Plus, and I must go back to the previous one, which is 23 times 1 minus 0.2, it's 0.8, as you can see there, times 23 and that gives us 26.4 and I can write it there. And when I go to the next line, now I'm going to use all these values here. So the first one will be 0.2, my weighting times my original sale, plus my 0.8 times 26.4, which was my previous value. And I will go on and on and on and on until I get to the end. So these are the numbers which are your smoothed, your exponentially smoothed values. And we can look at when we plot our data, remember it showed some fluctuation and the red one is the smoothed one. The difference between the exponentially smooth technique and the moving average, as you can see, with the moving average, we will start at the period where we want the moving average to start at. With a smoothing technique, it starts at the beginning with every value as well. But you can also see that it is the most effective way of calculating or doing some forecasting as well, because then it shows you the smoothed time series data. So your question might come like this. The price of gold is used for some financial analysts as a barometer of investor expectation of inflation. What the price of gold tending to increase as concerned about inflation increase. The table below shows the average annual price of gold in thousand of rent from 1990 to 1995. Calculate an exponentially smooth series for the gold price time series for the year 1992 using the coefficient of W of 0.8. So we are given the weight and we can calculate the exponentially smoothing. Now, the challenge with that is that you need to know what was the smoothing technique because yeah, they said for 1992, 1992 is at this point. You need to know what was the smoothing technique at that point. So it means you need to go first and calculate your smoothing technique, your smoothing. So we know that at the beginning is the same as 384, right? So to calculate at 1991, remember, we're going to take our weighting. So for 1991, it will be the original value, which or the weighting, let's start with the weighting because the formula says EI minus 1 will be W times YI plus 1 minus W times EI1. I think the formula wrong here, it just needs to be EI. So let's do that. We know for 1991, our weighting is 0.8 and we multiply that with 362 plus 1 minus 0.8 times and the previous value was 384. And what is the answer that you get? So it's 366.4. 366.4. That is the value that goes yet, 366.4. Now we need to go to 1992, which is the one that we are calculating. So it will be 0.8 times the original value, which is 344 plus 1 minus 0.8 times the previous value, which is the previous exponential value, which is 366.4. Hello, my value is 348.48. 348.48, which is option number two. And that's how you will find your exponentially smooth series. Any questions? Okay, then we can move to the next one. Suppose that we calculate the three-period moving average of the following time series, and that is the time series, using exponentially smooth technique with the smooth constant of 0.2, the forecasted value of time period four is. So we are at this point. No, we are at this point. That's what they are looking for. So it means we need to calculate for the other values. So let's start with number one. I'm not going to write the formula because we know what the formula is. Let me just write it in case you forgot what the formula looks like. So for number one, we're going to say our weighting is 0.2, right? So that will be 0.2 times 10. Crazy me. Sorry, my best. No, for number one. Oh, it's 10. Ey is 10. It's at the original. It's 10. For number two, we say we only then apply that formula. So at number one, it will always be Ey. Now it will always be Y. Yy. So yeah, we'll say 0.2 times 26 plus 1 minus 0.2 times 10. And on number three, it will be 0.2 times 15 plus 1 minus 0.2 times the value that you would got there. Just put in the building blocks in place. Oh, Slyzzy, my interrupt to the formula there isn't for one, oh sorry, it's 1 minus 0.2. Okay, I see. Sorry. And 0.2 times 20 plus 1 minus 0.2. Or maybe because it looks like a plus. Let's remove that and I did it properly. And that will be the value that you got from there. That is what you need to do. For number two, I have 13.2. Number two is 13.2. And number four is 18.2. Which gives a value of 13.56. 18.56, which means they is 13.56. Which will give a 14.848, which is 14.85. Which is optional. Any questions? Are we happy? In the last eight minutes, let's see if we can get the last bit in terms of forecasting. There are three popular methods for forecasting. One is the linear trend forecasting, which almost exactly like regression. Then we have the nonlinear trend, which can be a quadratic. And then we have the exponential trend, which will be your exponential trend forecasting. So in terms of the linear trend, which we dealt with the regression last week, right? So you know how linear regression works. So we can use the regression model to forecast a new value. If we don't know what that new value is, or we can focus the value in the future as well. And that is the formula that we use to estimate where your B zero is your slope. It's your intercept and B one is your slope. And we can use this. If we know what our equation is, let's assume that we know what this equation is. Let's say probably it is your y hat is equals two. I'm just going to do roughly. Let's assume that it's 22 plus. I'm hoping that it's going up because I'm using my slope as positive slope plus 2.2 x. Let's assume that we can assume that there's nothing wrong with that. So it meant in order for me to use this same formula, if I say what will be the value when the period is six, therefore I can use this to say 22 plus 2.2 times six. And that will give me my new value, which my new value will be 22 plus 2.2 times six, which will be equals to 35.2. And that means 35. 25.2 will be my new value, and that is forecasting as well. So you can use that to forecast. For example, if I have my cell strength, so in terms of this, now I do have the actual correct linear trend of this equation, which is y hat is equals to 21.905 plus 9.5714, which is my slope times x. And I can estimate the new value of x based on this information. So if I want to calculate six, let's assume that my period is six there. So therefore it means here I'm just going to substitute that with six. So that will be 9.714 times six plus 21.905. See if I didn't calculate it, I did calculate it, I don't know, I am going through the slides like that. So when you multiply by six and add 21.095, you get 79.33, and that is your new new value for when the period is six, which will be in 20 and in 2005. And you can do 2006 and 2007 and 2008, you can continue forecasting that. And that is the linear forecasting trend. The trend equation y is equals to 1200 minus 35t has been fitted to a time series for the industry, WECA. WECA days lost due to job-related injuries. If t is equals to one for 1991, the estimated number of WECA days lost till 2008 is. So we know that our t is one, so we can either go on and write the table 1991, 1992, 1993 up until we get to 2008. We can do that. Alternatively, in order for us to know what the t is in 2008, we can just say t is 2008, 2008 minus 1991, and I'm going to add one because I'm not counting 92, 93, I need to start at the origin. One is 1991. So this is one, this is two, and I need to get to this t, that's why I'm adding one to that. So that will be, my t will be 2008 minus 1991, which is 17 plus one, which is 18. So my t, I have my t, my t is 18. And I can then estimate y hat is equals to 1200 minus 35. So 1200 minus 35 times 18, which will give us, what is the answer? It's 570. And that gives us 500 and 70, which is option number five. And the type of forecasting that we can use is a nonlinear regression model, which will form part of a quadratic form, which states y hat is equals to be zero plus b1x plus b2x squared, because this is a multi, a multiple regression, but it's got an exponential of x squared, so it makes it a quadratic form. And therefore, we can also use this to estimate the new value, but we can also use the values here to and compare it with the adjusted value of r squared, which is your coefficient of determination, or the common variance. And we can also use the standard error to see if there is any improvement. So we use this formula to estimate the new value, but we can also test some other things like using your adjusted r squared and your standard error to see if the model has improved. And if it needs some adjustment, then you do some adjustment and check the values and see if there is any improvement on your model as well. We can also try other functions that would best fit the model for our regression or for our forecasting as well. The other type of a model that we can use is the exponential trend, which is also one of the nonlinear trend model for forecasting. And that model is given by this equation, y hat is equals to b0 times b1 to the power of x and multiply with the errors or residuals. And because x is in the power, we can also take the log of this so that we can bring x down. And when we take or when we transform this equation, we're creating what we call a log regression model or log linear. We call it a log linear model because then it takes the logarithm of this exponential function. And based on that logarithm, in some application they call it a logit. We can take this and estimate the new value and in short we can rewrite this whole logit as the log of y hat is equals to your b0 plus b1x. And where we know that b0 is the estimate of the log of b0 and b1 is the estimate of the log of b1. And we take this logit or exponential trend forecasting equation and we can estimate a new value. However, sometimes when you have multiple models that you can use to do forecasting, it is best to know which one will give you the best fit or the best forecasting. And to do that you need to use some measures to prove whether this model is better than the other and you need to use it for your analysis or not. How do we choose that? How do we choose the best fit model? We do that by looking at the residuals. So we will perform a residual analysis and look at the patterns of the residual and the trends of the residuals. We're going to measure the magnitude of those residual errors using the squared differences. When we were doing ANOVA we used to use some square measures as well. So we also going to use that when we're doing some forecasting. You can use your residual errors. We're going to also measure the magnitude of the residual error using the absolute differences. So it's very important so they can ask you questions to say use the square measure or use the absolute difference or the absolute measure. So there are two measures that we can use to determine which forecast in model will be the best one to choose or use. In a way we want when we're doing some forecasting we want to choose the simplest method and this is based on the principle of parisimony. If we look at the analysis of the errors you can see that number one the graph shows you some random errors so it means the errors are scattered everywhere. The number two which shows you the cyclical effect not accounted for as you can see that it's got some up and down but that is a cyclical trend. Then you also have a trend not accounted for in terms of the linear trend and you can see that it's going downwards. So this is the negative trend and you can also see some seasonal effects where it fluctuates as often as possible. So you can look at this graph and determine but that will not be enough. You need to use the measures the absolute and the square differences. The absolute and the square differences can give you a better idea than looking at looking at this because when you're looking at the residual you can look at it and you might look at this and say but which one will best fit my data or will best give me the correct prediction you will never know with this. But if you're going to use the absolute and the square difference measures then remember you will need to choose the model that gives you the smallest value. So it means your SSE or your MET should be small so we need to choose a smallest error. So in terms of the SSE which is the sum square errors which is also known as the SSE it is the sum of your Y estimated value your YT sorry it is the sum of your Y observation value minus your estimated male value which is your forecasted value squared so you can see that you're taking the original values. So if I have a sales table remember if I have my periods here which are one two three and this is my sales values and I have 20 30 40 and I estimated my Y hat which is my estimated sale and this is 22 this is 28 and this is 35. Those are my estimated values I'm not saying that they are correct or not or whatever the thing is but these are my estimates so in terms of this SSE one we're saying is the sum of the difference between 20 minus 22 squared we need to square that plus 30 minus 48 squared plus 40 minus 35 squared that is what it says in terms of this formula. So what if what if now my estimated I'm going to use my estimated value let's say this is 20 this is 35 and this is 50 so I'm going to say 20 minus 20 squared plus and you add all of that. Based on these two models you can choose which one actually has the lowest SSE that's what we say so for example if this one gives you an SSE of 420 and this one gives you an SSE of 2000 of 1002 therefore it means you're going to choose this estimated value there because it's the smallest the smallest one but that is not what I am trying to get to as yet as well sorry I need to clear this I didn't calculate all those values so those are just my estimate don't take me for it and say oh but you're giving us the wrong answers yeah I was just demonstrating so the match which is the mean absolute deviation is the sum of your observation minus the forecasted value divided by n and is how many there are if there were 20 divided by 20 both of these two there are sensitive to outliers but med is less sensitive to the outliers because what we do with the med as well that's the other thing the med absolute these things convert any negative value into a positive value we say in standardized all the values to be positive all of them so if I if I do like let's say I have 20 as my say and 19 as my as my estimate what it says yeah it says if I if I say 20 minus 19 I will get a value of what but when I have 30 and I have 45 yeah as my estimate that will be 30 minus 45 which will be equals to minus 15 right it takes the value of the negative so it will be minus 15 with the absolute it means it's going to be the answer here will just be positive 15 irregardless so that's what the meaning of the absolute is so we're also going to add one of them together as well so let's look at an example and I think we are almost close to the end of the session we left with 10 minutes consider the following calculated forecasts so here they gave you the actual value so this will be your y your y t and your forecasted t value and they say what is the value of the mean absolute deviation and we know the formula is the sum of your absolute value of your actual minus your forecast divide by n so it means you're going to say in this instance the med will be equals to 57 minus 63 absolute plus 60 minus 72 absolute so it means no zeros no negative value yeah um plus 70 minus 86 absolute plus 75 minus 71 absolute plus 70 minus 60 absolute divide everything by how many they are they are five so that will be divided by five um we know that 57 minus 53 63 57 minus 63 is six because it's absolute that will be six plus 70 60 minus 72 is 12 plus 70 minus 86 is 16 plus 75 minus 71 is four plus 70 minus 60 is 10 divide everything by five and therefore your answer will be six plus 12 plus 16 plus four plus 10 and that will be 48 divide by five and what is 48 divide by five and that is nine comma nine comma six which is option four you see how easy it is any questions any questions if there are questions thank you please make sure that you complete the register before you leave today's session let's see if we have the last the last exercise so we're given the level of which commercial lending institutions set mortgage interest rate has a significant effect on the volume of buying and selling and construction of residential and commercial real estate the data are presented below with the period the year the interest rate which is our actual vein and the forecast which is our forecasted interest the question is what will be the sum of the sum of square for the forecast error sse what do we know what is the formula we know that is the formula to use so what we do is we take the original minor the actual minus the estimated and square the answer so we say 10.6 minus 10.81 and we square the answer which will say equal and then square the answer and say equal which is 0 comma 225 I'm gonna write it here 0 comma 225 0 comma 1 2 2 5 excuse me for the register what's the topic is it numeracy skills or is it skills and statistics to the end of the register they should be some where it says statistical inferences oh okay thank you okay 10.86 minus 10.85 equal square the answer and we get is it is it right is it zero do you also get a zero 10.86 minus 10.85 equals square the answer equals I don't get a zero there are three zeroes and one zero comma one two three one and go to the next one 12.07 minus 10.9 equals square the answer equal one I'm gonna write all the values as I see them one three six eight nine and the next one 9.97 minus 10.95 equals take the square of the answer equal zero comma nine six zero four and we go to the last one 11.14 minus 10.99 equals square the answer equals 0.025 I just want to double check the first one because I think I'm not sure if I did square the answer minus 10.81 equals square the answer equals where I did square the answer so now we can take all of these values and add them together so you just say plus let's go 0.1225 plus 0.0001 plus 1.3689 plus 0.9604 plus the last one 0.0225 equals and the answer we get is 2.4744 which is option number three and that concludes today's session so let's look at the last question I'm not going to do that you can look at the question it gives you two models so you will have model number one and model number two and they also give you model number one sse and model number two sse and what they are asking you is which one of the two models are better and based on what we know remember the model with the fewer or lesser sse is the best one so you're going to choose this one has 4.95 and this one has 593.25 so model number one number one and number two you're going to choose which one best fit the regression and that concludes today's session what we have learned we now know about the different types of time series and forecasting models in terms of the time series we know about two types of the smoothing techniques which we can use which is the moving average and the exponential smoothing technique we can also look at the trends and identify the type of trends we have whether is it a cyclical trend whether it is a seasonal trend whether it is a linear trend or whether it is a irregular trend and then the last thing that we just did was looking at the three types of the forecasting models that you can use which the first one is the linear model the second one which is the quadratic model and the third one which both of them being the non-linear which is the quadratic and the exponential model and remember when we're looking for the best model we can either use the measures or we can use the um the residual analysis but when we look at the measures there are two of them there is the sum square measure and the met absolute mean or the mean absolute deviation which is called the met and in order for us to choose the best model both of those measures needs to have the smallest value as possible smallest errors as possible and that concludes our sessions for 2022 thank you I will see you when second semester starts probably or not but you have my email address if there is anything you need assistance with remember to send an email to city and tat at unisa.ac.za and let's go to the to the first first first light I can just go to the first light so that if you still don't know what the email is you can use the email and that is the email address city and tat at unisa.ac.za and copy myself or me Elizabeth boy which is e boy em at unisa.ac.za and I shall be of your assistance anytime you need help thank you and good luck with your studies thank you thank you thank you so much Lizzie I will certainly um contact you when we get stuck no problem remember anytime thank you that's fine bye