 me, by Professor Piyanka Burma, Professor Shoshmita Narayana, Professor Devabh Uttadas from Indian Institute of Management in Mumbai. So, in today's session we will talk about Analytics in Supply Chain Management. So, this is the module 3 of this course and this is the lecture number 3. So, before I proceed into the lecture, we will summarize what we discussed in the last two lecture of module three that is analytics in supply chain management. So, this is the summary. So, first we talked about what is analytics and then we discussed what are the various characteristics of big data like six vis, v for volume, variety, velocity, variability, value and veracity. Then we also discussed type of analytics, prescriptive, diagnostic, predictive and prescriptive. And finally, we gave examples of analytics from supply chain management domain. So, these were the main summaries of lecture one and two of module three. We discussed the example from procurement, we discussed the example from manufacturing domain, warehousing, logistics and transportation as well as demand plan. So, these are the main five pillars of supply chain management. So, we gave examples from each of these domain and discussed how analytics can be used. So, specifically in today's session what we will do, we will take the domain of manufacturing and then we will see like how analytics can be used in manufacturing, like how can I make manufacturing more efficient by applying some analytics technique. So, to do that we will take one case study and through this case study we will explain like how analytics can be used. So, let us see this case. So, that the production of a particular manufacturing plant was intermittently stocked during the last few months due to machine breakdowns. So, like every month one or twice machine was breaking down. So, obviously like production manager was not happy because their customers will be impacted. So, they will not get the product on time. So, obviously they will be unhappy and they may move to different OEM. So, therefore, they appointed a new plant manager and she was continuously thinking about this issue like how can she make the process much better, how the machine breakdown can be minimized. So, recently the plant manager completed her course in data science where she learnt lot of analytical techniques such as descriptive, diagnostic, predictive and prescriptive. So, she was wondering if she can develop a predictive modeling tool and predict when a machine will break down. So, if she can predict that a machine will break down or a component will break down then she can take some corrective action and obviously the whole production which was getting halted like during the last breaking down time. So, that can be stopped. So, the idea was to develop a predictive model and a tool. So, that before machine actually breaks down she can predict it and if she can predict it properly then she would be able to take some corrective action and that corrective action will stop halting the production and the production will be continuous in nature and the customers also will be satisfied. So, the idea is to develop a predictive maintenance model. So, she immediately gathered all the data from the past one year related to the machine breakdown. So, what are the main data points she gathered she thought that machine breakdown related parameters would be the age of the machine like since how long this particular machine is being operated, what is the utilization of the machine, if the utilization is more. So, obviously there will be chance that machine might break down more frequently. Then she also collected the parameter called MTBF that is mean time between failures. So, what is the average time between two successive failure that also she gathered then she gathered the data related to the unplanned down time of each of these machines. She also collected data related to oil contamination, the N S value. Then she also took the data related to overhauling schedule complaints. She also collected data related to schedule lubrication complaints. So, these were the parameters she collected for each and every machine for the past one year. So, let us see how the data looks like. So, she got the data of a thousand such instances. So, she had data related to the age of each and every machine in years. So, if you see the serial number 1, the age of the machine was 15. Then she collected data related to the utilization for serial number 1 is 90.9 percent. That means 90.9 percent of the time machine was utilized. Then MTBF value for that machine was 24.2. That means average time between two successive failure was 24.2. Then she also collected data related to the unplanned down time for each of this thousand instances. Specifically for serial number 1 it is 1.71. Then oil contamination value also she collected which is 5 for serial number 1 and of course, for all thousand instances the values are given in that column. Then she got data related to overhauling schedule complaints in the case of serial number 1 99.1 percentage of the time the overhauling was done as per the schedule. Whereas, if you see serial number 0 95.6 percent. That means 95.6 percent of the time overhauling was done as per the schedule around 4.4 percent of the time it was not done as per the schedule. If you take serial number 2 it is actually at a lower side that is 77.8 percent of the time overhauling was done as per the schedule. That means for serial number 2 machine the overhauling was not done as per the schedule more than 20 percent of the time. Then if you see the last column which is serial lubrication complaints for machine number 1 is 95.8 percent. For machine number 2 it was 90.5 percent and so on. So, now this thousand instances she collected. So, for this data with this data she had to develop predictive modeling. So, these are the parameters which will impact the failure of the machine. So, these parameters will determine whether a machine will fail or not. So, what she collected along with this she also collected the data related to the machine failure. So, for each of this thousand instances for each of this thousand instances she collected data whether the machine was failed or not. Let us say for serial number 0 the machine with 14 years of age 89.8 percent utilization 8.7 mtba value 2.38 unplanned downtime oil contamination 7 NAS overhauling schedule complaints 95.6 percent schedule lubrication complies 95.8 percent and the machine actually failed. Similarly, if you see serial number 1 the machine did not fail for serial number 2 the machine failed. So, she had thousand such instances and for each instances I have data related to age utilization mtbf unplanned downtime oil contamination overhauling scheduling complaints schedule lubrication complaints and also one column whether machine was failed or not failed. So, she not only collected the data of machine failure she also collected the data in which machine was not failed also. Now, this is the whole dataset she got and with this dataset she would be planning to develop a prediction modeling. So, the idea is she has to predict if I if a machine let us say for example if a machine is of 11 years of age this is just for an example and utilization is 78 percent unplanned downtime is 3 oil contamination is 8 overhauling scheduling complies is 92 schedule lubrication complies is 90 percent then whether the machine will fail or not. So, this is just one such example similarly if I if she develops a model the model would be able to predict given age utilization mtba value unplanned downtime oil contamination overhaul scheduling and schedule lubrication complaints she would be able to predict whether a machine will fail or not. So, if you give me the value of this parameters I would be able to predict whether the machine will fail or not and that is what actually she wanted. So, now the idea is to develop a model which is very easy to understand because the workers who are actually working in the shop floor has to interpret the model. So, she came up with the idea of classification tree. So, which is a part of decision tree model and developed a model the output of the prediction model is given in the next slide. So, these are the output of the decision tree model. So, now let us see how to explain this model and we will make sure that you understand each and every steps and interpret the output very easily. So, now if you see oil contamination is one such parameters which we had if the oil contamination is more than 5.5 and mtba value is more than 23.95 then 71 percent probability that the machine will not fail. So, this is a decision tree structure and this tree is called classification tree. So, these two parameter oil contamination and mtba value with these two parameters I am able to predict whether a machine will fail or not. Similarly, let us say for another machine if oil contamination is more than 5.5 and mean time between two successive failure is less than 23.95 less than equal to then 77 percent of the time the machine will fail. Now, if I come back to the left hand side if oil contamination is less than 5.5 as well as utilization is more than 92.05 percent then 88 percent of the time machine will fail. Now, if I if see that oil contamination is less than or equal to 5.5 utilization is less than or equal to 92.05 then 83 percent probability that machine will not fail. So, now, if you see there are only three parameters although I had many parameters in the model, but when I see this decision tree output specifically the classification tree output it suggests that I need to check only oil contamination and its value I need to check utilization I need to check mtbf value. So, using these three parameter oil contamination mtbf and utilization I can predict whether a machine will fail or not. I will not only be able to predict I will also be able to say what is the probability that machine will fail what is the probability that machine will not fail and in addition to the probability I would also be able to give like what is the confidence I have in that prediction. So, let us take the example of this 71 percent. So, if a machine has oil contamination more than 5.5 and the mtbf value is more than 23.95 then the model predicts that it will not fail with 71 percent probability and what is the support I have 9 percent support I have. So, 9 percent support I have in that probability or in that classification. Similarly, if I see here this value 77 percent probability that the machine will fail when like if the oil contamination is more than 5.5 and mtbf value is less than equal to 23.95 then 77 percent probability that machine will fail. So, I am not only predicting when the machine whether the machine will fail or not I am also able to predict the probability 77 percent probability that it will fail and what is the support I have I have 70 percent support in that probability. Now, if I come back to this 88 percent scenario it says that if my oil contamination for a particular machine if the oil contamination is less than equal to 5.5 and utilization is more than 92.05 percent is then 88 percent probability that the machine will fail and what is the support I have I have 5 percent support for that prediction. Now, if I see the last one 83 percent for a given machine if oil contamination is less than 5 less than equal to 5.5 utilization is less than equal to 92.05 percent then 83 percent probability that the machine will not fail and how much support I have 16 percent support I have for that argument. So, if I summarize this whole classification tree model I need to look into the last nodes of the tree and using the last node of the tree and this three parameter oil contamination mtbf value and utilization I am able to predict whether a particular machine will fail or not and along with the prediction of failure or not failure I am also able to say what is the probability of failure and in addition to that I will also get the support for the prediction. So, now if you have like hope you have understood the whole output. So, let us see how this output would be useful for me for prediction. So, for a given machine the age is 11 ok 11 years old the utilization is 78 percentage that means 78 percent of the time the machine was utilized mtbf value is 15 that is mean time between two successive failure is 15 unplanned downtime is 3 percent oil contamination is 8 overall in scheduling compliance is 92 schedule lubrication compliance is 90. So, now with these data can you say whether the machine will fail or not. So, can I use this decision tree model can I use the decision tree model to predict that. So, let us see whether I can predict it or not. So, I have given data that age is 11 years utilization is 78 percent and so on. Now what I need I need the data of oil contamination because oil contamination is the first parameter in the decision tree which is splitting the node 0. So, I will check the value of oil contamination. So, oil contamination is 8. So, for the given case for this machine oil contamination is 8. So, 8 is more than 5.5. So, I will come to this node I will come to this node because oil contamination is more than 5.5 then I have to check the value of mtbf and mtbf is 15. So, mtbf is 15 that means I am here less than equal to 23.95. So, first from node 0 I am moving to node 2 because for the given instances for the given instance for this given instance oil contamination is more than 5.5 which is 8 over here and mtbf value is less than equal to 23.95. So, I should follow this path first I will go from node 0 to node 2 then from node 2 I will come to node 5 for this specific machine. Now if I come to this path the model says 77 percent probability that the machine will fail. So, if I have a machine which characteristic is like that 11 years age 78 percent utilization mtbf value is 15 unplanned down time is 3 oil contamination is 8 overall scheduling compliance is 92 schedule lubrication is 90 then so 77 percent probability that the machine will fail and how much support I have for that argument 70 percent support I have. So, this model is able to predict for any given machine whether it will fail or not with some probability. So, I am sure like you got the idea of how to interpret this model. So, now let us see how can I create business rules and support because idea is this model should be very simple and very easy to interpret. So, that anyone in the shop floor can understand it and if there is any red alarm they would be able to tell their manager. So, what are the business rules we are able to generate from here. So, if I go back to the last slide. So, the first business rule is oil contamination I have to check whether oil contamination is less than 5.5 or more than 5.5. So, I am checking oil contamination less than equal to 5.5. So, then once I check oil contamination less than equal to 5.5. So, I am here in this node. Then I will check the utilization and utilization there are 2 values 92.05 more than 92.05 or less than 92.05. So, if oil contamination is less than equal to 5.5 and utilization less than equal to 92.05 then I come to node 3 and in node 3 83 percent probability that the machine will not fail and support I have 16 percent. So, now if I go to the next slide that is what you can see oil contamination less than equal to 5.5 utilization less than equal to 92.05 then I am classifying the model as node that is the classifying the prediction as node. That means, machine will not fail that the machine will not fail and how much accuracy I have 83 percent accuracy I have and how much support I have for that 16 percent. So, if I take you back to the previous slide you will see the same thing I have 83 percent probability that the machine will not fail and 16 percent support I have. Now, same way I can reach to the node 4 also oil contamination is less than equal to 5.5 utilization is more than 92.05 percent. So, if these two combinations are there then 88 percent probability that the machine will fail and how much support I have 5 percent. So, now let us move to the next slide and see oil contamination is less than 5.5 and utilization is more than 92.05 then I am predicting the model is predicting that the machine will fail. So, I am predicting that machine will machine fail and what is the classification accuracy I have 88 percent and support I have 5 percent. So, I am able to say that if a machine has oil contamination less than or equal to 5.5 and utilization is more than 92.05 percentage then 88 percent probability that the machine will fail and 5 percent support I have for that. Now, if I go to the node 5 node 5 means I am here. So, how do I reach to the node 5 first I will check the oil contamination oil contamination is more than 5.5 and MTBF value is less than equal to 23.95. So, if these two conditions are satisfied then I am predicting that machine will fail with 77 percent accuracy I have and 70 percent support I have for that argument and that is what we have summarized as node 5 rule. So, if oil contamination is more than 5.5 and MTBF value is less than equal to 23.95 for these kind of machine the model is classifying that machine will fail and how much accuracy I have 77 percent accuracy I have in that classification and support I have 70 percent. Now, in the last node which is node 6 how do I reach to the node 6 first I will go to the node 2 then from there I will go to node 6 first I have to check oil contamination value whether it is more than 5.5 or not and then I have to check MTBF value. So, to reach node 6 oil contamination has to be more than 5.5 and MTBF value has to be more than 23.95. So, for a machine which is having oil contamination more than 5.5 and MTBF value is more than 23.95 71 percent probability that the machine will not fail and 9 percent support I have for that. So, if I just go to the next slide it summarizes this rule oil contamination more than 5.5 MTBF value is less than 23.95 then I will classify that the machine will not fail. So, I will predict that machine will not fail and how much classification accuracy I have 71 percent and support I have 9 percent. So, now, if I print this and put it in the soft floor then the worker has to only see these three parameter oil contamination utilization and MTBF value. So, if they see these parameters continuously monitor these values then they would be able to say whether the machine will fail or not fail. So, it is very simple to interpret and it predicts with probability whether a machine will fail or not fail. So, that is the output of the model. So, let us summarize once again like how to read the output and how to predict it. Suppose I will give two instances this is instance number 1, this is instance number 2 for a machine any random machine in the soft floor if the age is 11 years utilization is more than utilization is 78 percent MTBF value is 15 unplanned downtime is 3 percent oil contamination is 8 NAS value overalling schedule compliance is 92 percent schedule lubrication compliance is 90 percent then whether the machine will fail or not can you do that yes you can. So, let us check which node it is falling. So, first I need to check the oil contamination. So, oil contamination is 8 for the first machine. So, which is falling here either node 5 either node 5 or node 6 both satisfies oil contamination value more than 5.5 then I have to check MTBF value. So, MTBF value is 15 which is less than equal to 23.95. So, I am in this node. So, for machine number 1 I am in node 5 I am in node 5. So, what node 5 suggest node 5 suggest that the machine will fail. So, machine will fail yes machine will fail and what is the accuracy I have 77 percent accuracy I have. Now, let us take the example of machine number 2 any random machine in the soft floor which is having 4 years of age 91 percent utilization MTBF value is 24 unplanned down time 4 percent oil contamination 5 overall in scheduling compliance 88 percent schedule lubrication compliance 87 percent I want to predict whether the machine will fail or not. So, let us check which node it falls first I have to check oil contamination oil contamination is 5 over here if oil contamination is 5 it is less than or equal to 5.5 that means, it is either in node 3 or in node 4 then I have to check the value of utilization. So, utilization is 91 percent utilization 91 percent is less than 92.05. So, that means I am in node 3. So, this is node 3. So, node 3 oil contamination less than or equal to 5.5 less than equal to 5.5 and utilization 92.05 actually 91 which is less than equal to 92.05. So, I am falling in node 3 and node 3 suggest that the machine will not fail. So, machine will not fail and what is the probability classification accuracy is 83 percent. So, that is how if you are sitting in a soft floor this tool can help you by looking into this 3 parameter whether the machine will fail or not fail with some probability. So, it is very nice prediction tool which manager or soft floor guys can use it to check whether a machine will fail or not. So, this is the output of the model which we have shown here. Now, you must be wondering like how you arrived at this model how you derive this model how did you get to this model. So, what are the background like how can I develop this model if I have to develop on my own how can I do it. So, all of this we will do it in the next class where we explain from the scratch we will also do the like hands on programming in python. First we will understand the concept of classification tree then we will use hands on programming in python to develop this model we will develop the exactly same model we will get the exactly same output and explain all the steps. So, this is like this completes today's session. So, in the next session we look forward to seeing again we will do the hands on as well as understand the concept of classification tree very thoroughly. So, that you can apply the same concept same model in various context. So, thank you look forward to see you in the next class.