 Welcome to session 14 on Quality Control and Improvement using Meritabh and Professor Indrajit Mukherjee from Shalesh M.A. for School of Management IIT Bombay. So, previous session we are discussing about process capabilities in that case Cp and Cp k indices we have seen . So, ah now in this session we will discuss more about this capability index and for that some other measures which are also used in ah production flow or ah personnel who are in quality sometimes in industry we will find some other measure which is also ah reported like that. So, we will talk about that now and in case normality assumption fails what what people does and that also we will try to see ok. So, ah so this is the process capability index what we have seen over here. So, Minitabh reports this these are the values. So, standard deviation within what is reported over here what you can see ah sorry this is the standard deviation within and this is standard deviation overall. So, this ah standard deviation within I told that this is calculated by this formulation for a subgroup size and this standard deviation overall what you see is basically ah sample standard deviation. So, that expression of S calculation what you see square root of x minus x bar whole square by n minus 1. So, this is the formulation for standard deviation overall and ah and this standard deviation overall is sometimes used to calculate another index which is similar to Cp and Cpk. So, this is Cp and Cpk index for the ring ah dimension which is a nominal dimension of 74 plus or minus 0.035. So, this was the problem and in this case ah because everything was fine we saw in in control chart all points are falling. So, it is a stable process that we assume ah we can also calculate ah another index which is known as ah Pp process performance index and this is also Pp and Ppk index over here like capital index this is known as performance index. So, in this case also ah and here formula even same. So, Pp will be equals to delta divided by 6 multiplied by ah standard deviation estimation over here estimation will be only equals to. So, this will be replaced by S and ah and this is the long term process standard deviation what this says ok. So, only estimation of this sigma will change. So, similarly ah in Ppk like Cpk ah we can calculate Ppk is over here and here also only thing that will change in the formulation is instead of r bar by d 2 what we will do is that we will replace with S. So, ah that will be replaced by S and we will do the calculation immediately as it automatically for you here also Ppl, Ppu, Ppk will be calculated. So, that will be reported over here. So, process performance index is around 1.14 what was calculated and ah Cp index was 1.17 which is somewhat higher than 1.14 and the distribution is more or less you can see that and the variation also very less difference between this value and this value because ah whenever the ah they are within control in in control in that case overall and within does not differ much. So, in that case Pp index and Cp index more or less will reflect very near values like that ok. So, ah so, then what will happen is that so, this is another index that is reported. So, ah in case data was not collected based on ah control chart sometimes what we do is that we report Pp and Ppk index in the industry also ok. So, ah certain scenarios will be there. So, how do we do that and then we will have over here what you see is that one is known as expected performance expected overall performance expected within. So, expected within is based on the calculation like ah what we do z calculation like this this is based on that USL what I told that one sided X bar by sigma estimation over here. This will be done by that R bar by D2 formulation and this part what you see will be done by overall estimation and this will be done by S using S estimation over here. So, instead of sigma estimation will be based on S or overall standard deviation you can say process standard deviation generally how we calculate process standard deviation given a set of data how do you calculate process standard deviation that is S expression what we use ok. So, ah this is also reported over here in Minitab. So, performance overall performance expected performance within both will be reported and PPM will be reported over here total PPM in case of this is around 605 overall performance and this can be we can say this is a long term performance sometimes they write and this is short term performance what you can see over here. So, 502 ah if you take a small snapshot like that and go to the process using control charts. So, in that case you get short term ah capability measures and if you if you if you consider that this is ah variability I will take overall variability like that irrespective of the control chart. So, in that case what we will do is that we will report this PPM PPK analysis, but always people prefer to use stable process and report the capabilities like that ok. So, here is the diagram where you can see that this is a short term snapshot what you see over here and long term variability is this much what you see over here. So, ah this is S estimation and for this what what we are doing is r bar by d 2 calculation using to estimate sigma over here. So, these two differences differentiates short term and long term process performance like that ok ok. So, when I am reporting capabilities I am talking about short term capability and based on control chart CPCPK when I am reporting process performance it is based on S estimation and this is long term performance for what we can say ok. So, we need to have also understands ah to report this one as short term and long term performance like that ok. So, this is what we need to have does and the this what you can see is that both centering and variability both are considered when even when I am reporting CP and PPK both indices we are considering basically and PP and PPK only considers the variability over here S estimation and only the estimation of sigma will differ only variability is considered over here with respect to the voice of the customer voice of the process like that. Here also X double bar is considered. So, a centering is considered over here. So, centre value is important. So, accuracy precision both are taken care of CP and PPK indices like that. So, we prefer to use CP CPK index, but sometimes we report PP PPK index also and and that is the way MINITAB also reports like that. So, if you go to MINITAB over here and you do the analysis let us say for the ring dimension over here you go to stat what you have to do is that quality tools, capability analysis, normal capability analysis. So, this is ring 1 to ring 5 and I replace the data set over here. So, this is 73 again I am writing down as 965 and this is 74.035 and then I change the option over here and I also include overall analysis over here. So, parts per million capability index. So, this I include over here sign 8 ppm and all this calculation then I click ok over here and then estimation what you can do R bar I am using over here. So, sample size is more than 5 subgroup size. So, in this case ok I will do and if you want to store you can store this data set also. So, this can be stored over here and this will be stored in columns C 15 like that. So, this will be stored. So, I do not want to store anything at this time point. So, what I can do is that I can and no transformation I am assuming the normal distribution assumption. So, I am not doing any transformation over here to convert into normal normality assumptions to adhere to normality assumptions. So, that I am assuming that this is ok and I will click ok over here what I get is this type of diagram and this what you can see is that. So, ah standard deviation overall you can see as 0.01 and standard deviation within also within means R bar by D 2 and this is S estimation overall and based on that Pp estimation Ppk estimation is 1.11 and Cpk estimation 1.33 because the process is stable both the measures are more or less coming out to be same and that is what is expected like that. So, formulation only differs. So, expected within observation expected overall. So, Ppm is Z is defined because of normal distribution and based on the Z we can calculate how much fallout outside USL and LSL like that. So, that can be calculated like that that can be calculated over here ok. So, that is what I wanted to express and other calculation over here anything that I have missed let me just check. So, this is 6. So, Cp L, Cpu both side specification similarly Pp L and Ppu is reported over here and Ppk index is calculated like that and similarly what we can do is that the other data set we can see that one also quality tools, capability analysis for the next one or maybe we can we have already seen 6 packs. So, I do not want to show it again. So, capability analysis is normal over here. So, in this case container 1 to container 5 again we are taking this one and let us try to set this as 200 which was the specification that is mentioned and estimation over here remains same and options over here target I have not given. So, within an overall we want to see I click ok and I click ok what will happen is that because this is one sided again Pp lower will be calculated Ppk index will be same as lower one because this only one measures we are calculating lower upper is not there. So, only one measure and minimum of that is only 0.67 that is reported over here similarly Cp L will be the Cpk values over here. So, what do you see expressed over here is similar like Cpk index that we have seen ok. So, that we wanted to explain and then in case this was a this was a second case that we are reporting over here and this is one sided. So, so lower specification is this anything below this will be rejection like that. So, this it has to be higher than this one. So, this is one sided specification generally that minimum this much is the strength that is required 200 psi that is mentioned over here and for that how to calculate process capability. Only thing is there that you see that normality assumptions is more or less satisfactory seems to be satisfactory over here, but we can check always normality assumptions like that. So, and then this is the overall analysis of the two data set that we are having and based on Z calculation we can see and Z another important concept that comes into that we can see in Minitab also is sigma level and Z benchmark. So, Minitab also reports a value which is Z benchmark. So, if you have understood the Z calculation if this is the normal distribution what happens is that Minitab first calculates the value of Z2 over here, then Minitab calculate this is the specification like in piston ring this is this is X double bar is calculated 74. Something and the USL is also given like 74.035 and based on that I can calculate a Z2 because this X double bar is known and S can be calculated based on R bar by D2 over here and this will give you some Z value and if I can get the Z value we can get always probability of Z greater than Z value this Z2 values over here and then what we can do is that I can calculate the probability and from there we can also calculate PPM level over how many P how much parts per million is defects that is coming out of the process. Similarly, on this side also we can calculate defects and that will be parts per million that is reported by Minitab. So, this will be this Z value what do you see. So, it will be written as Z USL Z dot USL and this will be Z dot LSL like that and it will report a Z benchmark. So, this report a Z benchmark and based on the calculation of S it will report as short term Z benchmark and long term Z benchmark like that Minitab will report that to you and short term what does it mean is that Z benchmark. So, it means that this is equals to when you when you put all the rejection level in one side. So, it will place Minitab will calculate like whatever proportion non confirming on this side non confirming on this side. So, it will just sum this up a plus b over here and then it will place in one direction like that. So, all a plus b will be put over here and then it will calculate what is the corresponding and Z and that will be the Z benchmark over here. So, that will give you Z benchmark and that concept is used to define the sigma level in Minitab like that Z benchmark short term is the sigma level of the process that is the sigma level of the process like that and we have in Minitab we have options to do that one we have options to do that one. Let us try to see Minitab how it is possible. So, this concept is used and then what we can do is that we will just close this one and then we go to this one and quality tools capability analysis again normal assumptions over here. So, we will do the ring one first and we click this one and select this one and we have taken 73.965 and this is 74.035 and options over here target value was 74 what we have mentioned. So, here instead of capability stat let us go to Z benchmark like that which is the sigma level Minitab will mention this as sigma level of the process. So, this is the way people calculate sigma level. So, when we say how many sigmas. So, it means that basically how much we have this when I single identity when I draw this one how much standard deviation total rejection over here what is the this is the Z benchmark basically. So, how much standard deviation from the central line over here. So, X double bar from here and that is the equals to sigma level basically this is. So, that concept is used over here and what we see is that when I do this one benchmark Z sigma level I click ok over here and I do not want to store estimation everything is fine. So, I will click ok and I will click ok and then Minitab will report this statistic what you see. So, it has reported that ZLSL. So, that is the ZLSL over here how much is the fallout like that and corresponding Z value is 3.55 Z, USL is 3.32 and when you convert all the defects in one direction like that and the Z benchmark value is coming out to be 3.22 and assuming that this is overall capability what you see and this is within capability what you see like that. So, more or less it is same because there is little shift in the means like that and we are assuming that process is stable. So, that is why both values are very close what you see and potential within capability this is a short term capability what you see CPK. So, generally people tries to note this down Z 3.29 and this is the corresponding sigma level what the reports like that. So, around 3 sigma level. So, what we can see is that and CPK multiplied by 3 will give you Z benchmark like that. So, you can just cross check like that whatever outputs we have got CPK multiply it with 3 more or less you will get near to these values like that. So, that is only approximation I am saying. So, CPK multiplied by 3 will give you Z benchmark like that and in long run what this says is that this Z benchmark short term what we are calculating using R bar by D2 and sigma that calculation of sigma that is the sigma capability of the process. And also there is assumption that process means shifts like 6 sigma principles what they says is that process mean can shift from one end to the other end up to 1.5 standard deviation. So, that is added over here. So, short term capability is just long term capability plus 1.5. So, if long term if you cannot do control charting in that case you calculate based on sigma S S calculation and you get the Z benchmark values and if you add 1.5 with that you will get the short term process capability which is generally reported by 6 sigma people and that can also be done. So, Z benchmark long term you have converted into short term and that is the sigma level what you can expect like that process and that is the standard which we can take as whether to improve or not to improve like that. So, CPK multiplied by 3 gives you Z benchmark that is the approximation what we can do and CPK 2 is the standard when we are talking about sigma capability of 6. And what do you what we can expect if it is a 6 sigma process that this is the accuracy level that means fallout is 1 minus of that this much percentage will be fallout and parts per million it is around 3.4 parts per million outside specification. So, 3.4 parts defects will come out of the process that that that is very very very much less PPM at PPM level is 3.4. So, that is the standard. So, that we can note down over here and then what happens is that sometimes we are not able to adhere to the concept of assumptions that everything the data comes from normal distribution like that. So, if it is non-normal scenarios how can we calculate the capabilities like that I will show you a non-normal data set and how to prove it and see. So, although we have not done much about hypothesis testing, but I will give you a brief idea what values to see and later on we will discuss about the values when we discuss about that topic and how to how to interpret that values like that ok. So, let us take some data over here which we have which is non-normal basically and for that what are the options used and then that data set can be used whenever I make conversion of the data what will happen is that and also I convert the specification then I can calculate the capabilities like that. So, for normal scenarios what happens is that data is converted into normal distribution and USL-LSL is also converted and based on that what they does is that they calculate the capability like that and there are different ways to transform the data into normal data sets. I am not assuring that every time it will happen, but there are two options which can be explored which is used to convert the data into normal data. So, one is Boxcox power transformation, one is Johnson's transformation. So, sometimes one will work, one will not work and Boxcox transformation if there is a values are CTQs are positive values and Johnson's transformation can be positive negative any values we can do the transformation like that. So, I will show you a data and we need to have options how to transform and interpretation of that ok. Transformation means what I am trying to say is that whether the values y which is the CTQ over here needs a log transformation or inverse transformation or square transformation or 1 by square root transformation. So, these things will be reflected when I use this some options like Boxcox and Johnson has family of transformation, three families of transformation and it will give you the equation what transformation will convert the y into normal values like that. So, that is also possible and we need to have as option to calculate capability using this transformation over here ok. So, let me take some data set first to illustrate simple data set to illustrate that we have we have this what you call already yeah. So, this is what files and then what we will do is that we will open the data transformation this point this file maybe and we have some data which is non normal and how do we check first is energy consumption what you see over here is some data set what we are having over here and we want to see whether it is normal distribution or not. So, if you go to stat what we need to have as an option to check whether it is normal distribution or not I will use a specific test too and we will do it somewhat at this stage we do not want to go into hypothesis testing we just want to see whether it is normal what is the guideline to see whether it is normal or non normal. So, I will go to normality test this is a set of data let us say energy consumption is CTQ that customer is interested in and we want to see whether it follows. So, I will go to basic statistics and I will go to normality test and what I will do is that I will select this one and there are three options over here understand our link test for normality what you see this is Ryan's test this is KS test Kolmogorov-Sminov test. So, I am using understand our link test. So, you can use other one also options, but this is a robust one. So, I am using this one. So, I click this one and as per the researcher's suggestion. So, what I am doing is that I am using understand our link test and when you do the understand our link test what will happen is this data set is plotted and there will be a straight line like this this was earlier known as probability plots you can you can you may have heard in statistical course. So, what Minotype does is that it calculates a understanding statistics 1.43 what you see and the we are interested in this p-value concept over here. So, although we do not understand at this present moment p-values, but we will elaborate that after a when we discuss hypothesis testing which will be used in design of experiments. So, if at this time point we can remember that if the p-value is less than 0.05 that means the data is non-normal. So, simplest way to interpret this one p less than 0.05 indicates that the data is non-normal data basically. So, here p is less than 0.05 data is non-normal data is non-normal over here. So, can I convert this data like that to a normal distribution it will follow normal like that. So, what we can do two options we can elaborate in minute we can explore in minute So, one over here is that you go to control chart there is a box cox transformation that is available over here. So, I go to box cox transformation and I say this is the and my cursor is over here energy consumption I want to see and subgroup size I will give 1 and in options what I can do is that I can store this data set in C2 and I can check whether the data is converted into. So, conversion will be stored optimal or rounded lambda lambda means y to the power lambda some lambda transformation will be done on the y ctqs y is over here energy consumption like that. So, if I put it in C2 and if I click ok and then I say ok over here what happens is that it will give you some estimation over here. So, that is optimal lambda that was calculated is estimated as around minus 0.28 like that and it will give you some rounded values minus 0.5 minus 0.5 ok minus 0.5 in common sense means 1 by this will mean basically. So, y to the power lambda. So, over here what happens is that it is getting a box transformation is getting y to the power lambda. So, y is this one or the energy energy one you can think of. So, lambda is coming out to be minus 0.5. So, lambda y to the power minus 0.5 y to the power minus 0.5. So, this indicates 1 by square root of y basically this is the transformation suggested by box transformation ok. So, if it is y if lambda comes out to be 1 that means no transformation y to the power 1 means no transformation basically 0 means log transformation it means and if it means minus 1 means 1 by y transformation that is required. So, these are the interpretation of lambda over here and in this calculation also I mean it has given you some options to explore whether it is this can be converted. So, 0.5 and this conversion is done over here and the values are given over here. So, this is energy consumption let us say energy consumption and this may be transformed we can say transformed ok. So, we can like like that. So, let us try to see whether now it is normal or not. So, what we can do is that it may work may not work like that. So, I am saying that after this transformation what happens can you show me the possibility. So, when I explore this one that this transformation has worked you see and now the p value is not less than 0.05 not less than 0.05 what do you see p more than 0.05 indicates that data is converted into normality. So, when I use a optimal lambda or lambda that is suggested lambda over here and when I do the transformation which is C2 column over here the data is converted into normal distributions like that. Now, I can calculate capability of this if I if I want to see the capability and I give a range of this capability means upper limit, lower limit to this I can do that ok. Similarly, another option that can be explored over here is that I can use another function which is known as Johnson's transformation over here. So, this is available in quality tools Johnson's transformation over here you go to Johnson's transformation then you say which where is the data I say energy consumption. So, can I store it will say whether you want to store that one. So, let us store this one because C2 is locked now we can store it in C7 let us say. So, C7 we will say that is stored in C7 and Johnson's transformation and options over here is that p value at this C you can ignore at present moment take it as default and go to ok and what will happen is that it will give you some transformation over here ok. And the final transformation on Y that you see is this is the function last last one what you see 1.5227 plus 0.43 ln means log transformation X minus of this divided by this this is the total on Y in case of X you replace that 1 Y and then you see the initial data set was showing 1.43 as Anderson Darling value and p is less than 0.05. So, after this conversion transformation function that was used Anderson Darling value is saying and p value is more than 0.05. So, we will assume that this p value we will take as a criteria if it is less than 0.05 non-normal if it is more than 0.05 it is converted into normal like that. So, whenever I convert this one I can calculate the I can calculate the capabilities I can calculate the process capabilities like that. So, this can be we can we can do this analysis for this sales and surface finish like that. So, for surface finish also let us try to see one more examples and then stop this one and then. So, in this case let us say I want to calculate the capability and this is a. So, I want to calculate because I know the data is non-normal. So, let me just test the data basic statistics and we want to test the data say normality of the second one surface finish over here I want to check and I want to see and p is here also you can see less than 0.05. So, that is the criteria and the data is non-normal basically. So, when the data is non-normal what is to be done how do I calculate capabilities like that. So, I will go to quality tools and capability analysis I will I will mention non-normal I will mention non-normal analysis I want to do and which is the data set survey this surface finish is the data set and we can we can fit the distribution over here we can fit the distribution over here. So, rather than that one what we can do is that quality tools over here capability analysis normal we want to convert this into normal. So, in this case we will use surface finish let us say and we can specify over here let us say from this is from 2.5 to 3.5 upper specification arbitrary I am putting over here some numbers to the surface finish data that I am having in C5 and then I suggest some transformation. So, I suggest use box cocks transformation which lambda use optimal lambda like that and then give me the capabilities. So, I use this one. So, here transform over here you have options no transformation, but I want box cocks transformation or Johnson's transformation I am using let us say box cocks use optimal lambda whatever the box cocks transformation suggest like that otherwise I can specify also over here. So, if I am doing this and clicking ok over here and then in options anything else. So, within an overall also I want to calculate within overall capabilities. So, capability index over here and then I want to calculate. So, only S calculation will differ and then in that case what I will do is that I will click ok and what will happen is that So, subgroup size I have not given. So, this is one subgroup size that has to be given mentioned over here. So, I click ok over here what will happen is that this will give you some transform data USL and LSL. So, this is written as transform data you can see transform data. So, some transformation was used over here using box cocks lambda transformation of 0.5 0.5 means square transformation basically y to the power point square root transformations y to the power square root of y was used over here and then Pp and Ppk index Cp and Cpk index were calculated and then you see that LSL and USL are also changed because I will square root transformation. So, the values are also transformed like that and accordingly the data was confirmed to be normal now and after the transformation and with the transform data I am calculating the Cp and Cpk index like that. So, whenever you have a non-normal scenario you can use this type of transformation. So, we will continue from here in the two examples that we have seen earlier that can example and the ring examples like that and from there again we will start. So, that will be the starting point then we will come to some other issues and quality what we can discuss. So, in next session we will do that. So, what we have done is that process capability analysis then Z benchmark we have talked about and in case of non-normal what are the options that we have that we try to explore in this session. We will continue this lecture from here in our next session. Thank you for listening.