 And, welcome to session 18 on Quality Control and Implacement with Minitab and I am Professor Indrajit Mukherjee from Shailesh J Mehta School of Management IIT Bombay. So, previous session what we are doing is that we are talking about hypothesis testing and we are talking about one sample Z test that we are doing and in that case what we have mentioned is that one example we have taken and we will we will just try to and then we will go ahead with that ok. So, this was the concentration in PPM which was monitored and the hypothesis and sigma was known over here. So, in this case sigma was known. So, whenever population sigma is known in that case that is why we are using Z test over here and what we have seen is that we want to test the hypothesis that whether is mu equals to 0.5 mu greater than 0.5 over here and the Z statistics that was used was x bar which is derived from the observation over here and sigma is given over here and it is the total number of observations that is there and this mu will be equals to 0.5 that we have assumed and so, I can calculate the Z statistics over here. So, in this case and then what we see is that whether the Z value that that is so, this is 0.5 reason over here whatever will be the Z value. So, this is the rejection zone that we have defined based on one sided test over here and if my values is falling somewhere over here Z value is falling that is Z statistics is falling somewhere over here in the rejection zone in that case we will go for the alternate hypothesis we will go for the alternate hypothesis. So, in this case what we have seen is that data because data is non-normal and we have transformed the data. So, we initially we have assumed that data is normal then also we have seen that we cannot reject the null hypothesis because my Z value is around 0.5 to and the corresponding p value over here is 0.301. So, this values indicate that p is greater than 0.05. So, I have to I cannot reject the null over here. So, we go by the null hypothesis over here. So, we cannot we cannot reject or we do not have enough evidence to prove that mu not equals to mu is greater than 0.5 like that and also we have seen that in case this normality assumption is not true we can convert the data by using lambda transformation over here where it will be for lambda that box-cops transformation what we have used and what we have seen is that after conversion then we converted sigma and then what we did is that after conversion we have applied the same logic of Z test over here and we have seen that that also satisfies the condition that means we cannot reject the null over here ok. And so, that was the example that we have taken. So, that was the example that we have taken we can take some more examples over here and try to see that what happens in other scenarios like that. So, before we go into that we need some brief idea about that what is this p value that we are talking about which is used for hypothesis testing and any other analysis on statistical analysis when we talk about level of significance like that what is this p value. So, normality test we have used this one p value concept like that. So, what is this p value? So, when I am doing this test on normality test when I am doing this hypothesis testing what is assumed is that the alpha level of significance is around 0.025 let us say 0.025 and on this side if I am doing both side a test what I am assuming is that on both side it is 0.025 over here and I am observing that this is the mu condition and this is the x bar condition where it falls if it is falling within this reason I will reject the null hypothesis basically ok. So, MINITAB gives you an option to calculate the corresponding people. So, I am using a z statistic over here and based on the z statistic value that is calculated value and then I compare this calculated value x can be converted into z statistics calculated value over here and the z. So, this will be converted into z on this axis and this will be z c. So, this is the 0 point over here this will be 0 condition over here this will be converted into 0 the mean will be equals to 0 when I convert into z. So, then what I do is that z statistics I am trying to derive that z statistics over here and based on this z value and with a level of significance of 0.025 on this side. So, if my z value gives me this gives me a corresponding value. So, here the z statistics over here will be compared with a tabulated value earlier days what what used to do is that this is calculated and tabulated value depends on this alpha level of significance what I have defined over here. So, calculated value will be compared with tabulated value like that from table z table we will get and and if this is calculated value is greater than then we will tabulated value then we will reject the null hypothesis that was the earlier concept. So, I can do hypothesis testing in two ways. So, one is that using tables and we compare that values and go by that otherwise we go by the p value concept over here what is used by Minitab and we and many people goes by p value concept like that. So, I I do not pre-defined this alpha level of significance, but I have in mind that if alpha value if I if I can if if this is the region of region of acceptance zone over here. So, so then what due to computational nowadays availability of computational softwares like that what can be done is that whenever I am doing this test or z values that I am calculating. So, this will be 0 and this will be calculated value over here for a given calculated value that we have calculated. So, we can also calculate that what is the probability or the area which is more than z. So, probability of z greater than z z c like that. So, this this area can be defined over this can be calculated over here and Minitab does it automatically for you. So, it gives you a smallest level of significance at which the null hypothesis can be that leads to rejection of the null hypothesis. That means, earlier what we are assuming this is strict region that we are considering over here and this is the acceptance region we are considering over here. Now, what we what we will do in a reverse way we are calculating the alpha values like that this is observed level of significance. The x bar condition that the x bar values from the sample statistics give me a location of z c over here and this z c corresponds to a alpha value which is different from the alpha that we have assumed. So, it is a calculated alpha value for given value of x bar like that. So, it is a representation that if this is the x bar then what is the alpha level at what level of alpha I can reject the null hypothesis that is the smallest level of significance that is known as the p value over here. So, if p value is less than 0.05 we need to reject the null hypothesis. So, we need to reject the null hypothesis. So, this is the condition that we will apply or less than equals to condition we can apply and p is more than 0.05 what we will do is that we will we will accept the null. So, we will go by the null and if it is less than equals to condition we will go by H naught or alternate hypothesis like that. So, this is the condition that is used. So, earlier days we used to see the tables and we compare that one because p value calculation is quite complicated for various distributions like that. Nowadays for computational we have so many softwares like that immediately software will give you the p value corresponding p value for specific scenarios of either I am doing one sample z test or any other test hypothesis testing. So, p value which is the actual level of significance at what point I can reject the null hypothesis that will be given. So, in when you have to interpret what you have to what you have to see is that if p is less than equals to 0.05 or not if that is the condition then null has to go in that case we will go by the alternate hypothesis we have to go by the alternate hypothesis over here ok. So, that is the interpretation of p value. So, p is less than equals to 0.05 we will we will reject the null hypothesis we will reject the null hypothesis if it is more than that the reverse condition if it is greater than that we will assume that null has to be accepted. So, in that case I cannot reject the null and we do not have evidence to reject the null like that. So, this is the condition and let us take one more example. So, in this case one more example what I am seeing is variance is unknown that is given over here. Golfer is interested in selecting a golf club whose coefficient of restitution is more than this one. So, my condition over here is the mu value should be greater than then only I will accept the golf club like that. So, golf club means which is used to hit the ball basically. So, that is the design of the golf club. So, I have to select a golf club whose coefficient of restitution is more than this one. Following table shows the data for experiments where golf. So, experimentation comes over here where golf balls were fired at fired at the golf club from a ball cannon and then coefficient of restitution was recorded So, these are the values that we are getting over here. So, these are the sample observations. So, from here what we can do is that we can calculate average we can calculate standard deviation of the data like that and population variance is not known. So, when we are doing hypothesis testing remember that we are doing hypothesis testing on the population. So, whether mu equals to mu 0 or mu not equals to mu 0 we are not testing whether this x bar is certain values like that. So, this is not the way we are testing over here. So, this is all about hypothesis testing all about population not about sample over here. Sample observation will lead to reject or acceptance of the null hypothesis like that. So, here what we can do is that we can get x bar over here and the condition is that that we will consider select that golf club which is having a coefficient of restitution of more. So, one of the one of the club was taken and in that case it was observation was given over here. So, this will have a mean and standard deviation. So, in this case I can when variance is unknown for the population of sigma is unknown. So, I can make an estimation with S over here and I can replace the estimation of sigma by S which is the unbiased estimator of sigma and in that case what will happen is that I have a test statistic which is known as T test over here and the test statistics that you see over here is this one where T is equals to. So, calculated value will be based on what is the x bar value which will come from here and then mu over here is taken as 0.82 over here which is the condition that is given it should be more than this. So, null will be equals to condition over here mu equals to 0.82 that is also considered over here you can see ok. And standard deviation calculation will be given. So, instead of sigma I am taking S over here and square root of n that is the condition in also in sampling distribution we have seen that this is the this is the. So, variance follows the sigma by root n condition. So, here also we are taking instead of sigma we are assuming S by root n over here and that will follow T T distribution and this T distribution. So, some calculated value of T will be calculated and that will be compared with the tabulated value like that. So, that will be compared. So, whenever I am talking of T distribution in that case degree of freedom is important. So, n minus 1. So, here I have n observation n minus 1 is the degree of freedom that will be considered for the analysis and alpha level of significance let us say we have we are assuming a 5 percent level of significance. So, alpha equals to 0.05 at that level we want to test this one. So, whenever I am using a T test the underlying assumption is that data follows normal distributions like that. So, data follows normal distribution that has to be checked and then only we can go for T test like that when variance is up. So, that is the condition we will we will we will see that one. So, we will go to the Minitab file and try to see how this analysis is done. So, coefficient of restitution. So, this is the data set that I am having. So, I will go to basic statistics and I will try to confirm whether it follows normal distribution or not. So, in this case coefficient of restitution I am doing. So, in this case I am doing Anderson Darling test again. So, I will click ok and when I do the analysis of normal distribution test what I found is that p value is around 0.873. So, by p value interpretation what we know is that p is more than 0.05 in that case null hypothesis over here is the data is normal. So, in this case we cannot reject the null. So, in this case data is normal. So, that is proved over here. So, when the data is normal in that case we can we can apply the T test one sample T test. So, what you have to do is that go to stat basic stat one sample T. So, one sample Z we have done when sigma is known one sample T we are doing over here. So, in this case again we will just reload the different set of data which is coefficient of restitution. So, in this case perform hypothesis and when we are doing that and the condition 0.82 which was given like that we want to test like that and that is the condition. So, this is one sided test mu greater than 0.82. So, what we will do is that we will we will just mention that one in the options and we go by the options and then 95 percent is there. So, I will write mean greater than hypothesis mean. So, if that is that is the alternate hypothesis that I am considering over here greater than condition. So, then in graph what we can do is that we can see histograms of the data set also box plot of the data set like that what we can see. So, if I click ok and I click ok of the data set what will happen is that. So, this is the data summary descriptive statistics what you see mean and standard deviation is calculated and confidence interval is also given. So, if you go down what will happen is that this is the ah test statistics that you can see. So, I can paste it over here to have a enhanced view ok. So, this is showing me a value of 2.72 and p value is 0.008 p value is 0.008. So, when p is less than 0.05 we will go for the alternate hypothesis we will go for the alternate hypothesis that means, ah the average value of the population will be ah greater than 0.82. So, that with that is confirmed over here using this test. So, with one sample we are doing this and we are finding out that the p value is less than 0.05. So, null has to go over here. So, null has to go over here and the p value indicates that we will go for the alternate hypothesis over here and if this is true. So, in that case ah anything which is more than that. So, we will select this. So, we will select this golf club because it is giving me a average which is on population we expect that it should be greater than 0.82. Alpha level of significance over here p value is considered as 0.05. In certain situation p p value can be considered as ah 0.01 ah sorry alpha value can be selected as 0.01 like that. So, 90 percent confidence band we are considering acceptance zone we are considering and so in that case so scenarios it depends on the experimenter what value of alpha we will select. So, alpha is generally taken as ah 10 percent or 5 percent like that. So, that is the general condition ok ah and if you want to be very sure. So, 99 percent also acceptance band we we can take. So, we do not want to take any chance like that. So, if this critical decision is very critical in that case may be the 99 percent band we will consider and in that case only 1 percent ah possibility that my conclusion will go wrong. So, in that case alpha value will be 0.01 a 0.001 basically. So, that will be the condition that we can think of sorry 0.01 sorry 0.01 will be the condition that we can adopt like that. So, it depends on the scenarios what what alpha level we will select like that ok. So, ah this data set shows that normality assumption is there and the p value p value is p value is less than. So, significance level is reached. So, we can go for that ok. Now, ah now in this case normality we have we have transformed the data. So, one one data set that was given that concentration data set and in that case what we have seen is that it is non-normal. So, conversion has worked. So, in this case scenario is favorable for us. So, box cost transformation has worked. So, after transformation we have cross checked and we have seen that whatever conclusion that we have drawn assuming normal that is also if I can transform the data and our conclusion is also correct. So, second time when we converted the data also our analysis shows that ah we cannot reject the null. So, it is approximately. So, p value is again less than that one. So, in case conversion is not happening it is a skewed distribution or something ah conversion is not working. So, in that case what we can do is that there is another types of test which is known as non-parametric test which uses the median as a reference and uses rank ah concept like that. So, ah when you go to stat you go to this non-parametric ah options that you have. So, in this case you will find two options one sample will coxon and one for example, sign test like that. So, what is to be which scenario will be ah preferred which one to be preferred like that. So, that we can highlight over here. So, in case data is non-normal. So, I want to test ah. So, distribution assumption is I want to avoid that one. So, in this case there are options like that ah non-parametric test. So, non-parametric test does not assure a strong decision over here, but what we can say is that this is the alternative that we have in case everything fails. So, we have this alternative, but ah we cannot be 100 percent sure that this test works ah 100 percent of the time. So, it will give you correct decisions like that. So, in this case one sample will coxon test some guideline over here assume symmetric distribution like ah normal distribution what we have when asymmetry exists. So, in that case one sample sign test can be used seeing the distribution based on that I can use either will coxon test ah. And if it is does not assume population this does not assume symmetry of the distribution. So, this is this we can use this also we can use one sample or otherwise we can use one sample will coxon test. So, how do we see that one? So, when we are using this one ah we can go back to the data set over here which is non-normal and in this case what we can do is that go to stat and go to non-parametric go to one sample will coxon assuming the distribution more or less symmetry over here. So, in this case what I will do is that I will go for this concentration non-normal data and here we are testing the medians basically. So, in this case again ah so, we do median testing over here. So, not mean so, you have to remember this one. So, median is assuming to be 0.5 over here we are doing this test and greater than condition remains same. So, this condition we are testing and in this case what will happen is that you will get some observation median values over here and the testing is done on the median value which is what you see over here median is equals to 0.5 median not equals to 0.5. So, this will coxon statistics is given 679 this is the like t statistics z statistics this is one of the statistic that is calculated and corresponding p value is given over here what you see 0.442 like that ok. So, 0.442 ah indicates that p value is more than 0.05. So, I cannot reject the null over here. Similar conclusion was drawn when we when we have converted the data and without conversion also the conclusion was same. So, but everywhere it may not happen. So, conclusion based on classical approach may not be same as ah non-conventional approach which is over here or non-parametric approach which does not assume distribution like that ok. So, ah this is true for this and if I go for this coefficient of restitution over here again assuming that the distribution ah we are not going by any distribution assumptions. We can also do one sample one sample sign test over here. So, this one sample sign test also you mentioned that this is the coefficient of restitution I want to do confidence interval is this and test median is around 0.5 we want to test this one ah sorry 0.5 over here and I want to test whether it is greater than condition is satisfied or not I click ok. And then in that case sign test for median is given and also the conclusion over here you go to the last p values that you see over here the p value indicates ah significant ah significance over here. So, that ah we have to reject the null that is 0.5 basically ok. So, all observations are median, whatever observation we have all 15 observations are beyond this 0.5. So, based on that some p value calculation can be done and p value is around 0. So, MINITAB will indicate. So, when whenever you see the output of MINITAB so, I am copying this one and I am pasting this one. So, I can delete this and let us paste this one. So, observations that you see I am enhancing this one. So, what we will see is that p value is approximately equals to 0 0 0 0 up to third place of decimal MINITAB will report the p values like that it is not that p value is exactly equals to 0 over here it indicates that up to 3 place of decimal this this is shown and beyond 3 place of decimal MINITAB does not show p values like that. It can be 0 0 0 0 0 1 like that so, we do not know. So, up to 3 place of decimal can be calculated over here. If you are using some other software let us say R interface you will find the p value with some values and beyond up to 5 place of decimal you can see the p values like that ok. So, that is also possible in other softwares like R which is freely available and we can we can also do these type of testing over there also ok. So, ah so, what we are showing over here is that this indicates that when I am doing that I am not assuming any distribution over here symmetry we are not assuming over here. We can go for this type of analysis and we will go for conventional approach whenever I know the distribution follows it is following normal distribution like that and we want to test whether the mean is equals to some values or not. We can also do variance testing over here we can do one one sample variance test over here. So, if some example exists over here we can we can see that. So, ah we can test the variance whether it is equals to so, our variance that has come out what we can see over here standard deviation is 0.02, 0.02 is the average that we are getting. So, in this data set let us say I want to check the variance over here and I want to do the testing on variance like that ok whether variance is equals to some value. So, if I go to stat and I go to basic stat over here and I want to do a variance test there is a option one variance test like that instead of mean I am testing the variance over here ah coefficient of restitution. So, I I want to perform the hypothesis value over here let us say I want to see whether it is equals to 0.03 something like that. So, because ah the analysis says it is close to 0.02. So, if you see the standard deviation that is calculated over 0.0245 like that let us assume that whether it is close to 3 or not we want to 0.03 or not we want to test that one and ah hypothetically I am assuming 0.03 over here. So, in options what we do is that 95 percent we do not change confidence level over here and standard deviation I am doing both sided test over here let us say not equals to condition whether it is equals to ah 0.03 or it is not equals to. So, two sided test I am doing over here. So, in this case I will click ok over here and I have a chi-square test for this because we have also seen that confidence interval when we are doing the confidence interval case for sigma also for population variance there also chi-square distribution was used like that. Here also chi-square distribution is used to define this test statistic over here and it will give you a ah idea of the p-value concept over here. So, you have to see this the chi-square test the test statistics over here with degree of freedom that is given. So, I can just copy this one and paste it in excel so that you can see. So, I will enlarge the test statistics what we see over here. So, ah what we will see is the chi chi-square test that we will see over here test statistics is 9.38 degree of freedom is 14 information that we have and this is 0.388. So, that indicates that ah we cannot reject the null hypothesis which is sigma equals to 0.03 because our value also shows that that is close to 0.03 over here. So, what we are what we are saying over here is that that we can do test for and underlying we are assuming that the it follows normal distributions like that follows normal distribution that is the underlying assumption, but variance can also be tested. So, we have an option in Minitab where we can do variance testing also. So, mean testing what we are doing z-tasty test similarly when we are testing for one sample variance like that in that case we have a chi-square distribution to define that one and based on that we can define the test and P value will indicate whether to reject or accept the null hypothesis like that. So, this way we can do one sample testing. So, most in quality we will find that ah some load information is given over here let us say this example was earlier taken I want to test let us say this load is close to 15 or it is greater than that then only we will accept like the coefficient of restitution. So, here also you can check whether the data set follows normal distribution like that again I am redoing this one. So, I will take the load information load data information over here and I will do understanding test and I see that understanding test indicates the data is following normal because P value is 0.838. So, in this case there is no problem in normality assumptions. So, I can go directly for the testing of this. So, I will I will go for basic testing and one sample if sigma is known sigma is unknown over here. So, I will go by that. So, I will go for the load over here and I will perform the test whether it is more than 15 or something like that we can we can hypothesize mean over here. And in options I will say that if it is greater than hypothesized means I will I will go for this ah this thing. So, so if I click ok over here and what I see is the P value is 0.948. Here also we have given options for the data to be disclosed in ah we can draw the histogram over here there is a graph option that we have leaked earlier also. So, these things also you can see the distribution of the data over here. Normality assumption is following. So, statistically we cannot say it is non-normal. So, this is following normal distribution. So, this is true and also you will see a box plot of this load over here that means box plot with hypothesis hypothesis location over here. Hypothesis mean is over here location that ah class sign ah that you see over here red red dots that you see over here. This is the location of the null hypothesis and this is the average that we are getting this is the average value x bar location and this is the median value what is the 50 percentile 50th percentile over here in the box plot over here. So, these values are these values that x bar we are getting is very close to the hypothesis hypothesized mean and that is why what we are getting over here is what we are getting over here is a P value is coming out to be significantly higher than 0.05 that means ah the hypothesis mean is very is not very different from the ah from the x bar over here. So, the hypothesis mean value over here. So, based on the data information. So, this is making a conclusion the conclusion is based on not only the mean value that we are calculating also the variance of the spread of the data set that we are having. So, ah so, in that case these two values are used over here to make a decision out of this. So, each statistics include s and x bar information over here. So, both the informations are used and based on that we are making a judgment over here whether the load is greater than 15 or it is less than ah or less than equals to 15 like that and based on that I will make a judgment over here. So, I have seen the data when it is normal what to do and if it is non-normal what options we have transformation options over here we also have ah non-parametric test over here we also have a non-parametric options for this and non-parametric options that we have two options what you said is that I can I can go directly to once one sample sign test over here and in case you find symmetry in the distribution when you plot the data set like that we can assume symmetry and uniformity in the distribution left hand side and right hand side of the distribution. In that case what ah symmetric distribution like normal distribution is a symmetric distribution like that in that case non-parametric option what we can explore is ah Wilcoxon test one sample Wilcoxon test like that ok. So, ah what we will do is that what we will do is that next we will try to cover ah some illustration we have given over here then we will discuss about in our next session what we will do is that we will discuss about ah two sample details this is the important starting point of experimentation this is the starting point of experimentation and this is important. So, ah we have given a brief idea about confidence interval we have given a brief idea of hypothesis testing not that I have emphasized much about hypothesis testing and on confidence interval you can see some other lectures which gives you thorough idea about this and extensive extensive lectures on that you can see, but that is the basic idea that is required over here in this course. So, that is the foundation. So, I have given you some basic idea of the foundation where P value is important for us based on which we are making judgment and we will do hypothesis testing everywhere most of the time experimentation is only hypothesis testing that means whether to go for this decision or that decision like that. So, we are into inferential statistics that we are doing over here in quality also. So, some amount of statistics ah is required understanding of that is required to go ahead with the experimentation concepts like that what we what we will try to discuss about factorial design and then all other options that we have ok. So, everywhere some statistical analysis has to be done and based on that we have to make a decision, but always remember that it has to be practical decision basically ok. So, sometimes P value is less than 0.05 immediately I take a decision it is not like that. So, physical significance practicality of that whatever I have concluded that has to be taken into consideration when I am making a decision like that ok. Values are very close although it is significantly different ah in that case let us say computational time like that with two algorithms you are checking that one. So, computational time is very close like that and although statistically significant, but practically it does not make any sense. So, you have to consider this P value very very seriously when I am making a judgment like that what is practical being an engineer or in a process ah you are knowledgeable about the process. So, whether this difference is really significant. So, that I say that this is really significant or the difference required is more then only we can say that this is significant like that. So, P value what is the value of P value based on that we can make a judgment like that ah ok. So, if P is very close to 0.5, but just less than 0.5 whether we will 0.05 whether we will go for that or we will we need more evidence or we we need more things like that that we have to check when we are making a decisions like that ok. So, what we will do is that we will close here and we will start with the next session starting from here we will discuss about two sample details which is extensively used in quality which is extensively used in quality that we will start. Thank you for listening.