 Now, we move on to the next problem that is called the goodness of a test. In this problem we want to test whether the distribution is a particular distribution or not. So, basically this is the problem of modeling ofdistributions. So, roughly speaking the we have a sample. So, let us consider say x 1, x 2, x n this is a random sample from say f x all right. And we want to test whether f x is equal to f naught x or not. If we want to test f x is equal to f naught x against f x is equal to f 1 x where f 1 is different from f naught then we have the most powerful test using theName and Pearson lemma. However, it is not that here we want to null hypothesis we are specifying completely, but alternative we are not able to specify. Therefore, we cannot apply the name and Pearson lemma here. So, what we do? We consider say we classify the data into k categories say c 1, c 2, c k and we calculate probability of x belonging to c i where x is of course, f x. What is the probability of x belonging to c i? Let us denote it by theta i ok when the distribution is f for i is equal to 1 to k. Now, suppose this probability that x belonging to c i that is say theta i naught when f is taken to be f naught. So, actually we make use of this fact that means, under the null hypothesis the probability of each category is specified. So, we actually frame it as a multinomial testing problem. So, our hypothesis testing problem can be transformed to h naught theta i is equal to theta i naught i is equal to 1 to k against at least one inequality in the above statement. So, you can see that in this Kalmogorov this chi square test for goodness of it that I am going to discuss. It is one of the oldest nonparametric test it was developed by Carl Pearson. In this test actually cleverly the problem of full testing has been transformed to checking k categories. So, we can actually consider let f i be equal to number of x i's which are belonging to category c i for i is equal to 1 to k. Then we can say that this f 1 f 2 f k this is having a multinomial distribution total number of observation is n and the probabilities of category c 1 c 2 c k they are theta 1 theta 2 theta k respectively. The simplest thing that we can do is we can consider a likelihood ratio test for this problem. So, for likelihood ratio test we know that we have to write the likelihood function. We develop a likelihood ratio test. So, here the full parameter space that is theta 1 theta 2 theta k where theta i's are greater than or equal to 0 and sigma theta i that is equal to 1 theta i to the power f i n factorial divided by f 1 factorial and so on f k factorial i is equal to 1 to k. So, basically this part is constant. So, maximization problem is reduced to this part only. So, to maximize l over the parameter space omega we maximize f i. So, we take log here log of theta i sigma i is equal to 1 to k subject to the condition that summation theta i is equal to 1. So, we introduce Lagrange's multiplier. Let us call this term as equal to l Lagrange's multiplier. Let me call it l m here or we can give some other notation say m here. So, that means we differentiate with respect to each theta i then I will get f i by theta i minus lambda that is equal to 0 that gives me theta is equal to f by f i by lambda for i is equal to 1 to k. This gives me the value of n because I can apply the condition here sigma theta i that is equal to sigma f i by lambda. So, this is equal to 1 this is equal to n by lambda this means lambda is equal to n. So, theta i hat that is equal to f i by n here. So, this is the maximizing l theta. So, what is the maximum value then? So, the supremum value of l theta for theta belonging to omega that is equal to l hat omega that is equal to n factorial divided by f 1 factorial and so on f k factorial product of f i by n to the power f i i is equal to 1 to k because I have substituted the value of theta is f i by n here and l hat omega naught that means supremum of l theta for theta belonging to H naught that is equal to n factorial divided by f 1 factorial f 2 factorial f k factorial product of theta i naught to the power f i i is equal to 1 to k. So, if we consider the likelihood ratio that is l of omega naught divided by l hat omega. So, this term will get cancelled out you will left with product n theta i naught by f i to the power f i. Let me call it say lambda then minus 2 log of lambda that is equal to. So, this part we are doing to develop the distribution here that is minus 2 f i log of n theta i naught minus log of f i for i is equal to 1 to k. Now this term log of n theta i naught I expand around f i. So, that is equal to minus twice sigma i is equal to 1 to k f i log of f i plus n theta i naught minus f i then the derivative of this that is becoming 1 by that at f i. So, it is simply becoming 1 by f i then second term will give me n theta i naught minus f i square by 2 factorial second derivative will give me minus 1 by f i plus n theta i naught minus f i cube 1 by f i cube this will be square here and so on minus log of f i that is this term here. So, we can now adjust the terms here. See this term will get cancelled out this term will give me n theta i minus f i this f i will get cancelled out and in other terms here f i will come here. So, let us write this that is minus 2 log of lambda that is equal to minus 2 sigma n theta i naught minus f i minus 1 by 2 sigma n theta i naught minus f i square divided by f i plus 1 by 3 n theta i naught minus f i cube and of course, summation will be there and so on. This term is simply. So, this is n theta i. So, sigma theta i is equal to 1. So, sigma theta i naught is also equal to 1. So, this is n then sigma f i is n. So, this first term will become 0 second term is giving me sigma n theta i naught minus f i square divided by f i minus 2 by 3 sigma n theta i naught minus f i cube by f i square and so on. Now, f i by n converges to theta i naught in probability under h naught that is we can say f i converges to n theta i naught that we denote by E i under h naught in probability. So, therefore, I can say that minus 2 log of lambda is asymptotically equal to sigma n theta i naught minus f i square divided by f i for i is equal to 1 to k. This is written as E i minus f i divided by f i square i is equal to 1 to k that is equal to some q here. So, this is converging. So, therefore, the higher order terms why neglect here and we are writing only this f i's are called observed frequencies E i's are called expected frequency of the ith class. So, and we have an alternative formula for this is also see if I expand it is becoming E i square plus f i square minus 2 E i f i divided by f i that is equal to sigma of E i square by f i plus sigma f i minus 2 sigma E i both are n here. So, this is becoming minus n i is equal to 1 to k. So, this is the test statistic which is coming from the likelihood ratio. In the likelihood ratio we know that we accept the null hypothesis if L hat omega naught by L hat omega is. So, we reject the null hypothesis if the denominator is large. So, basically we have taken minus 2. So, lambda is should be small for closer to h naught and for rejection lambda should be small. So, this minus 2 I have taken minus 2 times log of this. So, the outcome will be reverse that means for large values of minus 2 log lambda will be rejecting. Another interpretation you can make out from here this is actually the difference between the observed and the expected frequencies square. So, if the true distribution is naught f naught then there will be large difference here that means these differences will propagate and this term will become large. So, basically this gives an indication the value of minus 2 log lambda whether the null hypothesis is true or not. Now that gives a rough indication, but to get a real picture of this we need the distribution of that. For that we see that this is multinomial. Therefore, asymptotic distribution of this will become simply sum of the chi squares because when we consider 2 then binomial converges to the normal. So, here it is converging to k minus 1 dimensional thing and therefore, when we are taking the sum of the square it will converge to chi square on k minus 1 degrees of freedom. So, asymptotic distribution of this quantity let me call it w or q we have called it that is chi square k minus 1. So, we reject h naught if q is greater than or equal to chi square k minus 1 alpha at significance level alpha. This test is widely used in all the applications for modeling of the statistical distributions and it is extremely useful. However, since it is asymptotic certain assumptions are there. For example, the expected frequency of each cell must be greater than 5 for a good approximation if that is not so, then this test is not very good. Now another test for the goodness of it was developed by Kalmogorov and Smirno. So, this is called Kalmogorov-Smirno one sample statistic. As before we are writing down our hypothesis testing problem as f x is equal to f naught x for all x or f x is not equal to f naught x. We have at our disposal a random sample this population. We define the sample distribution function that is f n x that is the empirical distribution function of x 1, x 2, x and that is the order statistics from this. We define the maximum absolute difference between the empirical distribution function and the assumed distribution function. So, here actually you take f naught. So, what is the idea for this? The idea for this is the result about the empirical distribution function which we gave earlier that was that it is strongly consistent not only strongly consistent. We had actually proved that limit of the probability supremum f m x minus f x greater than epsilon actually goes to 0. So, because of this is a very good indicator of the actual discrepancy between the assumed model and the sample. That means, based on the sample actually we are calculating the empirical distribution function. So, if there is too much discrepancy then this statement or this value will be large. So, based on this idea this Kalmogorov-Smirno they have defined the statistic called d n. Now certainly as in the previous chi square goodness of it we need to discuss the distribution of d n. If we are looking at f n then certainly we knew the distribution, but since we are considering the maximum here. So, then this problem becomes slightly different. So, we further define. So, this one is actually the Kalmogorov we call it d n. So, we further define two quantities called d n plus that is equal to supremum of f n x minus f x and d n minus that is equal to supremum of f n x minus. So, here I will put reverse f x minus f n x that is d n is actually equal to the maximum of d n plus and d n minus. That means, actually I am taking the maximum positive difference and maximum negative difference. Now we will try to analyze this distribution of d n plus and d n minus separately. So, d n plus that is the supremum of f n x minus f x overall x. We have ordered x 1 less than or equal to x 2 less than or equal to x n and I take x 0 and x infinity also or x n plus 1. So, x this is I am taking to be minus infinity this I am taking to plus infinity. So, we can then express it as maximum of supremum value between f n x minus f x let me put a strict inequality at this one side. This is for i is equal to 0 1 to n. That means, I am considering the supremum value in the intervals. So, this is x 1 that is minus infinity to x 1 then between x 1 to x 2 then between x 2 and x 3 then between x n minus 1 and x n and then x n to infinity. So, I am I have divided this problem into looking at the difference into each of the intervals, but the advantage of this approach is that actually the value of the empirical distribution function in this interval is i by n. So, basically I am looking at that what is the difference of f x from i by n when x is in the interval x i to x i plus 1 and this we are doing over all i's. Now, capital F is an increasing function because it is a CDF. So, when I am looking at the supremum here now it is supremum over x. So, this is becomes a fixed quantity. So, this becomes actually the minimum value of f the minimum value of f in this interval is attained at x i. So, this is becoming equal to maximum of i by n minus f of x i, i is equal to 0 1 to n and now this f x i's are actually u i's that we have already seen. So, this is i by n minus u i's where u i's are the order statistics from the uniform 0 1 this is from uniform 0 1 here and we are looking at the maximum for i is equal to 1 to n and corresponding to 0 then this is actually 0. So, this is maximum of this and this. Now, this is very interesting here i started with some sample here ok. Now, as done that sample I have considered the difference between f n x that is the empirical distribution function minus f x, but this quantity if you look at this quantity has become free from the original distribution because this is nothing, but from the uniform 0 1. Thus we have shown that d n plus is distribution free. As I mentioned earlier in the beginning of this particular section on nonparametric methods that here we develop those methods which are free from the distribution original distributions assumption. So, so that means that whatever with the distribution originally it does not affect our this thing that means distributional assumptions are not required except of course we consider continuity etcetera here. Now, in a similar way I can consider d n minus here. So, let us consider d n minus now what is d n minus d n minus is equal to supremum of f x minus f n x that means I am taking the negative value here. So, this is equal to maximum of supremum f x minus f n x for x i less than or equal to x less than x i plus 1 i is equal to 0 1 to n that is equal to maximum of i is equal to 0 1 to n that is equal to f of x i plus 1 minus i by n that is equal to maximum of 0 maximum of 1 less than or equal to i less than or equal to n u i minus i minus 1 by n. So, I have shifted by 1 here because I am taking from 1 to n. So, I can in place of i plus 1 I can write i here then this becomes i minus 1 here. So, once again as in d n plus here also you see. So, d n minus 1 is also distribution free. Now, first thing is that we are able to derive the form of d n plus and d n minus in terms of the order statistics from uniform 0 1. The distribution of u bracketed i is known that we have derived as the beta distributions. Now, here the form that is coming out is the maximums that means when we are considering several dependent distributions or the dependently distributed random variables then what is the distribution of the maximum of that and then again maximum of the 2. Let me express it here. So, now we consider the distribution of d n. Now, one thing that you note these values are between 0 to 1 and what are these values here i minus 1 by n these values are also between 0 to 1. So, these are all values actually always lie between 0 to 1. If you look at this also this is i by n these values lie between 0 to 1. So, these values will also lie between 0 to 1 only. So, the entire thing is that d n lies between 0 to 1. So, that means when we are considering distribution of this less than or equal to say some d then it is equal to 0 if d is less than 0 and probability of d n greater than say d that is equal to 0 if d is greater than or equal to 1. Now let us consider d n less than d between 0 to 1. So, we put a particular form here why that particular form it will be clear when we derive the expression here. Let us consider say probability of d n less than or equal to 1 by 2 n. So, there is a reason that why I am considering 1 by 2 n. The reason is that if you look at these values here they are of the form 1 by n etcetera in each interval if I look at this. So, the differences will be of this nature and let me firstly derive this here. So, d n is nothing but probability of maximum of 0 maximum of i by n minus u i where i is from 1 to n and maximum of u i minus u i minus i minus 1 by n and again here i is equal to 1 to n. So, what we are saying is that this is less than or equal to 1 by 2 n. So, maximum of this and this and this less than or equal to 1 by 2 n we split it. So, 0 is less than or equal to 1 by 2 n is always true. So, this probability then can be expressed as probability of maximum of i by n minus u i less than or equal to 1 by 2 n for i is equal to 1 to n and maximum of u i minus i minus 1 by n less than or equal to 1 by 2 n. This is 1 less than or equal to i less than or equal to n. This is equal to probability of now these are already ordered here. So, this we can consider as i by n minus u i less than or equal to 1 by 2 n and u i minus i minus 1 by n less than or equal to 1 by 2 n. This is true for i is equal to 1 to n. These two statements I can combine then this is nothing but u i i by n minus 1 by 2 n and on this side I have i. Now, if you take it to the other side then it is becoming i by n minus 1 by 2 n i is equal to 1 to n. So, this is interesting both the sides are the same. So, this is actually becoming equal to probability of u i is equal to i by n minus 1 by 2 n. So, that was the reason that I mentioned that why I am considering d n less than or equal to 1 by 2 n because for this particular part this is given extremely simple expression that is the probability of u i. So, certainly u i's are continuous random variable therefore, this probability will be equal to 0. So, what we are finding here that the probability of d n less than or equal to 1 by 2 n that means, d n has to start from 1 by 2 n from the original definition it is not clear what is the starting point. So, we said d n lies between 0 to 1, but now we see that even d n less than or equal to 1 by 2 n it is giving you probability 0. So, we consider then probability of d n less than 1 by 2 n plus something ok arguing as before what will happen here? Here I will get 1 by 2 n plus v here I will get 1 by 2 n plus v. So, here 1 by 2 n plus v here I will get 1 by 2 n plus v then if I am having this term here I will get minus v here and here I will get plus v. So, proceeding as above we get this is equal to probability of 2 i minus 1 by n minus v less than u i less than 2 i minus 1 by n plus v i is equal to 1 to n. Now, this is a joint probability for the random variables u 1, u 2, u n which are the ordered statistics from uniform 0, 1 the joint pdf of this is known. So, it is nothing, but the n fold integral over this region like for u 1 you will have from 1 by n minus v 2 1 by n plus v for u 2 it will be maximum of. So, actually then this region because you also have u 1 less than u 2 less than u n and they are lie between 0 to 1. So, this is nothing, but a this is an n fold integral over the joint pdf of u 1, u 2, u n that is n factorial for 0 less than u 1 less than u 2 less than u n less than 1 over the given region. So, if I have to x n is equal to 2 or n is equal to 3 then these things can be evaluated. Then o n, b n bomb and smirno they have tabulated the values of the upper 100 alpha percent points of the distribution of d n points. Let us call it d n alpha of the distribution of d n that is probability of d n greater than d n alpha is equal to alpha for various values of n and alpha. We can actually consider probability of d n greater than d n alpha that is equal to under h naught is equal to alpha. So, he also considered some asymptotic distribution also that is probability of d n less than v by root n if we consider as limit n tends to infinity it was shown that it is equal to 1 minus twice i is equal to 1 to infinity minus 1 to the power i minus 1 e to the power minus 2 by v. And if I consider say a number c between 0 and 1 then probability of d n plus 1 less than c that is equal to probability of maximum i by n minus u i less than c for i is equal to 1 to n that is equal to probability of i by n minus u i less than c i is equal to 1 to n that is equal to probability of u i minus i by n greater than i by n minus c for i is equal to 1 to n. Once again you can see that this can be evaluated in terms of the joint distribution of u 1, u 2, u n. Similarly, if you consider d n minus 1 less than c if we consider d n minus less than c then this will be u i less than c plus i minus 1 by n i is equal to 1 to n that is equal to probability of 1 minus u i greater than n minus i plus 1 by n minus c i is equal to 1 to n. If we consider say u 1, u 2, u n they are i i d uniform 0 1 then 1 minus u 1, 1 minus u 2, 1 minus u n are also i i d uniform 0 1. That means, if I consider v 1, v 2, v n which is 1 minus u i then they are i i d uniform 0 1. So, this is actually the same that means, v i is equal to u of n minus i plus 1. So, this one then we can write as probability of v n minus 1 minus v i greater than n minus i plus 1 by n minus c i is equal to 1 to n. Then this can be written as u n minus i plus 1 greater than n minus i plus 1 by n minus c i is equal to 1 to n. Then this is nothing but v i or sorry u i greater than i by n minus c i is equal to 1 to n. Now, you compare this with this expression here probability of d n plus 1 less than c is probability of u i greater than i by n minus c and it is the same thing here also. So, what we are getting that d n plus and d n minus have the same distribution. So, one can directly use d n plus and d n minus also for the testing problem. We can directly use d n plus 1 and d n minus 1 for testing h naught f x is equal to f naught against h 1 f x is not equal to f naught x for some x. The asymptotic distribution of d n plus etcetera has also been worked out. If I look at d n plus 1 less than z by root n as n tends to infinity that is equal to 1 minus e to the power minus 2 z square that is f z is equal to 1 minus alpha. If I define u is equal to 4 n d n plus square then probability of u less than or equal to u then that is equal to probability of 4 n d n plus square less than or equal to u that is probability of d n plus less than or equal to root u by 2 n. So, if I apply this formula the limit will become equal to 1 minus e to the power minus 2 u by n into n that is 1 minus u by 2. So, the limiting pdf of u that is equal to half e to the power minus u by 2 that is negative exponential distribution which also can be said as the chi square distribution on 2 degree of freedom. Of course, since it is a negative exponential distribution the percentage points of this can be easily calculated and we can express the test in terms of this also. So, asymptotic test asymptotic confidence interval can be obtained in terms of this. If we call say this as star then from star we can choose say d n alpha d n plus alpha such that 1 minus alpha is equal to probability root n d n plus 1 less than z is equal to probability 4 n d n plus square less than 4 z square that is 4 z square is equal to chi square 2 1 minus alpha. So, this can be easily calculated. One can easily find out the confidence interval for f x we can also use d n to find confidence interval for f x that is probability d n less than or equal to d n alpha that is equal to 1 minus alpha. So, we write it as supremum of f n x minus f x this is equivalent to saying f n x minus f x this is less than or equal to d n alpha for all x that is equal to 1 minus alpha. This is equivalent to saying probability f n x minus d n alpha less than or equal to f x less than or equal to f n x plus d n alpha that is equal to 1 minus alpha. So, 101 minus alpha percent confidence interval for f x is of the form maximum of 0 f n x minus d n alpha to minimum of 1 f n x plus d n alpha. So, we have seen that this test Kalmogorov-Wismirno test it is actually this is using more values compared to the chi square test for goodness of it which was developed by Karl Pearson. In the Karl Pearson test essentially we reduced it to category problem that means, we considered k classes out of the full distribution and therefore, the test is more sensitive because what categories you are choosing then how many categories you have taken it will be dependent upon that. Whereas, this test is more robust of course, it is some it is having sensitivity in the heavy tail distributions, but that is besides the point there have been some modifications they have been proposed, but essentially what we have seen is that the distribution of d n is actually derivable here. So, this is a much you can say improved thing compared to the chi square test for goodness of it. Only thing is that the use of Kalmogorov-Wismirno is not that straight forward for the persons who have no idea about use of the statistics because they need to understand the tabular version of the distribution of d n that means, how the percentage points are calculated. Whereas, for the chi square test the percentage points are simply the percentages point of the chi square distribution. So, with a little knowledge of distribution theory one can actually apply that test. So, it is a you can say compromise ease of application is there in chi square test, but the robustness is more in the Kalmogorov-Wismirno test. Next we consider single sample location problems. Actually in the beginning I have given you some applications of that two sample functions that f m y i that means, which are based on f and g you have two samples and based on that some heuristic test for the location etcetera are given. Now, I am getting into use of all this ranks here and to derive the few tests for the location problems. So, firstly let us consider one sample problem. So, let us consider x 1, x 2, x n be a random sample suppose you have the CDF f x. Let theta denote median of f x and we assume f is let f be strictly increasing continuous x equal to theta that means, we are assuming the median to be the unique. So, we want to test theta is equal to theta naught against say theta is greater than theta naught or theta less than theta naught or theta naught equal to theta naught. So, these three types of alternative hypothesis we will be considering and you can compare it with the parametric testing problem. In the parametric testing problem for one sample we were testing whether the mean value is equal to something less than something greater than something not equal to something that mu is equal to mu naught etcetera. We had done that testing problems under the assumptions of the normality. So, here there is no distributional assumption made except that it is a continuous distribution and strictly increasing and continuous at theta that means, the median is uniquely defined. So, now we want to test about some value theta naught. So, whatever be the value theta naught since this is known if we shift our observations if we shift our observations x i to x i minus theta naught then median will become median of new distribution will become 0. So, actually we do that thing. So, without loss of generality we take theta naught to be 0. So, that the problem becomes slightly simpler. We define say psi i is equal to 1 if x i is greater than 0 it is 0 if x i is less than or equal to 0 and we define s is equal to sigma of psi i i is equal to 1 to n. This is called sin test statistic because this is giving you the number of positive x i's. How many x i's are positive it is exactly telling that thing. Actually under h naught if the null hypothesis is true then 0 will be the median then under h naught s will follow binomial n half. See in general you will have probability of x i greater than 0 is equal to some p, but under h naught p will be equal to half. So, under h naught the distribution of s is binomial n half. So, we can actually device a simple heuristic test based on the sin test. So, sin test is reject h naught if s is greater than let us say it is some k alpha this is if alternative is h 1 and if s is less than k alpha that is if k 1 minus alpha if alternative is h 2 and s less than or equal to k 1 minus alpha by 2 or s greater than or equal to k alpha by 2 if alternative is h 3. Where k beta is the largest k such that probability of k 1 minus beta is the largest k such that probability k less than k is greater than or equal to 1 minus beta. We have to take this largest etcetera because the distribution is assumed to be discrete. So, we may not actually achieve equality here that is probability of k greater than k alpha need not be alpha. So, that is why we choose the largest such cutoff point. So, basically we are saying that this condition is equivalent to that sigma basically we are saying probability k less than or greater than or equal to k is less than or equal to beta. So, then this small k is the smallest k that is sigma n c x half to the power n x is equal to k 2 n that is equal to less than or equal to beta. So, this can be easily calculated from the tables of the binomial distribution. If the distribution is symmetric then we have this k greater than or equal to k that is equal to 1 minus probability k less than k that is 1 minus p naught k less than n minus k that is less than or equal to beta. Then we can say that probability of k less than n minus k is greater than or equal to 1 minus beta. So, you will have n minus k beta is equal to k 1 minus beta. We can also calculate the power function here power function of the sin test probability of k greater than or equal to k alpha when theta is the true value true median then it is equal to x equal to k alpha to n n c x 1 minus f theta 0 to the power x into f theta 0 to the power n minus x where k alpha is the smallest k such that probability k greater than or equal to k under theta is equal to 0 is less than or equal to alpha or we can say 1 by 2 to the power n n c x is less than or equal to alpha for x equal to k to n. Let us take an example here suppose I consider f x to be normal theta sigma square. So, f theta 0 that is probability x less than or equal to 0 that is equal to phi half minus theta by sigma that is 1 minus phi of theta by sigma. So, f 0 0 is half let us take say alpha is equal to 0.384 n is equal to 16 sigma square is equal to 1 then we can see from the tables binomial tables that k alpha is equal to 12. So, if we consider the power function at say theta is equal to 1.04 then it is equal to sigma 16 c x 0.8508 to the power x 0.1492 to the power 16 minus x x is equal to 12 to 16 then that is approximately 0.9211. If we consider the corresponding t test that is 4 x bar by s greater than t 0.384 that is 1.77 then power of this test is equal to probability of t greater than 1.77 that is equal to 0.9918. So, certainly the certainly we can say that the power of the sign test is less than the power of the usual t test that we already know, but this is under the assumption of the normality. If we actually have no knowledge about this normality then this test will be miserable it will fail because this will simply give a wrong thing. Another thing is that asymptotically also we can use this sign test because we are saying k follows the s follows the binomial distribution. So, this binomial distribution asymptotically becomes a normal distribution. So, one can actually use this also. So, in both the cases the results can be obtained and the cut off point the critical point and the power of the test can be easily calculated. In the following lectures I will be describing some other test statistics which are based on the order statistics. In the sign test actually the order statistics are not used only the sign of the term is important. So, therefore, in that sense it is a you can say extremely simplistic test for the median of the distribution. Now, next we will define certain test which will be based on the actual values or the actual measurements. So, that I will be starting like Wilcoxon sign time test statistic one with knee and so on.