 Okay, after seeing this, how to apply our Chi square test, let us now start looking into another test, Kolmogorov square knot test. So notice that when we use the Chi square test, we basically compared f i with a i over k points, right, i equals to 1 to k, but we actually had n points. So then why to compare only this k points, why not compare all these n points? So that is what we will do here in this Kolmogorov-Smirnov test. So we are going to compare all the n points and but these points will be compared at the empirical relative frequencies, okay. So what basically we are going to do is, we are going to compute the empirical, we are going to compute the deviation between the observed empirical CDF and that CDF under the null hypothesis. So we know that under null hypothesis, my CDF is already given to me and now I am going to compare it against my empirical CDF. And when we want to apply this Kolmogorov-Smirnov test, we are going to assume that my null hypothesis corresponds to a continuous distribution, okay. Now let us try to understand how to compute this empirical distribution. Suppose we have a random sample, then its empirical CDF at point S is basically the fraction of the points that takes value less than or equals to xi. So formally, this is basically a number of points which are taking value less than x divided by n. And in this case, in the, I am going to simply get as now KS test, the test statistics in KS test is the maximum deviation between my empirical and my expected CDFs, okay. So and x can be interval, so we are going to take, x can be in fact, here is the entire real line, so we are going to take the supremum over all these differences. Now the question here is, does this DN statistics, if I want to find the statistics, do I need to know the underlying distributions of the sample? To compute the DN, yes indeed I need to know what is F naught, but does distribution of the SDN itself depend on the underlying distribution. So whatever it is, let us assume for hypothetically whatever, whether it depends or not, we will come to a point, but let us say whatever the DN we have, it is a stochastic quantity and we can for a given alpha, let us denote DN alpha to be the 1 minus alpha th quartile of the distribution. Now if you want to use this statistics to check my hypothesis whether it has a CDF, F naught which matched at all the points or it is going to be different at at least one point, I can check this hypothesis by comparing my DN against this DN alpha. So if I am going to accept this null hypothesis when DN is going to be small and as again if this and I am going to reject it if this DN is going to be larger than this DN alpha. So this makes sense because if my empirical estimation, if my underlying samples are indeed following my null hypothesis distribution then SN is expected to be close to F naught at most of the points and if it is going to follow something else it is going to be differing at at least few points or at least one point and then this DN is going to be large. So based on that intuition we can set up this DN but then by setting my threshold here which is DN alpha in this fashion which is DN alpha is the 1 minus alpha th quartile I am going to get that the probability of reject to be alpha that is I am going to get an alpha level test but does this how to compute the DN distribution? What is the distribution of DN? What is the distribution of DN? How is this DN distributed? We will come to that. So before that this test we had you notice if you notice we took the absolute difference and when we wanted to check we wanted this to be exactly equal at all the points. This is called a two sided test of the KS test and when we do not take the absolute value but we take the difference between SN and F0 then we are going to define the statistics at DN plus and when we take the difference between F0 and SN and take the maximum all value of x then we are going to denote it at DN minus and now if you are going to define alpha level test by taking DN alpha plus and DN alpha minus to be 1 minus alpha th quartile of the distributions DN alpha and DN alpha minus respectively then we can think of one sided test. So if you want to test that Fx matches with F0 only is going to be larger than F0 at all the point or it is going to be maybe this should be less than less than at least for some time for some and we are going to accept or reject this hypothesis maybe when DN alpha DN plus is less than D plus N alpha we are going to accept and when we are if not the case we are going to reject it. So this we can do in this case and when we want to test the opposite of this like whether my F of x is going to be for all x and or it is going to be less than F of x for some as we can use DN minus now and then I am going to accept at 0 if DN minus is less than or equals to DN alpha minus and similarly otherwise and reject otherwise. And these tests are called one sided test so and we have two criteria here depending on whether we want to see that my actual hypothesis is going to be always larger than monol hypothesis at all the points or it is going to be less. Okay to understand now the distribution of DN we need to little bit revisit the properties of empirical distribution and also our order statistics. Let us say I have a random sample drawn from some underlying distribution. Now recall that we have denoted the order statistics as x of 1 under this parenthesis like this and x of 2 under this parenthesis like this where x 1 denoted the smallest value and this quantity denoted the second smallest value like that. Now instead of only looking into the first order and nth order statistics we can also look into the 0th order statistics but it is just simply defined to be 0 and also we can define n plus 1th order statistics and which will be simply take it as infinity. Now with this we can write our empirical distribution as this it is going to be 0 when x is going to be okay maybe let me write this. I have x of 1 here or no x of 1 here x of 2 here maybe x of 3 here like that and I have x of n and like this. So all this region is what this entire region here is captured by this and this region here is captured by this and I think I made a mistake here this should be 2 and 3 and this region is captured by this and 3 and maybe this infinity is not there and this and this should have been n plus 1 here right and this region is captured by this. Now it so happens that if I have a sequence of random variables y n where y n is going to be distributed as s n and I have a random variable s x which is distributed as per malignale hypothesis then one can argue that y n converges to x almost surely okay. Further if I take so notice that this s n is a random quantity for any x okay because this itself is defined in terms of this random samples right. Now it so happens that the expected value of s n at point x is simply f of x or maybe I should have written this f of 0 computed at s. So what we basically saying is and also we can show that basically we can show that this is going to be f of x 6 for all x okay. So then what we are saying is basically s n is an unbiased and consistent estimator of the CDF of x. So this s n is going to provide me a good information about my null hypothesis when I have large number of samples. So that is why I want to compare these two. Now to understand the distribution let us little bit express this d n and use the properties of my order statistics. So I know that d n can be written as d n plus and d n minus right. Now d n plus is this simply and I can split this summation over all into max over my i's between 0 to n and also between the ranges. So maybe I think I made a mistake here this should have been x minus. So basically I am this soup this soup over entire regions I am basically dividing into this region. So this entire region I have basically divided into n plus 1 regions and I am looking now at the max in each of these regions okay. Now I know that for a x which between x of i and x i plus 1 my s n is going to be constant and that is given by i by n and so what I mean here is like we just said that this is going to be the value of my s n is 0 here 1 by n in this range 2 by n in this range like that. So we know that my s n is like maybe like somewhere like it jumps like this right maybe wherever it is like it jumps like this by equal number of amount and maybe like this. So this is like 1 by n this is like 2 by n and this is like 3 by n wherever it is. So now if I know that if I max over 0 into n I know that this is if I take soup inside because this I know is simply i by n in that range and this is now going to be in for x of i less than or equals to x less than x of i plus 1 of x 0 x and I know that the my f 0 because of its monotonicity properties and I know that and if I have to take its infimum value it is going to be the smallest here and that is why I am going to when I go from here this inf over this I can simply write a f of 0 x of i. Now one can also compute dn minus similarly using dn minus and dn plus I can write this expression like this but notice that even after writing this I have just did gave a little longer expression but both dn plus dn minus dn here all of them are depending on this f 0 which is basically the distribution of my underlying like of my hypothesis my null hypothesis. At this point it is not clear why is that this dn has distribution is independent of this underlying distributions for which we are testing the samples against. To do this further let us understand or let us understand little bit more of the properties and the transformations of the random variable. Suppose I have a random variable x which has a CDF of f of x and if I define a new random variable by applying transformation f of x on x then my new random variable y has any form distribution. So, you can check this this is like an exercise. Now I can define a new random variable by applying this transformation f of x on the earth order statistics. Instead of simply taking my random variable x I am going to now replace this x by its earth order sample and I am going to define that new value as u r. Now this is going to referred as earth order statistics from uniform distribution over 0 1. So, notice that its range u of r it is again going to be between 0 1 and we are going to call it as earth order statistic from uniform distribution and one can explicitly derive the distribution of this f u of r and this is given like this. And now if you notice this this distribution of u r is independent of f of x and this is I think this is like a beta distribution. This is like a beta distribution and it does not depend on underlying cdf or which we started with which has like whatever the f of x we started with. Because of that the distribution of the statistics we were interested in does not depend on f not and one can compute even though its explicit form of this distributions is not available one can do numerical computations and get the tail probabilities of these distributions and one can compute dn of alpha for all values of alpha. And now we can apply this thresholds and to see whether I want to get a alpha level test and because of that we can conclude that the ks test is a distribution free test because it does the your statistics you do not need to make any assumption about that. It is independent of what is the null hypothesis that you want to be interested in unlike in the t test or f test where we have to explicitly assume that your statistics is going to be either t distributed or f distribution. So, that is why we are going to call it as distribution test and here as I said because this dn alpha does not depend on any test one can do extensive numerical simulations and get a very good approach very good values of this tail probabilities of this dn. So, now I hope how to apply ks test is clear all you need to do is compute your dn if you want to do a two sided test and if you want to get a alpha level test you compare it against d alpha and if this is larger you reject it and this is going to give a alpha level test. As a quick example, suppose let us say there are 20 observations were chosen uniformly random over 01 interval and they were rounded up to 4 significant digits and you want to test the null hypothesis that the square root of these numbers follow uniform distribution again over the interval 0 at significant 0.1. Okay, the 20 samples are taken and they are organized in increasing order here and now let us see how we can apply ks test here. So, now the table it is the computations are represented in the table format here. So, because we are interested in the square root of the observed sample. So, we take the square root of this values here and the square roots are written here this is basically actually the square root of x now like a let us say these are x here and this is like a square root of x here and we know that Sn is going to increment in value of 1 by 20 here because n equals to 20 here. So, you will see that this values are increasing in values of 0.05 and this being an uniform distribution the null hypothesis being informed distribution we know that this value is going to be same as this value because for a uniform distribution we know this is equals to f of not x equals to simply x or this is like a linear line and now you can take the difference between them and what will be interesting in the absolute value of this which is the maximum. If you look into this the absolute value this is the one which has the highest value and that is going to be the value of dn. At significance value alpha equals to 0.1 from the tables you can get it to be 0.3 fine 2 and now you will see that dn is going to be less than dn alpha. So, because of that you your test suggests that you are going to accept but then is this correct here. So, notice that what we have been told is x is uniform 0.1 and we have been asked to check square root of x is also uniform which is not the case but in this case by applying this case test we ended up accepting that this square root of x is also uniform distribution. So, obviously this is not correct and here this number of samples are not good enough to make a decision and usually as a thumb rule one needs more than 14 number of samples to have a fairly correct answer otherwise there may be wrong conclusions we will end up making ok.