 Yesterday, we have been discussing the how to derive the asymptotic distributions of the order statistics. We found out the asymptotic distribution of the rth order statistics under two conditions. One was when r is kept fixed, but the sample size n tends to infinity. Under this condition, the rth order statistics from the uniform distribution has a gamma distribution. And therefore, we can find out in terms of f, the asymptotic distribution of rth order statistics from any distribution. Then the second condition was that when r tends to infinity and n tends to infinity, but r by n tends to p. That means, basically we are fixing the position in a fixed proportion. For example, median it could be quartile etcetera. In that case, firstly when we consider from the uniform distribution, then U r is having asymptotically n p p into 1 minus p by n, where r by n tends to p, r tends to infinity and n tends to infinity. Then this is the result that we had proved. Now, let me apply the following result for the if the asymptotic distribution of certain random variable is known, certain sequence of random variable is known, then if I consider a function. So, we have the following result that let T have asymptotically normal mu sigma square distribution. And of course, we assume that sigma square tends to 0 as n tends to infinity. This is an additional assumption. And let G be a differentiable function such that G prime mu is not 0. Then the asymptotic distribution of G T that is G of T is asymptotically normal G of mu and sigma square into G prime mu square. This is asymptotically. Now, we have derived the asymptotic distribution of U r here and we use the relation that X r is equal to f X inverse U r. If we use this, then we get. So, when n tends to infinity, r tends to infinity such that r by n tends to p, then the asymptotic distribution of X r is normal f X inverse p, p into 1 minus p by n, 1 by f f inverse p. This is capital F here, whole square. Sometimes one additional this thing is used. If we use say f X inverse p is say mu, then this is becoming mu and here I will get f of mu square. So, that is an additional approximation ok. So, this is the discussion about the asymptotic distribution of the ordered statistics and we have derived under two conditions. Now, I discuss one next concept that is of quantiles. I already mentioned that in the case of nonparametric statistics, it is more convenient to handle the positions on the distribution because capital F is there. We are not assuming functional form, but capital F is there. So, making some you can say inferences based on quantiles, positioning etcetera is much more convenient. So, let us look at this now. The concept of quantiles. So, suppose f is strictly increasing k p is a constant such that f of k p is equal to p, then this is called and of course, k p is unique. Then we call k p as the p th quantile. In the case of continuous distribution, you can think like this. So, if the probability up to this point is p, then this point is called k p. So, in particular you have k half is median, k 1 by 4, k half, k 3 by 4, these are called quartiles, k 1 by 10, k 2 by 10 and so on. They are called deciles, k 1 by 100, k 2 by 100 etcetera these are called percentiles. So, in general we are dealing with any type of quantile. We can consider here suppose r is equal to n p if n p is integer and it is equal to integral part of n p plus 1 if n p is not an integer, then x r is called p th sample quantile. So, it is the same thing if you for example, if you consider p is equal to half, then n by 2 and n by 2 integral part plus 1. So, that is called the median for example, expectation of x r that is approximately f x inverse r by n plus 1 that is f x inverse p that is equal to k p as r tends to infinity n tends to infinity such that r by n tends to p. Similarly, variance of x r that is approximately I am considering the first hand approximations that we derived yesterday p into 1 minus p by n 1 by f of k p square. So, this will go to 0 as n tends to infinity. So, x r this is see it is asymptotically unbiased and the variance is going to 0. So, x r is consistent for k p that means, the sample quantile the p th sample quantile is a consistent estimator of the p th population quantile. So, we have proved one result here. So, like see when we have the known form of the distributions generally we consider mean. So, for the population mean we consider sample mean we have proved that it is unbiased and consistent estimator under of course, certain conditions then we also have the variance then for that we consider the sample variance we have it as unbiased and consistent again under some mild conditions. Similarly, in the nonparametric case when we are considering quantiles then the corresponding sample quantile can be considered as a consistent estimator and it is asymptotically unbiased. So, we can make this statement that is the p th sample quantile is asymptotically unbiased and consistent for p th population quantile. So, we have something to like you can say to start with. Now, we consider confidence intervals for population quantiles. Now, already because if you consider say parametric form. So, when we consider the population mean then we start with the sample mean the reason is there it is unbiased and consistent, but here for the quantile we have considered the sample quantile that means, a natural choice would be to consider order statistics. So, we can pose a problem like this we want to find r and s such that probability of x r less than k p less than x s is equal to 1 minus alpha. Now, if we are assuming that f is strictly increasing then x r less than k p less than x s is equivalent to u r less than p less than u s where you are making the transformation by taking f here f of x r is equal to u r. So, now let us consider this probability. So, consider probability of u r less than p less than u s. So, we can write it as probability of u r less than p minus probability of u s less than r equal to p, but this is same as equal to p because these are continuous distributions therefore, keeping equality are not will not make any difference. Now, the rth order statistics from the uniform distribution that has a beta r n minus r plus 1 distribution and similarly the sth order statistics will have a beta distribution with s and n minus s plus 1. So, this can be easily written like this 0 to p b sorry 1 by b r n minus r plus 1 that is beta function x to the power r minus 1 1 minus x to the power n minus r dx minus 0 to p 1 by b s n minus s plus 1 x to the power s minus 1 1 minus x to the power n minus s dx. Well, we have to determine r and s. So, these are both are incomplete beta functions. So, we have to choose r and s r and s such that s minus r is minimum and if I call this quantity star and star equals 1 minus alpha. One can use the formula or the numerical integration for the incomplete beta function and we can calculate it. Another alternative is to write this incomplete beta function as the binomial expansions. So, these things can also be written as one can also write alternatively this as sigma i is equal to r to n n c i p to the power i 1 minus p n minus i and this one becomes sigma i is equal to s to n n c i p to the power i 1 minus p to the power n minus i. So, that is equal to sigma i is equal to r to s minus 1 n c i p to the power i 1 minus p to the power n minus i. So, once again from the tables of the binomial distribution we have to find s and r such that s minus r is minimum and this probability is equal to 1 minus alpha. Of course, since now we have made it a discrete it is not necessary that we will achieve this value. So, it may be greater than or equal to also. However, I mean this methodology is quite clear. Sometimes one may think of obtaining a confidence interval based on x r itself where r is the p th sample quantile. Now, in that case of course, the distribution of x r is known and so, let me just mention that point also. One may also like to form a confidence interval based on x r alone. Of course, the distribution of x r is not necessarily symmetric. Now, you have two things one is that one can basically if we consider say x r minus a less than k p less than x r plus some b then this is equivalent to probability of u r minus a less than p less than u r plus b. So, you can write it as after simplification see u r is greater than p minus a and u r is also less than p plus a. So, you can write it as probability of u r greater than p minus b less than u r less than a plus p ok. So, we have to choose a and b such that this is equal to 1 minus alpha. The distribution of u r is known and therefore, this is nothing but integral from p minus b to a plus p 1 by b r n minus r plus 1 x to the power r minus 1 1 minus x to the power n minus r dx from. So, basically you choose two values a and b of course, here since the distribution is not necessarily symmetric actually it is symmetric about half. So, but we cannot actually consider half minus something to half because this is k p. So, k p is not necessary in the middle if we are finding for middle then it is a different matter, but that is not so. Therefore, we take arbitrary choice, but once again one can use the tables of the incomplete beta function to calculate this value. So, let me move to the hypothesis testing now hypothesis testing for a quantile. So, we formulate a hypothesis say h naught is equal to k p is equal to k p naught against say k p is not equal to or say greater than k p naught. So, you can have various like k p greater than k p naught k p less than k p naught k p is not equal to k p naught three types of alternatives may be there. Now as we have seen this x r where r is the n p r n p plus 1 integral part is a consistent estimator for k p. So, we can consider critical region of the form x r greater than some constant say. So, let us say put say simply this one, but we should have the condition that probability of this region is equal to alpha that is under h naught. Now this is equivalent to probability of u r greater than p that is equal to alpha that means you are saying 1 minus 0 to p 1 by b r n minus r plus 1 x to the power r minus 1 1 minus x to the power n minus r dx that is equal to alpha. So, one can easily find it out and I mean this is doable thing what we are we will give an alternative formulation of this. Let us consider say y i is equal to x i minus k p naught for i is equal to 1 to n. Now if x r is greater than k p naught then at least n minus r plus 1 of y i's are positive and y 1 y 2 y n they are i i d. So, we can further define let us define say z i is equal to 1 if y i is positive it is equal to 0 if y i is less than or equal to 0 for i is equal to 1 to n then z i is r i i d and we consider probability of z i is equal to 1 that is equal to probability of y i greater than 0 that is probability of x i greater than k p naught that is equal to 1 minus p and of course probability of z i is equal to 0 that will become equal to p. So, we choose so alpha is equal to probability sigma z i greater than or equal to n minus r plus 1. Now this is simply coming from the binomial n c i 1 minus p to the power i p to the power n minus i for i is equal to n minus r plus 1 to n. Of course, you can change here this i to j minus n. So, this will become here equal to n c i p to the power i 1 minus p to the power n minus i i is equal to 0 to r minus 1. So, in either way one can actually obtain this here. One may think alternatively like we can consider x r greater than c if and then we choose c such that this probability is equal to alpha. So, that can be another way of looking at this here. Then next we define what is known as tolerance intervals. See what we have discussed here is a confidence interval now we are talking about tolerance interval. So, let me give you a definition of what is known as a tolerance interval. A p tolerance interval for the distribution f x with. So, now note here I am having p tolerance interval and then I am introducing another one tolerance coefficient like you have a confidence coefficient this I call gamma. This is a random interval T 1 x to T 2 x such that probability of probability of T 1 x less than or equal to x less than or equal to T 2 x is greater than or equal to p is equal to gamma. See here we are having this x 1 x 2 x n and x they all have the same CDF f x here and this x is actually x 1 x 2 x n here. So, we want to find two statistics T 1 and T 2 such that the probability of x line between these is greater than or equal to p. Now if you look at the first statement in the first one we will consider the distribution of x here and in the second one we will consider the distribution of this x here or we can do it in the reverse also firstly we will consider the distribution of x and then we consider this. We can also write it as see if you write it in the terms of CDF then x less than or equal to something can be written as f of T 2 x minus f of T 1 x of course, we assume continuous distribution here this is greater than or equal to p that is equal to gamma. If we replace this T 1 and T 2 by some order statistics say I take them to be rth and sth where r is less than s then this is simply reducing to this condition let me call it 1 this is simply becoming probability of u s minus u r greater than or equal to p is equal to gamma. Now the distribution of the rth and sth statistics from the uniform distribution is very well known. So, one can use this if we write in the terms of joint distribution then this is becoming n factorial divided by r minus 1 factorial n minus s factorial s minus r minus 1 factorial and then you have x to the power r minus 1 y minus x to the power s minus r minus 1 1 minus y to the power n minus s dx dy. Firstly, when we do with respect to x then it can go from 0 to y minus p and then for this for y it can be from p to 1. So, what we want to say that this should be equal to alpha. So, this is a bivariate integral of course, one can also write down the direct distribution of u s minus u r also. In fact, I have earlier derived the distribution of the range from the order statistics of the uniform distribution that was coming in a closed form because we were able to evaluate the integrals. In this case also this can be done let me just demonstrate that this can be done let me call it 2. We can also determine 2 alternatively using the marginal distribution of u s minus u r. So, let me do this we have this joint distribution. Now, you make the transformation here let me write this joint distribution again the joint pdf of u s and u r that is given by f of u r u s that is equal to n factorial divided by let me put y s and y r I will be using u again in a different context. So, y r y s n factorial divided by r minus 1 factorial s minus r minus 1 factorial n minus s factorial y r to the power r minus 1 y s minus y r to the power s minus r minus 1 1 minus y s to the power n minus s 0 less than y r less than y s less than y. In this I make the transformation u is equal to y s minus y r and let v be y s itself. So, the inverse transformation here is v minus u and that is y r is equal to v minus u and y s is equal to v. So, if we calculate the Jacobian del y r by del u that is minus 1 del y r by del v is plus 1 del y r by del u is 0 del y s by del v is 1 that is equal to minus 1. So, modulus of the Jacobian is equal to 1. So, the joint probability density function of u and v that is equal to n factorial divided by r minus 1 factorial s minus r minus 1 factorial n minus s factorial. This will become v minus u to the power r minus 1, u to the power s minus r minus 1, 1 minus v to the power n minus s and 0 less than now y r is v minus u less than v less than 1 which is equivalent to saying see u is less than v, v is of course less than 1 and u is of course, greater than 0 because y s is greater than y r. So, this region can be written like this also. So, we ultimately need the distribution of u that is y s minus y r. So, that is becoming n factorial divided by r minus 1 factorial s minus r minus 1 factorial n minus s factorial. When we integrate with respect to u this term I can keep outside u to the power s minus r minus 1 and v minus u to the power r minus 1, 1 minus v to the power n minus s d v from u to 1. Here we make the transformation say v is equal to 1 minus 1 minus v into t. So, you are getting then d v sorry 1 minus u t. So, d v is equal to minus 1 minus u d t. When v is equal to u then t is becoming equal to 1 and when v is equal to 1 t is becoming 0. So, this integral is transformed to f u u that is equal to this all n factorial divided by r minus 1 factorial s minus r minus 1 factorial n minus s factorial u to the power s minus r minus 1 0 to 1. Now, v minus u will become 1 minus u into 1 minus t. So, this to the power r minus 1 and this to the power r minus 1 and there is 1 minus u again. So, this will go away and then we are having 1 minus v to the power n minus s sorry this is also changing. This will become 1 minus u to the power n minus s into t to the power n minus s d t. So, this is equal to n factorial divided by r minus 1 factorial s minus r minus 1 factorial n minus s factorial. Now, let us look at the terms that we are getting u to the power s minus r minus 1 then 1 minus u to the power r and 1 minus u to the power n minus s. So, this you combine. So, it is becoming n minus s plus r ok. Then you have a beta integral t to the power n minus s into 1 minus t to the power r minus 1. So, it is becoming n minus s factorial r minus 1 factorial divided by n minus s plus r factorial. So, this term cancels out this cancels out and you are left with 1 by beta s minus r n minus s plus r plus 1 u to the power s minus r minus 1 1 minus u to the power n minus s plus r which is also the pdf of u s minus r. So, this is interesting. We have obtained the distribution of u s minus u r which is turning out to be this. It is the same as the distribution of u s minus r. That means, in the sampling from uniform distribution on the interval 0 to 1, if I consider the distribution of the difference of two order statistics then the difference value. So, for example, if I am looking at say u 4 minus u 2 then the difference is 2. So, if I consider the distribution of u 2 it is the same as the distribution of u 4 minus u 2. So, that is a very interesting phenomena about the order statistics from the uniform distribution. So, if we look at this condition here then that I wrote that this double integral must be equal to gamma. Now, if u s minus u r is a beta distribution then it is actually simply becoming a condition in a beta integral or incomplete beta function. So, the condition 2 reduces to p to 1 and that is 1 by beta s minus r n minus s plus r plus 1 u to the power s minus r minus 1 1 minus u to the power n minus s plus r du is equal to gamma. Or if you consider 0 to p then this is becoming 1 minus gamma. So, this condition you can see it is similar to that for obtaining the confidence interval. But these are called tolerance interval the reason being that I am considering the probability of a particular confidence coefficient of x itself to be equal to gamma. So, this is a I mean different thing than the usual confidence interval, but ultimately the solution is coming in terms of that. And that role of p is coming here and gamma or you can say 1 minus gamma turns out to be the corresponding confidence coefficient here. Next we consider the concept of coverages. What are coverages? Let us consider see we are having this f x x s minus f x x r as u s minus u r. Of course, this we showed it is having the same distribution as u s minus r where r is less than s. But if we are looking at this distributional thing then this is actually an s minus r coverage. That means, it is covering the probability from the r th order statistics to the s th order statistics. So, this is called an s minus r coverage. Let me define all coverages. So, for example, you may consider here say x 0 is equal to minus infinity correspondingly u 0 that will be equal to 0. And you have other order statistics. So, x 1, x 2, x n corresponding to that you have u 1, u 2, u n then you consider x n plus 1 is equal to infinity. So, the corresponding u n plus 1 will be equal to 1. So, now we define the first coverage c 1 as equal to u 1 minus u 0. So, if you see in terms of this it is actually f of x 1 simply because the second one is 0. Then c 2 is equal to u 2 minus u 1 that is equal to f x 2 minus f x 1 and so on c n is equal to u n minus u n minus 1 c n plus 1 is equal to u n plus 1 minus u n that is equal to 1 minus u n. If I consider say suppose this is c d f this is f and these are the points say x 1, x 2 and so on x n then f x 1 that is this is the c 1 then f x 2 minus f x 1 that is this quantity will become c 2 minus c 1 ok and so on. Sorry this will become c 2 like that let us consider say x n minus 1. So, this will become c n and the last one is after this that means whatever remaining height is there that is equal to c n plus 1. So, basically what we are saying is that we are covering the c d f that is the ordinate of the c d f that is why this is called the coverages, but it is based on the order statistics. So, they are not independent and another thing is that if you consider c 1 plus c 2 plus c n plus 1 that is equal to 1 c 1, c 2 etcetera they are not independent. Since these are order statistics from the uniform distribution we know the moments here. For example, expectation of u r is r by n plus 1. So, in general then I can calculate all these differences will yield the expectation equal to 1 by n plus 1 that is we are having expectation of c i is equal to 1 by n plus 1. If I consider say c i plus 1 up to c i plus r then we are considering the coverage from i to i plus r here. And if I consider say c i plus 1 plus up to c i plus r then that is becoming equal to r by n plus 1 or i is equal to 0, 1, 2 etcetera up to n minus r. Let us also talk about the joint distribution of c 1, c 2, c n joint p d f of c 1, c 2, c n ok. So, let me call it c vector they are transformations from the u i's and we know the joint distribution of u i's the joint p d f of that is u 1, u 2, u n that is f of that is n factorial 0 less than u 1 less than u 2 less than u n less than 1. And the transformation that we are having here let me call it say 3 here the inverse transformation of 3 that is given by u 1 is equal to c 1, u 2 is equal to c 1 plus c 2 and so on, u 1 is equal to c 1 plus c 2 plus c n. So, if I calculate the Jacobian here I will get 1 0 0, 1 1 0 and so on, 1 1 1. So, it is a lower triangular matrix with the diagonal entries as unity. So, if I take the determinant of that that is going to be equal to 1 only. So, the joint p d f of c 1, c 2, c n it is simply n factorial again. However, the range is now different in the case of u i's now this is becoming c 1, this is becoming c 1 plus c 2. So, basically what you are saying is that c 2 is greater than 0, c 1 is greater than 0 and so basically all the c i's will be greater than 0 at the same time the summation c i will be between 0 and 1. Now, because of the symmetric nature of the c i's appearing in this one if I consider say c i 1 plus c i r then the distribution of this is same as say c 1 plus c 2 plus c r because it is based on simply the differences and we have seen that the u s minus u r is having the distribution as u s minus r that means, only the difference matters. So, therefore, whether I start from any other point it will not make any difference. So, this is the concept of coverage let us look at say one particular case. Suppose, I take 2 here if I take 2 here then this is becoming f c 1, c 2 is equal to 2, c 1 greater than 0, c 2 greater than 0 and c 1 plus c 2 is less than 1. So, if we consider the distribution here how it is looking like on this c 1 plus c 2 is equal to 1 is basically this. So, basically the distribution is here. So, the density value is equal to 2 in this region if we consider say 3 and 1 is equal to 3 then this will become 6 and the region then will become c 1 plus c 2 plus c 3 less than 1. That means, I consider the plane c 1 plus c 2 plus c 3 is equal to 1 and we are below that in the first quadrant. So, that is the idea of the coverages here. So, coverages are useful because they are telling that the corresponding distribution f how much area it covers between 2 successive order statistics or between any few of order statistics. So, this place I mean this is a useful information and what is important here is that you can see basically I started with any f here, but now we are dealing with the uniform distributions. The distribution of c 1, c 2, c n they are free from the distribution of c 1, c 2, c n they are free from what is the original distribution. So, this is what is important here when we do not play pay enough attention to details of the exact model that means capital F is not known, but only we assume that it is a continuous distribution then we are able to talk about the how much coverage is there etcetera without actually getting into the exact form. So, this is the advantage of the distribution free methods or the nonparametric method because the conclusions are independent of the original distribution. Another concept that will be of much use and I mean in fact, it is one of the paramount importance that is actually empirical distribution function or the sample distribution function. As you can see here I have proposed the estimations for say sample quantile estimation for population quantile as a sample quantile and as a consequence see for example, variability is estimated by the sample range. In general we consider any position and corresponding to that position we have an estimate here. Now if we consider the estimation of the distribution itself based on order statistics then we can define a function. So, that is what we call empirical distribution function or the sample distribution function. So, let us consider say x 1, x 2, x n is a random sample from a distribution f x ok. Now corresponding in place of m let me put n here because I will be using m and n interchangeably. Let us consider this order statistics x 1, x 2, x m ok. And now based on the observed values I use the small caps here these are the observed order statistics. So, if I consider f m x that is equal to 0 if x is less than x 1 it is equal to j by m. If x j is less than or equal to x less than or equal to x let me put here strictly less j plus 1 for j is equal to 1 2 up to m minus 1 and it is equal to 1 if x is greater than or equal to x m. So, we can think of this function like this let us consider the plots here. So, suppose this value is here x 1 then x 2, x 3 and so on x n minus 1 m minus 1 and x n then up to x 1 this value is taken to be 0 then between x 1 to x 2 this value will be 1 by m that means from x 1 to. So, suppose this is my 1 by m ok this point is 1 by m then you have 2 by m and so on. Then between x 2 to x 3 this will be 2 by m and so on between x m minus 1 to x m this will be equal to m minus 1 by m suppose this is m minus 1 by m and beyond that it is equal to 1. So, we have made f m x here this is the function that we will be getting that means it is constant between 2 successive order statistics and at the end points it changes the that means it has a jump at those points. So, it is actually a step function. So, this is called sample distribution function or the empirical distribution function of the order statistics here. Certain basic properties you can see for example, if I consider f m as x tends to minus infinity then certainly this is equal to 0. If I consider as x tends to plus infinity then certainly this is equal to 1 then f m x is non decreasing function and the function is lying between 0 and 1. Another thing that you observe that it is also continuous from right at every point f m x is continuous from right at every point. So, this if I consider the random variable say z with values x 1, x 2, x m each with probability 1 by m then cdf of z will be f m. That means I am saying probability of z is equal to x i that is equal to 1 by m for i is equal to 1 to m. If we have this then the distribution or the cdf of this it will be exactly this function here. In the following lecture I will discuss further applications of the empirical distribution function. I will prove some results based on that we will define certain additional properties which will include two samples and based on that we will be able to derive some other results here. So, that I will be covering in the next lecture.