 Hello and welcome to this session on data structures and algorithms. In this session, we will discuss another algorithm for sorting a very useful popular algorithm called quick sort. Quick sort is another instance of the divide and conquer approach to solving a problem. In particular, given a sequence s, quick sort is hinged upon choosing a pivot element x and based on this pivot element x, identifying the set of elements in s which are less than x and the set of elements in s which are greater than x. So, pick a random element x, then divide the n element sequence to be sorted into two sub sequences l and g such that l has elements less than x, g has elements greater than x. Now optionally, if there are multiple elements equal to x, you might identify a band here which comprises of more than one element and you call it e. For the rest of this discussion, we will assume that e has a single element because whatever we discuss for a single element also holds for multiple elements. This comprises the divide part. What follows is conquer. So, in conquer, you apply the same procedure of identifying a pivot element and identifying a subset of l that is less than x, l and the subset of elements that are greater than l and then applying conquer on each of those partitions again. Ditto with g, you will divide g into three parts. Finally, your resort to combine after having done all the conquer, you merge the result of the arrays l, e and g that are sorted. So, let us look at an example. Imagine that you had to sort 9726814, you randomly picked six as a pivot element. This will lead to a divide where you identified elements which are less than six that is 2, 1 and 4 and now you are invoking conquer through recursive divide on this sub-array. With 4 as the choice of a pivot element, you will get all the other elements of this sub-array to be the subset array to be partitioned. There is nothing more to do here, just two elements. So, we are not invoking conquer again. We could do the same thing on the right hand side, combine happens at every level. So, we have combined at the top level now, merging l, g and e. This is of course, after several more invocations of conquer and combine for the g part. So, this is the g that got combined. You have already combined all the sub-arrays in this sub-tree to give you l and finally, you invoke a combine of l, g and e at the top level where e is a singleton set 6. So, here is a quick sort algorithm, we have discussed an in-place implementation of quick sort. So, given input sequence S, pick a random pivot position P partition S into S 1, S 2. Now, we would like to do this in place, what does this mean? How do you partition S based on P into S 1 and S 2? This will be the subject matter of an exercise, a homework exercise. However, we will discuss a very simple implementation of partition S P, when P happens to be one of the extreme indices of S and your task is to generalize that to P when it is not one of the extreme indices. So, continuing, once you have identified S 1 and S 2 and as I pointed out, we will do this in place. So, S 1 and S 2 are basically indices into S, begin an index for S 1 and begin an index for S 2. The rest is basically the conquer, you invoke quick sort on S 1 and quick sort on S 2 recursively. You can also come up with a non-recursive version of quick sort. Now, since we have S 1 and S 2 in place and they are already sorted, finally you just need to merge S 1, S 2 and P. This merge is the same as before. You could also come up with a three way merge and exercise would be to come up with three way merge. It is actually a very trivial extension of what we have seen. So, merge is discussed already. Let us see partition and consider simpler case of partition when P is S dot length that is P is an index to the last element of S. So, let us discuss the case of partition S, P is the position with respect to which you want to partition and you want to give as output the two subsequences S 1 and S 2. We will assume that S begins at position L and ends at position P. What we mean is this is position L and this position P and our goal is to have all S 1 on the left and S 2 on the right. But we will assume that the pivot is with respect to this last element. Let us call it V. So, the first thing we will do is let set V to the value of S at position P and assume that P is the last position. What we do next? We will now do some bookkeeping. We will try to partition S in such a way that for a particular position I, let us say sitting here, I am going to ensure that S 1 is to the left of I and S 2 to the right of I. And how do I achieve that? Well, initially my I equals L minus 1. Basically it is before the first element, it is basically the wall because initially I have no clue of what S 1 is. So, everything is basically S 2 which is also wrong, we do not want that to hold. So, as we scan, we will correct S 1 and S 2. Our next step is as follows, we are going to keep track of elements, iterate over the elements through an index J. So, we will use an index J as iterator over elements of S ranging from L to P minus 1. We are of course not interested in scanning P when we compare with P itself. So, what we do is for J equals L, we need to start scanning from the first element to P minus 1. Now, you are going to check if this J, Jth element S j is indeed less than equal to V. So, as I pointed out our goal is for all k which are less than equal to I, we want S of k to be less than equal to V and for k greater than I, you would like S of k to be greater than V. Well, you can make it strict inequality. So, if S of j is less than equal to V, then you want to update your I, you now got hold of one element, one more element which is less than equal to V. So, you will say I equals I plus 1 and immediately swap entry at position I with position J and move on. So, you keep doing this till you reach the end. So, you just adjusting your index I, updating your iterator J, index I only gets updated when you find an element which is less than equal to V. Otherwise, you are perfectly fine keeping that element where it was. And finally, as promised before, you will need to exchange the element at position I plus 1 which is supposed to be greater than equal to P, but as such P is sitting at the right end. So, you can peacefully exchanges with the element at P which is basically V. So, your homework is to deal with the case when P is not necessarily the last index. So, what about P that is a random index? So, let us try and understand the running time of quick sort. What is the worst case? Well, the worst case is when the split requires most elements to be moved. So, when S P happens to be the unique minimum or the maximum element of S, you will basically need to move every other element. And this case is basically illustrated below. So, this case of unique minimum will mean lots of swaps. And what do you mean by lots of swaps? Basically, for an array of size n, you will have to do n minus 1 swaps. So, let us try and understand this. So, let us say you split S and it turns out that one of S 1 or S 2 has size length of S minus 1 and the other has size 0, which means you have to adjust length S minus 1 elements. And suppose, this is a case for every subsequence split that one of them has every element except one, the pivot. The pivot has always been chosen to be the least or the greatest. And this basically leads to a completely skewed binary tree. So, let us try and understand what the cost will be. So, for depth 0, the time will be n because you have to scan all the elements and do those swaps. And then at the next leg depth, what you will have to do is again scan n minus 1 elements and then n minus 2 elements at i depth is n minus i elements up to n minus 1 depth where you have one element. So, this is basically going to incur cost which is i equals 0 or depth equals 0 to n minus 1 and the time required will be n minus d. This is very similar to the worst case of insertion sort. So, runtime is proportional to sum and you can show that this is actually theta n square. How about the best case? What we will consider is a nearly best case. The best case is an instance of that. The nearly best case is that there is a fixed proportion splitting up at every level. So, at the first level n gets split into sets of size p n and 1 minus p n where p is some fraction between 0 and 1. At the next level, again there will be a split p raise to 2 n and p into 1 minus p n and so on until you reach p raise to i n, p raise to i minus 1 into 1 minus p n. The best case would actually correspond to the merge sort algorithm where p is 0.5 where at every level the array gets split into 2 sub arrays of equal length. Now, what is the work done at every depth? Well, you have to anyways scan all the elements at every depth. You have to scan and merge elements. So, merge cost or even the split cost put together at the first level will be n. The second level again you will need to scan all these elements while merging. So, that will sum to p n plus 1 minus p times n. So, you sum up at every level. This will again give you n. The sum at each depth actually gives you n. So, this sequence is split based on a fixed proportion at each step. This will go on to a depth d such that p raise to d n is 1 or 1 minus p raise to d n is 1. We will only consider with the extreme cases. The first case when p is less than 0.5. So, we are concerned with the min of p and 1 minus p. So, the termination is basically when min of p and 1 minus p raise to d minus 1 times n equals 1. What does this mean? Well, this basically means when we take logs we will find that d minus 1 times log of either p or 1 minus p. We will stick the p case plus log of n equals 0. So, you can easily determine that d must be all of the order of log of n. So, the amount of work done at each level is basically theta n and we need to do this for theta log n steps. So, the total amount of time required is theta n log n. Alternatively, one could also solve the recurrence equation. The time required at n is basically upper bounded by the time required for the two partitions t of p and t of 1 minus p n plus a constant merge time c n. And by the master theorem, you could show that this would mean order of n log n. This also holds if the split proportion is upper bounded by p. So, it may not happen that the split proportion is always p or 1 minus p, but as long as there is an upper bound to that proportion which is p or 1 minus p, this analysis holds. How about the average case? Now, what does this average case mean? Well, it is not necessary that an upper bound for that proportion exists. What if we have this case? So, n gets split by a proportion p n and 1 minus p n. But exactly the next level, the proportion is actually lost, which is you basically have split p n into some constant size left array and the remaining into the right. This k 1 of special interest is k 1 equals 1. Similarly, for the other side, let us take the special case that k 2 equals 1. This will mean that one element to the left and remaining elements to the right in both the left and right branches. Now, it is possible that even this kind of splitting holds for some number of iterations. Still finally, you go back to some fixed proportion again. So, the average case is basically interleaving of fixed proportion and worst case. This is a very loose way of stating what the average is. One can get a bit more rigorous and consider the partition algorithm. So, remember that the partition algorithm, we performed a swap and our swap was based on a subset of comparisons that were made. The actually cost incurred is through the number of comparisons of pairs i j made across all calls to the partition subroutine. What we can show is that the average number of comparisons of pairs i j made across all calls to the partition subroutine is order n log n. And this comparison is the most frequently invoked of all steps. While you can refer to the section 7.4.2 of the second edition of CLR, I am going to give this intuitive proof sketch. So, let us define a random variable X. X i j is 1 if S i got compared with S j. So, what am I talking about? Well, a subarray i j where i and j get compared. Now, the probability that i and j get compared is actually proportional. The probability that X i j equals 1 is actually proportional to the length of this subarray. What does that mean? Well, i and j get compared if either i was chosen as a pivot or if j was chosen as a pivot. It is not necessary for i and j to be compared at all if something in between got chosen as a pivot. So, therefore, probability that X i j is 1 is that i is chosen as pivot plus the probability that J is chosen as pivot. And when I say chosen as pivot, I am choosing it from the subarray S i j. Now, what is the probability of having i or j as the pivot? Well, this is basically 1 upon the length j minus i plus 1 plus 1 upon j minus i plus 1. These are two mutually exclusive events. Now, let us try and study what the expected value of a random variable X is. So, this is basically, first of all the random variable X is defined as a sum over all the values of i and all the values of j. j can only be i plus 1 until n of X i j. So, we are interested in the expected value of X, which is the sum of all possible pairs of comparisons. And we want, what is the expected value with respect to all these random choices? So, this is nothing but summation over i summation over j of expected value of E X i j. Now, what is the expected value of X i j? Well, it is basically 1 times the probability of X i j being 1 plus 0 times the probability of it being 0, which is not of interest. So, the expected value will basically be 2 divided by j minus i plus 1. Now, it turns out that with some amount of reordering and restructuring of the summation, we can simplify this expression. So, let us do that very quickly. So, the first thing I am going to do is substitute for j minus i, let me call it k. So, this is going to be a summation now over i, I will retain i as ranging from i equals 1 to n minus 1. But now, k will actually be allowed to range from all the way from 1 to n. So, my dependence of j on i has been eliminated through this. And fortunately for us, we had j minus i in the denominator. So, this gives us 2 divided by k plus 1. Now, I know that by decreasing the value of the denominator, I am only increasing this expression. So, I can show this upper bound over i and k summation over i and k 2 divided by k. Now, this is a well known expression. We know that this is basically upper bounded by log n. Now, we know that the summation inside is actually upper bounded by log of n. So, basically what we get here is summation over i of log of n and this summand is independent of i. So, this just gives you log n. So, I would encourage you to look at a more rigorous proof of this in the C L R book. Thank you.