 Okay, so we will continue from where we left off last time. So we will go through it quickly and if there are doubts please stop me and ask. So you have seen the basic idea of merge sort, we take an array divided into two parts then sort each of the individual parts and then merge them back. This is what we saw last time and these are the three sort of tasks. One is to divide an array into two parts, the other is to sort the two arrays and then we want to merge them back. So all of you have seen the lecture slide so I am going to go through them a little past and if you have doubts in any particular slide please stop me. So here is how you might write merge sort in C++. So this is some simple getting the inputs and validating them. Here we have taken an array A of size 100 which means we will store at most 100 elements to sort, 100 integers to sort at most and then we read in the integers and then we call this function merge sort A0n, A is of course the array and this 0 and n basically say that the part of the array of interest to us is between 0 and n minus 1, n is the total number of elements, 0 is the starting element, 0 to n minus 1. Now how does merge sort look like? So as precondition we require start to be less than n and both to be within the array bounds. So remember this is the starting index and I should be seeing up to n minus 1. So from start to n minus 1 is the part of the array of interest and as post condition I want the part of the array from start to n minus 1 to be sorted let us say in decreasing order. So the first thing we check is the termination condition if the array has just one element which means if n minus 1 is equal to start or in other words if n is equal to start plus 1 then everything is sorted we are done. Otherwise we calculate the midpoint which is start plus n over 2 and then we sort the two parts the part of the array from start to mid minus 1 and the part of the array from mid to n minus 1 and how do we sort them we recursively call merge sort again that is what we decided last time we are going to recursively call that and then we are going to merge these two sorted sub arrays and how do I specify the two sorted sub arrays? So I pass these arguments to merge sorted sub arrays so this is of course the entire array of numbers and the two sorted sub arrays within that are the part from start to mid minus 1 and the part from mid to n minus 1 so the two parts that we want to sort. So instead of actually passing two arrays I have passed one array and have given the ranges of indices between which the two sub arrays exist. So this is the most important function merge sorted sub arrays so as precondition we will assume that a start through mid minus 1 is already sorted in decreasing order and similarly a mid through n minus 1 is already sorted in decreasing order merge sorted sub arrays so the sub arrays should already be sorted and as post condition we require that start through n minus 1 should itself be completely sorted and in decreasing order of course. So within this function if you remember from last time we had this picture of two sorted sub arrays and we had two arrows indicating which element now I am considering so think of those two arrows as i and j so i is the running index for the first sub array and j is the running index for the second sub array. So those two arrows are like i and j and then as we were so we had these two arrows pointing to these two sub arrays two elements of two sub arrays and we were checking which one of them is greater and putting them in the sorted array. So that sorted array for the time being let us say that we are going to create a copy of a call it temp a in which we will put the sorted elements. So there were these two sub arrays and we were picking up elements and putting them there so think of this as the two sorted sub arrays and this is a stem a so this is the part where else store the sorted elements and index is so this merged and sorted sub array will be temporarily stored in temp a and finally it will be copied back to a and this index is the index in that temporary array where the next sorted element should go if you remember there were two sorted arrays I was comparing the two putting one of them there incrementing that index again comparing the two putting one of them there incrementing the index and so on. So index is the index in that temporary array where the next sorted element should appear. So this is the basic merging loop what we do is we initialize the two arrows i and j to the starting points of the two sub arrays so i is initialized to start j is initialized to mid because the two sub arrays are from start to mid minus one and mid to n minus one and then I am going to iterate as long as there is still some element left in one of the sub arrays. So as long as either i is less than mid or j is less than n I am going to iterate I have not shown the increments over here because if I am choosing from the first sub array then I should increment i if I am choosing from the second sub array I should increment j so that incrementing will happen inside the body of the for loop but in any case in every iteration of the for loop the next sorted element is going to go in its right full place so index should increase because index is the position but the next sorted element should come. So how does the body of the for loop look like there are two cases one is when the two sub arrays are not fully seen yet so we are in the middle of both the sub arrays and the other case is when one of the sub arrays has been fully seen so if one of the sub arrays has been fully seen that is the easier case what do we do we just copy the elements from the other sub array which is already sorted into template so this can be implemented like this so if i is less than mid it means the first sub array has not been fully seen yet because the first sub array goes from start through mid minus one so then I will just copy the next element from the first sub array to tempeh and increment i and otherwise it is the second sub array which is not yet fully seen remember this is in the else part of this condition so clearly this condition is false which means either i is reached mid or j has reached end so if i is less than mid then j must have reached end which means the second sub array has been seen and if i is not less than mid then i is equal to mid which means that the first sub array has been seen so I will just copy from the second sub array that is fairly straightforward and otherwise if none of the two sub arrays are fully seen yet what did we do in that picture we saw last time we compared the two the elements at the two arrows whichever one was greater we took it to the next sorted position so that is what we will do here we will figure out if aj is greater than ai i and j are the two indices and if aj is greater than ai I will copy aj to tempeh index this is the next position in the sorted array and I will increment j otherwise I will copy ai to tempeh and increment i okay so that's the whole story of the merging loop and this is the most crucial step in MergeSort. So after we have done all of this merging the sorted array is actually now in tempeh so we will now to copy it back to a because we finally want a to be sorted so that's what this copy thing will do and I'm sure all of you can do that copying one array to another between the indices start and n minus 1 good so now let's try to understand we did this analysis for selection sort let's try to understand how we can do this for MergeSort counting the number of basic steps so I realized that there are several questions about what is a basic step so let me try to you know spend a couple of minutes on that so whenever we are trying to analyze the running time of a program or the running time of an algorithm given so there will be one part of the running time which will depend on the input size right in this case the size of the array n clearly the time taken to sort an array of size 2 is not going to be the same as the time taken to sort an array of size 2 million right so there will be some dependence of the running time on the actual size of the input but there are steps in the program which are going to be executed whether the arrays of size 2 or the arrays of size 2 million for example comparing two elements of an array reading an array at a specified index updating an array at a specified index so each of these steps will have to be executed whether the arrays of size 2 or the arrays of size 2 million the question is how many times are these steps executed that will depend on the size of the array so what we mean by a basic step is something some computation that is done which takes a certain fixed amount of time independent of the size of the array and then we will count how many times are these steps going to be repeated as a function of the size of the array so is that clear so what we mean by basic steps is some steps that will be executed independent of the array size and they will be executed multiple times depending on what the size of the array is so we want to get that unit how many times is that unit going to be repeated that will be a function of the size of the array but that unit which will be repeated is what I am calling a basic step and the time taken for doing that unit which could be for example reading specified to specified array elements a i and a j comparing them is a i greater than a j writing an element in tempe tempe index assigned something right incrementing i incrementing index all of these so the total amount of time taken to do you know to do each of these steps once reading an array element updating an array element comparing two array elements this is independent of how many elements there are in the array and we will repeat these operations several times depending on how many elements there are in the array is this clear so basically we are trying to count the total time as some function of the size of the array times some basic operations and the time taken for these basic operations is independent of the size of the array so that's what we mean by basic steps is this clear what we mean by basic steps it's a slightly hand-waving abstract notion but the point to remember is that the total time will be some function of n some function of the size of the array times the time taken for a basic step so the time taken for a basic step should be independent of the size of the array these are steps that need to be taken no matter what the size of the array is how many times you take this step should be a function of the size of the array is that clear the actual time taken for these steps really does not matter because it's going to be a fixed time it could be 50 nanoseconds 100 nanoseconds whatever depending on the computer you're using but it's some fixed amount of time repeated some function of n number of times and we want to find out what that function of n is okay so let's try to understand how many basic steps will be there and merge sorted sub arrays so there are these two sorted sub arrays a start through emit minus one and emit through n minus one each of size n over 2 so as you have seen from that you know animation we showed last time in each of the sub arrays the arrow just keeps moving down so it just makes one pass over the sub array right so basically and and what happens as the arrow moves down I read two elements compare them copy one of them to tempeh and then the arrow moves down right so all of that I can think of as one basic step take some fixed amount of time but how many times the arrow moves down of course depends on the sizes of the two sub arrays right so the sizes of the two sub arrays are n by 2 so there'll be at most n by 2 basic steps to pass over the first sub array at most 10 by 2 basic steps to iterate over the second sub array so at most n basic steps to get the merged sorted array in tempeh and then we have to copy that tempeh back to a so we have to copy n elements so there'll be n basic steps to copy right so total of 2n basic steps fine so now we want to count the number of steps so this was the total of 2n basic steps in merge sorted sub arrays so what is the number of basic steps in merge sort so in order to understand this let's say that we do not know what this number is going to be what is this going to be as a function of n but let that function of n be called Tn so T subscript n means the number of steps number of basic steps to merge sort an array of size n and if you look at this function that we just wrote what are the steps involved in merging an array of size n this will take some constant amount of time right computing the midpoint checking termination condition and then the real amount of time will be here but these themselves are merge sort with half the sizes of the arrays and finally I'm going to merge them so it is something like this that to sort an array of size n I'll have to sort two sub arrays of size n by 2 and then I have to merge these two sorted sub arrays which we have just seen takes time number of basic steps 2n right and this is for the first recursive call that's for the second recursive call and that's for merge merging them and of course when the arrays of size 1 it just takes one basic step it's already sorted right now if you solve this recurrence relation there are several ways to solve it you know I'm sure most of you already know about it but if you solve this recurrence relation you'll find that the solution is roughly 2n times log n base 2 plus n ceiling of that okay and this is quite close to what we had promised in the earlier lecture a sorting technique which takes number of basic steps proportional to n times log n base 2 this is much better than n squared which was the case for selection sort how much better is it so here is plot visible so the blue part is how the number of basic steps for selection sort increases with array sizes increasing from 10 to 100 the orange part is how the number of basic steps for merge sort increases with the array size increasing from 10 to 100 and you can clearly see that the gap is increasing and the gap will increase for this as we as n increases right and this is the difference between the growth of n minus 1 times n plus 2 by 2 which is for selection sort and 2n times log n base 2 plus n which is for merge sort good so therefore this is going to be much faster than selection sort right are there any doubts about this part of the lecture okay good