 Hello and welcome. In the last lecture, we looked at selection sort as a simple sorting technique. In this lecture, we are going to analyze the performance of selection sort. So, here is a quick recap of some of the relevant topics we saw in the last lecture. We looked at selection sort, the intuition behind it and how a C plus plus implementation might look like. In this lecture, we are going to analyze how selection sort performs when sorting an array of size n. Specifically, we are going to define some quote-unquote basic steps in sorting and we are going to count how many such basic steps selection sort requires to sort an array of size n. So, let us quickly look at little animation of how selection sort sorts an array of marks, the total marks in the quiz problem that we had seen earlier. So, in this case selection sort would find the maximum number, maximum marks in this array, put it in its rightful position and swap the number that was already there in that position with the maximum number. So, 24 came down from there and then we would find the maximum number in the remaining part of the sub problem and again do the swap and then we would find the maximum number in the remaining part I could have chosen any one of the 24's, I chose one particular 24 and then the swap happens. Next, the maximum number in the remaining sub problem is chosen, it is taken to its rightful place and the swap happens. Finally, this is already in its rightful place and 17 is already in its rightful place and we have gotten a sorted array by selection sort. So, what did our program in C++ look like? Basically, we had a simple loop in which we had a variable called current top which incremented from 0 to n minus 1 and at any point of time the understanding was that a current top through a n minus 1 was the unsorted array and in each iteration of the loop I found out the index of the maximum element the array a between the indices current top and n minus 1 and then I swapped a current top with a current max index. This is how find index of max looks like we have seen this in an earlier lecture basically we iterate from the starting position to the ending position in the given array and if at any point of time I find an element which is greater than equal to the currently maximum seen element I update current max index and when this loop completes I can return current max index. This is how swap looks like very simple function I basically want to swap a index 1 and a index 2 and I use a temporary variable to temporarily store the value of a index 1 and copy the value of a index 2 to a index 1 and then copy the value of temp which was originally storing the value of a index 1 to a index 2 and finally I return. So, if you looked at these functions what you would realize is that the basic steps in selection sort were something like this we wanted to read two elements of an array a we wanted to compare them and we wanted to update current max index if necessary this is what the basic step in find index of max was doing and then of course the swap function swapped to specify the elements of the array a. So, these are kind of my basic steps in selection sort and now we might want to ask that given an array of n integers how many basic steps as a function of n are needed to sort this array by the technique of selection sort and that is what we are going to do in today's lecture. So, let us see how many basic steps would selection sort required. So, here is our array which has six elements it is currently unsorted. So, I would start off with current top being 0 and then I would call find index of max with the array a the current relevant part of the array is from 0 through 6 minus 1 that is 5. Now, when this function is called what would happen current max index would get initialized here I would get initialized here and of course the start is 0 when find index of max is called and then because 18 is not greater than equal to 24. So, current max index would stay here. So, I would have done this comparison comparing 18 with 24, but because 18 is not greater than equal to 24 current max index would stay here but I would increment 17 is not greater than equal to 24. So, current max index would stay there and I would increment, but 25 is greater than equal to 24. So, current max index would increment along with I current max index would be updated along with the incrementing of i. 27 is greater than equal to 25. So, current max index is updated again, but 24 is not greater than equal to 27. So, current max index stays wherever it was. So, if you noticed what we did is we actually iterated through these elements and we made some comparisons between elements of the array and then we updated current max index whenever I found an element which was greater than equal to what a current max index was. So, in this particular case I started here and then for each of these elements once for a 1 I did a comparison did some update if necessary once for a 2 I did a comparison did some update if necessary and so on. So, I basically executed 5 basic steps once when I was here next when I was here next when I was here then when I was here and finally when I was here. In general if you had n elements in the array it is easy to see that in this first iteration we would execute n minus 1 basic steps. In addition I need to swap. So, note that current max index was finally pointing here I need to swap it with the current top. So, therefore that is one more basic step that we need and so in total now I have executed n minus 1 basic steps here inside find index of max and one basic step inside swap. If I go back to my program what would happen next I would simply increment current top and repeat the entire procedure. So, let us do that we increment current top it comes to 1 and then I have to repeat the entire procedure. So, now find index of max is called with the array a starting position is 1 and the total number of elements is 6. So, current max index and I are initialized here start is of course 1 that is how I call find index of max and then once again because 17 is not greater than equal to 18 current max index stays there while i is incremented, but 25 is greater than equal to 18. So, current max index is updated while i is incremented 24 is not greater than equal to 25 current max index stays there 24 is not greater than equal to 25 current max index stays there. Once again how many basic steps did I execute I had to execute a basic step which is comparing to array elements and updating current max index if necessary when i was here here here and here. So, there were 4 basic steps in general if this array was of size n in the second iteration of the main selection sort loop I would execute n minus 2 basic steps and of course after that we need to swap. So, let swap consumes one more basic step and so now I have n minus 2 plus 1 basic steps. Now in general when current top then increments to a 2 it is easy to see the pattern I would have executed n minus 3 plus 1 basic steps and finally, when current top comes here I would have executed n minus n minus 1 plus 1 basic steps the pattern is very obvious over here and of course when current top comes here I do not need to do anything else because I now have only one element in the unsorted part of the array and therefore, that unsorted array is really a sorted array an unsorted array with one element is certainly a sorted array. So, overall what is the total number of basic steps that I required to sort an array of n elements by selection sort I have to just add up all the numbers that I got and if I did that addition I would get a summation like this 1 n minus n minus 1 is 1 then plus 2 plus 3 plus n minus 2 plus n minus 1 and I am adding 1 n minus 1 times. So, that contributes this additional n minus 1 and if you did this summation it turns out to be n minus 1 times n plus 2 over 2 the interesting point is this quantity increases quadratically with n there is an n squared that you want to get if you expand this out. Now, how bad is it? So, here I have plotted the count of basic steps in selection sort versus the array size for array sizes increasing from 10 to 100 and the growth is as n minus 1 times n plus 2 by 2 as we just saw these are the values of n minus 1 times n plus 2 by 2 for different values of n and you can clearly see that this growth is more than linear it is a quadratic growth you can sort of see the contours of a parabola over here and this is problematic because the number of basic steps needed by selection sort is increasing too fast with the array size which basically means that for large values of n too many basic steps are needed. So, the program will run too slow. So, is selection sort really fast enough for our practical requirements? Well in a real world sorting scenario our requirement could be something like this a query on an internet search engine for a popular topic let us say Mahatma Gandhi or Independence Day India if you query Google or Bing with such keywords it is going to generate more than 1 million data items each of these will be associated with a score and then you have got to rank these data items in this case the data items are basically various web pages each of them has a score and now you have to rank them in decreasing order of their scores. So, you see in the real world having 1 million or actually much more than 1 million data items is not very uncommon we routinely need to sort data items that are this numerous and it turns out that selection sort would be hopelessly slow for such applications. So, here is a simple calculation which shows why that is the case if I take n to be 1 million then n minus 1 times n plus 2 by 2 this is the number of basic steps required by selection sort is roughly 5 times 10 raise to 11. Now, remember a basic step involves reading two elements of an array comparing them and perhaps updating some other variable. So, there are actually memory reads and memory writes involved. So, if each basic step takes 20 nanoseconds this is quite optimistic given today's technology given the fact that a basic step involves memory reads and writes possibly multiple memory reads and writes comparison etcetera. But, if you just take this ballpark estimate and if you multiply it with 5 times 10 raise to 11 you will require 10 raise to 4 seconds to complete the selection sort to execute that many basic steps 10 raise to 4 seconds is roughly 2.78 hours. Now, is not that completely unacceptable if I do a query on the internet and it generates 1 million data items to simply rank them by their score if it is going to take around 3 hours that is hopelessly bad. So, can we do much better fortunately yes we can do much much better. In fact, we can sort an array of size n using approximately n times log n to the base 2 basic steps and if you use this metric you will see that a million elements can be sorted in no more than a few seconds where I have the same notion of a basic step which takes around 20 nanoseconds. Now, this topic how do we sort much faster how do we sort using approximately n times log n base 2 basic steps will be the topic of the next few lectures. In summary in this lecture we looked at the analysis of how selection sort performs we counted the basic steps that selection sort requires and we saw that it grew quadratically with the size of the array and this indeed motivates the study of faster sorting techniques where the number of basic steps required cannot grow as fast as quadratically with the size of the array. Thank you.