 Hello and welcome to the NPTEL course on an Introduction to Programming through C++. I am Abhiram Ranade and this lecture is about arrays and recursion. The reading for it is chapter 16 of the text. So recursion is very useful for designing algorithms for data stored in arrays and as we saw for data stored elsewhere as well but also for data stored in arrays. So we are going to talk about two very useful recursive algorithms for which we work on arrays and these are the binary search algorithm and a sorting algorithm called merge sort. So let me begin by discussing the notion of searching an array. So the input to this is an array of length n and then we are given a key or an element and a value x which is called which is often called the key and say it is also an integer and we want to know whether x is present in A. We may also know things like at what position it is present but for now just for simplicity let us say we just want to know whether this x is present in A and if it is present we should return to otherwise we should return false. Now this is not a new problem when we did the marks display problem we already did this. So we were given a roll number and we went through our array of roll numbers and tried to check, tried to find if the roll number was present somewhere. So this is just a simpler version of it and I am doing this primarily to jog your memory as to what we have already did but then we are going to do something more interesting with it. So the natural algorithm that you have already seen is to scan through the array or go through every element starting at the beginning and return true if you find it but if you go to the end without finding it then you return false. So here is the algorithm or here is the program. So as argument we take this array name A I could have written this as int A square brackets but I can also write it as int star A then we are going to start at 0 and n is assumed to be the length of the array over here and so we are going to go through all the elements. If we ever find that x is one of the elements then we return true otherwise we just continue going through the array until the end of the array and if we get to this point then that means we have not seen x in the array at all and therefore we return false. So you have seen this a number of times actually but as I said just to set the context that is why I am saying this and this is called linear search I think we have also used that term. So I just want to observe that this is somewhat time consuming. To decide whether x is present we essentially have to scan the entire array we certainly have to scan the entire array if the element is not present and say on the average if the element is present we will scan half the array. So our time might be small if the array if the element is present at the beginning it might be large if the element is present towards the end but say on the average it takes us we have to go through half the array. So if the array is large this can be this can take quite some time. So let us now consider us a different problem that of searching in sorted array. So what is the sorted array? Again you know it but let us just let me just define it because we are going to use it over here and sorted when we talk about a sorted array there can be two orders one of two possible orders. So the non-decreasing order is a 0 is less than or equal to a 1 less than or equal to a 2 and so on until a n minus 1. So there are so the elements are increasing but with equal elements allowed. So that is why this order is called a non-decreasing order. We could have a non-increasing order and that simply means a 0 is greater than a 1 and so on until a n minus 1. So now we want to search in such an array say whatever non-increasing or non-decreasing we know of course what it is but yeah so we want to search in it. So the interesting question of course is that does the sortedness help us in the searching and if so how much and naturally if it helps us a lot then maybe we will say that look let us keep our array sorted because if we are doing searching then that our searching might happen very fast. So let us consider the problem we are searching for x in a non-decreasing sorted array a 0 through n minus 1. Now here is the key idea we are going to compare x not with a 0 as we did in the case of linear search but our very first comparison is going to be with the middle element of that array. So let me draw a picture. So this is my array a so this is 0 and this is n minus 1 in linear search we compared x to this then to this then to this and so on. The new idea is that we are going to compare it to this middle element well what is the middle element this is the n by 2th element with index n by 2. Now this is not exactly the middle element if say n is even it is sort of on the right side but we want here I am talking about an approximate middle and of course division is integer division so we get an integer. So x is compared to this element now let us see. So what if x turns out to be smaller than this element what do we know? If x is smaller than this element then x is also smaller than all of these elements so why because these elements cannot be smaller than this these elements can only be larger and therefore it means that our x must lie only on this side must lie only in this region. So x can be present only in this region and therefore our subsequent search can be restricted to this region. So the first half again half I am using approximately what if x is bigger than or equal to n by 2. So if x is bigger than or equal to n by 2 then we know that x must be present in this region including this n by 2. x could be present here also but never mind that if x turns out to be bigger than or equal to this then we know it has to be present in this second half and therefore again we can say I will ignore the first half and I just restrict myself to this half because if x is present I am going to if at all it is present I am going to find it in this region as well. So x if present will be present in n by 2 through n minus 1 and if it may be present but I am going to ignore that and in the rest of the algorithm we will only search the second half. Now how do we search the halves? This is the beautiful answer so we are going to recurse and I have really told you the entire algorithm well I will tell you the exact function in a minute but I have really told you whatever I really whatever the ideas are and let me just make one observation that I have only done one comparison but in one comparison whichever way the comparison turns out to be I have eliminated half the elements. So in one comparison I have sort of reduced my problem to half the size. If I do linear search I will do 1000 comparisons but now even if I do linear search after this one comparison I will still only do 500 but that is not what I am going to do in any case I am going to recurse so the savings that I have will be enormous. So let us sort of develop this idea so what is our plan? So we are going to write a function B search it will search a region of an array instead of the entire array of course the region could be the entire array but in general it would search some region and the region is going to be specified using two numbers the starting index S and the length L of the region. So maybe I should draw a picture here as well. So this is my array again and this is 0 this is n minus 1 and let us say this is some starting index S. So the length of the region that I am going to search is going to be some L over here so there are L elements in this entire thing. So what is this index over here? This is S plus L minus 1. So I could have specified the starting index and the ending index but I can give the same information by specifying S and L. So both are really equivalent we are just happening to use L over here. So B search is going to be given S and L and of course X which is the element to be searched and B search is going to search this array but not the full array it is going to search only within this region. Now if L is 1 then what happens? What is this array? So if L is 1 then these two elements really become the same element so L equal to 1 so these two elements become the same element. So we are really the region that we are talking about is really a single element region. So searching a length of an array or a single element region is very easy. So we just check whether this A of S is equal to X. If it is we return to otherwise we return false as simple as that. Otherwise what do we do? So just as we compared X to the middle element over here we are now going to compare X to the middle element of this. What is the middle element? So the middle element is somewhere over here and this has index so I will write the index over here. So this has index S plus L upon 2. So remember this was N over 2 and this was 0 so it was 0 plus N over 2 and this is this has index X S plus L over 2. So I am going to compare X with this and again the idea is going to be that I will eliminate either this side or eliminate this side. So the middle element is S plus L by 2 and the algorithm is called binary search because the size of the region to be searched gets roughly halved at each step. So here is the code. So the search is going to take the starting address of the array and the region is going to be given by the index S and the length of that region is going to be given by L and X is going to be the element that I want to search. So we are going to search for X in A of S through S plus L minus 1 as we just said. So if L is equal to 1 we can check whether A of S is equal to X. So what does that mean? We are returning the result of that condition. So if that condition is true then we will return to otherwise we will return false. Then we are going to look at the middle element and that middle element is L by 2 So the half, the half way point or the half of that region, half of the length is L by 2 and so we are going to check whether X is less than or equal to S plus H. So in this picture is this is S plus L by 2 or it is also S plus H. So this element is A of S plus H. So we are checking whether X is less than this element or not. If X is less than this element then what happens? If X is less than this element we have to search which region? Well we have to search the region starting at S and going to S plus H minus 1. So that is what we are searching and indeed if I write search the region S, H, H is the length so that is indeed saying S through S plus H minus 1. Now this has to be written carefully because you do not want to miss out on these minus ones or you do not want to write an unnecessary one. So you have to do this, you have to do this somewhat carefully. Otherwise what do we have to search? Well otherwise we have to search this part. So starting over here we have to search all the way to this point just as in the previous case we searched over this entire region. So what is that region? So that starts with S plus H so that is what is over here and we go to the end of the region. So that is what the region which I have marked over here and in fact by specifying the length in this manner I get exactly this. So why is that? So the final index of the region is going to be S this plus this minus 1 just as the final index over is this plus this minus 1. So this plus this is S plus L minus 1 is exactly the region that I wanted. So I have written exactly the analogous code of what I had of the algorithm that I described at the very beginning. And what is the main program? So say it is given some input like this. So this is a sorted array and in this sorted array I want to search for 11. So my call is a 0 8, why 0 8? Because I want to start at the 0th index and the length that I want to search through is the entire array or 8. And then X is 11 so that is the call that is the value that I am searching for, the key that I am searching for. So this will print a 1 or 0 depending upon whether 11 is present or not present in the array. In this case 11 is not present so it will print a 0, it should print a 0. Now this is our function and its recursive function and we said earlier that the recursive functions have a certain format. So we should check whether that format is there in this case as well. So for example one question we said that we should be asking is there a base case and is the correct answer being returned for the base case? Well let us see this, is there a base case over here? Yes, so L equal to 1 is certainly a base case. This is the case when the program returns without recursing further. In this case does it return the correct answer? So if L is 1 then that means we are searching an array of length 1 and therefore and the array starts at S. So this A of S is the only element in the array. So if we want to know whether X is present we should check whether that only element equals X. So indeed we are doing that correctly. So that takes care of that first question. Then you have to check are the recursive calls valid? So let us go back. So we are making this as a recursive call. Now as I said earlier we have to be sure that the indices are given correctly. So for example this L minus H is crucial you should not, I mean at the very beginning for example certainly you should not give a bigger index than 8 over here. So similarly you should make sure that these indices fall within the range that we want. These are actually the indices, these actually define the region that you want searched. So we did this check and yes these are the correct regions that we want searched. No matter what the arguments are, no matter what these S and L are. So that is also done. Now here is an important question, when we are recursing the recursive call should be solving a smaller problem. If they solve the problem of the same size then we are not making progress. Then we are likely to run into infinite recursion. So we should be checking this. So this is perhaps the most important check. So here the size, so there is a notion, there is a natural notion of the problem size over here. In this case it is captured quite nicely by this argument L. So we are searching in a region of length L. So what we need to be sure about over here is whether our recursive calls are searching in a smaller region. And of course if we keep on doing that then eventually we will get to a region of size 1. Certainly not go below 1 and in this case we will hit the base case. So we have to make sure whether this call is searching a smaller region and this call as well. Well what we want to know is H smaller than L. So what is H? H is L by 2. Is L by 2 smaller than L? Well this is integer division. So L by 2 is truncating division. So L by 2 is always going to be smaller than L. So this is going to search a smaller region than what we started off. Well we have to be careful. So L by 2 is smaller than L provided L is bigger than 0. So we have to also make sure that we are never asking to search a 0 sized region. When will that happen? So it will happen if L is 1. So if we happen to run this for L equal to 1 and if we come to this point then we will be asking to search a 0 size region. But that is not happening because if L is 1 then we are actually returning right here. So if L is 1 then this is never reached. So this is reached only if L is bigger than 1 and therefore this is always a proper call. So a proper call is that this is always larger than or equal to 0. But now we have also checked that it is smaller than L. So earlier I said that this call is proper but it actually requires this argument that this H has to be bigger than 0 and this will be bigger than 0 because if we start with L equals 1 then it will be returned right over here and therefore here we will come to if L is bigger than 1 so therefore H will always be bigger than 0. Now is this always going to be a valid call well here can H be equal to L well L by 2 is always smaller and therefore H cannot be equal to L provided H is bigger than 0. So this will always be smaller than L and will it always be greater than 0 well it will always be greater than 0 because H is going to be smaller than L. So let us run some basic checks. So we said that if this is a recursive program then there has to be a base case and a correct answer must be returned for the base case. So clearly in this program the size of the region being searched if it becomes 1 then we are returning directly. So this is where we return directly and so that is happening so we have a base case and are we returning the correct answer for it yes we are returning the correct answer. Then the next question is are the recursive calls valid do they stay within the region. So what does this mean so this requires us to check whether these numbers are valid. So this number is a valid index because we started off with a valid index over here so that is not a problem. Now when can it not be a valid index well this could this is potentially a candidate for not being a valid index. So what do we want this index to be we want this index to be always bigger than 0. So will it always be bigger than 0 well we are doing H equals L by 2. So if L was 1 over here there is a danger that this index will become 0 but note that this will never happen because over here we checked whether L is equal to 1 and we immediately return in that case. So if we come over here we know that H is always going to be bigger than 1 and therefore this will never be 0. So what we have checked is that this call is a valid call in the sense that these indexes stay within that region. Now what about this index in this call so is this going to be a reasonable index well what do we know about the value of H. So H is L by 2 so this could be something like S plus L by 2. So since L is going to be when we come over here L is going to be at least 1 so H is going to be certainly non-negative and therefore this index is going to be bigger than S but can it be too big can it go past the end of the array well it will not because L by 2 S plus L by 2 is going to be within this region always. So this region goes from S to L S plus L minus 1 and you can check that this index will always be within that region and therefore that is a perfectly fine index as well. What about this index S plus L minus H so H is L by 2. So this index is going to be smaller than L because if H is bigger than 1 then this index is this index is going to be strictly larger than 0 and therefore L minus H is certainly going to be smaller than L and so in this case we can already say that the length is going to be length is going to be shrinking but it will not go down below 0 go down to 0 and therefore again this is going to be a valid call this is also going to be a valid call. So the next thing is are the recursive calls solving smaller problems well again let us go back and if we look at this you have to argue that this is going to be smaller than L. So why is that well H is L by 2 so L by 2 is always smaller so therefore this is going to be smaller L minus H is it always going to be smaller it is going to be smaller if H is always going to be bigger than or equal to 1 is that going to be true well H is L by 2 and L is at least 1. So therefore H is going to be at least 1 and therefore this is going to be strictly smaller than L so yes so both of our calls are going to be dealing with smaller regions. Now the only reason to go over this so carefully is because these truncations introduce sort of some surprises because we are familiar with doing ordinary division where division is sort of exact division the integer division sometimes is does things which we do not expect and therefore it is better to check all these things. The final question is if the answer to the recursive calls is correct then will it produce a correct answer for the top level call. So again if you look at this if this answer is correct then it should be a solution for this region and that is exactly the region we wanted to search because X was greater than or equal to S plus H. Similarly this was exactly the region which we wanted to search if X was less than or equal to S plus H. So if these answers are correct then we will get the correct answer over here. So we have run over basic checks and in fact they seem to be okay so now we can sort of run. All right so what did we discuss? We discussed the search problem then we said that if it is the array is not sorted we will do a linear search then we say if the array is sorted we do a binary search and next we are going to do an execution we are going to see how this program executes and then we will also analyze that program. So we will take a quick break.