 So, as I had mentioned in the morning and in the previous lecture, today we are going to discuss some additional problems and their solutions, including one particular problem that appeared in the quiz, which was question 4. I believe that was the problem which could be solved in multiple ways and many people could not get the correct logic while solving their problems. So, we will discuss that. More specifically, we are going to look at more examples of searching in arrays. So, we will begin with a load balancing problem which requires two arrays containing different values to be sort of balanced so that their sums are equal. We pose this problem in the context of a concocted real life situation. We will then look at the search problem in an array, primarily to revisit our linear search algorithm, meaning given an array with several elements, you want to search whether a specific element is present or not. So, you can actually scan that entire array, keep comparing the given element. If you find it, you have found it, otherwise not, which is okay. But if you have hundreds of thousands of elements and if you are going to do sequential search, you will have to examine all n elements of the array. Can you do it faster is the issue. Yes, we can do it faster and we will introduce this notion of faster search through another problem in analog domain. You remember we had visited the issue of finding out root of a real root of a function. We had seen the Newton-Raphson method. Today we shall look at another method called midpoint search method and using that method as the logical basis, we will try to apply the learning to a faster search in arrays. Finally, as I said, we will visit question 4 of the quiz. So, here is a load balancing problem. Imagine that you have two trucks, truck A and truck B and it contains some packages and the loads are not even. One of the trucks has larger weight in the other. You would like to balance the load but the stipulation is that you would like to balance the load by just exchanging one package from either of the trucks. So, you take one package from truck A, one package from truck B, the package from truck B, you put into truck A and package from truck A, you put into truck B. Can you find such a pair such that when that pair is exchanged, you will get exactly the same load in both the trucks. We assume for simplicity that all weights are integer kilograms without any loss of generally. So, is it possible to swap exactly one pair of packages? Mathematically, how do we represent this data? Well, since we have to keep track of individual weights, it is better to retain the values of all individual weights. So, for example, we can read the values of all the weights in two arrays. Loads in packages in truck A, there are loads in weights in one array and in the other array. We assume tentatively that there are maximum 100 packages in any truck. This is what we wish to do. We have one truck, the sum of all the weights is say someone. We have another truck, the sum of all the weights is sum 2. Obviously, someone is not equal to sum 2 if it is trucks of balanced load. So, we wish to exchange one package weighing x kilograms, take it out and put it in truck 2 and simultaneously take another package weighing say y kilograms and put it in first truck. So, what will be the situation of these sums of loads? This sum of the loads in first truck will now be someone minus x plus y because you have removed x and y. Similarly, the sum of the weights in the second truck will be sum 2 minus y plus x and we wish that these two be same. Of course, we do not know in advance what is x and what is y. We do not even know whether balancing is possible. There may not be any pair which will actually balance this. But we wish to find this out. Mathematically, we denote this problem in terms of elements of two arrays. So, there may exist pairs of elements of the type x and y such that if x belongs to array a and y belongs to array b, then if these are swapped across, some of the elements of new array a is same as some of the elements of new array b. So, how can we solve this problem? Since we do not know anything about weights, remember the weights are not sorted. They are just given to us in arbitrary order in which they are entered. So, what we can do is we can take one weight, say 0th element of array a and try to see whether there is an appropriate element in another array which when swapped with 0th element will balance it. What should be that weight? That depends upon what is the weight we are looking at. But if this weight is x, then we already know what should be y because of that formula that we have which requires first that we calculate sums of existing elements of both the arrays. Having got this sum 1 and sum 2, we will find the difference. And having found the difference, we can find the value of y for any given x. So, this is the sort of approach here. This is the program for balancing load. The input is values of m and n which are elements, number of elements in the two arrays which represent weights of individual packages. Packages are arbitrarily numbered 0 to m minus 1 and 0 to n minus 1. And we expect the output to be weights of one package from each truck such that their exchange will leave both trucks with same load. Please remember that there could be more than one such pair. In fact, we do not even know. There will be several loads which may be having exactly the same weight and all of them could qualify. Here is the initial part of the program. This is straightforward. I have loads in truck 1 and loads in load in truck 2 reprinted by two arrays T1 and T2. And these hold package weights in kilograms. I have sum 1 and sum 2 which will reprint the sums of the two packages, two sets of packages and x and y as temporary locations. So, x and y are weights which can be potentially exchanged. I additionally define some temporary locations such as i, j, diff will mean the difference between the two sums and found count. I want to know exactly how many packages or pairs I am able to find. So, I am just putting it as found count. This is plain old stuff where you read all the data. So, you read the number of packages in the first truck and then read all the weights in the first truck. Then you read the number of packages in the second truck, read all the weights in the second truck. That is it. Of course, what would be the standard validation in such case? There is no validation of input values. We are presuming all input values are correct. But suppose you are writing a professional program reprinting weights in truck. What is the minimal check that you should do when you read the input? Sorry? Yeah. No value should be negative because there is no notion of a package weighing negative kilograms. Secondly, you should also check for the upper bounds of the array. So, if you have accounted for a possibility of hundred package weights to be stored, then m and n should be less than 100. So, these are the two standard checks that you should perform. So, this is the main algorithm. Main algorithm finds out the two sums. Someone iterate over all the elements of the array P1 and find out that sum, find out the other sum. Since you know these two sums, we just print them out and we now find out the absolute difference. The absolute difference is someone minus sum to absolute value because we don't know which truck has a larger load, which truck has a lesser load. Now, we do a funny thing. We calculate the modulo remainder of diff and we say that if it is not equal to 0, the difference of sums is not even. So, no package can be found. Is that correct? What tells us to conclude this? This is an important conclusion, but what is the reason why we conclude it like this? Do you all agree with this? What will happen if it is all? So, here is the reason. If x is an element of array A, y is an element of array B, we need to find x and y to be swapped. So, the formula would be sum 1 plus y minus x will be equal to sum 2 minus y plus x and this reduces to 2y equal to sum 2 minus sum 1 plus 2x. Or y is equal to sum 2 minus sum 1 by 2 plus x. In fact, if you just look at the difference, here sum 2 minus sum 1 if you have to calculate. This will be equal to what? Sum 1 will be taken to the right-hand side and sum 2 will be taken to the left, x will be taken to the left-hand side. So, this will be 2 times y minus x and y minus x any integer 2 times that means it has to be an even integer. So, the difference has to be even. If difference is odd, there cannot be any x and y values which will satisfy the equation. So, having done that, now if we look at any element T i, the i th element in first truck, then corresponding element can be found as T j provided it satisfies this equation. T 2 j is equal to sum 1 minus sum 2 divided by 2 plus T 1 i. If this equation is satisfied, we have a matching problem. So, this makes the algorithm simpler. I run an iteration for i equal to 0 to m, find value of y which is a possible replacement. So, the possible replacement can only be sum 1 minus sum 2 by 2 plus T 1 i because I am looking at the i th element of the first truck. I have to now search for y in the entire secondary. So, I set up a separate nested iteration to go over elements of secondary for j equal to 0 to n minus 1. I find out if T 2 j is equal to y. If there is one, then I say I have found one. As I said, there could be more. So, I do not stop my operations, but I note that I have found one. So, I increment the found count and I immediately output that value. Namely, the T i which I started with in the outer iteration can be exchanged with T 2 j. So, I can print these two. If necessary, I can print i and j if the packages are numbered, for example. I finish this iteration. The inner iteration will give me as many y's as are present in the second truck which will match this T i. Then I go to the next value of i, i equal to 1. I take the next package, do the same thing and so on. Now, yeah, sorry. So, in case I come out of all these iterations and I have not found anything, my found count will remain 0. So, I will just check that if found count is 0, I say sorry, balancing not possible. Otherwise, I do not print this message. I would have printed a whole lot of messages corresponding to the pairs that I have noticed. Once again, you have to remember that once you fix the logic in your mind, namely for each package in truck A, you are going to search all packages in truck B. Then you will automatically know that you will have to run two iterations, an outer iteration and an inner iteration. And outer iteration would fix one package which must be fixed first, zero, then first, then second, then third. And for that one, you have to find out the matching. So again, you have to be careful with the indices that you use. Here is the sample load data in two trucks. This is given here so that you can actually hand execute your program. It is often called a dry run. Before you run the program on the machine, you run it with some sample values to verify that prima facie or logic is right. In the most complex logic, a simple dry run of this kind with one or two sets may not still tell you whether your program is correct or not. But at least if there is a glaring mistake of logic, you will be able to detect it very quickly. So you should always remember to hand execute your program with some data. Here is another example of searching a value in an array. We have visited this problem earlier, assumed that marks of all CS11 students in some exam are listed, roll number wise. Okay? And these are the roll numbers. The important point to note is that roll numbers are not necessarily sequential. There are some intermediate roll numbers which are absent. So somebody did not join or whatever. So you have thousand one, thousand two, thousand three, thousand four, then thousand six, then thousand eight, two roll numbers are missing. And these are the marks. Now there are 500 roll numbers and 500 marks. Now given a roll number, you want to find out what is the marks called by that student. So the given roll number hopefully exists inside the first array somewhere here. All that you do is first you read the entire array once and keep it in your memory, both the arrays actually roll numbers and marks. And then start looking for the given roll number. You check 0th element, is it given roll numbers? First element, is it given roll number? You will hopefully somewhere find it. Okay? When you find it, you will announce that the marks obtained are the corresponding element of the secondary where the marks are stored. Is it possible that you may never find the given roll number? Why should that happen? After all, I won't give any arbitrary roll number. If some student says, tell me my marks, I will ask that person the roll number, that person will hopefully tell me the correct roll number, which I will enter. So why should there be a possibility that a roll number will not be found in an array of roll numbers belonging to that class? Sorry, exceed what range? I am giving a roll number, an absolute value within the array, size of the array. But that I should verify if I have n students, if n is 474, then I would be giving 474 pairs of roll number and marks. There, that's right. There could be a manual error. Somebody has wrongly typed one or more of the roll numbers or even marks in the original array itself. Human mistake is possible. In which case, I exist as a roll number but my data does not exist in that array because my roll number was wrongly typed by someone. That is one possible error. The other possible error is that data is verified. Invariably, by the way, such critical data as exam roll numbers and marks would be verified at least twice manually. So there is very little chance that that is erroneous. You will prepare a data file and you will assiduously go through everything somebody has checking. But what may happen is, while entering the given roll number, I may enter it wrongly and it may not exist. But there is a possibility that a given roll number may not exist. There is a third possibility. The given roll number did not appear for the exam. So therefore, this data did not contain that person's roll number. And suddenly the dean or head of the department of that person says, how many marks this fellow scored for that? I say that I will then tell him such fellow does not exist in my class. Or at least I will tell him it did not appear here. Unless you find out a mechanism of representing upset students by a data which is not valid marks but is a valid number such as negative marks or whatever. Anyway, the logic here is very simple and straightforward. Given a roll number, I need to find the marks scored by that students. So I declare these two arrays. N students is a location which will hold the number of students. Then there is a given roll, found marks, position, and I. What are these for? We have seen a short one. First of all, I assume that I don't find the student. If I find the student, the position will indicate the index in the array at which place I have found that student. So obviously, if I have not found that student, the position should have a value outside the range of 0 to n students minus 1, which is the valid range. So I arbitrarily put it as minus 1. Please note that here, I will not find multiple students for a given roll number. It is expected that the relationship is unique. Anyway, so I set up an iteration from 0 to n students minus 1. And in that iteration, every time I just check whether the ith element of roll is equal to given roll. If it is, I have found the student. I will announce that I have found the student. Marks for given roll are these. And I set the position equal to I. I also do a thing called break. You all remember what break does? Break will take you out of the innermost iteration. Please remember, break and continue statements are not concerned with the if statements surrounding those. They are concerned with the loop statements surrounding those. The innermost loop is this for loop. So whatever be the value, the moment I come here, I have found a fellow, I will break out of this. When I come out of here, I would not know whether I have come out after locating that student because the break statement or I have come out after completing the entire iteration and still being unable to find that student. In either case, I will come out here. And that is why there is another check. If position is minus 1, then I know I did not find that fellow. I will simply announce this as not found. Why? Because if I had found the student, I had already printed the marks for that student in the innermost. Of course, you can implement this logic in multiple ways. There could be cleaner ways of doing it than even this. But this is just one algorithm which will work. The point is not this algorithm. I think all of you understand this simple searching for a value. Point is, can I do it faster? There are 600 students. This particular logic will, on an average, require how many comparisons to be made? On an average, half the numbers. Because when a given role number is there, it could be any number amongst the 500. So each number has an equal chance of being asked. There are 600 numbers, let's say. So about 300 searches, on an average, I will do. Some role number, I'll be lucky, I'll find it in the 0th position. Some, I will have to do the donkey work of going through all the elements and 6th. But on an average, half the elements. So if there are n elements, then the number of searches I will have to ordinarily do on an average will be n by 2. Which is about 300, if 600 is the number. It is not very large. But what if the number of elements were 6,000? I'll have to do 3,000 searches. Okay, the machines are fast enough. What if the number of numbers is 6 million? Then I'll have to search for 3 million elements on an average for every number. What if the numbers are 6 billion? Then I have a serious problem. So obviously I must perpetually look for an algorithm which will solve the problem at least in order of magnitude faster. So if some algorithm takes n square iterations, for example, we call it an order n square algorithm. Any other algorithm which solves the problem in examining only n elements will be called order n algorithm. Any order n algorithm will be better than order n square algorithm. What can be better than order n? Because this is order n algorithm. In one scan of the entire array, I'm solving the problem. We shall see that. But before that, we prepare the background for the mechanism. So here is the problem. The algorithm scans all n elements once. Can we do it faster? That's the question. To look at how it can be done faster, we move away from elements of an array which are essentially discrete points, a set of discrete values to a real line where we were searching for roots of functions. You remember we had seen that problem. Here is a method of finding out root by what is known as bisection method or midpoint search method. I don't know whether you have done this kind of algorithm. Here you are given a function f of x and you are given two possible values, one at which the function is positive, another at which the function is negative. In the sample that has been drawn here, this particular value is positive and this particular function value is negative. If at all you have a function for which two points are known, one is positive, one is negative, then it should be obvious that somewhere in between the function will cross x-axis and wherever it crosses x-axis is the root. In this particular sample, we have shown the root desired root to be here. Now look at what we do here. We arbitrarily find the midpoint of these two values high and low. Let's say this is the midpoint. We have absolutely no clue whether the midpoint will be the root but midpoint will divide this entire range from low to high into two halves. Now we can evaluate the function value at midpoint. It so happens that the function value now is still positive. Now if the function value was positive here at high and function value was positive here at midpoint, whereas function value is definitely negative at low, then obviously the root should definitely lie in this region. We are presuming that we know one root in that region but even if there are multiple roots, there could be other roots on the other side but there is guaranteed one root on this side, we know that. Whatever we achieved in one shot by just doing the arithmetic mean of high and low and finding out the midpoint, we have reduced the space on which we are searching by half. So suppose the width was say 200 units. Now I have to search only in 100 units. What do I do now? I have got this midpoint. Since I know that I have to now search between this range, I will treat this midpoint as high. I simply shift high to this point and I find another midpoint. Let's say I get it here. This is the new midpoint. Again I observe that at this midpoint also the function value is positive. So I know that I have to search now in this range. Please note every time the search range is becoming half and half and half. At this point of course the midpoint will be somewhere here. The function value will be negative. I know I have overshot the root. I have to search now backwards. So I will search in this range. By iterating around and by shifting appropriately either the high value or the low value. Do you not agree that I will move very quickly to the root? Because every time I am reducing the search space by half and search space by half means what? How fast is the reduction in the search space? Suppose this original width was W. Then in the first iteration itself I have reduced the search space to W by two. Next time W by four. Next time W by eight. Next time W by 16. This is a very fast reduction in the search space which is of the order of logarithmic reduction. As compared to going point by point. Now imagine that I have here, okay this is the algorithm basically. So what does the algorithm say? The algorithm start with some X low and X high values. I calculate F of X low into F of X high and I ensure that it is less than zero. Why is it required? Otherwise the two values will not be on the either side of the root. We want to ensure that they are on the either side. And once I know these two values I calculate the midpoint and I calculate the F at X mid. So this is the midpoint, okay. Now if the absolute value of function value at midpoint is greater than some small threshold value. Observe we are finding out root. We are working on real line. And real line has infinite points in any interval. So whichever I reduce the interval there are still infinite points. But given that I can represent numbers only with finite precision. I know that beyond some point it won't make sense. We have seen that earlier when we decided on some threshold or some such value we had said I think 1.0 10 to the power minus five or minus seven some value will take. Now if the function value is within that range we will announce our founder root. If the function value is not in that range then I have to still continue to search. So that is what I will test. A simple iteration while absolute value of X mid is greater than some threshold value locate the next interval which will be either this or this. Either this or this. And then once you finish that off you will announce that I have found the root. So how will you write the program? Let's write it simply. So I'll first verify. Let us say I have written a function floating point float F or something which will give me the value at any given X. I will not spend time in writing that. I'll assume that function F is known. So this is my function F of X and this is X axis. So first I will read. I will presume that I will do the basic validation which is what this should be negative. So if it is not negative that means it will be greater than zero greater than or equal to zero. I will say you agree? There's no point in searching further. Now I have got this. So I will calculate the first midpoint. What is the midpoint? It's equal to simply L plus H divided by two. Now I have to search while. What is the condition I should put? This is the first condition. As long as the function value at the midpoint is greater than a threshold I have to keep iterating. What is the second condition? And L should be less than H. If H and L cross over then I have a problem. So that means I have somehow missed the root completely. Can that happen? It can happen if my function is not continuous function. So the function has negative values on one range, on one side, positive values on one side and there is a sharp switch over. At that point the function is not zero. It is either minus or plus. That is what is called discontinuity. So if I have discontinuous function, sad I can't do anything about. But anyway I need to check that L remains less than H. Low should remain less than high. Now what I have to do? I have got inside. Okay, that means I have not found the root yet. So I have to know what it means is I have come to this midpoint and I have to now decide whether I take L to this point or I take high to this point, right? Whether I have to search this side or this side. So I will just check what is the value of function at midpoint and whether it is greater than the value at function at the low point. Or just check the sign again, whichever way. If the sign is negative that means I have to search on one side otherwise on the other side. So what would be the if statement that I could write here? If function of mid multiplied by let's say function at low point is less than zero. What it means? FL and F mid are of different signs. So high should go to mid. Earlier I was searching between high and low. Now I should search between mid and low. So what should I do? Not mid equal to something, high equal to mid. Remember I am always searching between H and L. I'm not searching midpoint is just a place where I should bring either L there or high there. L's, what should I do? L equal to mid. Simple, nothing else. For a well-behaved function I will come out only when I have found out a root within the accuracy of yeah. I have done this but I'm going back doing exactly the same thing again. So in the next iteration I'm supposed to do what? Since I have moved this H here I have to now find the new midpoint which I have not done here. So L's and then I have to say mid is equal to L plus H. You notice that this program is so short and the searches made are still shorter. There are two things that I would like to point out from a numerical analysis perspective. First, this method will converge faster than the tangent method that we use for Newton-Raphson because every time you are guaranteed reducing the search space by half. Second, the tangent method may not work in cases where I have shown the functions like this where there are local maxima and local minima. And what will happen is that tangent may keep you oscillating between that point and you may end up finding out the local minima or local maxima rather than the root. So the method may not converge properly. Whereas a midpoint search doesn't care of how many camel hums are there in the function and so on. It will concentrate solely on going to root so you will be able to find it. Of course, the difficult part is finding low and high properly. And the more difficult part is suppose you have an equation or a polynomial, let's say of order n, which may have all n real roots. Then you may not be able to find all n real roots. So in the notes which I am preparing I have given a sample example of a polynomial and given sample values of low and high. Depending upon the value of low and high, you end up finding out either this root or that root. And the number of iterations will also depend upon how many things that you do. Yeah, there are some questions somewhere. So is this okay? Now, yeah. No, no, no. Any function, why should it matter? Any function which has a real root. Not necessarily, okay. The condition for this method to work is that there exists at least one real root between low and high. If that happens, then it will find it. If there are multiple roots, it may find any one of them. No, it will always find one of them. Because it doesn't matter how many roots there are. If one value is positive and one value is negative, then wherever you go, you will go from that point to the next point. You are always continuing to look by shifting high or low to the midpoint. And it is guaranteed that in the new search interval, there is at least one root. Because the signs are opposite. Then I will get out earlier itself, announcing that I can't apply this method. Yeah, not L less than H. We are saying F of L is less than F of H, okay. L, because I am assuming L is the lower value on x axis and H is the higher value. The function value, we are not sure whether it is positive here or negative here. No, there may not be any root. That's the point. That's the point he was making. Sorry, come again. No, no. So even if for a continuous function, L is less than H does not guarantee that F of L is less than F of H. Of different side, that's all. Not necessarily of the same, not necessarily F of L is lower than F of L. That is, we are not assuming that F of L is negative. F of L could be positive. F of H will be negative. This method will find the root in between. That's all we are saying. No, no, no, that's okay. But when do I stop this? When do I stop this method? So, okay. In this particular case, it will work. I wanted to use this as a basis for doing a faster search, where an equivalent of root may not exist. What would be the equivalent in case of an array where I am searching for an explicit value? A given roll number does not exist at all. Now, if I apply a similar principle which we shall see shortly, if L and H crossover, then I know the value does not exist. That is the only reason. No, we are only assuming that given L and H values, L is the lower of the two. That's all we are assuming. We are only assuming that L is the lower value, H is the higher value. But notice what we are checking is the product of the function values. So, for example, the same thing will work even if function at L was positive and function at H was negative. There's no change I require. I am only saying that I am ordering L and H based on their absolute values that are given to me. He's asking what is fabs. So, it's not fabulous. It is actually fabulous because it finds absolute value of floating point value. So, if floating point value, ABS is absolute of integers, F ABS is absolute of a floating point. So, a floating point X, if depending upon whether it is positive or negative, it will be made into positive. That's all. Here is a question. I am tempted to repeat my earlier observation, but I still answer this question. The question is which method is better, Newton-Raphson or the midpoint search, half interval search. He's saying that half interval search requires two inputs. The advantages of half interval search I have tried to enumerate. One advantage is that it will avoid local minimas, local maximas, and concentrate on the rule. The second advantage is it will probably converge faster. Which, will it always work? It depends upon our ability to find two points at which the function values are positive and negative. It is not as hard a problem as it appears. Even for a very complex function, I can actually do it by trial and error. In fact, I can do a midpoint search kind of thing. I take a very large negative value, find out what is the function value. Very large positive value, what is the function value? And then sort of look at orbit. But yes, if I fail to find the starting point, the algorithm will not work. That's the problem. Same thing could happen even in Newton-Raphson. You may have a starting point from which you may only diverge instead of converging. That depends on the nature of the function also. I don't think there is any general rule which says that always this method is better. A whole lot depends on the nature of the function. But what I wanted to originally say is that yes, these are important issues. These are important issues of a subject matter called numerical analysis. What we are trying to understand is given some numerical analysis method, how do we code it as an algorithm and write a program for it? So let's concentrate on the programming part, but that question remains important. So now all this was preparing a background for a linear search in an array being made into a faster search. So consider now x-axis has a sequence of points. And let us say I am not evaluating a function, but I am just looking at these points and I am trying to find out whether a given point is there or not. Can I not consider this to be an array where I have just inverted that entire midpoint search diagram of the function on its head? My low is this point, my high is this point. And there are eight elements in the array. And I am looking for a roll number which I treat it as function of this index. So function zero is this roll number, function one is this roll number, function two is this roll number. Instead of a function, it's merely an array. This assumes that roll numbers are entered in ascending order because the points on x-axis are arranged in ascending order low to high. That is correct, that is the assumption. You will notice now what is the importance of sorted arrays. The technique that we are going to discuss cannot be applied to an array where the elements are not sorted. But if the elements are sorted, there's an amazing speed up that you get while searching for a given element. So now you can almost guess what the logic would be. The low will be zero, high will be the maximum elements that I have, say in this case seven. If I am searching for a given roll number, I'll find the midpoint which is zero plus seven by two, three point five. Maybe I'll upgrade it to four. I will look at the fourth element. Now the fourth element may be exactly what I'm looking for. If not, it will be either higher than what I'm looking for or lower than what I'm looking for. If it is higher than what I'm looking for, then I know which part of the array to search, the lower half. Otherwise, the second half. So first half or second half. If there were eight values in which I was searching, now to search only between four values. Next time I have to search between two values. Eight, four and two does not appear to be very large but imagine if you are making searches over say thousand elements. Thousand twenty four will reduce to how much? Five hundred and twelve next time. Five hundred and twelve will reduce to fifty six. One twenty eight. Sixty four. Like that, it will logarithmically reduce. So instead of searching on an average, n elements in an array, you'll be searching for only log n elements or means you'll be comparing with only log n elements. And at the end, whenever the midpoint straps, that is when high and low move across, you know you haven't found the fellow. If you have found it, you will get to know that. So this is called binary search. It assumes that the array is sorted. I have the low, high and mid. I set low to zero and high to n minus one. I am of course presuming that I have read the array elements earlier. So the array is a roll number array and marks array. And I am given a roll number which I call given roll. I am searching for given roll in the array of roll numbers. Same problem that we saw there. And I want to print out the marks for that roll. So I start with calculating the midpoint as low plus high by two. I'll set found flag to be zero. I have not found him. I will check, I will iterate now while the roll number of mid is greater than, sorry, what should this be? And high is greater than low. So this is one condition I apply. I know that the higher roll numbers will be on the higher side of the index value and lower roll numbers will be on the lower side. So if I find that the roll mid is greater than given roll and high is greater than roll, I still have to search. I will now check whether roll mid is equal to given roll. If it is, I will set found flag is one and break. Otherwise I'll recalculate the midpoint. And how do I recalculate the midpoint? Exactly the same logic. If the midpoint roll number is greater than given roll, that means I have to search in the upper half. The given roll is smaller than this. Otherwise the roll number is bigger than this. I have to search in the later half. I only set high to mid or low to mid. And I recalculate the midpoint. Go back again. I can put this condition in three different ways. I can alter the way I reassign mid, high or low. But eventually I will have to implement this logic. And if I implement this logic, my search will compress the search space. This time it is not going from one infinite number of points to another infinite number of points. But it is going from one finite number of points to half of them in the next search to half of them in the next search, guaranteeing that I will finish this testing in logarithmic time. And at the end, depending upon the found flag when I come out here, I can check whether I have found or not. If I find, of course, I can print that marks. If I don't find, I will not. There are small but significant efficiencies that can be added. If I put the condition here itself as equal to given role, that means at the midpoint, the role number does not exist. Then I have to search for upper half or lower half. But is it necessary that when I move the high or low, I should move it to midpoint? Actually not, because the midpoint itself need not be concerned now anymore in the search. That number is not there. That is my file condition, in which case I can move either low to mid minus one or high to mid plus one. Sorry, low to mid plus one or high to mid minus one. So I can eliminate that midpoint itself from my search. These are small tricks that can be done in this. I am putting this up. I would suggest that you take some possible role number and write down in the first iteration what is low, what is high, what is the midpoint, whether you are required to do the second iteration, and confirm for yourself that the number of iterations that you require are of a magnitude which is of logarithmic order of the total number of iterations. Yeah, while conditions should be this condition. What is wrong with this condition? If the, correct, it has found a flaw. If the given role is less than the midpoint role. So what should this condition be? So I should just say, actually we don't have time, but I wanted to go through that dry drill as I mentioned. And when you go through that dry drill, and when the given role number is in the wrong range, you will find that suddenly the algorithm will not behave correctly. This is precisely the reason why I say that always get into a habit of quickly checking out by doing some hand iterations for one or two sample values, where typical sample values that you should test are the extreme cases. The given role number is the 0th itself, given role number is the last itself, and third one given role number does not exist. If it works for these three, it will generally work for any other. Here is another question. This was for depending upon whether you are slot five or slot 11, the question said that you find out the such a subsequence of positive and negative numbers given in an array such that the sum of that subsequence is either minimum or maximum. So how will you solve this problem? Sorry, segmentation fault. It has nothing to do with searching of elements in an array. You are getting that when you are running which program? This program, then somewhere an index is going beyond the total number of elements or the size that you have declared for the array. Okay, what is a segmentation fault in technical terms? Whenever you run a program, some memory is allocated to you. For any reason, any instruction in your program attempts to access memory which is outside that bound, outside that segment. The system shout saying segmentation fault. Basically you are trying to write something into my area, for example. So if that fault occurs, you will be terminated permanently. I mean your program will be terminated. So segmentation fault is a serious problem. Should examine the. Why would compiler give an error? Compiler's job is to allocate storage based on the size that you have declared. You have declared size, say roll number size to be 500. Now you are searching for an element whose index is 1274. Compiler would have no clue what you are going to do, what values you are going to do. Compiler is relatively dumb. Just says, do this, do this, set it up. Now you run your program. Oh my God. Oh, he's asking a very serious question. Doesn't compiler check whether my program is correct or not? If it did, we won't require any exams of any kind. In fact, if you would write any trash algorithm, you are expecting the compiler to be wise about the algorithms that you use in your program. That is not yet possible. That is not yet possible. Maybe your generation or next generation would write compiler systems, which will be very thoroughly knowledgeable in various algorithms. It will just check the syntax. It will not check anything else. Logic is entirely your responsibility, and you will notice it only when you run the program. So here is a sequence. Now, first of all, what do we mean by subsequences? Subsequences could be many. Let us consider these subsequences that appear in this particular list of numbers, starting with 0th element. The first sequence, subsequence is 2, and that sum is 2 itself. The next subsequence is 2 and minus 3. And what is the total? Minus 1. The next subsequence is 2 minus 3 minus 4. And the sum is the next subsequence is 2 minus 3 minus 4 2. What is the sum? What is the sum now? What is the next sum? 4 or minus 4. You agree that starting with 0th element, these are all the subsequences. Just to make things unique, let me change one of the values. Let me change this to minus 2, for example. If I change it to minus 2, this value will be how much? Minus 10. And this value will be minus 7. So I have these sums, minus 1, minus 5, minus 3, minus 9, minus 4, minus 1, 0, minus 8, minus 10, minus 7. Which is the minimum sum? So this is the minimum sum. And this minimum sum comes out of what sequence? So what is the start index and what is the end index? There are two start indices. One is this subsequence, minus 8 and minus 2. But because I was getting a value 0 in the sum, even if I start counting from this point and go right up to this point, I will get the same sum. You agree? Sorry? Yes or no? Start after? Oh. So if I start with this point, if I start with 0th point, then the minimum sum is minus 10. But the sequence, so you see, identification of this sequence, when the numbers are not at all ordered, is not straightforward. One logic that can be implemented is I start with the 0th element and I search for all subsequences and find their sum. So 0, 0, 1, 0, 1, 2, 0, 1, 3, 0, 1, 4, 0, 1, 5. I get a value minus 10, which is actually the sum from 0th to this point. But is this the smallest subsequence? Not necessarily. Because starting with minus 8 and minus 2, also I get the same thing. So having done this work, I can find out what is the minimum sum that I get if the starting point is 0th element. I can now say, let me start with first element and find out what are the subsequences with minimum. Do the same thing with second element, do the same thing with third element, and so on. So I have an order n square algorithm essentially. And in each case, I will find one minimal sum. Now I have to find out the minimum amongst all these minimal sums. So whatever is the minimum and whatever is the corresponding start index and end index, that is my answer. So this is not a very straightforward logic. Technically, I should have put that problem as question five and shifted question five to question four, because normally I attempt to put slightly more difficult questions towards the end. But that's okay. I have only glanced through some of the answers. Tomorrow we are doing the entire correction work, so we'll know how many people got the logic correctly and how many did not. Not a very straightforward problem. But what I wanted to indicate by such problems is that there is, if at all we have said so far, that you can have multiple ways of solving a problem, this is one problem where there could indeed be n different ways of solving. One algorithm, because after we wrote our program and we are struggling ourselves to find out the best solution, we knew that the best solution would not take order n square we then wrote to what a person whom we call local god of algorithms. My colleague, Professor Ajit Divan, he was writing extraordinary algorithms even when he was a BTEC student. He continues to do that even today. So this is the solution by Professor Divan. Actually the mail that he sent me said that this is a simple exercise in algorithms and then gives this solution. I don't find that simple at all to imagine that this is the solution but this is indeed the correct solution. So look at what you do. You calculate SI. SI are not necessarily sums of all the elements from a given starting point. You accumulate SI based on this simple rule. In the process you scan all the elements only once. So you start an iteration from zero to n minus one on elements of AR. Initially you set S zero to A naught. So whatever is A naught is two five minus seven whatever that is S naught. Now SI is calculated as AI if SI minus one is greater than equal to zero. Otherwise it is calculated as AI plus SI minus. Simply put if you get a previous sum to be positive okay then obviously the next element if that is also positive and you add it to the previous element you are going to get a larger value only not a smaller value. So there's no point in continuing with that summation. That is roughly the logic. But if you take the sample values that have been given here I'll put up these slides in about an hour's time. You will find that at the end when you come out you will find the S array to contain values okay which will the minimum of H will represent the minimum sum possible. What is the end index for that minimum sum? That will be the index at which that value of S occurs. So if it occurs as I equal to eight then end index of that minimal subsequence is eight. But what is the start index? Where did this start? Did it start just at that point? At two points earlier or right at the beginning? That is not indicated by this logic. For that you will have to struggle to think how to maintain the start position for each of these sums that you actually. So if you can maintain start positions for each of these sums the end position is obviously the place where it occurs. So you can have another array just as you have array S you have the start position. And that start position initially is set to zero and subsequently for every value of S you will keep track of where you started till you change this S. And the last one where you change this S after that you will set that. You will be able to get both start and end positions. The beauty of this logic is that it is an order and algorithm. It makes exactly one pass over all the elements and gets you the minimum sum. How will you solve the problem to find out the subsequence which has the maximum sum? It's simple you can invert the logic and do exactly the same thing. So you can. Sorry? Yeah. No, no, not necessarily how to find the sum of first term. The issue is that if such a sequence is given what is a subsequence inside this sequence whose sum is minimum? It does not say that subsequence must start with the zero at term. It can start with any term. So as I mentioned in this particular case minus eight minus one might be the subsequence having the minimal sum. It does not so happen in the given example but I can always construct examples with appropriate values and check where it is. In general there is no sanctity or assumption that a minimal sum subsequence will always start from zero at element. There's no such need because the values are arbitrary. So is the logic understood? Logic first of doing the twice nested loop, finding out all the partial sums and then comparing later on which is the minimum and keeping track of start index there as well and the second one which is a much cuter algorithm. Fine. We are iterating over i from zero to n minus one. That's it. In one shot you will get all the si's and if you maintain the start index as well then at the end the minimum of these can be found out. In fact it so happens that since you are interested in the minimum sum you need not retain all the individual elements of s after you calculate. You need to retain only the minimum and the current sum that you are doing. Okay. So you can implement that logic. We'll be posting the sample answers on the Moodle after the correction work is over on Monday. Okay. Thank you.