 Hi and welcome to this next session on data structures and algorithms. In this session, we will continue our discussion on the running time of a program. And in particular, we look at average and worst case complexity. We look at something called asymptotic analysis, which is basically all pen and paper work, unlike empirical evaluations that we talked about in our last discussion. So, going back to our search algorithm A and you might want to recall that our search algorithm A was basically a linear scan algorithm meant to find an element E in a sequence S. We will now start looking at analysis as function identification. Now, we look at two different notions of analysis. One is average. The average basically across all instances of the input. Now, all instances of the input subject to some constraint is also a very frequently occurring scenario. So, we will look at all instances, for example, of S, but with a fixed length of S. The other is maximum or worst case. This is again across all instances of S with the fixed length, but we are only interested in the worst case performance. For the first of the two, which is average, you will of course need to know how likely is each instance. So, to calculate average, you need to compute probability distribution over inputs. Now, a probability distribution needs specification of what is a random variable. So, first of all, we need to determine the probability of S having a successful search for an element E. So, this is the probability that E is found in S at all. The second is distribution over the location of E, which is probability that E is found in S at position i. So, let us take the average case for search algorithm A. We will assume that the probability of success is say P and given that there is success, the conditional probability of element E being an index i is 1 by N and this basically is for all i. So, what exactly is the probability? Well, we can first of all assume some success probability, let us say P is half. Now, what the specific value of P is will also depend on how likely your element E is given the kind of sequences that get generated, but P equals half is a reasonable assumption. So, what is the average case? Well, you will need to sum over every possible location in the sequence and the time required for search if the element E was at position i and this is a probability of having that element in that position plus 1 minus P that there is actually an unsuccessful search for an sequence of length N. So, we recall that the time required, times required for successful and unsuccessful search were respectively determined to be 4 i plus 5, where i was the location of the successful search and in case of unsuccessful search, we had found it to be 4 capital N plus 2. So, substituting these values, we basically get the average time to be 3 N plus 2.5 and 3 capital N plus 2.5. So, it takes time that is linear in the size of the input on an average. How would worst case? Well, worst case is exactly when the element E is not found, you need to scan the entire list and that is something we already computed this 4 N plus 2. So, the average is 3 N plus 2.5, the worst case is 4 N plus 2, the difference isn't really very significant with a linear scan. How about an alternative search algorithm which shows some reasonable difference between the average and the worst case and remember that we have discussed such an algorithm, an algorithm that we expect to be efficient, the binary search algorithm which basically starts with the extreme ends of the sequence S, the begin and end. It terminates when S is empty, which is begin, begin, end, end. Otherwise, it checks if the midpoint of S is the number. If it is found, it exits. If not, it is going to check for the presence of or the possibility of finding the element E or the number to the left. So, this is the mid. It is going to probe if the number is likely in this section. If yes, then it launches a binary search on the left half. Similarly, for the right half, the query is if N is greater than equal to S of mid and that is highlighted here, binary search or recursive call on the right hand side. Obviously, exactly one of these two need to be invoked. So, what is the analysis? So, the time taken in one function call is as follows. So, if you count the number of math operations that get invoked. So, there is this mid. There are a couple of operations where you look for equalities or inequalities that comprise comparisons. There are also some assignments. The operation, math operation and assignment could be counted separately. So, found, for example, is an assignment. So, there are three, exactly three assignments slash math operations. The remaining amount of comparisons. In fact, two of these comparisons will invoke cost of two each. That is because they also require you to compute the element at a particular location in the sequence. So, there is basically array access or sequence access. So, overall there is a comparison of five. Of course, for the final call, you do not need to invoke any of these. And as a result, for the final call, you just return a false and skip the expensive computations. Sorry, this is the found is true. This is the final call. I am just going back. For the final call, you can avoid doing the second access, which is checked for if num is less than smid and exit with found equals true. So, therefore, there are just three comparisons for the final call. Two for this smid equals num and one for begin greater than n. The final point we note here is that the typical function call or the recursive call we make here to buy research does invoke some additional cost. A cost that needs to be factored in when you have lots of recursions. And the reason is that you have to store the state of the current program before you invoke the new program. And when you come back to the original program, you need to retrieve the old state. So, all of this gets stored in a stack and that leads to more expensive function call. We have assigned an arbitrary additional cost for function calls C. A general note is that recursive calls can involve more overheads. As I already mentioned, this is because you need to save and also retrieving the parent program state. This is done through a stack. As an example, the factorial program implementation can be done in two different ways. So, you can recall the factorial program. One was a recursive call. Fact n is n times fact of n minus 1. This is a recursive call. You can also have an iterative version of the program which is equivalent where you iterate over indices i equals 1 to n and basically assign to factorial, the product of the previous or the factorial so far with the current element we iterate upon. This is the same, but now you do not need to invoke a stack to maintain states and so on. So, wherever possible it is advised to invoke recursive iterative programs. Often recursive programs are compact and for brevity, you might write recursive programs whereas, while implementing you might go for the iterative versions. So, let us look at the worst case for algorithm B and that is when the element E is not present in the array which basically amounts to scanning every position in the array. So, this time required is basically C plus 8 for each call except for the last call, the last call you basically find that the area is empty and you exit. So, how many such calls? Well, first we can try and represent this whole search in the form of a tree. So, the first call is with range 0 to n minus 1, but in every recursive call you base you reduce the search range by half of a factor. So, you look at n by 2 and then n by 4. Now, you might invoke the left or the right branch of this binary tree, but what you will be effectively doing is traversing a path in this tree. The termination is when you hit a leaf node in the tree. The termination is equivalent to the case when begin exceeds end in that recursive function. So, we know that the depth of such a tree, a balanced tree is basically locked to the base 2 of n or you could also reason out based on shrinking sizes of the sub arrays that at lock to the base 2 of n you will basically have an area of size 1. So, the time required is c plus 8 into lock to the base 2 of n. This is for all the calls except the last one. The last case you will have a cost of 6 that really corresponds to the leaf. So, our recurrence relation is T n equals T n by 2 plus e by 8. How do you solve this? In the following lecture, we will talk of the master's theorem and you can solve this using the master's theorem. Basically, the master's theorem gives you a template for solving recurrence relations in the asymptotic case. And when I say asymptotic case, I mean of a reasonably large values of n. So, how do these two algorithms compare? We will assume that algorithm B, the solution to the recurrence relation is exactly what we found by analyzing the decision tree and that happens to be c plus 8 lock to the base 2 of n plus 6. So, this is the worst case for binary search. The linear search has a worst case of 4 n plus 3. Which of these is faster? One might jump and conclude that lock to the base 2 of 10 is smaller than n. But please remember that we have a whole bunch of other factors sitting here. So, assuming that c equals 10 and let us take small values of n such as n equals 2. You will find that for n equals 2, 4 n plus 2, 11 is actually less than 18 times lock to the base 2 of 2 which is 1, 18 plus 6. So, for 2 and in fact, likewise for 3 and for 4, algorithm A actually is faster. In fact, this goes on for n up to 20. But for n greater than equal to 21, algorithm B suddenly becomes faster. Now, the most important point to highlight here is that algorithm B becomes faster and remains so thereafter. Which means for all values of an exceeding 21 or 20 basically, you do not have to bother about algorithm A. So, there is a kind of monotonicity in terms of improved performance. And this actually is consistent with experimental observations for different values of n. In fact, this is the kind of analysis that is desirable. You get your insights and set your expectations before you even perform the experiment. Now, can we leverage this notion of monotonicity and define a notion of complexity? So, yes indeed we can. So, there is something called asymptotic analysis. Asymptotic analysis is really about this exploiting or leveraging this monotonicity, which is about analyzing and comparing running times only when input size is large. That is, we focus on analysis for n more specifically, we use capital N when capital N is very large. For small inputs, even a bad algorithm such as linear scan will perform better and we did find that for n up to 20, a bad algorithm like linear scan did perform better. But believe me, nobody in practice uses linear scan for searching, especially the array sorted. The other thing is, we also want to get out of unnecessary details. And we do that by focusing on order of growth. Now, what does this mean? Well, this means that we do not want to make any serious distinction between 40 n plus 400 and 2 n plus 1. Now, this might be surprising for beginners. But the point is, for large values of n, we have other more serious concerns. The way 40 n grows with n is not the way n square goes with n. So, n square grows much faster than 40 n. So, we want to make a big distinction between 2 n plus 1 on the one hand and 400 log n plus 1000 on the other hand or as I pointed out n square on the other hand. So, we do want to make this distinction. So, the order of growth is really based on what we want to distinguish from what. So, we want to be able to say that 2 n plus 1 is slower than 400 log n plus 1000 and it is faster than n square. How do you say that? The intuition is as follows. The intuition is that function n grows faster than log n. So, we are talking about rate of growth. For those who are familiar with calculus, you could look at the derivative. The derivative of n d n, d n is 1 whereas, the derivative of log n with respect to n is 1 by n. So, we have some intuition for continued value numbers, for real numbers for example, that n does grow faster than log n. Now, can we articulate this algorithmically? And we try and build up on this intuition. Intuition is that irrespective of the constants involved in the analysis, we know that after some large value of n, algorithm A will become slower than algorithm B. So, here the slower is where we talk about the rate of growth. So, recall that Tn for algorithm A was basically proportional to n and we will give it a name called linear. For B, Tn was proportional to log and we give it a name logarithmic. What have we done in this process is we have ignored constants. Constants do not matter. What matters is that one was linear, the other was logarithmic. But again, we need something more formal. So, before we look at the formal definition, let us just plot these different functions and convince ourselves log n indeed grows much slower than n. Well, n we expect to be linear. What we have also shown here are other functions such as n log n, which grow faster than log n, but grow slower than n square and n cube and so on grow really fast to power n is something we would never like to have. So, the fundamental running time of an algorithm is called its time complexity. We have already motivated that we would like to look at time complexity in terms of the dominating terms or we call the orders that basically brings us to the order of complexity of an algorithm. Let us discuss the order of complexity in some specific details. So, one of the very well known and popular notions of order of complexity is this big O. It is called the big O and denoted by this big calligraphic O here. We say that the time complexity of an algorithm with input size n is big O f n where f n is a function. If there are positive constant c n and not says that t n is less than equal to c times f n whenever n is greater than equal to n not. That means, well you might have had t n exceeding a particular function f n for some point of time, but if you are guaranteed that beyond a value of n t n is consistently less than f n then you would say that t n is big O f n. So, this is being stated in some words you can find a point n not after which t n is smaller than the linearly scaled version of f n. So, we have allowed some scaling factor c. So, it is possible that t n if not scaled well or f n not scaled well might have t n going above f n, but we are allowed to suitably scale f n so that the order of growth is still respected. You still do not want that for an arbitrary value of capital N t n overshoots f n. So, the key here is that the c must be independent of n once having determined c you need to stick to it. Constant c helps ignore multiplicative constants as I just pointed out it ignores multiplicative constants or n not helps you ignore any additive constants. The whole idea is to focus on the dominating n term. Thank you.