 Good afternoon and welcome back. As I said earlier in the morning, I wanted to briefly discuss the topic of efficiency of our programs. It is actually important in real life and I will explain with some examples what could happen if we are not careful with our programs. But essentially to our first year students, we should tell them at an early stage that efficiency matters and the program execution time in spite of the fact that the computers are very fast, program execution time remains a parameter of importance. And it should have a bearing whenever we are designing algorithms. So, essentially we are going to discuss efficiency of our programs and we are going to discuss the notion of time complexity. We will take an example of estimating value of pi and in the lab today afternoon you will actually repeat the similar kind of counting of execution times for the Fiponacci series. The assignment has already been made and I think it is on Moodle. I acknowledge of course some slides courtesy my colleague Prasam Malinswani who had actually when he thought CS101 he had introduced the computation of pi or estimation of the value of pi in his course. I have taken the basic theme and extended it to suit the purpose of illustrating the notion of time complexity or illustrating the notion of efficiency of our programs. First of all, some comments on the computational time. So, we observe that while the computer works very fast, it does take a finite amount of time to execute any instruction. So, by every computation there is some time. At one time this time used to be in seconds, it then became milliseconds, it then became microseconds and now there are machines which can execute instructions in few nanoseconds. But still there is non-zero finite time required to execute any instruction. The challenge is that if we are not careful in designing our program, we may actually force the computer to unnecessarily execute instructions without adding value to our final solution and that is what we are trying to look at. So, when we say that we want to design efficient algorithms, what we mean is we want to achieve the same results, but we want to see whether by writing programs in some particular fashion or by taking care of some costly operations and replacing them by less costly operations and this will naturally happen in case we have iterative solutions which is what is true in most of the real life problems that we encounter. Very roughly, the order of magnitude of time which is required to execute any program is called time complexity. We will actually introduce the notion of order of complexity towards the end in a very simple fashion which is what I would like you to look at. But the important message that we want to convey to our students is that they should design their algorithms, the steps in their program such that the execution time of the program is minimized. So, we start with an example illustration. This example is estimating the value of pi. As you know, pi is related to any circular shape. In particular, the area of a circle is given in terms of pi. Pi r square is the area. However, if I consider a unit desk, that means the disk where the radius is 1, then naturally the area is exactly same as pi. You observe that the area of the circle, if it is pi, then the area of a quarter circle which is shown here with a shaded region, it is pi by 4. While we are looking at quarter circle, we can of course look at the full circle, but it is a symmetric thing, four quarter circles are symmetric, their areas are same. So, if we can estimate something about a quarter circle, it will apply to the other parts of the circle as well, that is common sense in oxymoron. What we do is, we have shown here a rectangle which is drawn on the two radii, they are 90 degrees apart. So, what I have is actually a square. This square, since the disk is a unit desk, this square also is one unit square. So, the length is one and height is one, one by one square. And of course, I have a circle inside it, a quarter circle whose radius is also one. How do I estimate the area of this circle? If I could estimate the area, I could relate it to pi. The way I estimate it is that I take this rectangular portion, I assume this is x axis and this is y axis. So, on this x axis and y axis, I put up a series of points, number of points here and number of points here. Assume that I put n points in x direction and n in y direction, not a grid. So, I will have n into n points across this. I have tried to show this through this diagram here. So, I have effectively discretized this area which falls within this square. You will notice that if I have equidistant points, then some of the points will be outside the circle and some of the points will be inside the circle. If I have very large number of points, then the total number of points denote the r r r representative of the area of this square. Whereas, the points, the count of points which lie within this circle would represent the area of this quarter circle. This makes estimation of this area easy because with this discretization, if I take each point, notice that this length is 1. So, if I have divided this length into n points, then each point is 1 by n away from each every other point. So, consequently, the x coordinate of first point will be, let us say, for i equal to 0, it will be 0. It will be 1 by n, 2 by n, 3 by n, 4 by n. Similarly, y coordinate will be 0 by n, 1 by n, 2 by n, etc. In short, the coordinates of any point i, j are nothing but i by n and j by n. All that I want to do is count those points which fall within this circle. Given, so how could I do it? I think you can form an idea very clearly in your mind. I have n square points. I will go through, go over each of those n square points. For every point, I will examine whether the point is inside the circle or outside the circle. The trick is, how do I determine whether a point is inside the circle or outside the circle? It is very simple. Notice that the distance from origin to any point on this arc is exactly one unit because that is the radius of the circle. Further, for any point, if I draw a triangle from that point on x axis, let us say, then I can calculate the value of the length of the longer side. What is that? This square plus this square. So essentially for any coordinate i and j, i square plus j square is equal to the square of this length. Now the square of the length for a point which is exactly on this circle, you will notice that the property there will be that i square plus j square will be exactly equal to n square. Sorry, if it is less than n square, the point is inside the circle. If not, it is outside the circle. This count divided by n square is obviously pi by 4 which is the area of this quarter circle. Armed with this information, I can now estimate the value of pi by just putting up a count which goes over all the points here. If I get this point, I can determine whether this point is outside the circle. If I am looking at this point, I can determine this point is inside the circle. I will just go over these n square points and I will get my estimate. The program is actually quite simple. Let us look at the program. Of course, since I am using c out and c in, as I told you, I am using namespace std and iostream. But in a regular c program, I will replace it as we already know by saying hash include std i o dot h and I will have either scan f or print f statements to be used, which we will introduce subsequently in the next week, after which we will have programs containing this. Incidentally, the program that has been sent to you for the lab assignment, my colleague Nagesh Karmari has converted that program from c plus plus to c and we use those hash defines to simplify input output statements. Continuing with the estimation of pi, notice that I have n, I have some count, which is the count of points which are inside the circle and I have variables i and j. I collect the value of n, so that means I have so many points n by n. The estimation is actually a doubly nested iteration. What do I have to do? I have to go over all n square points. What is the easiest way of doing that? I start an iteration with i equal to 1 and let it go through n. So that means every statement inside it will be executed n types. More important, I will take all values from 1 to n. Effectively, I am covering all points on the x axis. That is all points whose x coordinate is 1, 2, 3, 4, 5 up to n. Within that, I start another iteration for j, where I vary j from 1 to n again. So I am ensuring that for every value of i, j takes a value from 1, 2, 3, 4 up to n. Since it happens for every value of i, the totality covers all the points of my hypothetical digitization of the quarter circle. Once I come inside this, I will come inside the nested iteration with one specific value of i and j, whatever it could be. So whatever is that value, I now need to determine whether i square plus j square is less than equal to n square or not. What is the property? If i square plus j square is equal to n square, the point is exactly on the boundary of the circle. If i square plus j square is less than n square, the point is inside the circle, otherwise it is outside. So I will look at i square plus j square and compare it with n square. If this condition is true, I know the point is inside the circle, I will simply implement a count. I have started with count equal to 0. When I complete this nested iteration, whatever value of n I have given, I will get a count which is representative of the area of the quarter circle. Since I know the area of the quarter circle is 4 pi, I can estimate pi as equal to 4 into a count divided by n square because we assumed a unit disc there for the area to be pi by 4 and I will just print pi out. So the program itself is an extremely simple program. Let us very quickly go over it again. I want to estimate pi. I define i, j, n and count. I read a value of n. What should be n? 2, 4, 200, 2000, 20,000, 2 million. Again I take you back to our discussion on numerical computation. As I vary n, I will get closer to the truth because the true value of pi because my points will be very close on the grid. If n is very small, my pi estimation will be rather less accurate. If n is very large, my estimation will be accurate. So we are discussing the algorithmic efficiency. It is important to understand that my first target is to ensure that I get the correct value of pi because ultimately the computations are done to get correct results. Anyway, going over to this program once again very quickly, I am setting up a double iteration and a state iteration in which I make the variable i go over all the values from 1 to n. For each such value of i, I make another variable j go over from 1 to n, ensuring that every possible i, j pair between 0 to n gets examined. And what examination do I do? I simply check if i square into plus j square is less than equal to n square or not. If it is, as I said the point is within the circle, I increment my count. Otherwise I will forget it, simply go to the next value of j and so on. Finally, when I come out, I will calculate pi as 4 times the count divided by n square which will be my estimate of pi. So, a very simple program. Notice that most of the computations are being done here. What is the kind of computation? I am multiplying i by itself, I am multiplying j by itself, I am multiplying n by itself. Pretty heavy, three multiplications and each of these multiplications is being carried out inside an estimate. So, these multiplications that will be carried out will be n square times 3. There is an addition here and there is of course, one comparison and if comparison succeeds there is an increment to the count. So, here is a quiz that I have, we will have enough time, let me shut this quiz because some of you would be eagerly reading this, although it does not matter, but based on whatever we have discussed that is why I deliberately went over the simple program rather carefully and I again point out to you that most of the computations are being done here. i square plus j square less than equal to n square, I increase the count. The first option, the effect will be negligible because we have declared pi as float. Actually the effect should be negligible, but it is not because we have declared pi as float because pi is only the recipient of final value and that is expected to be float. So, we have to worry about what could happen inside the expression. The b option says very large because the division operation in the final formula count by n square is of type integer divided by integer. Well yes, it is integer divided by integer, but please read the full expression. The expression on the right hand side says 4.0 multiplied by count divided by n square. You will recall that when we discussed the precedence, we said that multiplication and division are at the same precedence followed by addition and subtraction and operators at the same precedence are evaluated from left to right. Consequently, this multiplication and this division and are at the same, what should happen? Of course the brackets overwrite the precedence, so n square will be calculated separately. So consider what will happen now, n square will be calculated and n square will be an integer value. Count of course is an integer value, however 4.0 is floating point and the expression will be evaluated by our Dumbo computer strictly left to right. Therefore when it attempts to multiply 4.0 by count by the laws of the programming language, it is count which will be converted into a floating point before the multiplication is done. And the result will be a floating point result. Once I have a floating point result, it is the floating point result which is being divided by an integer number, again this integer will be converted to a float. So you will see that this result will still be a float and there is no problem because there is an integer divided by integer. C answer would be correct answer on some machines because the values of some terms may be beyond the limits of integer and that is not necessarily at this point in time. If n square is beyond limits that is it, but i into j, sorry i square and j square which you are computing could also be beyond limits. In any case we modify this program and I call it program version 1. All that I have done is I have converted integer i, j and n to float i, j and n. And why did I do that? As I admitted to you because the behavior of this program is found to be different on different machines, I realized that I have to admit I do not know, I cannot guess what will be the effect. So, to be prudent and to be on the safer side I simply replace the integer definition by float. After all I am doing computations for getting a real value and there is no harm if all participating variables also store a real value. Please note that I might lose a little bit of precision but I will not lose the magnitude. In fact I am making count as long count to ensure that count also contains the proper value of n square. All by the way depends upon the value of n. If my value of n that I give is 100, I will have no problem whatsoever. If the value of n I give is let us say 1 million, million obviously allow all kinds of problem. So, one has to be careful about taking a call on what could be the problem area anyway. So, rest of the program is same as I said it is still doing the computation as I mentioned and I now know that this program should work reasonably correctly. I compile this program and execute it. These are the execution results I have shown. I have used C plus plus instead of C C merely because I am using C in and C out. If the third version of this program which we shall discuss shortly has been mailed to you as part of the assignment you can download it, edit it, bring it back to the first version if you want and test things for yourself. So, when you compile this program please note I am using minus o pi that means the name of my program is now pi. So, instead of saying dot slash a dot out I am saying dot slash pi it will execute the compiled and executable version that was created here called pi. When I execute it because of my input output statements the program will ask for the value of n let us say I give a value thousand. With the value thousand I get value of pi from this program as 3.13755. This is not very accurate as pi is concerned but I know I seem to be getting there. I increase the number l from thousand to ten thousand. When I increase it ten times the value is 3.14119. You can see that I am inching closer towards the correct value of pi. I may still not be happy so I will execute this program with still larger value and larger value and so on. I can execute it till the limits of the representation strike in but now I can believe that before that I will probably get the right answer. Now I talk about determining computational efficiency. Let me go back to this slide for a moment and tell you what we have achieved here. We have confirmed that our program seems to work correctly. We have executed that program for n equal to thousand we have got some value. We have executed the program for n equal to ten thousand we have got some value. And we now know that if we increase n beyond this we will get probably a better and better value closer to pi. Now we come back to the original question which we started discussing in this session. How long will it take to do these computations? And thereafter is this algorithm that we have written the best algorithm that we can write. So let us examine this. First how I determine computational efficiency? I now wish to determine how long it takes to run the program. How do I measure the execution time? One dumb way is to use my watch on the wristwatch or a stop clock. So whenever I press return after typing dot slash pi that means that program is going to be executed. I also press a button on my stopwatch and whenever I get the final result I press it again. The difference is the total time required to execute that program. Unfortunately when a program executes on a model operating system the execution time that I measure by my stopwatch is not necessary in the time that the computer spend executing the instructions of my program only. That is what I want to know. What would that stopwatch time indicate? It indicates the total time which will also include the operating system overheads. My program has an input statement. I have to type in the value of n. Suppose I take one hour to type the value of n. The total time on my stopwatch will be one hour and few seconds. Does it mean that to compute pi it takes so long? No. Where the real time as calculated by my stopwatch is not good enough. What then do I need to know? I need to know the actual clock time that the program has taken including my time, think time let us say while I was giving an input. But more importantly I want the time that the computer's processor spent on executing instructions of my program. How do I find that out? There is no easy way of doing it externally. Fortunately, all operating systems have some utility programs which permit us to measure the time. So there is some utility which internally keeps time while the operating system runs our program. Unix operating system and its variants provide a utility called time that is the command. So instead of saying dot slash a dot out, if I say time dot slash a dot out, a dot out will not execute independently but it will execute under the control of that timer. Timer is not causing any interference. All that it does is equivalent of pressing a button. However, I am able to measure only real time externally whereas the computer's utility since it knows at which point in time the computer's processor is executing my program instructions and at which point in time it is executing the overhead instructions of operating system. So it can distinguish and discriminate between the two and collect different counts internally. And it is able to present it to me, I have calculated the time taken by your program and these are the different components and this is the time taken. All operating systems indeed provide such utilities whether it is Microsoft or HPUX or IBM, AIX or any other units. This utility by the way may be good enough to give us a judgment on the execution time required by individual programs. In modern days going forward and I would like you to tell this to your students that while if this first course they might be writing small programs 100, 200, 500 lines but in real life extremely complex software is being written such as for example an ERP package or a banking system when I go to withdraw 500 rupees some banks clerk enters something on the computer and depending upon how long that computer and the backend server system takes to do my transaction I will have to wait that much longer. Consequently all banks, all financial institutions or for that matter all companies which deploy computerized system for running their applications of this type whether it is filling up a reservation form whether it is collecting cash from the bank or whatever they want to know what is the response time of the system and what is the throughput that I get from the system for the given load of users how do you calculate the load well I will say that I have a system implemented in so many branches I have a centralized system take state bank of India state bank of India has 16,000 branches even if you forget the subsidiaries they have 10,000 branches they have almost 2 lakh employees imagine that in 10,000 branches just 2 people are simultaneously conducting transactions out of 20 staff members all are logged in but simultaneously only so many transactions are happening in which case there are 20,000 queries which are coming to the backend database a query is what is actually it is a parameter input to some program like our estimation that program will have to execute and return the results in terms of information about account balance or whatever so consequently measuring the throughput of such large and complex systems which are being used by a large number of people is an important real life requirement and simple utilities like time will not work so there are systems which have been built merely to calculate the execution time under such circumstances these are the two names which I have written here one is called load runner it is a popular tool package from heavily peckard there is a similar equally competent utility I call it utility but it is actually a conglomeration of large programs by IBM as well most popular amongst people like you and me of course is a utility which came out rather recently as compared to the other ones it is called jmeter it was actually designed to see the performance of Java application servers all the end to end servers where the web is essential ingredient and the users are sitting across the web so jmeter is one such thing I am merely mentioning these details as a matter of interest we are not going to use these I mean our students are not going to use this but they will find this interesting particularly in the context of what we are going to observe now as execution times we will all be convinced that yes for any program that we write which requires heavy computation we should certainly once in a while apply the time command and see what happens how long the program takes so as for now we will just use the notion of the time command under whose control we would like to run our program this is the way we measure execution time so observe that ordinarily I would have said dot slash pi to execute my program instead I use this command and say time followed by dot slash pi all that it means in plain English is that look mr. computer run this program but run this program under the control of time utility so at the end you not only give me my results but you also tell me some different statistics of time taken for executing different components related to my program this is what it does so let us look at this for n equal to thousand I give a value thousand the computer says value of pi is 3.14 119 this is actually the output from my program and my program will end when my program ends this time utility will now print this statistics the statistics uses some very peculiar wordings it says real 0 minute 3.690 seconds it says user 0 minute 1.076 seconds and it says sis it means it says 0 minutes 0.012 seconds what do these different values signify we have 3 of them so just remember real user and sis let us understand what the real user and sis mean to us first of all real time means the total clock time is the program to execute from start to finish roughly then the real time is equivalent to a stopwatch time that I would have measured if I had done the process that I mentioned to you that when I press return on my keyboard to run that pi program I will also press a button on my stopwatch and whenever I get the result I will press the button again so whatever time duration I get that is real time in the sense that that is what the time shown by my clock or my watch and therefore real time means the total clock time is the program to okay this includes the time spent by us in giving input value so this clock that is why the either the stopwatch idea all looking at the real time alone will not tell us really how much time my program took to run this is where the two other times user time and sis time are important independent time counts sis incidentally stands for system the best way to explain this is that we go back and look at or recall the discussion that we had in the morning we were executing a function we were doing some computations which were not only desired but we had implemented it by instructions in our program but we know that whenever a computer moves from one program to a function or something else etc etc there are some overheads essentially the time which was spent in executing operating systems instructions all the instructions of something other than my program that is given by sis so we have on one side the clock time we have other side the system time the usual time is actually that computing time taken by the computer to execute our program consequently amongst all these three times which are displayed to us by the time command the value which is most important and pertinent to us and which can be considered to be reflective or representative of the execution time of our program is the user time so this is just an explanation user time is actually the computing time taken by the computer to execute the program whereas sis time is the time which was spent by the supervising operating system real time we have already seen is the total time it will also include time taken by me for input or whatever whatever so we conclude that the time complexity of our algorithm or the efficiency of our algorithm is actually captured or measured only by the user time consequently when we run the command pi for different values of n we have seen already 10,000 and 20,000 if additionally we also issue the time command and under control of time command execute our programs then I will not only get my results but I will also see this usage statistics this is exactly what I have done on the version of the program for estimating pi that we have seen I execute this with the time command now so let us see what happens for 1000 I get 0 minute 0000 second basically since I get only in seconds or milliseconds and for 1000 it takes trivially less time to finish the computations I do not even see the reflection of computation time here incidentally the real time says 2.496 seconds and most of this time probably the machine was waiting for me to give input or whatever it is but now look at the execution with 10000 as the value of n the command is same time dot slash p i so the time will initiate the process called p i it will say go ahead and execute it I will monitor your time and whenever you finish off let me know I will stop counting and tell my masters this is the time it required to execute the problem notice that close to 0 value for n equal to 10000 the user time is 0 minute 1.076 seconds so this is actually the time which is required to estimate the value of pi provided my n is 10000 that means so many points I have digitized please note that the value that I am getting is 3.14119 I am still not comfortable with it so I will run the program once again maybe with 20000 now I can see something at n equal to 20000 user time is 0 minute 4.3 seconds so suddenly it has shot up if you look at go to the previous slide the value for n equal to 10000 was 1 minute 7 seconds and the value for 20000 is 4 minutes 30 seconds very clearly the time is not increasing linearly so 10000 to 20000 is twice as much but the time required to execute is not just twice as much is more than I continue this experimentation and I let us say arbitrarily run this program for n equal to 50000 if I do that the value of pi that program returns is 3.14151 pretty close I know the real value of pi as all of you therefore we know that we are reaching closer but look at the time the real time is 29.742 seconds 29 minutes 0.742 seconds and the user time is 26 minutes 0.714 seconds so if you forget all other things with 10000 it took just about a second or so to get me the result for execution of my program for 20000 it took 4.304 seconds and for 50000 it took 26.714 seconds it is very obvious that the number is not increasing linearly but increasing very rapidly and nonlinear is it possible to further reduce the execution time come back to the main theme of the second session which is that I should be able to write the most efficient code for my program I have written this program it is running it is giving results can I make it better can I make it more efficient well we go through the code again and we make these observations first observation is that the major computation is happening during evaluation of if statement we already seen that where we are saying i square plus j square should be less than n square and increment count if that condition is met now this computation is done within a nested iteration each running n times so n square times and we notice that why the values of i and j are changing every time value of n is fixed now we notice so if n is fixed value why am I calculating n square again and again and again is that essential let us go back to the program and just check it once i square plus j square less than equal to n square count is equal to count plus 1 this whole statement if statement is executed n square times and therefore this computation is also executed n square time the point which I am trying to make is that if n is well known at the time of entry itself then n square can be computed once for all and kept in a variable wherever I require n square I will use that variable but that way I should in theory avoid a very large number of multiplications that will occur for every possible pair of values of i and j this is the modification that I can think of how did I arrive at this modification first I looked at the execution time then I found out that execution times are rather high for my comfort I am now looking for reducing them I know where I am doing the computations I have already identified that this is the life then I read the program carefully and I see whether is there anything which is fixed or constant I notice that i is changing j is changing but n is not changing ok so here is the modified program version 1 or version 2 what I am doing this is the only line I have added here rest of it is same what is the line I have added I am saying integer n 2 is equal to n square that means n square computation I am taking out of the loop now look at the iteration for i equal to 1 to n for j equal to 1 to n if i square plus j square is less than equal to n 2 so n 2 is not a computation it is a variable fixed value and that value is fixed here at the beginning just outside the loop so consequently you know this computation not just comparison between n square and i square plus j square but the computation of n square itself which would have happened n square times because I am running the loop i equal to 1 to n and j equal to 1 to n all that is removed that much of computation is removed and I am gone back by replacing n square by a I calculate n square once and replace it by some variable n 2 ok I had similar long computations for i square and j square since they are varying I cannot do much about it but the amount of computing time required I would think should be proportional because these are also integer multiplication that was also integer multiplication technically then since there are three multiplications here equally heavy and I have removed one of them I should actually see an advantage and my program should run at most in two third time then what it did sadly it does not happen if I run this program these are the execution results I get notice that I am running it for 20,000 and 50,000 just to show you let me go back a few slides where we had seen this for 50,000 the time it took then was 26.714 seconds now look at this execution 26.898 seconds what does it mean no appreciable chain well question is why there is no appreciable chain after all I removed a very costly operation called multiplication from inside a doubly nested loop and pushed it outside here is another quiz the execution time for each of the two versions is not appreciably difference because why it is not different so this is an interesting quiz I am speculating what could be the reason a multiplication does not take very large time it is the division operation and the addition operation which is time consuming and therefore since my equation I will go back to the equation this was the equation i square plus j square less than equal to n 2 or earlier n square this is where most of the computational of occurring I am speculating that it is not the multiplication operation but the division and addition operation may be taking longer because there is one addition here and therefore that may be the problem my speculation be or choice be since i and j are varying computing i square and j square each takes much longer than computing n square and that is why even if I replace n square computation by something else it does not affect because i square and j square i into i and j into i actually takes longer than what it takes to compute n into n why because I believe i and j are variable they are varying so therefore somehow somehow the computer takes longer to compute that multiplication but n is constant so it does the computation faster choice he says our program somehow figures that n is not changing it knows that I read the n nowhere else I have assigned a value to n and therefore n is not changing and thus what we did actually by saying 2 n is equal to n square and then using 2 n wherever we had used n square the computer seems to be able to do that automatically so it calculates n square only once and uses that value repeatedly so I am claiming that my computer is super intelligent to understand this and of course the last choice as usual I do not know and also I cannot guess so this is the case.