 Two administrative things. One of them is we now have a room assignment for our midterm. They were going to put us in more than one room, but luckily a few people have dropped the class and we now all fit in 2050 VLSB. So that's where exams are going to be. The first one is two weeks from today. We'll talk more about that later. Drop-in tutoring. I had a visit the other day from a student who went into the CSUA office, the one that's on the outside of the third floor entrance, looking for drop-in tutoring. Apparently she had an unpleasant experience, which shouldn't have happened, but could have been avoided by me telling you earlier, although it isn't the handout, where to go for drop-in tutoring. We have HKN Ada Kappa New, which is the EECS Honors Society, and Epsilon Pi Epsilon, which is the letters and science CS major Honors Society, and both of those groups offer drop-in tutoring. They're right next to each other in 345 and 346 Soda Hall. So that's the place to go if you want help besides my office hours and your TA's office hours. Also, every once in a while I get an email from a student wanting to pay somebody to tutor them in this class, and I don't have a supply of such people, but the Honors Societies would be a good place to look, if that's what you want. Although before you spend money on tutoring, you should make sure you're availing yourself of me and the TA's and so on. And you can ask questions on Piazza, which some of you are already doing, about the project, which is another administrative announcement. In case you missed it, you are this week working on Programming Project 1, which is in Volume 1 of the reader, the skinny one. Okay, that's it for administrative things. The topic for this week is efficiency, which we hardly ever think about in this class, because we want you to get your program to work before you worry about how fast it is. The 61B is very largely about efficiency. You'll spend a lot of time on that next semester. But we do spend one week on it just to introduce you to the ideas, the general ideas that people use in talking about the efficiency of programs. And every once in a while we'll actually have something to say about a way that you can organize your program to get big efficiency gains. Again, it's not our main focus. It's not our main topic. And it's not the main thing you should be worrying about. Right? Here it is. Okay, so how do we measure the efficiency in time of some algorithm? It's whether the program using this algorithm runs quickly or slowly. How do we measure that? And the obvious answer turns out not to be a very good one. The obvious answer is you look at your watch, you start the program, and you see how many seconds it takes, and that's your measure of how long the program takes to run. And the problem with that as a measure is that the actual running time on a computer is affected by very many things other than the actual algorithm. So it's affected, first of all, by how many years old your computer is. That's a very big factor. Another one is what else your computer is doing. And you may think it isn't doing anything, but that's not true. If you look at a list of the running processes, which all the operating systems let you do more or less easily, you will always find 20 or 30 things. There are programs running with names you never heard of, and you have no idea what they're doing. And the efficiency of your program is very much affected by what all those things are doing. So to talk about the efficiency of an algorithm, what we do is we don't talk about this is how many seconds it took. Instead, we actually look at the algorithm and count how many primitive, constant time operations it does. So, for example, for our purposes, all arithmetic operations, like plus, minus, those things, take constant time. That's actually not quite true if you're dealing in very large integers. If you take the factorial of 500 or something, and then you try to multiply that by 3, that actually takes a time proportional to the number of digits in the number, roughly. But we're not going to think about that because mostly either you're using small enough numbers or you're using floating-point representation that goes with powers of 10. And those things really are constant time. So, here's an example up there. I have a procedure squares that you've seen before. It takes a sentence full of numbers. Oops, what did I just do? It takes a sentence full of numbers and returns another sentence full of numbers in which each number is the square of the corresponding number in the input. Okay? Very simple program. So what I want to do is measure the running time of this program. And I'm going to say in squares, what are the primitive operations? There's empty question mark, constant time. There's if, deciding which branch to take, constant time. There's first constant time. There's but first constant time. Now, square here is not a primitive operation. It's defined up in the top line of the screen. But if you look at its definition, it just does one multiplication. So it too is constant time. So that's 1, 2, 3, 4, 5 things that are constant time. And then this is something a little tricky and I wouldn't expect you to know this. But sentence actually turns out to be pretty complicated in the way it's implemented and it has different running times for different cases. But it so happens that in this program, the way we're using sentence in which the first argument is a single word, namely the square of one number. And the second argument is a sentence, namely the result of the recursive call. This call to sentence takes constant time also. That's 1, 2, 3, 4, 5, 6 constant time operations. Plus the recursive call. Okay? The recursive call is on the but first of the sentence, which is a sentence that's one shorter. So how many recursive calls happen? In this example where I gave it five numbers, how many calls to squares were there all together? Here's some fives and some sixes. Want to vote? Who says five? Who says six? Okay, a little more sixes. I say six also. There aren't six recursive calls but there's six calls because this one right here, maybe that you weren't counseling this, is a call with sort of length equals five. So five, four, three, two, one and zero. There's a recursive call with the empty sentence. Okay? But that call for the empty sentence only does two constant time operations. It does the empty and the if and then it just returns the empty sentence and it doesn't take any time at all. Okay? So for squares, a sentence of length n, we do six n plus two constant time operations. Right? Because the five calls with non-empty sentences have six operations plus a recursive call. And then the empty sentence, one, only does two operations. So the running time of this program should be proportional to six n plus two. Right? Yeah. Any questions so far? Great. Oh, there's a hand up. Yeah. What is n? I'm sorry, n is the length of the sentence that you give as input. Okay, a number of words. Are there questions? Yes. Yes, the two is for the if and the empty when the argument is the empty sentence and the very last recursive call. Okay? Okay. We're going to see, by the way, in just a moment that this is sort of too much information and we're going to get to that because just a hint, coming attractions, we're going to end up saying, okay, the amount of time is proportional to the length of the input. And that turns out to be the best thing you can say about this. Okay? Now, here I have a function called sort and what it does is it takes a sentence of numbers and it returns a sentence containing the same numbers but ordered from smallest to largest. Okay? So we're going to look at how long sort takes. Well, this is a little bit trickier because sort has one empty, two if, three first, four but first, constant time operations. It has the recursive call to sort but it also calls this helper function insert that's defined just below it. We're going to discover that insert is not constant time. So first let's look and see. Well, first let me explain the algorithm to you. This is called insertion sort. And it's what you do when you're playing cards and you have a pile of cards that have been dealt with front of you and you pick them up. If you're me, you pick them up in your left hand because I'm left handed, maybe you pick them up in your right hand. And then one at a time you transfer the cards from this hand to that hand. And so we start by taking any old card and moving it to the other hand. Then we take another card and we put it either before or after the one that's already there. And we take the next card and there's three places it might go at the beginning, inside the other two or at the end and so on. For each randomly chosen whatever order we got them in card, we insert it into an already sorted hand. So the trick about insert is that it takes one new number and it has this argument sent, which has to be a sorted list of numbers. So sent is already in order first to last and we're inserting one new number into it. Okay, how long does this take? It's a con, so it's a three way choice. Here's the base case. Sent is empty, in which case we return a one word sentence containing the number. So this turns out to be constant time. If sent isn't empty, then we can compare the new number we're inserting with first of sent. We can say is this new number smaller than first of sent and therefore smaller than everything in sent because we're assuming sent is already sorted. So we're comparing the new number with the smallest number in sent, which is also the first number. So this is two constant time operations, first and less than. And if that's true, we do this sentence, which is constant time because it's adding one word in the front of the sentence. If we were adding the word in the back of the sentence, it would take longer, it turns out. Well, we'll get to that next week, why that is. And here's the else clause. And what the else clause does is it says, okay, this new number is not the smallest. First of sent is smaller or equal. Doesn't matter. So my result should start with first of sent. Because first of sent was the smallest number in sent. And it's also smaller than num, which is the new one we want to add. So therefore the first number in the result should be the first number in sent. And what do we combine that with? A recursive call in which we're trying to insert the same new number into but first of sent. So basically we're going to mark down this sent until we hit a number that's bigger than the new one. And we're going to end up inserting it right there. Okay? Let me actually show how this works. The moment by saying trace, sort, insert, and then do this again. So it's a little tricky. Here we go. So I'm calling sent, sort with the whole sentence. That, you can't see this, but it says is the sentence empty? No, it's not. Is the new, is the first of sent? Because, I'm sorry, the, what sort does is it takes first of sent and uses that as num in the call to insert. And it also sorts but first of sent. So what we're doing here, we're looking at this sentence, 623, 510, 107. And before we can do anything else, before we can call insert, we have to sort the but first of the sentence. So first thing, we sort 23507. And that starts by sorting 507. And that starts by sorting 107. And that starts by sorting 7. And that starts by sorting the empty sentence right down here. So we call sort, boom, boom, boom, boom, boom. Sort the empty sentence, that's the base case of sort. It returns an empty sentence. And now we can start inserting. So we want to sort this sentence. So we end up calling insert with num is first of sent, which is 7. And sent is the sorted version of the empty sentence, which is also the empty sentence. So in that situation, we're in the base case of insert right away. We're inserting into an empty sentence. So it immediately returns the sentence of length 1 with a 7 in it. And because insert returns that, that's what sort returns. So the innermost sort here returns sentence with 7. I'm sorry, sort, there it is. And so that's this value here. So 7 was just the first number that happened to be in the sentence, not necessarily the smallest. So now I'm going to take, I'm sorry, the last number that was in the sentence. It goes right to left. So now I'm going to take the number before it, which is 100. And I'm going to try to insert 100 into this sentence. So insert says is, is sent empty? No, it isn't. Okay, let me compare num, which is 100, with first of sent, which is 7. And it asks the question is num smaller? The first of sent. And the answer is no. Num is 100. First of sent is 7. So num is bigger. So therefore, I'm going to do this recursive call to insert. I'm going to insert 100 into the empty sentence. That's the base case for insert. So it returns this, sends with just 100. And then we're back here inserting the, by sentencing, sorry, the 7, which is now the smallest number of the ones we've looked at in front of the result from insert. So insert returns 7, 100. So therefore this recursive call to sort returns 7, 100. And now we get the next number, which is 5. So I try to insert 5 into the sentence 7, 100. This time we're lucky. 5 is smaller than 7. So without even trying, we know that 5 is smaller than 100. Because we know that sent is in order. So 7 is the smallest number in it. So we can just stick 5 in the front, insert right away, returns this, and so does sort. And then we go to the next number, which is 23. We're trying to insert this here. This time we weren't so lucky. So 23 is bigger than 5. So I'm going to try to insert num into but first of sent, which is 7, 100. 23 is still bigger than 7. So I'm going to try to insert 23 into the sentence with just 100. This time we are lucky. Insert sticks to sentences 23 in front of the 100. And then the next call up, 6 to 7 in front of that. The next call up sticks to 5 in front of that. That's all the numbers we have. So the recursive call to sort returns this sorted sentence. Now we're all the way out to the beginning of the original sentence. So num is 6. We're trying to insert that into 5, 7, 23, 100. 6 is bigger than 5. So we do a recursive call to insert. 6 is smaller than 7. So we just stick it in front. Then we stick the 5 in front. And that's what sort returns. That's pretty quick, but I think the algorithm is not that bad. Questions? Yeah. Okay, good question. The question is, since sent is empty, why do I have to do this? Why don't I just return num? And if the overall input that I started with is more than one number, that would in fact work because sentence is very forgiving. And you can do sentence of a word in a word, not just a word in a sentence. But if I did it your way and I said sort a sentence with only one number in it, it wouldn't return a sentence. It would return just the number. Because that's what insert would return. That's not so good because somebody else who's calling sort expects to have a sentence. And even if it's a sentence of length one, they're going to be, you know, butt-firsting down the sentence looking for a word there. So my overarching answer to this question is domain and range. Always remember domain and range. So in particular, as we're writing insert, we say what is the range of insert? That is to say what kind of thing is insert supposed to return? The answer is it's supposed to return a sentence. So that's true even in the base case. We have to make sure we're returning a sentence and that's what this does. Okay? So it's a much better way to think about things than what I could get away with shaving, you know, one instruction off of this program by having the base case not in the range but something else instead. Don't think like that. You'll get all messed up. Wait until they get to trees that you'll die if you think that way. Think, I'm supposed to return a sentence. I'm going to return a sentence in every one of these three cases. So you'll notice that in fact every one of these three cases calls the procedure, the constructor procedure a sentence to produce its result. Okay? Yeah. Yeah, if you didn't have ASI there, we'd just return the word. So that's what num is. Yeah. Here, we're talking about this. We've just checked and we've discovered that scent is empty. So we know what, so we're just making a one word sentence. If it said here, if I put scent right here where the cursor is before this close parenthesis, it would be okay. I would then combine num with an empty sentence, but it wouldn't help because I know what happens when you combine one word with an empty sentence. You get a one word sentence. So I'm just asking for that directly. It does create a sentence. Are you saying does it cobb or pointers to change something? SE is a function. It doesn't change any of its arguments. Okay? Functions, that's an important point. You programmed before in high school or something, right? Yeah. We're doing functional programming. Never ever are we going to write something for the first month and a half of this course that changes the value of anything. It's always going to be creating a new value. Okay? That new value might or might not share memory and if you're worried about space efficiency, you know, the answer is it's none of your business how it does it under the hood. You just worry about the value of what you get. Okay? Yeah. Okay, he wants me to insert values into an array. In functional programming, we do not change the value of anything. We don't set up some data structure and then put things in it. We write a function that returns a value period. Okay? You have to, people who programmed before in some other paradigm, you really have to try to think in functional programming terms and not try to solve this problem the way we're taught in high school or the way you were taught in community college or the way you were taught in E7. Okay? Don't think about what data structure should I build. We don't build data structures. Think about what is the value that this function is supposed to return and you make an expression whose value is that value like this. This is an expression whose value is the thing that we want sort to return. Okay? Do that. Don't think like that. I'm not mad at you, but think like that. It's a good question. Everybody else wants to know that. Everybody who went to high school had the same thought in their mind, but I don't feel that. But he didn't go to high school, you're doing fine. Okay? All right. So, some problem. What's the running time of insert? Well, I see one, two, oops, empty, three less than four first five sentence. Sometimes we get five. Sometimes it's five sentence, six first, recursive call, seven, but first. Okay? So, the time for insert, sub problem here is either if the input sentence is empty, three, or five, if num is smaller, or something in between. The actual running time depends on how lucky we are. Maybe it's just five altogether if num is smallest. Maybe it's seven plus five if num is second smallest, or 14 plus five if num is third smallest. The worst it can be is seven n plus five. Actually probably seven n minus one. Do I mean that? Seven times n minus one, right? No, seven n, because n is the number of things in cent, not including them. So, this is the worst case. So, when we're trying to figure out how long sort takes, what we're going to do is assume the worst case. That's the most conservative assumption. We can also say on average the number we're trying to insert is going to be halfway down, right? So, we could say seven n over two plus five, average case. So, that's another way to think about that, to keep in the back of your mind. But generally speaking it's easier and more helpful to think about the worst case in estimating running time of a program. Okay? So, given that, how long does sort take? Well, sort has one, two, three, four constant functions plus a call to insert plus a recursive call. So, there's n recursive calls, right? Sort. So, the result for sort is going to be n times something plus a constant of two. What's the something? Well, it's four constant things plus the worst case for insert, right? Okay? Which is seven n squared plus nine n plus two. Everybody happy with this? Okay, somebody who's thumbs down, ask a question. Yeah, no? What's the two? The two is the if and the empty in the final base case of an empty sentence. Any questions? Oh, three. No. Most of the time, that's only, he wants it to be seven n plus three, not seven n plus five. Most of the time, almost all the time, the new number that we're inserting is going to be somewhere other than the very end. So, when we get to a number bigger than it, we're going to do five steps. That's the second case of the con, which is also a base case. There's no recursive call in the second conclosive insert. Does that answer your question? Yes, the very worst case, you're right, would be that, but if you think about the big sentence that we're trying to sort, we're only going to get to that worst case by once, unless we're so unlucky that we get the numbers in inverse order, backwards order or something, okay? Yeah. Ah, yes, good question. You get a gold star. He wants to know, this makes up for you wanting to put things in a raise. He wants to know, isn't it the case that each call to insert has a different size scent, right? It's not that we're going to make n calls to insert and each of those calls to insert has a sentence of length n to work with. We're going to make an insert into an empty sentence and then insert into a one number sentence and then insert into a two number sentence up to insert into an n minus one number sentence. So really, this should be 7n over 2 and this should be 7n squared over 2, right? Because on average, we're inserting into a sentence of length n over 2, okay? Now, I said, we're only doing one week of this in 61b, you'll get a much better treatment to this whole topic. I am hand-waving a lot by saying, okay, 7n over 2. It's the right answer, but I have not proved that it's the right answer. Okay, next semester you'll actually do that. But yeah, it's actually 7n squared over 2 plus 9n plus 2 is the right answer to how long this takes. Okay, now that you know it's 7n squared over 2 instead of 7n squared, do you feel edified? Do you have a better understanding of this program? I don't. Because this division by 2 is the kind of change in running time that is going to be obsoleted the next time you buy a new computer, right? Next year's computer is going to run twice as fast. That would also make it dividing the time by 2. So constant factors in running time are just not very interesting. Here's which interesting about running time. If I want to take squares of a thousand numbers, it's going to take some amount of time, okay, whatever it is. If I want to take squares of 2,000 numbers, it's going to take twice as long, right? Yeah, everybody happy with that? Because it's proportional to n basically. If we have three numbers, that plus 2 at the end matters. But once we have a thousand numbers, the plus 2 at the end is totally down in the noise. Basically, you double the length of the input, you double the amount of time it takes. The behavior of sort is very different. If it takes a certain amount of time to sort a thousand numbers, how much more time will it take to sort 2,000 numbers? It'll take four times as long, right? Because there's an n squared in there. And the square of 2,000 is four times the square of 1,000, not just two times, right? Now there's also a linear term. It's something n squared plus something times n. But once we're up in the thousands, that doesn't matter, right? The difference between 4 million and 4 million 2,000 is nothing, right? In terms of the running time of your program, you want to know the question, should I go out for dinner while I'm waiting for this result? Or is this result going to be computed in my lifetime, right? That's the kind of question you want to know. Not can I get down to one percent, one tenth of one percent, in knowing exactly how fast it's going to be. That one tenth of one percent will be swallowed up by things like all those other programs running on the computer swapping you in and out of memory, and that will affect the running time too, okay? So really, what we want to know about these programs is that squares takes linear time, doubly input, doubly output, double, I'm sorry, doubly input, double the running time. Sort takes quadratic time, doubly input, quadruple the running time, okay? And that's way, way more important than the constant factors, because next year, all the constant factors will be cut in half, and so the running time of the program will be cut in half. But that's not going to change the fact that doubling the size of the input quadruples the amount of time that it takes, okay? So that's really what we need to know about the algorithm, and we need a way of talking about algorithms that captures that essential point, okay? The difference between linear and quadratic. There is a notation for that, and I'm sorry, there's some math up on the board. If you're a math hater, this is as bad as it gets, okay? We're never going to do anything like this again. Next semester, maybe. I'm defining a notation called Big Theta, but in order to define it, I first have to define a notation called Big O. You will see in a lot of books and papers something like the running time as a function of something or other equals O of x squared. But that's wrong. That notation is wrong. This equality sign T of x is not equal to whatever this thing is, because this thing is not a function. It's actually a set of functions. So what it should say, and what the careful papers do say, is T of x is an element of O of x squared, okay? And if you're really, really careful, it will say T of x, nobody is this careful, but they should be. Of x maps to x squared. That's what they should say. Because what goes in the parentheses after the O is a function not a number. So this way it makes it really look like a function. Okay, so what does it mean? And here there's all these horrible symbols, which I am going to deconstruct for you. It says there exist numbers N and K. By the way, N and K are both greater than zero. I forgot to put that in. Such that. One of the horrible things about mathematical notation is that if you see two vertical bars, it means absolute value. But sometimes you just see one vertical bar and that means such that it's easy to get confused. But try not to. For all x greater than N, what's that about? Well, axes, x and f of x, whatever. This is what a linear function looks like, right? You double the input, you double the output. It's linear. Here's what a quadratic function looks like, okay? Once you pass this point, the quadratic function is always bigger. But prior to this point, the linear one is bigger for a little while, right? Good. This says I don't care about small values of x. Why don't I care about small values of x? Here's why. If I have a thousand numbers to sort, given how fast computers are these days, the speed of the computer sorting the numbers is going to be limited by the speed at which I can type them in, realistically speaking. You know, we can figure out exactly how many nanoseconds it's going to take to do that sort. But it's going to be less than a billion of them, which is to say it's going to be less than one second. If it's less than one second, it doesn't really matter if it's less than half a second or not, right? It's fast enough. It's when we have a billion numbers to sort, and that takes a thousand million billion trillion quadrillion. That takes a quintillion operations, right? That's a billion squared, right? Did I get that right? I think so. Quintillion. That's a big number, all right? That's more than a second. That's when we have to worry about efficiency. So if for small values of the input size, the wrong function is bigger, who cares? What we care about is large inputs, not small inputs, when we're talking about efficiency. So that's why there's this capital in. This is the biggest value for which the wrong function is on top, okay? And this notation says who cares? Because for almost every imaginable problem, that cutover point is small enough that we're not talking about a significant amount of time. We're not talking about hours of runtime or days of runtime. We're talking about seconds. Just a minute, yeah. In this class, the question was, does efficiency play a great role in grading projects? No. I think for most of the projects in this class, you'll have to work hard to make a program that's inefficient by this standard that I'm talking about, by a different order of magnitude. If you actually did manage to do that, we might take off a little bit. We don't want you to worry about efficiency in doing the project. Okay. The next thing to worry about is these absolute value signs. These are here to make the mathematicians happy. If a function goes negative, it's the how far away from zero is it that dot that we're talking about in terms of what family of functions is it in. Luckily, if you're a computer scientist, you can just read this as if the absolute value signs weren't there. Why? Because these functions that we're talking about are amounts of time. And no matter how hard you try, you're not going to write a program that solves a problem in negative time. Okay. So for our purposes, f and g are always both positive and you can forget about the absolute value. What about k? This is the part that says constant factors don't matter. So 7n squared, for our purposes, is the same thing as 7n squared over 2. Because that factor of 2 is always a factor of exactly 2 no matter how big or small n is. So doubling the size of the input, it's still going to be quadruple the running time, not only half the running time because that over 2. So constant factors, if your constant factor is bad in running your program and you're trying to optimize your program to get rid of a constant factor, the way to do it is stall for a year, get a new computer. Constant factor, gone. What we care about is this sort of domination by more than a constant factor, one function by another. So this k says the function f is less than or equal to some constant times the function g. Okay. So 3x squared, 4x squared, 1,000x squared plus 20 million x is still quadratic time. That x factor for really big n, which is what we care about, is down in the noise. For a polynomial function, it turns out it's the highest power of nx, whatever your variable is. That counts. Okay. Which do I do? I don't have time to do both. I'll tell you next time about the families of functions, but what I want to show you right now, John Bentley, who wrote not the greatest, our textbook is the greatest, but one of the greatest computer science books ever called Programming Pearls, which I definitely recommend when you're in 61B, if they don't require it. Decided to do a demonstration of this thing that I just said, that it's the order of magnitude, the exponent of the function, again, that matters rather than the constant factor. So what Bentley did was he took the Cray one, which was at that time the very fastest computer in the world. You know, there were like five of them or something in the whole world, and they were used by the NSA to crack codes and by the Weather Bureau to compute the weather and stuff like that. And he wrote a function to solve some problem that did it inefficiently in time, proportional to n cubed, but with a small constant factor. This is measured in some amount of running time. And then he took his RadioShack TRS-80, little 8-bit microcomputer, terrible, slow, conky, and he wrote a program in interpreted basic versus highly optimized compiled Fortran. So it had a huge constant factor, 19.5 million, but proportional to n rather than n cubed. So here are the different values of n, and here's how long things took. And for small values of n, the supercomputer with the small constant factor 1, 3 microseconds versus 200 milliseconds. So that's a factor of 10,000 better. 3 milliseconds versus 2 seconds. We're still a thousand of the factor better. But eventually it crosses over at n equals 10,000. This supercomputer with the super fast compiled program and the tiny constant factor takes almost an hour against 3 minutes for this little dinky micro with a huge constant factor, but proportional to n. Because the difference between 10,000 and 10,000 cubed swaps even this constant factor of 19 million. The last ones he didn't actually run it on the computer, because you couldn't actually have a supercomputer to yourself for a month. But it would have taken a month for something that took half an hour with the linear algorithm. It would have taken a century for n equals a million versus 5 hours. Now 5 hours is a long time to wait for a result, but it sure beats a century. Okay, see you Friday.