 Okay. Last time, a couple of things I didn't quite squeeze into saying. I left you with this picture of the unimportance of constant factors, even a constant factor of several million. By the time the amount of time gets to be significant anyway, even that kind of constant factor is down in the noise. But I should tell you there's an exception. And that is if you happen to be in the video game business, because those guys are always absolutely pushing the limits of whatever this year's hardware can do. So they really do worry about constant factors. But for the rest of us, no. The other thing was I showed you this frightening formula that talks about one function basically being bounded by another function. That f is, for large enough, values of x always less than or equal to some constant factor times g. But I didn't get to talking about the big theta notation, which is actually the one that we use mostly that we're most interested in, because big O notation says f is bounded by g, but it doesn't say how closely. So take the function f of x equals 3, and the function g of x equals x to the xth power. Well, f is big O of g, but that doesn't really tell you very much because f is so much less than g. You really want to know about how long is it going to take, not how long isn't it going to take. So the theta notation says f is in big theta of g if f is big O of g and g is big O of f. Which sounds at first glance like it might be impossible because f can't be less than g and g less than f, but the reason that this is possible is this constant factor. So essentially what big theta says is that if you draw a picture of g, say it looks like this, and you multiply it by something and you multiply it by something else, and somewhere in here lies f. So f might vary up and down a little, but basically f is within this envelope pretty close to g. And that's what we really want to know is big theta. Confusingly, very often in the literature, they will say big O when they actually mean big theta, just to confuse you, but you're not going to do that. Okay, and the other thing I didn't say about time complexity last time is that there are certain families of time complexities that come up a lot in computer science, and you're going to learn again in 61B more than you want to know about this, but I'd like to give you a little bit of a sense of what they are. So one of the common problems for computers is you have some huge collection of data and you want to find something in it. Okay, so I have the phone book and I want to find somebody's phone number. I have the human genome and I want to find some strings of a's and t's and whatever the other two are. I have the worldwide web and I want to find whatever keywords you're searching for. Those are all search problems. And so the most obvious way to do searching is you have all this data and you look at the first thing and say, is this what we want? Nope, look at the next thing. Is this what we want? No, etc. And that takes linear time, right? Basically, on average, you have to go halfway down the whole dataset to find the thing you're looking for. Clever ways take time proportional to the log of n, which is a huge advantage. The log of 1,000 is 3, right, to the base 10? Well, anyway. So it's a big difference. And typically the way those work is you get the data in some good order first. So when you're looking something up in the phone book, you guys don't look things up in the phone book, do you? You look things up on the Internet. Who's seen a phone book? Great. Let's say there's a power failure and you want to call up PG&E to complain about it and you can't look up their number on the Internet. So you drag out the phone book from the closet and you want to look for PG&E. You know what started page one and started scanning down for PG&E, right? You say, okay, I'm smarter than this. P, that's about halfway through the alphabet, so I'm going to open it up halfway, right? So in doing that, you're taking advantage of the fact that the phone book is not in a random order. It's an alphabetical order. And so you can get your searching down to log in. So that's the sort of pretty clever searching technique. And then if you're really clever, you can actually search for things in constant time using an even better data structure called a hash table that you'll learn about next semester. So all of these things are this sort of searching family of running times. And then these guys are for sorting, which is the other canonical common problem things to do with a computer is to put some data in order. And we looked last time at insertion sort, which took theta of n squared time. And all the obvious ways to sort take n squared time. The clever ways to sort take n log n time, those are the ones that work by sort of cutting the whole data structure in half and sorting the two halves. But each of those halves you cut in half, et cetera. So that's where the log n comes from. And it is theoretically impossible to do better than n log n for the kind of sorting that works by comparing two values. See which one is smaller than which. N log n is the minimum. You can do better than n log n in special cases, of which my favorite example is. Let's say we have the name and address of everybody in Berkeley, which is 100,000 people. And we'd like to sort them by zip code. Well, the right way to do that is not any of these comparison based sorts. You set up a dozen great big buckets on the floor because that's how many zip codes there are in the city of Berkeley. Well, roughly a dozen. And you just take an address and put it in the right bucket. And you can do that in linear time out of the 100,000 people. So you get to do that because the information on which you're sorting is only a piece of the complete address. If you wanted to get everybody's actual address so that there's as many addresses as people, then you would have to do one of these things. Okay, so those are the first two. There are surprisingly few, but there are some problems that take n cubed time. The canonical example is multiplying matrices, square matrices, let's say. So an n by n matrix, if you remember how to do that, there's n squared elements in the answer and each of those is the sum of n products. Everything in this row multiplied by everything in that column. So that's n cubed multiplications that you have to do to do the matrix multiplication. There is an extremely clever, complicated way to do matrix multiplication that is theta of n to the 2.7 something. It's not e, it's some ugly number. But basically it's n cubed. And then here's the important point. There's a huge gap. Somebody's going to tell me I'm wrong with some example or other, but to a first approximation, you never see n to the fourth problems or n to the fifth problems. After this we get to exponential time problems and factorial time problems and the occasional n to the n time problems. How many math haters in the room? A few. You understand right that 2 to the n is way, way, way, way, way way bigger than n cubed. It grows much, much faster. Even the math haters know that, I hope. These problems are called intractable. Meaning, yeah, theoretically we can write a program to do these things, but in practice for significant size data sets the program would never finish running. So, you know, linear time algorithm means double the input size, double the running time. Quadratic means double the input size quadruple the running time. But 2 to the n means just add 1 to the problem size. Go from, like, n equals a million to n equals a million plus 1, and that doubles the running time. So that means you don't get to a million. You get to, like, n equals 50, and if you can just barely do that, n equals 51 is going to be way too long, right? And by the time you get to n equals 100, forget it, you'll be dead before you get an answer. So for non-toy size problems, these kinds of running times really do mean in practice that we can't do it. And for problems like this, one of the things that you do is say, well, can I come up with an approximate solution to the problem that is good enough and can be computed in a reasonable amount of time? So all of that stuff is the analysis of algorithms that you'll learn about later on. And that's all I have to say about time efficiency. I want to go on now and talk about space efficiency. Ah, okay. The question was, would you ever deliberately design an algorithm to be intractable, say, for cryptography? And the answer is, it's not good enough for your algorithm to be intractable. If there's a better algorithm, your enemy is going to use it instead of your intractable one. So, no. Okay, space efficiency. This is the thing in the book having to do with iteration versus recursion and tail calling and all that. And I want to start by saying that this matter is much less than it did back when they wrote the book. Back then, computer memories were measured in kilobytes as opposed to gigabytes or terabytes. And you really did have to pay careful attention to how big your program space requirements were. It still matters a little bit. These days, not because you're going to run out of memory. You really are not unless you're an astronomer or you work for the federal government on the budget. But what does happen is as your program's memory requirements get bigger, indirectly that slows the program down. Because what happens is your process is competing with other programs on the computer and your operating system is moving things in and out of memory and the higher the memory load, the more time you lose to your program not being in memory when you need it to be. So for that reason, people worry about space efficiency still somewhat. Okay, so the other piece of background to this topic is when I was your age, I bet you never think you're ever going to say that to anybody, right? When I was your age, LISP, the language of which Scheme is a dialect, had kind of a bad reputation for doing everything through procedure calling instead of having specific iterative mechanisms like for loops and while loops like that language you learned in high school, because procedure calling was viewed as being very expensive and therefore LISP was viewed as being unsuitable for practical problems. And the big sort of top level point here is that you can write a program using procedure calling, using recursion as the means of expression, the form in which you write the program and it still can come out with the same efficiency that you would have gotten from a wired in iterative construct provided that your algorithm is basically iterative in the first place. So as if you would have written it iteratively, you know, in some language with for loops, we can get iterative performance even though you didn't write it that way in Scheme. So up here we have two different versions of procedure count. What count does, it calls you how many words are in the sentence or how many letters are in a word. And we have here two different implementations. And the first one, they're both recursive procedures. So here's a recursive call to count in count. And this second version of count has a helper function, iter, and here's the recursive call to iter. So there's a recursive call both ways. Nevertheless, even though these are both recursive procedures, that is the form in which you write the program is recursive for both of them, the process generated by these two programs is very different. And the first one generates a recursive process, the second one generates an iterative process. So to understand all that, you have to remember that inside the computer there are a lot of little people. And every time you make a procedure call, you're hiring a little person to solve the problem. And there are lots of little people that know how to do the same procedure. And when you call one, you give it different input values to work with. In its arguments. And so they're doing slightly different jobs, but all running the same algorithm. So in the first version, it works like this. I'm going to... So it's not to take so much time or board space. Instead of counting, I want to hold your hand, I'm going to count she loves you. So I'm going to hire Charlotte. That's Charlotte's task. And Charlotte says, is the sentence empty? No, it's not. So I have to do plus one, and then count of but first of this, which is loves you. So Charlotte hires Carl to do that. Carl says, is loves you empty? No, it's not. Meanwhile, Charlotte is sitting there waiting. Charlotte's twiddling her thumbs, waiting for Carl to do the job. So Carl says, loves you isn't empty. So I'm going to do add one to count of but first of that, which is the sentence you. And Carl hires Kathy to do count of you. All right? Charlotte's waiting, Carl's waiting. Kathy says, is you empty? No, it's not. So I'm going to hire Charles. Oh wait, I have to do, I'm sorry, I have to do plus one count, that's a parenthesis, of empty sentence, and hires Charles to do count of the empty sentence. Everybody with me so far? Charles says, is that empty? Yes, it is. I return zero. Here Kathy, here's my zero. Now Kathy says, add one to that zero. It gets one. And so Kathy hands a one to Carl. Carl, in order to do count of loves you, adds one to the one from Kathy, says okay, count of loves you is two. Charlotte is doing plus one, whatever I get from Carl, that's a two, add one to it three, Charlotte returns three. To Alonzo who prints it out. At the scheme prompt. Okay? So as we're doing this, really no work is being done on the way into this recursion. It's only when we get down to the empty sentence that we start with zero and start adding one, adding one, adding one, adding one on the way out. And so each of these little people has to wait for the next one in line to return a value. Now each of these little people has to remember what sent is and where I came from and everything and all of that information takes up room in the computer's memory. So when they're all waiting, when we have four little people all waiting for each other, that's four chunks of computer memory in use for this computation. Okay? So this isn't a question about the running time, it's a question about the memory use which is linear in the size of the problem. Yes. Let me get to iterative processes and then I'll answer that. Okay? Okay. Now let's do it the other way. Yeah, Charles has zero words actually. He's doing the empty sentence. But no, once the computation is finished, we're not using the memory anymore, if that's what you're asking. Okay. So which little person has pockets in which they keep information? So Charles has a pocket in which is an empty sentence and Kathy has a pocket in which is this one word sentence and so on, right? Now you're right, in this simple example it doesn't matter so much what the sentence is. Once they've made the recursive call, all they really care about is waiting for the answer, right? And then who asked me? That's the important thing that they have to remember is who asked me for this answer, right? Charles has to remember to give the answer to Kathy and Kathy has to remember to give the answer to Carl and Carl has to remember to give the answer to Charlotte and Charlotte has to remember to give the answer to Alonzo. Okay? So that's what's taking the memory that's important here. Does that answer your question? Oh, I see. Okay. She's asking about the difference between a one word sentence and a word. These parentheses right here mean that we're looking at a sentence. They're just not the same thing at all. So, no. Okay. All right, let's do the iterative process. Carol has given the task. She loves you. Okay? So what Carol does is this. So Carol hires Irving to do iter. No. Blah. I'm ahead of myself. iter. She loves you. Zero. Irving says, is the sentence empty? No, it's not. So I'm going to hire Isabelle to do iter. Loves you. Isabelle says, is it empty? No, it's not. So I'm going to hire Ichabod to do iter u2. Ichabod says, is this empty? No, it's not. So I have to hire Irene to do iter empty sentence three. Irene says, is this empty? Yes, it is. The answer is three. And this is the answer. Okay? In this case, we're doing the work on the way in rather than on the way out. And because of that, because when Irene is finished, the whole computation is finished. That is to say, because the recursive call to iter is the very last thing that the call above it has to do. Carol can say to Irving, please do iter. She loves you. Zero. By the way, when you're finished, don't give the answer to me. Give it to Alonzo. When he hears Isabelle can say, please do iter of loves you one. And by the way, when you're finished, don't give the answer to me. Give it to Alonzo. And then Carol and Irving go off to Hawaii. Isabelle hires Ichabod. It says, by the way, when you're done, give your answer to Alonzo. Isabelle goes off to the Bahamas. Ichabod hires Irene. Same thing. All right? So because all the work is done prior to the recursive call. The recursive call is the very last piece of work to do. There's no need for us to remember who's waiting for intermediate answers. Okay? And so we don't need to use more and more memory as the size of the problem goes up. We just need one little chunk of memory. Yeah. You're not adding three and zero together. No. You added one to zero in between Irving and Isabelle. Okay? So the work, in case that isn't clear, I should point out. Right here is where the work is being done. And remember, scheme uses applicative order evaluation, which means the argument expressions are evaluated before the procedure call is done. Right? So that adding of one happens before the recursive call to iter. Okay. A scheme interpreter or compiler is capable of detecting this situation. If the recursive call is the last thing that has to be done, then scheme will say, I'm going to treat this just as if you had said for a while or go to or however it is you do it, you know, in those other systems. Okay? So you wrote a recursive procedure call, but I'm not going to actually compile a procedure call at all. I'm just going to compile, change the value of words, change the value of result, go back to the top. Remember I said when we learned about recursion, don't think go back, and you shouldn't think go back, but it's okay for scheme to think go back when it's in the kind of situation where it can. That's because scheme doesn't make mistakes about it and you do. Yeah. Yes. The question was, is scheme doing something different than it would have done otherwise? Yes. It says, oh look, this is a tail call, so I'm not going to do a recursive evaluation. I'm just going to, you know, change the values of things and go back and do it again. Yeah. Oh, I'm sorry. The question is when you load it into the interpreter, how does it know which count you're calling? You wouldn't load this file into the interpreter with two definitions of count in it. There's only one definition of count. If you did do this, whichever one you did second is the one that's there. Why did I make iter an internal definition instead of a separate procedure? Because nobody's going to use this iter except count. It only makes sense for counting, right? And so, for one thing, it's just cleaner because it keeps the definition with the use. And also, if I had wanted to make it separate, in order not to get in trouble with something else called iter for some other little procedure, I would have had to call it count iter or something, and that would have taken six extra keystrokes twice. So it's just easier. But it's not, like, it's not crucial. It's not a requirement for tail call elimination. Yeah. Ah, no. He's saying, isn't it going to be less efficient because you're defining it over and over again? We will talk much later about how an interpreter works. But for now, just take my word for it. No. Yes. You're next. Wouldn't you put the function all above the definition? No. Internal defines come before the actual body of the procedure. Sorry. Externally, you could define them in either order as long as they were both defined before you called it. Yeah. Is there a downside to internal definitions? Yes. Not an efficiency one, but a debugging one. You can't easily trace an internal defined procedure. You can't say trace it or at the scheme prompt because there is no iter at the scheme prompt. It's only inside that procedure. So it's easier a little bit to debug if you don't. Yeah. Yeah. Yeah. Yeah. So he's saying this call to count, the recursive call to count, is inside the expression plus one count of but first of cent. And so after we get back from count, we have to still add one to the result. This call to iter, when we get back from iter here, there's nothing left to do. Yeah. Now it's not just in case anybody was thinking of making this mistake. It's not because this call to iter is on a line by itself. Lines don't count. A new line is the same as a space. It's got nothing to do with the notation. It has to do with what work still has to be done. In fact, here's something I'm surprised nobody's asked yet. This call to iter is inside a larger expression, namely this if. Right? So how can it be a tail call if it's inside an if? Who can tell me the answer to that? Yeah. Because if is a special form. And in particular, if the valuation rule is, it looks at its first argument and then, depending on whether it's true or false, it evaluates either the second or third argument and that's the end. So yeah, it's the special formness that makes this work. Okay. Let me make sure I haven't forgotten to say anything. Oh, there's lots I've forgotten to say, but not about this. So is it okay if we move on? Yeah. Good. I hope so. Better be because I'm moving on. Okay. For the benefit of those three math haters, Pascal's Triangle, etc. Each number here is the sum of the two numbers above it. Okay? Pascal's Triangle is useful for a bunch of things, but I'm not going to tell you that because either you already know or you don't care. But I am going to show you this procedure, Pascal, up at the top of the screen. Oh, I'm sorry. Except for the numbers at the ends of each row which are always one. So Pascal takes a row number and a column number counted from zero in both cases. If column is equal to zero, that means we're looking at the left edge of the triangle and the answer is one. If column is equal to row, that means we're looking at the right edge of the triangle and the answer is one. Otherwise, add these two numbers from the previous row. Okay? So this procedure very straightforwardly says what I just said to you at the board about how to compute Pascal's Triangle. So if I do the particular example that I circled up on the board, that's row five, column two, counting from zero, and sure enough the answer is ten. It's not five times two. It is like just to prove it. This is interesting. That took quite a while, didn't it? Why? Because this is a theta two to the n algorithm. To get this number, I have to do these two numbers. To get the four, I have to do two numbers and to get the six, I have to do two numbers. So that's four numbers on this row, eight numbers on this row, sixteen numbers on this row, et cetera. Yes? Is there any way you can do this iteratively? No. Well, not with anything like this algorithm because there are two recursive calls here and they can't both be the last thing that has to be done, right? No. But we can make this run faster. Running this complicated program here. What this does, it looks at first glance like it ought to be very inefficient because it has this helper function, Pascal row. So Pascal row six produces all the numbers on row six. Okay? And to get a single element of Pascal's triangle up at the top of the screen there, I compute the row, the entire row, and then I pick the column of value from that row. So that sounds like it ought to be inefficient because in order to compute this value here, what I really need is kind of a diamond shape of numbers that looks like this. And there's some more numbers down to the sides that I don't need. It would be more evident that it would be more of a difference further down. So I'm actually computing twice as many numbers as I need to, roughly speaking. So it ought to take twice as long. But in fact, it takes no time. Okay? Why is that? Well, because that doubling of computation is computing some of the numbers more than once. So in order to get the ten, I need this four and the six. In order to get the four, I need this one and this three. In order to get the six, I need this three and this other three. And that just gets worse and worse and worse. So we're computing the same values of this function over and over again and that's where most of the time is going in the simple version. Whereas by computing the entire row one once and then the entire row two and then the entire row three, we compute more numbers than we need but we only compute each number once. And it turns out to be the difference between exponential time and quadratic time. Okay? So the moral of the story, and this is much more important than any of that stuff about space efficiency and tail calling. The moral of the story is you have to think hard about what's making your program inefficient and how to achieve a better order of growth. And it's the order of growth that you want. This code is much more complicated. So the constant factor, if you look at this versus the original version, the constant factor is going to be greater. But that is so swamped by the fact that it's quadratic versus exponential time that it doesn't matter. Next point. For those of you who didn't hate high school math, you might remember this formula, which is a formula for computing one of the elements of Pascal's Triangle without looking at the row above it. You just take which row you're in, which column you're in, take the factorials of these three numbers. How long does it take to compute factorial? Theta of what? N. How long does it take to compute three different factorials? Theta of what? N. So if you know this formula, you can do it even better than N squared. You can do it in linear time. So the moral of the story is an ounce of mathematics is worth a pound of computer science. Let me answer your question from before that I promised to answer about should you prefer iterative because it's more efficient. My answer to that is no. And that's because, let's go back to these two examples. When I read this, that wasn't what I meant. When I read this procedure, I have no trouble at all understanding what it's doing. It's perfectly straightforward and obvious to me. When I read this procedure, I have to think about it a little harder. Plus, you're more likely to make mistakes. So this is one of these things where I said first get the program to work and then worry about getting it to work efficiently. If I were you at this stage in your life, I would write things recursively in a recursive process rather than iterative process almost all the time because you'll get it right the first time instead of not getting it right the first time, which matters a lot if that first time is on a midterm. And also remember that it's only space efficiency we're talking about. Both of these versions of count take linear time. So in terms of how fast it is, it's not going to be a noticeable difference one way or the other. So go with recursion, but you should know what iteration is all about. See you Monday.