 Okay, so two announcements. Today is the first feedback session on your final project. So we'll move back to Soda and the first 10 teams will meet with me 5 to 15 to 6 to 15. There are some really exciting proposals. I look forward to talking with you. The next announcement, again, it's Tuesday, so the next project assignment is out. This one should be fun. It's a culmination of the parser part and the compilation part, so you will write a few little languages. Essentially, putting together everything you build so far. What we'll do today is look at another family of statically typed languages. Last time we looked at Java, and what static type checks you do during compilation on the program. How you delay some of the checks to run time so that the type checker isn't too strict and doesn't reject programs that might actually very well be correct. And the Java type system is fine in that it does give you not only a sense of documentation about your programs, but it does give you a lot of strong type properties, type checks at compile time. You know that certain things will not crash when you run the program. For example, what are the things that are guaranteed not to cause you troubles once you run the program? In a Java program versus, say, a Python program or a JavaScript program. One example of that. In an object, it's guaranteed that when you access an object, it is going to have that instance variable. But you cannot guarantee in Java, for example, that it is not now. That runtime error you still can have. What are some of the other things that are guaranteed not to? Buffer overflows, but those are guaranteed also in Python not to happen. So in that sense, Java doesn't give you any stricter guarantees. Those in Java are delayed till runtime might as well. Some other things that Java checks at compile time? Exactly. So the arity of functions and the types of arguments to functions for that reason as well. And also because it knows what types these are, the compilation can be better. The objects don't need to be hash tables, associated arrays, in other words. But they can be structs like in C, the way you learn, presumably, in 61B. But you'll plug in. But Java is not the end of the evolution of programming languages. And in fact, new statically-typed programming languages like Scala and evolution of Java are inspired by ML, a language that uses static types but uses them differently. And so what we'll try to do today is essentially reinvent a little bit about ML. We sort of succeeded reinventing parsing by looking at the fundamentals, what the grammars are, and building a parser in prologue, and then reinventing C, Y, K, and then optimizing it in early. So we did sort of from basic principles quite a bit of history of computer science. So today we'll do a little bit about ML. So the way we'll start is well, actually. Let me tell you what happened in Java, that in Java you actually name the types of the parameters. But why would we do it if we can infer those types automatically without these annotations? So let's start with this factorial program. It's essentially written in the 164 language. And could you actually look at that and see what is the type of the function, what arguments it accepts, what type of arguments it accepts, and what types of values it returns? It accepts some sort of a number. How do you know it's a number? So here is the argument, and we apply multiplication on that number. So there is another operation on n. So n definitely will be a type that supports multiplication and subtraction. So we know that much about the parameter of the function. What do we know about the return value? So that's another operation. So what we know about n is that it needs to support equality, multiplication, subtraction. So the return type must be either one or the result of the multiplication. These are the return values. So the reasoning that you've done is correct. So the argument could be anything that supports equality, subtraction, multiplication. And the return value is the type of one, and it must be something that does multiplication. And so reasoning about this manually is fine. But when the programs grow big, think 100,000 lines of code, that sort of reasoning might be problematic. So we want an algorithm for doing this type inference for us. So I'll erase all these reasoning. Let's now think how we would actually build a type inferencer which simultaneously type checks the program. How would we build it for programs of that sort? So what are those programs? Imagine that these are 164 programs that support dynamic types. The value carried their types at runtime with them in a sort of tag, as is the case in our interpreter. But now I want to optimize them or type check them for errors at compile time. So what would that algorithm be? So Elbon, I'll let you think a little about what the algorithm would use to do this type inference. And could we use tools that we have already developed in the course to figure out all these types? I know you know the answer. I would like others to participate. I'm sure they are almost ready to raise their hand. OK? So I like the algorithm because it is very simple and often it actually works. What you are saying that I have a function which we denote this way. It's a function from some type that we need to fill in into this box into some type. The arrow signifies a function type, something that gets a function from one argument to one return value. And you're proposing to let's go through all the types that we have in the system. And essentially all pairs because we need to pick a type for the argument and for the return value and enumerate all of them, see for which the function type checks, and we are done. So that in principle would work. It would probably be even reasonably efficient, but we want something more efficient than that. But even efficiency aside, there is a problem with that. What could be the problem with this approach? One is enumeration of the set of types. Not just inefficient, but in fact impassive. The programmer can define their own types. And yeah, maybe you can discover them from the source code of the program. So maybe that would not really be a problem. But even if you could find all the user-defined types, even then there could be a problem. What obstacle do you see to this algorithm? So the algorithm again is to create a list of all our types. Now you create a list of pairs for all possible types, run them through the type checker and see which of them check up. What could go wrong? Is it guaranteed that the set of types we might see in the program is bounded by some number, or could we create an unbounded number of types? How about if we have a list of pairs, such that the first pair is a list of pairs of ins, and the second element of the pair is a list of pairs of lists. See how you could create, in Python, essentially infinitely many different types. Because you can, in your program, use data structures of arbitrarily deep nesting. Then the type checker could not just enumerate all of them because you would never know that you have enumerated enough. And it would not be probably efficient to try types that go, I don't know, 1,000 deep because, of course, there are just too many of them. OK, so that's a good idea. Indeed, we'll use prologue. And we can think of it as sort of solving some constraints of types. But what we'll do, well, maybe somebody can suggest now how to do it in prologue. So prologue is the idea that we'll take a program, we translate it into a prologue program, whose sole purpose is not to evaluate this, not to act as a factorial, but just compute those types. And so that prologue program will be our type inferencer. When you run it, it computes these types for us. OK, so what you're essentially proposing is to turn this program into its abstract syntax tree, which we know how to do, and propagate some information up. Now, what if that could be done, of course, without any prologue machinery? We just propagate this. Well, is it guaranteed that as we go bottom up on the abstract syntax tree that we always, when we reach a node, know the types of all the variables down there in the subtree? Not necessarily, right? Because sometimes you need to know the constraints from up there. For example, in this case, perhaps you know. But if I give you another program, which has some conditional, and here you return 1, and then you return n, if you look at this program, can you tell the type of n? Can you tell the type of n by looking at this red program? You cannot. Well, so it really now depends how flexible we want to be. But the reasoning we'll use is that this expression here could return what value here? An integer, right? And so we'll adopt a rule that both sides of this if must return the same type. And so you see, because this is one, the other one must be one as well. So if you look at this if statement or if expression, if this is one and this is n, then this is int. And this is also int for the reason that the left side of the if is int. So this one here constrains this n as well. But if you go bottom up this tree, you see n. But at this point, you don't know yet the type of n. It's only when these two subtrees emerge together. Exactly. But we need to accept the fact that some information, some constraints essentially will be flowing that way. We cannot compute everything going bottom up. We can merge them, but the computation of those constraints will happen up there in the if. It's not a problem, but it's not a problem. Except what you are computing by the bottom up propagation is not actually the types of things, but a prolog program, which when you run after you compute everything, you'll get a type. So let's go through an example here. Yes? Well, OK, that's an excellent question. What if the else does not return the same type? What if we have written a program in which the if returns one and then returns something else, the function foo might return a string, right? We don't know. So if it indeed foo could return a string, then the type checker would say, I cannot type the program. I cannot prove that indeed both the then branch and the else branch return an int. And what the type checker will do? It will reject the program as one that is not type safe. So this is an example of how some programs that would be correct in the dynamically type setting, Python, JavaScript, and run correctly without an error would be rejected by the type checker because the type checker does not know everything about the program and it does more conservative, more myopic, local reasoning. And therefore, good programs are rejected. That's the price you pay for getting some guarantees at compile time before running the program. Not always good trade-off, but often when you are putting the program on a spaceship that's going to fly to Mars, you might want to have a guarantee that something won't go wrong halfway. And yes, the program or productivity might be a cost here, but programs typically are more reliable than they are statically typed. So again, we make the assumption that the type of that must be the same because they will flow into the same, you could say, port of the function, the return port. OK, so we are here. We know that we are going to take the program and translate it into a prologue program which when we run, you'll get the types out. Could you suggest an approach for doing this? So let's brainstorm here. I want us to understand it by discovering it from first principles. So the prologue program will presumably have some variables, those things with capital letters to which we assign values by running the prologue program. The prologue program essentially solves for the values of those prologue variables. What will be the values of those prologue variables? Things like seven, things like a string or something else. What will be the domain of those values for the prologue variables? It will be types. So we'll invent variables whose solution, whose values on the output of the prologue program will be types. So what variables are we going to invent for this function here? How many do we need at least? Let's try to write them down. So we'll have two, presumably these are, I call them i and o, and this will be the type of n, the input. This will be the type of the return value. Do we need some more? Or would it be more convenient for us to have more variables, just to make it easier to write that prologue program? So how about if we did it this way? Here is our AST. What operations these are doesn't matter, but this is an abstract view on an abstract syntax tree. If I want to invent these prologue variables to keep values of types, what would be a convenient place to invent a new variable? I could have a variable at each node. So we could have a variable sort of t1, t2, t3, and these would denote types of the value that that particular subtree computes. That would be a clean, systematic way. Perhaps we would have too many variables, but who cares? So we have those variables, but let's now come and see what we really need to do. We need to compute the type of the parameter n, the return value of that factorial function. And we also need to check whether those types that infer all check out. It turns out to be we'll design it in such a way that we will only be able to infer types if the function is in the typeset, meaning it uses those values in a way that it doesn't add a floating point to a string. OK, so let's write first the axioms of our arithmetic, essentially the axioms of the operations of the language. We'll do it in prologue. So what am I saying here? I'm saying here that the type of the value 0 is int. The type of the value 1 is int. Really, if I wanted to do it in prologue, I would have to do this for all literals that can appear in the program. So you imagine that I have done it. So this is a prologue fact that says constant 0 is of type int. You see a problem that it would be really hard to do it for floating point literals. But I could create these rules on demand as I encounter these literals in the program. What other axioms do you think we may need to add to do our type checking? We need to describe the semantics of the language with some prologue facts. So we have a multiplication in that program. How would we describe multiplication? Presumably, we want to describe it once and for all. So you will have inputs. So if you get int, then you are saying we would say multiplication here. And we probably want to show also the return value. Somewhere. So what would we do? We would do outputs. It is a possible way to do it. Essentially, you are saying that multiplication has two arguments, and they are ints of type int. And it has one output, and that's an int. OK. Do you see a problem with that? There could be a little challenge here. All right. Exactly. So we have inputs. Imagine I do a float, star, float, when I presumably do outputs, star, float. Now you could couple this relation on input with this relation for output and think that int times int is a float, really first. Presumably, it's not the case. So let's try a different way of modeling the types of multiplications. And this is actually what we'll do. We'll do multiplication either accepts all two ints and generates an int, or all three are all both arguments, and the return value are a float. And now we are coupling together the input arguments and the output arguments. So how about subtraction is the same, and equality same too. So these rules are specific to the language. They hold for all programs. They refer to the semantics of the program. So what do we do next? Now we need to take this program that we haven't translated into rules that are specific to that program, and this is the program that we'll run and it will perform inference. So could we start by some rules? So I spread out these programs here, and can you pinpoint some point in the program that will result in some prologue rule? So let's start with the first rule. Does somebody want to give me the first prologue rule? You could think of it as a type constraint induced by this program. So can you think of a type constraint generated by the program? So remember we used i and o for input and output. So the type of n is o. The type of the output, I'll put it here, is o. The type of o must be comparable to 0. So let's see if we can. Type of i must be comparable. Well, do we write it like this? So what is a bit? It's a good rule, but it's not quite correct. So can you spot a little inconsistency there? Can people see what's not quite correct with this rule? So presumably this rule, let's just say where it comes from, it comes from this comparison. And we are comparing indeed the type of n with the type of 0. But you are saying here 0, whereas we really want to say what? Type of 0. So we want to have a rule which says type 0 is, let's call it t0. So we'll bind this variable to the type of 0. And now we'll just say here we want to move this. See, if it was written in a really type safe language, it would presumably not happen. I'll deal with it. So in the meantime, think of what other rules we want. It's a shame that this is what you need to do. I'm sure that Windows 8 is going to solve the problem. OK, so here we are. So remember what we did. We said the type of 0, and this is that 0 here, is going to go to variable t0. And now for this operation, we are going to say comparable i, which is the type of n, to t0. What other constraints do we need to generate from this program? How about the fact that value 1 is going to be returned from the function? Can we turn that into another rule? So we need to get the type of 1, all right? So we have agreed that the type of the return value of the function will be stored in the variable, the prolog variable 0. So that variable could be int, flow, string, whatever that function happens to return. And so this means that t1 must be essentially the same as 0, right? Must be the same type. Now, 0 is already a variable that stores that output type. Yeah, maybe I should call it t0 for output or r for return, but it's 0 for now. OK, what about some other constraint? We could write it that way. If that is preferable, we could say equal. And this is just the prolog rule equal, which insists that these two are equal. It will unify them in the term unification sense that you know. Well, how about this? Can we turn that into a type checking rule, a constraint that must hold over the types that be assigned to this function? So we'll need multiplication here, for sure, which is three arguments. One argument here, the other one here, and then the return value of that multiplication. So what are we going to put here? That should be easy. This will be i, the input type, the n. How about the second argument? Now it gets a little bit tricky, because we are calling a function whose type checking we have not yet performed. So what assumption simplifications are we going to make to make it possible to type check recursive programs to begin with? And indeed, this is what we'll do. And we'll do it on this sort of agreement we made that if there is a function, it will return the same type each time it is called. So if you have a recursive function in JavaScript, Lua, it could return a different type at each level of the recursion. You could write it that way, and the program could still be quite correct. Not sure that it's good software engineering. Might be horrible software engineering, but the program could be correct. But here, we agree that if the factorial here returns all of its instances will return the same value. So the type of this is O, and the type of that is also O. All instances of fact are O. And what is going to go here? It would be another O, but I could be a little bit more systematic, perhaps, and say, in when the temporary variable, I don't know, call it TM, and now say equals TM and O. Simply, I would be saying that the type of the value of multiplication is TM, and it must be the same as O. Essentially, I'm saying that both branches of the if return the same type. Are we missing anything? This cannot possibly be all. It's a short program, but you know short programs generate a lot of code, stuff, bytecoded. So what we might be missing? Can you speak louder? Uh-huh, OK. So this one, perfect. So we need to say that the sub of I and T1 needs to return I. Perfect. Good. So we have that. What else? There must be at least one more. Yeah, we should probably do it. Ideally, every sub-expression would have its own variable as we agreed on the AST, and then we would have these equalities binding them together. But let's forget that. That's cosmetics. But there is one constraint that we should really type check. Yeah, I think I'm doing it here. I'm saying that the type of here is T1. I'm making it equal to the output value. And the type of this is TM, and I'm making that equal to the output value as well. So this is where it happens. Here is where I say the two branches of it produce the same thing, because they must be both equal to O. If you wrote a syntax-directed translation that generates the prologue program, which you can imagine now doing, it would look slightly different because I'm oversimplifying since I'm doing it by hand. But there is one more expression that we are not checking. Does anybody see the expression? We are not checking for a property that would crash your 164 interpreters. But in a language that is somewhat more decent about not allowing any values anywhere, yes, we should check that this one here is a Boolean, presumably. Let's not write it down here. So here is the whole program. And essentially, what I have done, I looked at the function factorial. I have written down the constraints, a little bit more compacted, fewer equalities than we've seen before. And look what I've done. I defined a rule, factorial, that in its body has all the constraints for the statements inside that factorial. And I'm using this term here. This is our notation for I arrow O, which is the type of a function. And I could do it for all the functions in the program. And I could refer to these functions from here. So a factorial called function foo, foo with some constraints, I don't know, a, o, t, 0, would appear among the constraints. So if fact calls foo, it would be just another constraint over there. So now let's see. Let's try to actually run it. So here is the program. Here are our axioms for the arithmetic of the language. Here is what I just typed. Now we want to ask the query. This query here has one variable for which we are asking for the type. And this is essentially the type of this factorial variable. It will be a function. Sorry about that. Now I can ask. So what answer do we expect? Do we expect int something else? Int int, and it will be in this notation, right? The term fun int int is our way of writing the syntax int comma int. Should there be more solutions? So try to look at that constraint system expressed in prologue. Should it find more solutions for that function? So how could we now make the language a little bit more flexible and allow the factorial to go from floats to floats? Because in principle, why not define a factorial over floats? So what extensions do we make? Yes? So if we do that. So now we are saying that 0 can be both int and float. And now I'll rerun it. These are the two solutions here. OK, so let's try to understand what we have built. Yes. OK, I love it. So the question, I'll paraphrase for others, is that I'm being inconsistent because I'm not treating the factorial, which is a function, the same way as equality as multiplication and everything the same way. And indeed, I should, right? Because multiplication is really not an operator, it is a function. And I should do it. The rules will be a little bit bigger, because I would need to check, yes, is it a function that extracts the input and output? And if I want it to be proper, I would do that, except the constraint system would grow a little bit out of hand if I had to do it by hand. But absolutely true. OK, fantastic comment. Now, what can I do with this prologue program? So imagine I am a type checker. And I see in my program that somebody calls a factorial with an int. This is how the type checker essentially determines what is the return value. It's an int. I can also, when I see what the return value is, I can determine what is the input value. It can only be int. So this is now something you can use for optimizations. Because you know what is the type of the variable you can create much better code than code which a runtime needs to distinguish between integers and floats. So this is essentially it. Now, what I want to do is switch from this little example where we saw how the inference happens. And indeed, it should be more proper and operator should be handle-like functions. But this is somewhat pedantic at this point. So let's look at the language ML, which is the language that pioneered this sort of type system. This is the language that you will see in your programmer life. There will be less of the Java style type systems. And more of this partly because it does the inference for you and you don't need to type the types. You don't need to enter them manually. But because it is also a little bit better for extensibility of programs. And Scala sort of mixes this type system with ML's type system. So first let's look at function definition. So this is essentially just purely syntactic thing. But it is interesting that when in ML you define a function, you have these two cases. Here is the first case and here is the second case. And you can reason about it mathematically. You see, oh, factorial is a function which on zero argument returns one. And on an argument that is different than zero returns this. So you can now see that these are the two bodies of the function. And this is a case expression that does pattern matching on the input. But this is a purely syntactic thing. The inference is just like before. When you run the compiler, the compiler does the type inference that we just performed. And when it is happy and it can infer types, meaning there is no type error inside the function, the body return from the compiler is that. So can we now decode what is the meaning of this? Can we go word by word in this and try to understand the output of the type inference? So essentially, the output of the type inference is a checker for you. Type the function. You don't need to annotate any types. The compiler does the check, comes back to you, and you look at it and see, oh, it makes sense. This is actually what I intended. It may be that it doesn't type check. It could be that it type checks to something else that you wanted. And then you have a hint that perhaps there is a bug in your program. And in fact, type checking in these languages is so strong that it may take you a long time to fix the types, fix the bugs that manifest themselves as type errors. But once the program type checks, it's almost always going to run. There are people who will swear by this, right? Yes, but debugging the type errors may take quite a bit of time. But then the program just runs, since all bugs have usually been found in fixing the type errors. So what does this keyword mean, you think? So fact would be sort of a variable, but it's not quite a variable, right? The reason that this is a value rather than a variable means what? That it's immutable. You cannot reassign it the way we happily reassign it here. So it is a value rather than a variable. But it is a symbol that has a name. And here is the type of the value stored in it. And what is the value stored in fact? It's a function. This is what you see. It's a function from ints to ints. Pretty compact and convenient. So let's look at lists. This is the cons operator, the double colon. Nothing that you have not seen really. And so it is the same as that. So there is little syntactic sugar to simplify this notation into that notation. And really, as you said, it is just a function, binary function, but it has infix syntax. So you write it the double colon between its arguments. No magic here by now. So look at this function here. Try to understand what it does. And tell me the type that I need to put here. So we'll do a manual type inference. So who can give the type of that function? So sum of is a function. So it will have the type of its parameter and its return type. So let's start with the return type. What is the return type? You could cover the second line and look at the first case of the function that should already give you a clue if not the whole answer. It would be an int. How do you know? Well, it's enough to look at this body and you see that this function returns int. So the second body better agrees with it, or it would be rejected by the time checker. Indeed, we see a plus here. So it looks like it could be an int. Now we look at h since this needs to be an int. The whole thing needs to be an int. This is a plus which works on int. So this then is an int. This gives us a constraint on h. So let's do this reasoning again. The return type is an int. We know it from here. So this one is an int. This is a plus that works on ints, as opposed to plus that works on floats. What do we know from that? We know that the return value of this must be an int, but we already knew that. So what do we know about that? h must be an int. So what kind of list is that? What kind of elements are in the list? It's a list of integers. So we'll say it this way in ML. It's an int list. If we had a list of ints, it would be int list list. That's just a syntax. And notice how we did the reasoning. We knew the types of o. That would give us the type of that. Then would give us a type of h through the addition. This would give us the type of the head. And now we know it's a list of ints. But there is something we need to clarify. We only concluded that this particular head of this list is an int. What about the other elements inside of the list? Yeah, we'll get to it soon. I'll give you a simpler rule that in ML, in order to make the type checking work, there is the agreement that you can only have homogeneous lists, lists where all the elements have the same type. So each instance of the list is not polymorphic. It needs to contain elements of the same type. And that's another price you pay for having static type checking. That you now cannot create lists where this is a string, this is an int, which you could, of course, happily do in Python. Not that you should, but you could. Yeah, so there is another way to derive the constraint that we do have a constraint on t. It needs to be a list of ints. And because the function is recursive, it would fall out that all elements need to be ints. Looks like you have a question. OK, so let's get to the empty list in a second. That this empty list really must be an instance of the type list of ints. Indeed, it's an instance of all kinds of lists that you can create. All right, so how about this? What is the type of this function, which is your map function very familiar, right? It's a function that takes a list, takes a function, and applies this function on every element of the list, and returns a list with every element applied. So just to be sure what we know we are talking about, when you have a, b, c, it returns f of a, f of b, f of c packaged into a list. So what is the type signature of that map function? So we are making an assumption here, right, that f is what? A. So under this assumption, under the assumption that f is a function from some type A to some other type, the same type A, of course, then this would be what? A function that map would be receives, oops, a, right? This is the first argument. That's the function. The second argument would be a list of As, and the result is what? Also a list of As. You agree with that or no? So it is correct. The map is of that type, but is this the most general type? So it is correct that if I give you a function from A to A here for f, then of course the list must be a list of As, where A is some type. The result then is a list of As. In what way is this not the most general type that the compiler can infer, please? What could f be? It could go from, yes, from A to B. So now, indeed, we have the function, the type of f is as general as it can be. Cannot be more general than that. If it's a function, it needs to go from some A to B. Now it must take a list of As and produce is now a list of Bs, right? Excellent, OK. And it has two arguments which is denoted by this star. And you can indeed use it in two different ways. You can give it a square root and apply it on a list of loads or a reverse function and apply it on a list of lists. It will all work, OK? So these are really polymorphic types in the sense that now we have a list with not a specific type inside, but a special parameter. It's written as quote A, which really is a way of saying this is some alpha, OK? So this is a polymorphic function, right? The cons is a polymorphic function. Now how would we do the type inference here? Now we have a type inference with types that have this special funny type operator. Some kind of list. So how do we encode some kind thing? So can we encode it in prologue and somehow do a type inference where the output of the type inference in our prologue program will be a type of alpha, OK? It looks like we have some proposals. So that variable, the unknown thing, would be just the A type. So let's see if it will work for our, let's see if we could do it this way. Actually, I'm not sure it will work out, but let's imagine that multiplication can work on this. Can we predict what will happen now? Maybe this is a quick hack that is not going to work, but let's see. So this one, because of other constraints, didn't give us the unknown solution. We would need to encode the floating list, and I'll try to do it on the slides before I post it. But that's essentially the solution. Just use prologue unknowns for these alphas and betas, which are the parameters of the types. Here is a different version of the map that I want to use to close the lecture. I was hoping to do more brainstorming, but it actually did go quite fast. So let's see how this map differs from the previous map. Look at this definition here. Just pay attention to the arguments, and let's go two slides back, or one slide back, and see what we do here. So what you see is that these parentheses are missing. They are here on this slide, but they are not in this definition. So the first conclusion could be that, well, these parentheses for the function syntax are optional, but that's not quite the conclusion that we want to draw. What is going on here? It's a relatively deep, quite a deep concept that you can use to build domain-specific languages by embedding them a little bit better. So what's going on? So in fact, this is the proper ML definition. This is a function f with two arguments. Sorry, function map with two arguments, f and the list, and the second case is f with a list decomposed into two elements. So map is a function of two arguments here. The function was a function of one argument, and then the argument was what? It was a pair, right? So really the arguments are you there have nothing to do with grouping parameters for the function, but just creating the pairs. So let's see what is the type signature here. If you look at the type signature, now the type signature of map is very different than it was before. So compare this with what we have here. Okay, I'm sorry. Oops, right? So the type signature here, I should note it here is what? Map takes a pair where the first part is what? It's a function. The second half is a list of alpha, and the argument is a list of beta. I actually don't need the parameter here. The parenthesis here, okay? So this is the pair, which is the argument of the function, and here is the return value. If you look here, we see the function, but now it looks like it's a function of one argument, the supply f, and the result is some sort of a function. So what does that all mean? So maybe a simpler function that we could consider is, imagine we have a classical function of three arguments, A, B, and C, and imagine that the types of these are capital A, capital B, and capital C, and you return a value of type B. What would be signature of f in this ML notation? It would be A goes to... So what it means is that when you call a function with multiple arguments, say three of them, the evaluation proceeds as follows. You actually apply first the function to the first argument, and the result is another function, and now this function is applied to the second argument, and the result of that is another function, which you apply on the third argument, and then you get the result. So you could think of this application of arguments as if you were plugging in the arguments one by one, and after each application of one argument, you obtain a function in which that value is sort of hard-coded. This process is called currying, and let's see what we do here. We take map, and we give it just one argument, this function, what is the result? What is the type of this thing? Can we read it out from this? Okay, so this is the type of the square root, and it is real to real, and this would be the type of squared all, okay? So what is square root all? It is a function that goes from reals to reals. So by applying the first argument, just the function passed into the map, we did obtain a map function, but it's a map function in which the function that will be applied on every operator, the mapping function will be already hard-coded, and so now you give it the second argument, the actual list of reals, and it produces the right value. So what's interesting about that, that using this method, you can now create essentially custom operators for a DSL, if we wanted to define DSL this way. So now we have an operator square root all that performs a square root on lists of reals. We do not need to mention map at all, and this is all we need to create, by essentially hard-coding the square root in it. But square root all is still a function, except some arguments have already been provided. So this seems like a convenient way to create new operators, so why don't we do it in our Lua kind of language? Seems like it would be handy to have. Turns out that you can create embedded languages for parsers in a really beautiful way by relying on this process of curing, creating functions of three parameters, supplying one and getting two. And you can write expressions that look like grammars, but they really are just function applications. But visually they look like context-free grammar production rules. So why didn't we add such a curing capability into Lua? Why isn't it in Python, for example? So we are relying on some particular property of the language to make this possible, or could we just add it into Lua? So can you do such curing in Python? Can you give a function name? And the function is of two arguments, and you just supply one of the two arguments, and the result is a function. You can clearly create an operator curing, which receives a function, and the first argument gives you back a function that is essentially a curried version of the original. But you cannot do it in Python with the syntax that we do here. Just call a function with this one argument, right? You need to call a special operator to effectively wrap your original function with a wrapper that provides that curing. Do we need that? Okay, so why would it conflict? It's an interesting language design question. So if I understand correctly, you're saying that Python does not or does have function overloading? So Python does not have function overloading, typically, right, because in a language like Java that is statically typed, you could have multiple functions with the same name, and then you choose among them based on the type of the argument, right? And typically you make that choice at compile time, but since the language doesn't have static types, you cannot make that choice. So you don't have that overloading. Right, so the answer is that perhaps Python has something similar in the form of the default parameters, right? Maybe you have a function foo and y, where y gets the default value one, right? That's what we have in mind. And now I can call foo with value two, and I do get the result back, right? But if I do this in ML, I get something different, right? I get a function that still needs to receive the second argument before it is evaluated. In Python, when I call foo with one argument, it automatically supplies the value for the second, and it executes the body of the function and gives you the return value. So it's not the same thing, right? I see, I see, I see. Okay, I'll tell you why I'm misunderstood, because typically by overloading, we mean having multiple functions with the same name, whereas I think what you meant was default values of parameters, okay? Yeah, so that would conflict, absolutely, all right. But even if you remove it, so if you remove it from Python, could we then obtain curry? Because it would seem convenient to just obtain these partially evaluated functions that you can then pass around. The arguments to functions are evaluated just like in LUAS, and it's passing arguments by value. You evaluate the expression, obtain a value, and then you pass it into the function. So immutability, sort of orthogonal concepts here, it has passing by value, right? So you can do it explicitly by wrapping a function in lambda with one fewer argument. The question is, could we get this sort of syntax that we really just call it with one fewer arguments and the return value is this partially evaluated curried version of the function? The curry doesn't have anything to do with food, by the way, it's based on a mathematician who worked on functional programming languages. Uh-huh, okay. So we would need to know, so you're saying let's add that feature into the interpreter, what would we need to do? So let's do this exercise. We would presumably need to restrict the ability to give a variable number of arguments, right? That would need to be fixed. So when you see a call, you read the arguments from that call, okay? And then you pass them into the evaluation of the function and if you have all of them, perhaps you evaluate the function. If you have fewer of them, your return is partially evaluated function that needs later to receive more arguments, right? Is that all we need to do or would it break down somewhere? So it seems like it could work, right? You could perhaps even play with the syntax and avoid these parentheses, right? So as long as you make the agreement that we are strict about the number of arguments of the function, so you don't allow variable number of arguments, then you could call a function with a partial list of arguments, obtain that function as a first class value, pass it around, pass it into another map and so on. And this way the functional programming would become much more interesting because you could create the harder functions that you need to pass to maps and folds that could be created from other things, from other maps, for example, by plugging functions into them. Uh-huh. Yeah, you would have one function with the, that's right, one map function. It could be polymorphic, right? This function map is polymorphic in that it works on various kinds of types. But you couldn't have another with a smaller number of arguments. You cannot have that in Python either, I believe, right? So in that sense, we would not be losing anything. So that's all, I will try to add one more example into the slides before I pose them off where curing is useful to create operators that make functional programming somewhat easier compared to wrapping these functions into lambdas which adds a lot of fluffy syntax, okay? So that's it for today. I look forward to the design meetings, later tonight, and what we'll do on Tuesday is look at how tools can automatically analyze programs for bugs, for information that can then be used for better optimization in the compiler.