 It's 9 o'clock. Nice to see everyone here so bright and early this morning. Although you're moving, what's going on? This side. There's more people on this side than on this side. That side didn't cheer you up. Bad side. Tell people on your side that they're bad. They're the bad side. It's all right. I guess I'll forgive you. What else are you on this side? Oh, that's weird. You're farther away from the slide. I think it makes it even weirder. All right. So let's we're going to today finish up Hindley-Milner type inference today. So what is the goal of this Hindley-Milner type inference? What do we want this to do? What's our end game here? Check types. Are we checking though? I mean, we are. Well, it's so that you don't have to do this early device check. Exactly. Yeah, so we are checking, right? We are checking types, but we're also inferring types, right? So that's kind of the really cool thing is we're inferring based on the usage of what the types have to be in order for the system to type check. Just like what you're doing in project four, you're not only doing type checking, but you're inferring the types based on the usage, right? You know, if you see an integer plus a variable that doesn't have type, then you know that that variable must be an integer, right? This is a lot more complicated because we have functions and we have ray accesses. We have a lot more kind of stuff here, but the basic core concepts are the same. So that's the goal. That's where we're trying to go is we want to try to understand and look at all types in the program to understand what could those types possibly be? So if we're going through each of the constructs in our language in order to see what exactly constraints that imposes on our language, right? And this is, you know, very similar to what you're doing again in project four, right? We have the constraints of the language and it's your job to go through and to enforce those constraints, right? And if you get to a point where you can't enforce those constraints, then you know you have a type error. Okay, so I believe we left that function application. So we have a node, the apply node, which we're going to say is the function application. So we're going to say that has some type r, foo has some type f, and t1, t2, all the way to tk of the parameters are all the types. So what are the constraints that we can enforce on these types based on this function application? So do we know anything about, let's say, I don't know, x1, the type of x1? Do we have any constraints on that? See some shaking, maybe nodding. That's early neck muscles. Yes, why? Why no? You're taking your head now. Why no? Because I just thought no. Why? Tell me why. Because it hasn't been declared anything yet. Foo's been declared, but not x1 or x2. Yes, right? So x1 and x2, they're just parameters into this function, right? At this point, they could be anything, right? They're just parameters being passed into a function. We don't know exactly what those types have to be. It's not like we can say they have to be an int, like on array access, right? So what do we know here? What do we know anything about, let's say, the type of foo? It's a function. Yeah, right? So this seems kind of trivial, but that's actually the little piece of information that we're going to build upon, right? So f is a function. So what's the type of that function? This function, the type f. What's the type f as a function of what? Of t? Aren't any t's, though. We have r's, f's, and t1 through tk. Return type r, definitely, right? So we know that. So we know that whatever foo returns must be the same as this type r of the supply, right? So we know foo is a function. How many parameters does foo take in? k number of parameters. Do we know the types of all their parameters? Not exactly, but we know they have to be t1, t2. They have to be the same as these, right? So we know that foo, the type f, is a type of a function that takes in t1 through tk and returns type r, whatever r is, right? And so hopefully you can kind of see where we're going. We're going to build on these. So we're going to see that, okay, if this r apply as used to add to an integer, then we know that foo must return some integer, right? So it's not going to be the general type r. Now we get to function definition. So now here we're defining a function. So here we have our function. And so we're defining now these x1, x2, all the way to xk, right? These are parameters of the function foo, right? So we're going to give them types. We're going to call them, we'll give foo the type f, all the types x1, x2, xk. And we'll say that the body, right, is some expression of type e, right? So then what does this constrain the types that we have here? So first thing we'll say that the function definition doesn't return anything, right? There's no type there. It doesn't really make, you could do it like that. And I think some languages do, but we don't have to worry about that for right now, right? So these types e, f, t1, t2, tk, what's the relationship here? The type of e, f must return something of type e. And we also know just like before, exactly the number of parameters to f, right? We know that f, the type of f is a function that takes in t1, t2, all the way to tk and returns a type e. One of the cool things is, well, let's not get into that, but based on this definition, right? If we decide that e is, let's say, an integer and we see later that this function foo is being called where it expects a string to be returned, right? We can say that there's a type error that can't possibly happen. If expressions, so how's this different than a normal if statement that you're used to thinking about? What do I mean by expression versus statement? Say it again? Expression has to. Yes, yes. Expression has to, you can think of return something, right? If it has some kind of return or the statement can just be a side effect that doesn't return anything, right? So here, the way to read this is if this condition, then return expression one. Otherwise return expression two, right? So the if condition actually returns something. So here we're going to have, we're going to write it as, okay, the if is a t4 and then the condition is a t1, the expression one is a t2, expression two is a t3. So then what are the constraints here on these four types? What do we know has to be true about these types? t1 has to be a Boolean, yeah, exactly. What else? Why? t2 and t3. Because if they're both returns, they have to be the same because it has to be expected the same amount but regardless of which condition they take. Right, so yeah, you don't want your program to be typed differently depending on what path it takes to the program, right? You go down one branch, then it's going to return an int. If you go down another branch, it's going to return a string. That's not going to be what you want in the type system. So is that it? What else do we know? Exactly, yeah. So t1 has to be a Boolean, right? The condition has to be a Boolean and then all of these types have to be the same, right? Whatever one branch returns is the same as the other branch returns is the same as what the if statement returns. Any questions on this? Okay, so these are actually the only rules we're going to talk about and focus on when we talk about int, they're typing inference. So this is something that you should, so you'll be expected to be able to perform this algorithm to be able to derive types based on a bit of code. So you have to either be able to reason your way to these constraints or memorize them, whatever works for you. And so what we really want to do, so we have all these constraints and essentially what we want to do is we want to propagate them through the program, right? This condition, this condition can be some arbitrary expression, right? Then we need to say, well actually that condition, that node has to be a Boolean, which means whatever, if it's a function application, that means that that function must return a Boolean. And if there's any parameters in there that are also used in other places, so we need some way to propagate the types essentially throughout this tree. And so this is this idea of type unification. So you're gonna try to go through and bring together all the types and try to find the most general type for every type in the program. So the basic idea is actually really simple. You start at the top of the tree. So you're gonna have a parse tree, which is exactly what we've been looking at for those examples. And every time you see some construct with unconstrained types, right? Every time you see a new type, so it says, hey, this thing is something new, right? Then you create a new type. So you just give it a new T1, T2, T3, T4, right? Some new anonymous type, which is also what you're doing in project four, right? When you see a new implicit type, you don't know what the name of that type is, but you know it is something new and you know it's not necessarily the title of something else yet. So, and this is basically what you're doing. So you say, hey, if a construct or in a tree, in the tree, right? If we say that, ah, this node must be type T1, and then it also later by some other constraint has to be a type T2, you say, well then T1 and T2 must be the same type, right? These types have to be the same. And this is propagating these constraints, right? So you build these constraints, and you're gonna propagate them through the tree in order to decide which types are, what the types are for every place in the program. So let's look at an example. So here we have this function definition, right? So we have, so I'm gonna number the nodes. So we have the function definition at the top of the tree. We have foo, a, b, c on the left. On the right we have, so what's, this is gonna be a good test. So what's the, I don't know, I already said that. So it's, the first thing that's gonna happen, right? Is the apply. So what are we applying to what? What are gonna be the children of this apply? So what does this node represent in relation to the definition? The body, yes. The body of the function, right? So we're defining a function, the signature is on the left. On the right is the body of the function. So what's the, actually not the first, but the last thing that happens in this function. Calls function c with what as the argument to c. Was it a, a, b, yeah. So it's actually the array index operator, right? So the first parameter is the result of the array index operator. And what's the left child here? A, the right child is b. This tree, right? This is the parse trees we've been talking about, right? So this is first take a, use the array, reference the operator, sorry. Take a, use the array access operator using b as the argument here. Then that value is passed in to c. So we're calling function c and passing the result of this as the parameter there. And the result of that is gonna be the end result of whatever this function refers. So now we need to try to figure out what are all the types for every single node in this program. Well, not only node, right? We wanna know, okay, what are the types on all the nodes? Right, but what else do we care about? For defining a function, right? Do we care about the type of that function? What's the type of this function? Yeah, we don't know yet, but we know it's the type of foo, right? And what about the parameters of foo? Do we wanna know those types? So yeah, yes, it's part of the function type of foo, right? And maybe you can see they're also used here and here in the tree, right? So we should have types of those. So we're gonna create a table of all the types. What we're gonna do is we're gonna start at the top here. And we're gonna say, okay, at one, right? This is a function definition. So what does that mean about its two children types? The foo and the apply. So is this applying any constraints on the types of A, B and C? No, they can be anything. So we'll give them T1, T2, T3, right? So we see the parameters here, we see A, B, C, and we see, okay, we don't know exactly what those types are, so we're just gonna give them new types, right? They're unconstrained, they can be anything. So what do we know about foo? The foo must be something, right? But what does it take in as it's the types of its parameters according to the types that we've given out so far? Yeah, so it's gonna do, let's say, T1, T2, T3, and returns some type T4, right? So then the other constraint, so this is our type of foo, then what does this mean about the type of this node here? It has to be T4, right? So we know that whatever two returns, right, whatever that type of this two node is, is also going to be the return type of foo, right? And we've said that in our type system table here by saying that those two types are the same. So have we successfully propagated the constraints? Okay, so we just did node one, right? We propagated the constraints to all of its children, right? Now we need to recursively go to each of its children and apply constraints. We don't really go down the left, there's no children down there, and we've already applied all the types we need to here, right? So then we go to node two. So what does node two tell us about its children node three and node four? Must return the same thing as node two? What is this, the function to be called? So what are the constraints on a function application? Four must be the same. Ah, yes, okay. So we see this four, right? So the first kind of one of the things is, okay, there's no constraint as to this parameter, right? This is a parameter that we're passing into the function C. Right, so we'll give it some new type T5. We don't know what it is. We're gonna say it's some new type T5. And then what do we know about the return to weapon? T4, so why? Yeah, so we know apply has to be a T4, right? And we know that function C, right, or actually we're not even worried about the function C right now, right? We're just leaving this node three because it could have children and a bunch of other stuff, right? So we're only looking at the nodes. So we say that three must be a function that takes in something of whatever is this parameter, which we just gave the name T5, and it returns something, and what's the return type have to be? T4, the same as node two, exactly. So we're gonna infer just by looking at this that node three is, must be a function that takes in a T5 and returns a T4. Questions? Ah, we haven't got there yet. Yeah, so we're doing it step by step. So technically you can actually do this algorithm, I think any way you want. You can start from the top and go down, you can start from the leaves and go up. I like starting from the top down so I feel like that gives you a better anchor on how to actually do this. But yeah, that's part of the thing is saying that this is just like first and follow. It's a mechanical thing, and you can get tripped up if you try to look ahead and cheat ahead to see what's going on. Yeah, so that's why we didn't change the type of C to be this right now, right? But when we visit node three, we're gonna do that. Okay, so we just did node two, so now we do node three. So what does node three tell us? Does it have any children? Does it have any constraints in that sense? No. No, but what does it tell us of what constraints does it have? Does it have constraints? Exactly, we know that, so we know that, well, sorry. So we know that it's a node three, which means it has this type, T five returns T four, a function that takes in type T five returns T four, right? We know that is type three, but what's inside that type? I mean, what's inside the node? C, what's the type of C that we have? T three, so what does that mean about the relationship between these two types? T three and T four. T three and T four are the same? So you mean T three and T five are the same? What must the type of node three be? Node three? What's this type mean? It returns T four, except for the T five. It is a function that takes in a T five and returns a T four, right? This describes a function, right? So it has to be of type function. So it's a T three is a function, that's right. We'll get into that, don't worry. No, no, right now T five is simply a function, any function that takes in one type and returns another type, right? Well those types are T five and T four. Yes, yes. The constraint says that these two types must be the same thing, right? We know that T three has to be T five returns T four. Right? So we know that C, the type of this parameter C must be a function that takes in some T five and returns a T four. And what's that T four in relation to this definition here? It returns value of the function, yeah. So this means the third parameter of foo is a function that takes in some new type C five. We don't know exactly what it is yet. But it returns the same thing as whatever foo returns. Right? So this constraint means that we're gonna replace all instances of T three with the function T five. So the function T five, a function that takes in T five and returns T four. So what's the difference between these two? These two types, T three and that function except T five returns T four. It gives you more information, exactly. So let's think about it like this. Are there any constraints on what T three could be? No, there's absolutely no constraints, right? T three could be anything. Are there constraints about what this can be? In the place of three, can you use an int? Can you use a string? In the place of C here with T three, can you use an int or a string? No, anything, yes. This type T three is more general than this T five, right? This function that takes in T five and returns a T four. So this is why we do this unification. We're gonna replace T three because we're trying to find, so most general types for everything would be everything's completely unconstrained, right? Your function, foo. So this is the most general signature you can have for foo. Takes in any type, any type, any type, all completely unrelated to each other and returns some brand new type, right? But this is more constrained, right? This says that, well, the third parameter must be a function, right? So this is how we're moving from, we're assuming that everything is the most general type possible and based on its usage, we're gonna restrict that. So we say here, okay, we're gonna replace all T threes with T five goes to T four, or the function that takes in T five and returns a T four, right? Because between those two, that is the more specific function. So we're gonna actually replace, we're gonna completely get rid of T three. T three no longer exists. So we're gonna replace T three everywhere with this function that takes in a T five and returns a T four. We're gonna place it here and then we're also gonna replace it in our function definition. Does everybody see that we've now restricted the type of foo, right? We're subtly getting closer and closer more towards the more, the most general and most specific type that satisfies all the type system. I guess it's the most general type that still satisfies the type system. There's a whole bunch of formalism behind this that we're not going into, so. Okay, so we just did node three. Now we need to do node four. So what constraints does node four have? What do we talk about? The array access operator. Yes, so which one, so it's gonna be six, right? So what does that mean the type of node six has to be? Has to be an integer. I hate that I do these in opposite order somehow. And so what does this mean about array five? That's been arrayed. Has to be arrayed of what? Type T five. Type T five, an array of whatever four returns, right? Because this array access operator is gonna index and return a single element of this array. So the type of each element of that array must be the return type of the array access operation. So we know that from this usage that node five is an array of T five. And we know that node six is now an integer. So let me go to node five and we access node five. Now what do we know about node, what does this tell us? I look at node five. Yes, so now we know that A, the type of A, which is T one, must be an array of T five. Why is node six an N? This constraint here, the array access operation. So we have to constraint that anything that's used inside the brackets, like here B, anything that's inside there must be an integer. So we're doing array access based on integers. Okay, so now we look at node five, right? Now we know, okay, A, which is T one, must be the same thing as array of T five. So which one of those is more specific? Array of T five. Array of T five, right? So now we're gonna replace everywhere T one with array of T five. So now we know that up here the foo is now a function that takes in an array of T five and a T two and a function that takes in a T five and returns a T four and the whole thing is a function that returns a T four. Then we look at six, right? So six says that six is an int and we know it's B so we know those types must be the same. So which is the more specific type? Int. So then we replace all T twos with ints and now we've gone through and we've propagated all the types in this program. So now we know that foo is a function that takes in as its first parameter and array of something. The second parameter is an integer and the third parameter is a function that translates from the type of that array to the return type of foo. So now that you have this, right, you can call this function foo by supplying any type of array, an integer, and then a function that translates from that type of array and returns some other type. So then let's go over questions before we get into some examples. For node one, we didn't write anything. Just curious if there has to be anything there. Let me figure that out for you while I'm still not. Yeah, we didn't write anything here because for a definition we don't return anything. We're never gonna return anything for a definition. I mean, different languages actually do it differently where like, I think a JavaScript, a function definition returns that function so you could define an anonymous function but assign it to a variable name. So yes, you could have this would return basically the type of foo, but we won't deal with that. Okay, let's look at what can happen when type inference, oh, let's just look at some examples. Okay, so I'm gonna find a function. You guys wanna keep going with foo? Go ahead and do function. All right, let's see. So I'm gonna do simple example. So can you draw the tree? So let's go to the top most node of this tree gonna be. My tree drawing is not the best today. All right, then the right child, the if, right? So then what's the left most child gonna be? The condition, what's the condition? And then the true branch. And then the false branch. Plus, let's move it down. Now we have this tree. So let's number all of our nodes, right? So, okay, we know the death isn't gonna have anything. No bars are really gonna return kind of, well, we do need to look at that. So we'll call this one, two, three, seven. So I have one of the types that I'm gonna need to define. Yeah, A, B, and bar, and one, basically. So we'll just do, I'll do one like this. We'll do A, B, two, three, try to draw. Everyone's a critic. That's the worst. Should have been the hardest. Okay, so we start at the top, right? We go to node one, node one. What does this tell us about the types here? Yeah, A is some new type, T1, right? We have no idea what that's gonna be. I'm just gonna do it like this, T1, and B is what? T2. Is it also a T1? Could be. Could be, could be. Exactly, right? So that means that they have to be the same type, but at this point, we have no idea if they are the same or are not the same, right? Because this, the definition has no constraints on those. Okay, and what's the type of, we're gonna put bar here. Right, what's the type of bar or node one? T1, T2, and returns T3, right? And then what else do we know then? What do we know about node two? It's also a T3. It has to also be a T3. So are we done here? Propagated all the constraints? Yep, it does. Then we go to node two. Node two, we say, okay. What do we know about this if statement? What constraints does this apply to its children? Remember, we're not thinking about A quite yet, right? We want to say three, node three, exactly. Node three has to be able to enter. What about node four? Four and five have to be the expression. Half of the expressions? Yes, but what about the type? Four and five have to return the same type? So type T4. T4? Well, but we, you know, two has to return to T3, so its children have to return to T3. Yes, so, right? So we did this. This would mean that this type of T2, this type of node two, which is T3, can be different from these other types. T3. So let me revisit node three. So what does this say? But A is what? A is T3. Boolean, right? A has to be a Boolean. But which one's more specific? Boolean. Boolean. All right, four. Quietly while you guys are not paying attention, we'll change this to a one. Well, I'm sure there's some language where you could add another thing. No, that's not what I wanted to do. I mean, if this was C, you could have A be an int. Let's do, well, I wanted to do this. Okay, the problem with these things is it's really, really difficult to just tell by looking at it if it's gonna type check or not, right? So if you have to play active, you have to go through the steps. Okay, so four, and we got a node four. So we just did three, we visit node four. What does node four tell us about the type systems? It tells us that T2 and T3 are the same. Yes, it tells us that T2 and T3 are the same, right? So are they both equally general? Yeah, yeah. But we still have to make sure that they're the same. So we're just gonna replace one with the other. So what's your favorite, T2 or T3? Yeah, T3, because there's more T3s still. From a writing standpoint. So we're gonna change it every T2 in here to T3. Now we visit node five, so what constraint, so what, so node five is an addition operator, so what constraint does that mean that we use? They're good, let's all review. Numeric type, we'll just say that they have to be the same. So what constraints then do we have here? Six and seven, you can do the same. You can do the same as what? T3. As T, as, yeah, as node five, which is T3. Exactly, so these have to be T3s? Yes, okay, great. Hopefully you're right about that. So we don't know yet, because we haven't visited that node yet. It could be anything. It could be arbitrarily long tree. Well, now we visit node six. So now what do we know about node six? All T3s must be Boolean. All T3s must be Boolean. So it says A, so node six is a T3, A is a Boolean. So this means everything must be a Boolean. Yeah, sometimes, just kind of. We visit node six, then we visit node seven. So what does node seven say? It's a Boolean. It's a Boolean. Type of node seven has to be a Boolean. What's the type here? A integer. Can we merge those two? Unify those? Yeah, you can. Dement it in your life. So we're using the same languages, ones and zeros for Boolean operators. True or false? One's and zeros are one, zero, and not zero are not good truth values. Doesn't C use zero ones? No, it uses zero and not zero. Which is kind of ridiculous. I thought it was greater than zero. I thought it was greater than zero, because if you put a negative one in, doesn't it stop? No. You have the check, it doesn't stop. No, no, no. Just text, if all the conditions are if zero, basically, I believe. A lot of the return values are negative one from a function, so if a function fails, so a lot of when you call functions, I'll say if return value is not equal to negative one, so they do that check in the call. So yeah, so then we get here, then what do we say? Type error. Type error. One is not a Boolean, therefore this program does not type check. What type error? Yeah, which type error is in the project? I'm just kidding, you don't have to do that right now. Okay, so that's an example of a type error. Let's look at something that's kind of cool. Okay, actually I'm gonna draw the tree instead of drawing the code, is that okay? So everybody? Yeah. Otherwise I just feel like I'm writing things twice. So the body of this, I want an if condition, and then I want a bracket operator, and I want a true branch I wanna apply. Okay, I have to come up with examples that you're not gonna just like, and we can still see the tree. Okay, I'm gonna kind of simplify things a little bit, just so we can see the tree, I'm gonna just write the types on the node as we go to them, right? So, can we still number them? Well, let's number them. Okay, so what do we know about this function? That's, what does that tell us about this? It must be a T4. It must be a T4, exactly. Okay, now we get to this if. What does this if tell us about the types of its children? Right, and six and nine? It must be the same. Same as what? Each other. What type is T4? It's a T4. Yeah. That's the important thing. Exactly. Okay, now we go here. So what does this tell us about the types of its children? It's four must be. Yeah, so four must be the other Booleans, and five must be an integer, right? So now we actually know types T1 must be an array of Boolean, and type T2 must be an integer, right? You're not allowed to do this. I don't have so much space. Anything? No, I can't. I don't know why. I don't know why. I wish it was like infinite. I could move around, but someday. One of you guys should make that for me. Okay, so now node six. So we just did four or five, right? Node six. So what do we know about six? What constraints does this apply to its children? So after a turn of T4, but this is a function application, right? So what does this mean about node seven? It should be a function. It's gotta be a function. So what's the type of that first parameter that it takes in? So we don't know, right? It's on constraint. The parameters are on constraint. So we'll just give new types to each of the parameters, and we'll say that C is a function that takes in what? The T5 then returns what? T4. So now when we visit seven, now we know that C, so type T3, is a function that takes in the T5 and returns the T4. Then we go to eight. When we look at eight, what does this tell us? T5. T5 must be an int. Exactly. So function C is a function that takes in an integer, right? And returns some type of T4, which is also the return value of this function fast. Then we get up here, and then we go to node nine. So what does node nine tell us? An eight. A is a T4. T4 is an array of Boolean. Which means T1 is equal to T4, and T1 is an array of Boolean. This means T4 is what? An array of Boolean. Yeah, so now what's the type of this function that is putting it all together? C is a function of T5 that gives you T4. So it's C is an integer that returns an array of Boolean. Questions? So is it type Tic? Yup. Yeah. Thanks. We're able to find, based on this crazy usage, the types of everything in this function definition. Of the... So yeah, so it takes an array of Booleans, an integer, and a function that takes an int and returns an array of Booleans, and that function as itself returns an array of Booleans. Could this be even crazier nested? Sure. Is that something I would probably do? No. So young and optimistic. All right, any questions on this? There'll be a homework on that. I'll probably decide it after the project more is due. That way you can... I mean, none of you are gonna start now anyway, so. All right, that thing's not realistic. Cool. All right, I'll stop this here.