 Good morning everyone. Thanks for having me back here at Bangalore. Great to speak to you guys. I'm going to tell you today about parametricity, how we use types as documentation of code. So this apparently is a bit of a contentious topic. I don't understand why. I'm going to teach you why it's just obviously true. So here we go. I look after a team in the state of Queensland, Australia. Queensland Funker Programming Lab is my team. We generally write Haskell more than anything, but we don't necessarily write Haskell. We work on a lot of open source projects. We work on a banking project at the moment in Haskell. And, yeah, it's sponsored by the state Queensland government and CSIRO from a research organisation. So first of all, why does parametricity work? One of these is you must first commit to functional programming. Here we are at a functional programming conference, but I'm just going to remind ourselves what it actually means. So here is a program written in some programming language and it doesn't matter. And we notice in this program that there are two sub-expressions that are equal. That is, these two expressions. You can't quite see the highlighting there, these two here. And we say, let's factor out that expression to a value and assign it to the value and then substitute the expression with the value. And in doing so, we ask, did the program just change? So who thinks the program just changed? Okay, I left a delay there because some people do think it just... So the program did change and it changed because of this thing here, this side effect here. The counter now gets only incremented once. Here's another program and again it's got two sub-expressions that are equal. And what we're going to do is we're going to factor them out and assign them to a value. Did the program just change? Who says yes this time? There are zero hands and that's correct. This program is the same program. The reason this is the same program is because PIVL is a function. It's a function. So functional programming is the act of writing only functions where we can substitute expressions with values without changing the program. That's the definition of functional programming. So you must commit to that before parametricity works. Okay, so... Or another way some people say it is, for given inputs, the outputs are the same and nothing else happens. And that's an important thing, nothing else happens. So inputs come in, inputs out and nothing else. We call this property of expressions when we can substitute them with a value without changing the program. We call that referential transparency. I heard Michael mention it in the keynote. If you heard that word, that's what he meant. You can substitute expressions with values and a referentially transparent expression with a value and the program will remain unchanged. So the definition is very simple. But there's some consequences. One of them is that, for example, you can't do i++ anymore. You can't do loops anymore. And, you know, where I work, I just take everyone's loops away, take all their tools away, and say, there you go, fix. And now I have to substitute new tools. One of those tools is parametricity. And in my opinion is one of the most important things, one of the most practical and useful consequences once you've committed to functional programming. By the way, who was at the workshop yesterday? Just as a... Okay, I don't know if you noticed, but we were using parametricity yesterday. So yesterday I was asking people to write functions of Brian and Mark and I of a specific type and the type guided you to the answer. So I'm going to be a bit more concrete about telling you how types can get you to answers. So types can tell us about what's called inhabitants and inhabitants of a type are values of that type. Like, for example, Boolean has two inhabitants that are true and false. These are the inhabitants of Boole, for example. So I'm going to do a little bit of algebra and I hope that's okay. So who's heard of some types? All right, where we have like an A or a B. We have a sun type and this corresponds to addition. It's the same as A plus B when we count the inhabitants. Or we can have A and a B. We have a product type, I think most people have heard of those. We are doing multiplication. The inhabitants are A multiplied by B. And who's here written a function? Everyone. You've taken an A and you've returned a B. That is the same thing as B to the power of A. That's the number of inhabitants of that function. All right, so for example, if I said to you, write me a function that takes a Boole and returns a Boole. Well, Boole is two. There's only four possible inhabitants that you could write. They are return the argument, negate the argument, return true always and return false always. There's nothing else you can do. That's two to the power of two. Boole to the power of Boole. The unit is one. It's just the starting point in it in the algebra. And boy, that is a data type that has no inhabitants is zero. So let's look at Boole. It's a true or it's a false. It's the same thing as carrying units in the constructor because unit means it just has one inhabitant. It just carries one value in that constructor. This is the same thing as saying one plus one. It's two. Boole is two. Unit plus unit. It's a sum type we said or. How about maybe? The maybe data type is a unit plus a. Nothing carries unit and just carries a. So it carries one plus a inhabitants. If I said a with Boole, which is two, maybe Boole, it's three values of type maybe Boole. One plus two. Here's just another simple one. Either void or a. It is again addition. It's one of those two things. The left carries zero and the right carries an a. Zero plus a is a. If I ask you to construct a value that's an either void a, that's the same thing as saying construct a value that is an a. You can't call the constructor with void because there's no inhabitants of void. And if you do zero times a, this is a void and an a, then you have zero. You're just never going to give me one of these things. Give me a void and an a. First you need to give me a void, which you can't. It's zero times a. I think this is just an interesting way to sort of concrete the algebra. So if I said to you two times two and two plus two and two to the power of two, they're all four. All of these types are isomorphic. All right, so if I said to you give me a value of the type either Boole, Boole, you'll give me one of four values. You'll give me a left true, left false, right true, right false. They're all four. So I can count the inhabitants just by doing some algebra. As I just said. So I've just made a little something a little bit more complicated just to kind of really concrete it in. If I said to you give me a value of the type maybe of a function Boole to maybe Boole, I have just asked you to give me one plus. So one plus, one plus two to the power of two, possible inhabitants which is nine. And you're welcome to calculate them all out. There'll be nine of them and there's nothing else you can do. And an even more complicated one. If I really had to give you something of that type, I'd calculate it and say well I have to give you one of 40 values. All right, so the algebra can tell, I mean that's a very silly thing for me to ask you to do, but it's quite a nasty data type actually, isn't it? But I can count it and I can say it's one of 40 values. So what about a list of A? Algebraically, what is this thing? Well we know that an A is either a nil or it's a cons carrying a head and a tail, an A and another list. Or another way of saying it is I have zero A's, I have zero A's, or I have one A or I have two A's and so on. That's what a list of A's is. I can simplify that to be one plus A times the list of A. And if you carry that further, you'll find it's actually, this is from memory now, it's one over one minus A. Isn't that interesting? I said give me a list of A. I've just said to you calculate one over one minus A in algebra. I've asked you to do an algebraic equation. I happen to think this is really interesting, but it's not what the subject of the talk is about, but you can actually differentiate these functions. So take one plus A, that's a maybe, which the differentiation is one, with respect to A if I remember rightly. That means unit is differentiates maybe. But yeah, that's another talk for another day. It's really interesting, or maybe out in the hall perhaps. Okay, so parametricity. Why is it so important? What even is it? So this is the abstract from the paper. The paper is called Theorem for Free by Phil Wadler. And basically, Phil says, write down the definition of a polymorphic function, and by polymorphic that's when we're using like the A's and D's and so on. And tell me it's tight, but don't let me see the definition. Just tell me the type. And I will tell you a theorem that function satisfies by the type. And the paper goes on to explain how you can do that trick. So what my goal today is to try and give you an intuition to do this trick, just to prepare you to read that paper and then carry forward and actually execute it in your work, because I do it all the time. So I've got a contrived example here based on algebra. If I said to you write me a function, take an int, return an int. Well, I've just asked you to write me one of that many possible implementations, right? Int to the power of int. That's a lot. That's a large number. I have no idea what you're going to do, pretty much. But it's smaller than infinity. But you're not going to return the string ABC. It won't touch it. But we don't know anything more about the type other than it's going to be one of that many implementations. There's no information that's conveyed anymore. So this often happens. At least when I've worked on this kind of code, someone will go, oh, I'm going to look at the identifier name. Now I know what it does. And we write some tests. I definitely know what it does. No, you don't. And then that code. It's contrived, of course. But this passes all the tests. All right? I mean, of course, it's obvious that this is broken code. But there are more subtle examples out there. I want something more reliable than giving me, you know, test 10 elements out of the possible 2 to the power of 32 to the power of 2 to the power of 32. That's very small. I want to give a wider coverage with less effort. So the reasoning is difficult, or the effort is difficult, because the types are monomorphic. That is to say, they're not a's and b's. They're ints. They're concrete types. And I want to address a concern that I often hear, sometimes people say to me, when you write those types that are all very polymorphic, they're hard to read. And I want to address this by saying, quite the contrary, they're easier to read, because now I know what the function does. Okay? So I want to give you the skill, at least give you an intuition for being able to calculate that. And therefore agree with me that it's actually easier to read when you be polymorphic. So here's another one, just another example. Take a list event, return a list event. What does it do? I don't know. Does it do that? What about that? I don't know. But here is an important one. I've written it in Java notation. Yesterday I said, who's written Java? And almost everyone put their hand up, so I assume that's still true. And you can read that code there. I've written Java by... In fact, we went out for dinner last night, and it was revealed that I have a Java certification. All right? Someone dod me in. I actually have three. I was with one of my colleagues, and we were talking about Java, and having some fun. And someone goes, oh, Tony's got a Java certification. It's like, oh, man, just drink beer and look away. So looking at this type, I'm going to make the claim that every element in the return list appears in the input list. All right? It must. All right? Just by intuition, it must. Now, you might say, aha, I can think of counter-examples like null. I can put null in the list, and so on. And I'm deliberately discarding those. So all the a's in that list appeared in the input. And the reason is, I can't just construct a's out of nowhere. It's not true in C sharp. Who writes C sharp? So for C sharp, it has another sort of cheat code, which is you just call default, and make an a out of nowhere. And as far as my world exists, it is that doesn't exist. So I will explain why, but they don't exist. But importantly, I'll look at this code, and I could have written this code, say, yesterday when I was feeling a bit tired and I wrote this code, and I came in today and I looked at it and I said, I don't know what I did, but every element in the result appears in the input. I know that immediately from the type. And let's commit to this statement. Is does anyone sort of not quite feel that that's true? And that's okay. And if so, okay. Is that a question? If I replace a with int, yep, so if I replaced a with int, then my claim falls flat. So that's true. So what I'm saying is because I'm using a polymorphic a, now my claim is true. Okay, so you're right that if it was int, I could just put 99 in there. Yep. Yep. Yep. Okay, so if I call not on the elements in that list, it will not compile, because they're a's. They're polymorphic a's, and if I call negate, the compiler's going to say you can't negate an a. If they were bools, then it would be true, but I cannot just negate an a. The only thing I can do with those a's is shove them in the return list. Does that make sense? All right. I now believe I have your commitment. Okay, not quite. It has a subset. So that the returned list will always have the a's. All the a's in the return result will come from the input. It might just return no a's. It might return only one of them or all of them, or double all of them. Like just put two of them in the list or something. But they're all in the input list. No a's came out of nowhere. Yep. Yep. So let's return the empty list. All a's in the result of which there are none appear in the input. Okay? All right. This is in Haskell notation. Given a list of a's, return a list of a's, there could be like an alien somewhere on the unit, I don't know, Jupiter, and they're writing a function with this type. And when they are, all a's in the result will appear in the input. And this is a really good property. It took me no effort. I just wrote the type. And now I know this thing. And these things we will call theorems. All right? So the papers call theorems for free. And I can reliably construct these theorems based on the type. So here's another one. Maybe a to list of a. And I'm making three claims here. And that is that all the a values in the return list are the same value. They must be. All right? Looking only at the type. They must be. If the argument is nothing, that is there are no a's, then the empty list is returned. So I already know what it does. If nothing comes in as the argument. And the a values that are, so if the list is not empty, then all the a values were in the just. They're in the just constructor of the maybe. I know all these things by looking at the type because it's polymorphic. All right? Wouldn't be true if there were ints. If a were ints, I'll just start doing 99 and 77 and whatever else. And I know it because it compiles. All right? The type, it compiled. I know all these things. Isn't that great? I didn't have to do any work. I knew things for free. So here's one. And just to explain this notation a little bit, this says a b to an f of a, so these are the arguments, given a b, given an f of a, return an f of b. And this bit here says you can do math on f. That's all you can do with f. Okay? So if I said to you, write me a function, and in fact, yesterday we did, I think, write me a function b to f of a to f of b where f does math, what are all the things you could do? If I sort of like gave you this type and said, hey, run away and write me this function, how many different answers would you come back to me with? Any guesses? I heard one. That's correct. I know what this function does already by looking at the type. There's only one function in the whole universe that has this type. All right? It's the one that maps on this f, so it calls math on f. When it maps on f, it then needs to pass a function of type a to b to make the f of b. It takes the a, it discards it and replaces it with that b. That's all it can do. If I said to you, write me a function that is int to list of bools, to list of int. Yeah, int to list of bools, to list of int. Well, now it's just exploded. There's millions of things you could do. But because we're polymorphic, there's only one thing you can do. Isn't that amazing? Write the type, done. And what about now? Now, what applicative does is it says you can sequence on f. So that is to say you can take an f. Not only can you map it, you can then run it again. Now, what's happened here is I could do the same thing that it did when it was only constrained by the functor. That is, it could map. But because it's now applicative, I've opened up the possibility of taking this f, this result, and continually sequencing it. So the answer in this case is infinity. There's infinity possible answers. But the simplest one is the one that maps. Actually, that is even simpler. Correct. So I could just call pure b because applicative has a pure function on it as well. So there's some limitations. So it doesn't always work. Wouldn't it be good if our types were just once and that's it? There's some limitations. So going back to the fact that I can discard null if I were writing it in Java or I could discard default if I were writing it in C sharp, this paper tells us that we reason about our programs as if we're reasoning in what's called a total language. That is to say, these other things don't exist. And we do this all the time. Like, how many possible things does this return? Well, two, right? And we throw away that one. The one that didn't return. It's really three, but we are justified in throwing away that third option. That paper talks about that. And this is called fast and loose reasoning. So when I say, null doesn't exist, I am discarding this thing. I'm fast and loose reasoning. And in many languages, they have a lot of things that I have to throw away, such as all of these. That's a lot. If you use Java, which I have, and I did in fact throw all of those things away. It can be done. You can write Java so you never have to use any of those things. But I do throw all of these things away so that I can exploit parametricity. So here's just yet another example. List of int to list of int. It's monomorphic. What does it do? I don't know. Who knows? It just returns a lot of tens. I don't know. But this one here, all elements in the result appear in the input. There's a problem here, which is, well, that's not everything. That's not telling me everything about what this function does. It tells me a lot about what it doesn't do. But, you know, does it flip every second element? Or does it flip every fifth element? I don't know. So what do we do in these cases? We've exploited parametricity. We've got a really good, like for free, we've got some really good theorems, but we don't know exactly what the function does all the time. So, yeah, we don't know what it does. So, I mean, I've worked on lots of projects and often I see these things. It goes, this function twiddles the database to twiddle out the twip top or something like that. I kind of go a bit blank as I'm reading it. It's just a blur. It says something like that. And I tend to look at these things, actually, and I use them to go, it definitely does that thing that it says it doesn't do. You know, the ones, this function doesn't touch the database. I immediately know it does. This is not parametricity. This is just the way programming generally works. But instead of writing these comments or documentation or whatever you want to do, I want to write machine checkable comments where a machine makes sure it's true. If you did touch the database when you said you didn't, it won't pass the tests. So, let's look at this function. So, this is Haskell notation and this is more a question for you guys and I want to hear your input, right? Which is, I have a function. It's type is list of A to list of A as discussed. We don't know what it does just yet. We know that every element in the result appears in the input and I think that might be a bit hard to read but I can tell you what it does, what it says. It says if I call this function, so that's here, if I call this function, then I call it again. On any X, I always get back the same X. Alright, so if I call it once then again, on any X, I just get back X. And this one here says if I have two lists X and Y and then I append them and then I call the function that's the same as calling the function on Y and appending that to calling the function on X. Alright, so these are tests. I've written some tests and a good test library will just start plugging in X's and Y's and see if it's false. So what does this function do? I know there's some of you who know the answer but what does it do? Who feels like they've calculated by these two stated propositions that they know what this function does? Well, does it always return the empty list? It always returns the empty list. No one's brave enough. If it always returns the empty list and I call the function on the function with X, yeah, it'll return not X, right? I'll just pass in a single element list and then it'll return me back nil. So it doesn't return the empty list at all times. I can tell this by these two propositions. What this function does is it reverses the list. All right, this function must reverse the list. It must reverse the list because these two propositions told me so. Okay? There is no other function that satisfies these propositions and that type. That is not reverse. It must reverse. Okay? So people often say to me, do you write comments in your code and like no, I write propositions in my code. I think this is much more reliable because if I touch the database in here or whatever it is, the propositions stop being true and I will learn about that very quickly. Okay? So basically these propositions are what happens when the parametricity has not got us all the way to the answer. All right? So in the previous example, which was B to FA to FB, we're already at the answer, there are no tests to write. Do you test your code? Not in that case. All the tests are written. I have a proof. So here's another simpler example. So flat map or bind it sometimes called, given a list of A and a function, A to B return a list of B. Well, if the input list is empty, so is the result. I'm making this claim immediately. All right? So if there are no A's, then no B's are coming back and I know that straight away. I don't care what code you wrote. I already know this. And every B that is in the list came from the application of the function. It's the only way you're going to get any B's. All right? This reasoning that I'm doing is parametricity because those types are polymorphic. I now know a lot of things about what that function does. This wouldn't be true if it was like A was a string and B was an int. Then I got to return a list of ints and I'm going to put negative 99. I'll put all my favorite numbers there. But because it's polymorphic, I can use parametricity. And like I said, sometimes you don't have to write tests. This is a very simple example. Write me a function that takes A and returns A. How about this? Write me a function that takes an int and returns an int. Oh, that's a lot. That's a lot of possible answers. If I sent you all the way to write me that function, you'd all come back with different answers. That's a lot of possible answers that you could give me. But this one here, if I said it's A in and A out, it's polymorphic, you're all going to come back with the same answer. It must return its argument. And that's all it can do. This is a common objection I hear to this kind of claim is if you're not familiar with functional programming, so don't forget we've committed to that already. But if you're not familiar, you might think something like, aha, I'm going to lock a thread, then return an A. But this has a different type. You can't do that in Haskell, at least when you're honest with your types. So you can't, you've got to return an A and that's all you can do. Where are you going to get it? Well, it must come from the argument. There's nowhere else it's going to come from. And there's no way you can write any test with this function. They're all redundant. The types have already given me all the tests. Okay? So another thing actually I often hear is that types are not tests. And that's true. Types are not tests. The types do substitute the tests. There are now none to write. Okay? And this is the example that we've already seen. So if we look at this code here, and basically I've called G and I've passed in the string hi and then I've passed in the list one, two, three. And it returned me back the list of hi three times. And I already know that the function did that by looking at the type. List, map, right? I can map across a list. What did it do? Well, it mapped across the list of integers and it replaced it with that string hi. But I already know that it did that by looking at the type. I don't need to run it and check. Sometimes it's helpful to run it and check just to verify that you know what's going on. But that's not the same as now I know what that function does by the type. Okay? So the point I want to get across is I can use types to reason about my code and know a lot about what it does. Sometimes I can know everything about what it does. And when I do, there's no tests to write. What tests are you going to write for this type? Are you going to write this one? Well, I already knew that. It's redundant. Okay? So when we say types are tests, it's true that types are not tests. But this type substituted for all the possible tests you could have ever written. There is no need for them. I have a proof. Sometimes they're almost unnecessary. Not quite. How many functions of this type exist? Well, it's a to the power of a to the power of a, but I haven't told you how to do algebra with a's yet. Just intuitively, how many functions of this type exist? I sent you all the way, I said, write me a function of this type and you come back. How many different answers would I have? Take a guess. I see, I hear and see a few two's and that's the correct answer. You must return me an a. These are the arguments. Well, you're either going to return the first one or the second one. So how many tests do we have to write? One. We'll just disambiguate it. We're not sure yet which of the two functions I've written by the type, but I need to write a test to disambiguate which one did I do? There we go. Now I know what it does. If I put in seven and eight, it returns seven. Well, now I know which of those two that it did. And this is just one to think about. Who can read that, by the way? I mean, it's okay if you can't. Okay. So I'm going to try and explain it. Basically, this function, applicative, applicative means that you can sequence the f. All right. So you can run the f and you can run it again. And by traversable means you can run a function on either side of the thing. So if you had, say, a list of pairs, a list of two things, you can run a function on both sides, like you can map across both sides. And the answer to the question in this, so I guess what I'm saying is, like the overall point is, when we look at types like this, and the problem with this type is it's, the benefit is it's very polymorphic. There's all a's and f's and so on. But how many inhabitants are there? How many values of this type exist? And the answer is infinity. Okay. Because I can just keep sequencing the f. I can write the code. I can get the f. And I can just call the applicative and just keep sequencing it. But sometimes we actually do this reasoning, which is, what's the most sensible answer? Which is very loose reasoning. But the sensible answer is the correct one as well. All right. It's the one where you don't keep sequencing f. So it is true. I'm making an admission. It is true that sometimes we do this reasoning, which is, well, of all the infinity inhabitants, which one is this? It's the sensible one. That's it. Here's another example where, so store is a data type. Who's heard of store, by the way? It's pretty handy. It's basically, instead of carrying a b around, like carrying a value of the b around, sorry, instead of carrying, sorry, that should be an a. That should be an a there. So yeah, instead of carrying a b around, you carry an a, and then a function a to b so that you know how you got to the b. All right. Does that make sense? So you're carrying it into round. So you go, well, how did I get to that int? You have a function like string to int and a string so that you know how you got to it. That's what store does. If I said to you, write me a function of this type, like give me a b from one of these. Well, you take the a, you put it in the function, and you return the b. And in fact, that's the only way you can get the b. All right. When I see functions of this type, which you might, I go, well, I know what it does. I look at the type. It's once inhabited. I can calculate it. That paper theory tells you how to calculate that. And it's one and so on. So the point I want to make here is, so address the point that people often come to me and say, hey, this type, I find it really hard to read. I don't know what this says. I need help. And I want to say back to you, okay, let's sit down and talk about parametricity, because that's how we read these types. Okay. Once that tool, once you've got that tool, these things become much easier to read. It's why we do these things. So we're able to calculate what code does sometimes using only types. Isn't that, doesn't that excite anybody? Why don't I just write types and then go home? All the work is done here. It's good. I really enjoy doing that. So basically types document your code. I'm making this claim. Types document your code. Nice, reliable, dense documentation. The best kind of documentation. Any questions? Yeah. Can we do that one in the hallway? Is that okay? So the question was, so if I have a type A to A, and I've said to you there's only one inhabitant of that type, what is the process by which we get to that one? And I'd really like to take that one offline if I could. It requires a whiteboard, and questions keep coming as we do it. Anything else? Who wants to go and write some types? And that's it. Thanks, everyone. Thanks so much, Tony. That was a great talk.