 What we're really talking about here is taking something quite large and putting it in a very small box, where the box here is system f omega. So-called because of the nearly infinite amount of swearing you will do if you try and write any substantial program in it. It's not fun, but that's why we have compilers. And as compiler writers, our job is to take things that are quite big like recursive data types and fit them into these smaller systems that are much nicer to reason about, but maybe don't have all of the things that we're used to and we'd like. And as much as possible, we want to resist the temptation to just add data types to system f omega and make it bigger and bake all that stuff in. Because then as Phil has said repeatedly, then we're no longer using that wonderfully ancient system that we know and love, and instead something newer and less trustworthy. And today I'm going to hope to convince you that, in fact, the box is big enough and we can fit a surprising amount in there. And a lot of what I'm going to say builds on stuff that is not new and may well be known to many of you. But as far as we know, nobody has quite put all the pieces together in the way that we're doing now. And it certainly took us long enough to put them together. So I hope this will be interesting. So just as a bit of a recap and a bit of context for why are we actually doing this? Why do we care? Well, the thing we actually care about is Plutascore. This is our language that actually runs on the blockchain and that's the thing that matters. But for all intents and purposes, and this is the great thing, it just is system f omega. So you will not hear the words Plutascore again in this presentation. You can effectively forget about it. And that's nice. We can just think about the problem in more generality. System f omega, as it is normally presented, does not have data types. So a quick reminder for those of you who maybe don't spend all of your time thinking about the lambda calculus. System f omega is pretty much the simply typed lambda calculus with polymorphism and higher kinds. So this means that we can also have things like lists as well, which we want. But we do want data types. We want to be able to reason about them. So part of what we're trying to do in the long run is to have a system where people can write pretty much Haskell and have it turn into Plutascore that runs on the blockchain. And Haskell has data types. So we've got to do something with these things. And that's part of what we like about Haskell. So we need some way to encode them. And Phil has already told you about that. So there's a number of well-known ways of doing this. So the most famous is the church encoding. But that has some fairly serious efficiency problems. So we don't use that. Instead, we use what's called the Scott encoding. And the Scott encoding is based on one key insight, which is that we can think about what we can characterize a value of a data type by what we can do with it. And well, what's the thing that you can do with a data type value? You can pattern match on it. And so if we kind of pull on this thread and just keep trying to take seriously the idea that what it is to be a value of a data type is to be able to be pattern matched on, then we end up developing quite a nice encoding for these things. So let's actually do that. Let's just go through in a bit more detail how we actually get there. So let's start with just a simple pattern match. So we have some value m of type maybe a in Haskell parlance. We're going to pattern match on it. And in the case where it's just, we're going to apply some function f to the thing inside it. In the case where it's nothing, we're going to use some value g. And if you think about it for a bit, you can convince yourself that really any pattern match on a maybe value is going to have to look something like this for some suitable choice of f and g. And then we can just start abstracting. So we can pull out the scrutiny, which is compiler jargon for that thing that you're pattern matching on. So we take some arguments of type maybe a. And then, well, we need to also be polymorphic over this a. So we're going to use system f's big lambda that lets us bind type variables like that. So now we have some polymorphic function that takes maybe a so far so good. Take the next step, pull out these implementations of the case branches, this f and g. Well, what do they look like? f's got to take a thing of the type that's inside and give you some result type. g's got to, well, just be of that type. And they've got to be the same because your case expression needs to have a single type, which is r in this case. And again, we want to be polymorphic over that. So we'll bind that as well. And at this point, we sort of look at this and think, this looks very familiar. And it's really just an old friend. This is maybe from the Haskell Prelude, which has been around for a very long time. The arguments are in a different order. And they're using b instead of r. But it's the same function. So this is not an unusual way of thinking about data values. We're essentially saying, well, let's think about their destructors, the things that allow us to destroy them and get out and do something with what's inside them. And so we might then think, well, let's go a bit further. What if that just was the type? So let's try and define maybe r type. So it's a parameterized type. So we're going to take some a as a kind star. So type as a parameter. And well, what's the type of that pattern matching function? Well, it was polymorphic over the result type. So we're going to have a for all here. And then, well, it took two functions to do the two cases. The just branch is a to r and nothing branches r. And then at the end, what you get, your prize is actually a value of that type. OK, that seems to be going well. We keep pulling on the threads. What's next? Well, we need constructors. But actually, all of our choices are forced here. It just sort of comes out. You've got to, again, polymorphic over a, take an argument of type a, because, well, it's a constructor for just. It takes an argument. Got to bind this guy in order to match the type signature. Got to take these guys in order to match the type signature to branches. And then, well, we've got the just case branch and an argument of type a. There's really nothing to do except apply one to other. And that does it. And if you sort of look at this, you realize that what you've done is you've inverted the control of pattern matching. So rather than sticking a tag into the piece of data that the consumer can look at and decide what to do, instead, you hand both of your alternatives to the piece of data and let it choose for you. And how does it know how to choose? Well, when you constructed it, you told it how to choose. And this, then, does what you expect. There is no pattern matching anymore. There's just function application. Because a data type value simply is a pattern matcher. So if we use this curly bracket syntax to indicate instantiating our polymorphic functions, we construct a adjust one. We pattern match it as an integer or apply plus one. And I'm just going to tell you that that evaluates to the right thing. And you can do it yourself on pen and with a piece of paper if you don't believe me. So this is fine. This is great. We can do sort of standard Haskell 98 data types of this. Wonderful. Everybody's happy. But what about recursive types? We really, really want recursive types. I don't know if you've ever tried programming without lists, but it sucks. You can't do anything. Anything interesting, whether you've got a variable amount of data, it's really not great. So we want to do that. But we can't just crank the handle. So as Phil said, he had his NAT example. We have exactly the same thing. Well, if you just write out your definition, well, bugger, you've used the thing that you're defining. You can't just do that. You have to have some way of tying the knot of having some kind of recursion or fixed point combinator at the type level. Otherwise, this is just defining some infinite thing that's not going to make you happy. So those of you who've read types and programming languages are probably checking your watch and wondering why I'm wasting your time. Can't we just add a fixed point combinator to your type system? And then everything is fine. Define your types as fixed points. So if we look at the NAT example, you just take your fixed point. You feed it in instead to this type level function that takes what we often call the pattern functor and takes the recursively defined value and defines the type in terms of that. And then the fixed point value unfolds that to infinity. And indeed, this is the approach we're going to take or a similar approach. And you have to add a little bit of complexity to handle that. So you have seen, if you recall Phil's slide, there were these mysterious wrap and unwrap terms in the term grammar. And these just witness the isomorphism between a fixed point and unrolling it one step. So if you remember this perhaps familiar Haskell data type, you have this constructor that takes you from an f fix f into a fix f. And well, pattern matching on it is unwrapping. But we can't use a recursive data type to define recursive data types because we don't have them. That's the whole problem. So instead, we have some terms that fulfill this function for us. And again, so far, all so fairly standard. And in fact, we don't use this variant of a fixed point combinator, which is called iFix, short for indexed fix. And this is starting to answer one of the questions you can have about a fixed point combinator, which is, well, what is its kind? What sort of things are you allowed to pass in as these pattern functions? So this thing here, this is a star to star. That's useful. That's nice. You can do things. You can do natural numbers with that. But you might want something a bit more complicated. And we'll need a bit of extra power in order to do mutually recursive data types, which we're going to come to. It also lets you define things like lists nicely and do irregular data types. But it turns out that we can do enough with this variant. And what this variant has is instead of taking something of type's kind star, it takes something of kind k to star for any k. So you handle things which take one type argument of arbitrary kind. But that's it. And it turns out that this is enough to do everything we care about while still being fairly tractable from the theoretical point of view. So that's the kind of basic story. And I'm now going to hand over to Roman, who's going to show you in a little bit more detail how we get to the encoding of mutually recursive data types using this machinery. We know how to encode simple data types now. But we needed to encode something more complicated, which is mutually recursive data types. Also, we regular that data types. I won't touch the later here. But we will see how to encode mutually recursive data types. When we stumbled upon this problem, we searched the literature. But we were unable to find a solution to our problem. There were solutions that require either extending our language, or they had some really bad complexity, like factorial or something. So we needed to invent something. We will use this example. It's the tree forest family. A tree has only one constructor. It's node. A node carries an A and a forest A. A forest A is essentially a list of trees. As a list, it has the constructor for the empty list. And the constructor that allows to add a tree to a forest. That's an example of a tree. We will use this simple family and improve it in a few steps in order to arrive at the representation that the system of omega compatible. There is a really well-known trick. It goes to at least 1999 when Conor McBride's thesis was published. Let me read it. Mutual definition can always be represented as a single inductive family of data types indexed by a finite type whose elements label the branches. We might define a family parity 2-2 type with parity true containing the even numbers and parity false odd numbers. Applying this trick to our case, we get the following. There is one huge tree forest data type. It encodes both tree and forest data types that we had previously. Here are all the constructors from the previous family. They have literally the same types, but now forest and tree instances of the same data type. Tree forest is parameterized by indexed by tree forest tag, which determines whether a tree forest is actually a tree or a forest. So this is a well-known trick. It allows us to get rid of mutual recursion. We still have this mutual keyword, but it's only for convenience and readability. Those two definitions can be just in light in the type of constructors they are not necessary. So what we achieve here is that we got rid of mutual recursion. This is the next step. We have the same tree forest data type, but now it only has one constructor, which is tree forest. It captures the notion of recursion, but the actual contents of this constructor is determined by the tree forest F type level function. This type level function matches on tag in order to figure out whether we construct a tree or a forest. Since a tree contains only one constructor node, which carries an A in the forest A, we encode this as follows. This is the type of this constructor. Forests have two constructors, Neil, which does not carry any information, and cons, which carries a tree and a forest. This is known as a sum of products representation. It's also well-known. These data depths remain the same type AES structure. So what we achieved here is that we separated recursion from the actual contents of the data type. And once recursion is separated, it can be obstructed. That's our central primitive, what we use, what Michael has shown you. We can compare it to the simple fix that is very common. It's in Haskell base, in data fix or something. With this fix, you can easily tie nodes in simple data types, like natural numbers. But it doesn't quite work when you have some complicated data types, like mutually recursive data types. Because in the net example, there is only one recursive case, net. But when we encode tree and forest, we have two recursive cases, tree and forest. Yes, they are defined in terms of the same data type, but they are still distinct. They are indexed by different tags. And so we need to somehow push tags in. And that's how we do this. We have this x of type A, and we parameterize the pattern function not only by the recursive case, but also by some x, which f can use in order to determine what to do with the actual recursion. Applying this to our case, we get a following. That's all basically the same as we had previously. We have this type aliases as before. They are just local now, although we have global ones as well. We have tags. We match on tags in order to figure out what to store in constructors. They solve the same. The only thing that changes is that we now use iFix for handling recursion for us. And know that we do not have the data keyword anywhere here. It's now handled by iFix when we no longer need some special data construction mechanism. So if we have this iFix primitive, we don't need anything else. We can encode new show recursion using just it. But what about system f omega? We don't have data types at the type level in system f omega. We don't have pattern matching there. We need to somehow replace these things marked in red. Well, we can just in live pattern matching. If you want data just encoded, that's what we was using this entire presentation. So a tag is something that takes two types and returns one of them. Three tags returns the first and four returns the second. And here we just apply tag to the encoded version of constructor of node or of tree which is node and to constructors of forest which is nil and cons. So this iFix is well-known too. You can find it on Hackers in some libraries. Similar tricks I used in the type theory field. But this simple thing is what we were unable to find in literature. That's just something that we came up with. And now a very fair and honest encoding in actual system f omega. No jokes, no Agda. We have this tag that we were on the previous slide. The literature is the same. Like you have a function that takes two types and returns one of them. This is the unified pattern factor of the tree forest family. Depending on how you instantiate tag, you can get either the pattern factor of tree or the pattern factor of forest. We have encoded here product and some types that we had on the previous slide, these ones. We don't have dating system f omega, so we need to encode it. Having this unified pattern factor, we can define unified data type just as before. And then we get actual data types by instantiating tags as appropriate. And these are the constructors. So this entire family. This is some usual scouting coding. Nothing interesting here. The only thing that is really related to this case is how we handle IRAP. IRAP always receives the same pattern factor regardless of whether we constructed tree or forest because they encoded in terms of the same data type. But it also receives some tags. And they are, of course, different because that is how we distinguish between tree or forest. We use tags. So a node is the constructor of trees. Hence, it receives a tree tag. And nil and cons are constructors of forest. That's why they receive forest tags. This actually looks when we print this out in full. So in the Plutus code base, we have two syntaxes. One is standard, one classic, it's old. And the one is new and readable. This is a readable one. So when you get errors, this is a swearing that Michael was speaking of.