 Hi, I'm Gershon Bazerman, I work at SG Capital IQ, I've been a tasker for some years, I helped with sort of being a tasker though, or a different structure on a tasker though. I'm a committee and I, like, get sort of stumped and this is a really good turnout and I'm glad that, you know, talking about it in racing gear and stuff because I'm talking about stuff that is pretty difficult and we're going to go over it slowly and I hope we'll be able to set a pace that will be sort of people that know a lot and people that don't will both be able to eat something out of it. It's a really beautiful round of work and I just hope that this is going to be enough to introduce people to it because, alright, how many people here have ever tried to re-functional programming with bananas and the lobes lenses in barbed wire? How many people think that they really get what's going on in that? Yeah, and so I started, and one of the things, I think, how many people can name more than three morphisms? Cata, Hylo, Zygoh, Histo, Dina, right? I like Dina morphisms, they sound like the Lost Transformers. So one of the things is we learn, like, oh wow, there's all these different, like, recursion patterns, but why? Why do we want to classify them? And it turns out a lot of it has to do with, and this gets lost when you only look at one of these, one of these papers, it has to do with the School of Research, you know, about calculating the programs, the program derivations and transformations. And in these notes here, I've tried to give maybe not the most recent articles, but articles that sort of take you back a bit in the history, like, you know, Mertin's Writing Algorithmics in 1987. So these weren't really research about Haskell, and it was only sort of research about functional programming. It was research just about programming in general, and the sort of dream that you could specify an algorithm, algorithmics, in a very uniform way as opposed to pseudocode, right? An imperative pseudocode with for loops, and that when you did so, the idea was that you could manipulate these programs, and they call it calculating programs. And calculate with them, just as one calculates with any algebraic formula, and you know, you're taught you can give, you know, x squared plus 3x plus this equals that, right? And then you expand it out, and you put it into a normal form, and so forth with a formula. The idea is you can write programs the same way. And it turns out that when you start doing this, to talk about the laws, a lot of category theory comes up. And so I started reviewing the textbook on this stuff, the algebraic programming by Gordon Moore, a wonderful blog. And I had, for some reason, thought that it was simpler than the papers. But it turns out that when they got the room to stretch out, what they decided to do is instead of taking twice as long to make everything more clear, they decided to put in all the category theory. That was how they arrived at these results, but which all the reviewers told them, you can't put this in the paper. So they said, now it's in the book. So I'm not going to try to give a fairly category theory presentation, which means I'm not going to be able to get the deeper stuff in the book. And in fact, I'm only going to really have one running example and one shorter example that isn't even going to get the way into the media stuff. And I figure that, if we do that right, that should take the whole two hours. But it could be that I really miscalculated and we'll get through it really quickly, or it could be too much. We'll see. So we do need some sort of, to quote, I don't recall who it's from. It's Raymond Brown, the mathematician, running them, quotes it. It's from a magician. In fact, it's till the difficult becomes easy, the easy becomes harder than the harder, but it's beautiful. We're just going to be practicing, and we're doing some simple concepts, and we're going to try to derive one algorithm, at least, which is, how many people are familiar with the maximum segment sum problem? Okay, this is a really beautiful problem, and we'll get to it in a bit. We've got to do some setups. Because it's an example of where you have an algorithm that appears that it's got to be about OMQ, and there's linear algorithm for it. And that was discovered back in the 70s. And it's now become the hello world of these sorts of school of program transformations. Because there's a way to take one and turn it into the other, proceeding entirely by mechanical steps, just as one would simplify an arbitrary equation. And hopefully, I'm going to show you guys quite the end of this talk, how to do that. So, yeah, the notion is, yes, here we go. Equational reasoning acquires equations. And we're going to level up our equation reasoning by looking at equations more powerful than simple substitutions. Because we can all do some basic equation reasoning in Haskell, right? If you have the function, you know, head of a list or something, right? You just substitute in the definition of the function, and just as one substitutes the lambda calculus and you expand it out, and you can show one thing that equals another. And do that effectively just as the computer does it when it's running the program. Just by substituting and expanding and substituting and expanding, you can actually demonstrate facts about your programs that way. But that's sort of long-hand, right? We don't want to have to... That almost amounts to our equation reasoning is the same on the computer performance of the program. We'd like to have some higher level laws, just the same way that, you know, we can say that times distributes over plus. And you can say that sort of once and for all without, you know, doing big quote equation reasoning that multiplies with numbers. And so, in order to start talking about this, they introduce some category theory concepts. If you want to learn the full general theory. And they do introduce these sorts of pattern bumpers. And there's going to be talk about those tomorrow. How many people sort of recognize what's going on with these few lines here? How many people do not recognize it? Okay, so we're going to have to go over this at least a little and talk about what's going on. And the general notion is that there's a large class of, right, potentially infinite data specters or recursive specters like lists and so forth. And what we can do with them is we can talk about everything about them just one step at a time. So, many people are hopefully familiar with the theorem that fold R, or not just fold R, that various folds could be thought of as the quote universal property of a list. Which is to say, pick it up, that's the type signature fold R, right? And so, if you sort of swap the arguments of that round, that you put the list argument first. You're going to say fold R with the swapped arguments is going to be a function that given the list of a's, now is a function that takes a list of A to B to B, sorry, a function of A to B to B, and the value of B, and returns to B. And it is a theorem that we're going to get to that if I have a list and then I partially apply fold R to that list, that what I have left is equivalent to the original list in every respect. I have neither gained nor lost any information through doing this. And the way to see it, what we're going to get to that is to put in cons as your A to B to B, and the empty list as your B. Then the fold will give you back either a list or a reverse list or whatever. The fold ones, but something that you can see is clearly the same as the list that you started with. And you can go in the other direction. And so what these patterned functors are is a very general purpose way of taking many recursive structures and breaking them down into these step-by-step pieces that capture what's going on. So for the list, but we'll start with an A functor, right? When we talk natural shear where it's like I have, let's do the list. It's more interesting and actually easier to grasp in a way. And it's the only one we're really going to talk about. Here we're really just going to talk about lists because there's enough interesting with them. And when you're talking about lists, you're talking about the streams and operations on data. And so therefore you're talking about online streaming algorithms of all sorts, which is basically what the modern internet is built on. Right? Things where you've got a firehose of data coming at you and you don't want your structures to build off bigger and bigger over time. You can't wait until it's all done. You incrementally need to keep producing as much as you can if you're going to keep up. And those are all based algorithms. So it's okay that we focus on lists even though when you do it in the generality, you can work on many other types of structures as well. So here we go. We say that the pattern functor without the signature functor, or just call it the signature data functor, forget the word functor, up-list is this. It's either nil or cons of an A and an r. And the A, so we're going to have a list of As and so we just have the A there, right? And you can just substitute that in for int that we were down with. You want to think of nearly as ints. So we have nil and then it's cons of an int to some r. And that r can be anything. And I put r for remainder there, right? And so that's not a recursive structure, clearly. It's either the unit or it's just a pair of things. So in fact, you could just call it either unit pair. But the important thing about either unit pair is if you keep putting these either unit pairs one after the other after the other, right? So if I have cons of f of say one and then cons of other things, I'm bad at everybody on the chalkboard. So I apologize. Right now this is the list. And then somewhere around at the end I'm going to have a nil, right? So I've written the list. Now what's the type of that list? Well, now it's going to, that list is going to be a list f of int of, a list f of int of, list f of int of, list f of int. So it's going to reflect every single one of these constructors as we go down. And this fixed point of the functor guy is a way of sort of collapsing that down. I'm saying it's an arbitrary nesting of this that can go infinitely deep or, you know, cut off at any point. And it's just a way of, right? Now, now here's the, but we've been, and we don't really need to get to that fixed point stuff. It's just a way to think about what's going on here is that recursive structures are built up one step of a constructor plus another step of a constructor. Now the important thing is we have an isomorphism. Here I've written l-alge for what we're going to call a list algebra. And in this formulation, a list algebra, it's just going to be our two arguments to fold r. It's going to be our function a to b to b, which we call it, quote, cons function, right? If you think of a fold, one way to think of a fold on a list is I write out the list and it's, you know, 1 colon 2 colon 3 colon 4 colon empty list. And the fold just for places everywhere you see a colon with a function, and it replaces the empty list with the sort of unit value of the fold. So you can think of it as a syntactic substitution, right? Or an interpretation of an algebra. If you think of cons as an algebraic operation, right? You can think of it as I've written a formula and now I fix the meaning of my cons operation, the particular interpretation of what it means to adjoin something by yourself, right? And that could be something you have. So now, we're only going to look at one step of a fold at a time. So we have the list algebra and that's the two arguments that we're going to pass to a fold, right? So given a list and a list algebra, clearly I can produce the resultant fold. Now we're going to write a one step fold and that's going to take a list algebra and not a normal Haskell list but just one step of this either unit or pair. And it's just going to, in the first case, right? If you give me a nil, I'll give you the b, the unit value. And if you give me a cons, I'll apply my operation to the pair, right? So you can say this is sort of the universal way to get something out of, this is the eliminator just like, you know, maybe we have the eliminator for maybe or the eliminator for higher. This is just the eliminator for that list f structure. And by that, I give you a list f and then I tell you what to do in the nil case and I tell you what to do in the cons case. And now I, in a sense, told you everything that you can know about this very simple structure. And so everything you can do to one of these list f's is going to be captured in a list algebra. And that's a lot simpler to see because we're not looking at the recursive structure deeply nested. There's not many things you can do with unit and pair. And so therefore this captures all of them, what you do in the unit case when you do the pair case. So where you had a pair of functions, sorry, a pair of constructors, the unit constructors, the pair of constructors. We now have an algebra, the dual of that, a pair of functions, like one per constructor. And now this fold takes us in one direction, right? It takes us from a list algebra to, now you can put parentheses, right? It takes list algebra to this guy, right? And very often when you'll see something in these textbooks, and that's why I'm going through this, where the white capital F of a arrow a, and they'll say that's a list algebra. And then this, what I'm illustrating is that's the same isomorphism here. We take our list f to be what built the node as a capital F. Then you have an isomorphism where I can take a list algebra and it's equivalent to any function, right? Any old function that you can possibly write from list f of a, or in this case a, b, to b. And we can write the isomorphism in the other direction with main list algebra right there. Where if I'm given, you know, one of these functions that maps out of a list algebra, I sort of can just wrap up its partial applications, right? Into my top, and recover my dictionary. And so this is a very important isomorphism here, because you can see sort of code as data and data is code, right? This L L's guy here, we've got to do, that's a pair of functions, so you can think of them as a data structure. The other guy we've got, list f, a, b, arrow b, that's just a pure function. And what we have is now a very simple theorem that this pure function can be represented as a pair of other things. And that you can go back and forth freely between them. And you can say, why am I going through all this? Yeah, I think I'm going slowly enough to take it to the pole too. I feel these things have gotten started, but we're going through all this because all of these identities that look very trivial, it turns out that if you, that you can, you only need a couple of non-trivial ones among them for a whole equation of strong group emerge. And that's what's going to happen. Now, I don't want to stop for exercise. Do people feel that they'd like to take a minute and try to convince themselves that we've written an isomorphism of the step and make yourselves a part of it? Or do they feel that this is a part of it? All right. So now here we observe that our pattern of function is indeed a complex classical sense that, you know, if I have a list step of, you know, that has one thing in the remainder position and something else in the remainder position, and here we've shown what I talked about earlier that, you know, I can apply the arguments from a list algebra as arguments to a fold order and recover my normal fold on lists. And this guy we call Little A is precisely the guy that, if I take, you know, this fold of my Little A algebra, I'm going to cover the list right back. So this shows sort of why fold orders are universal in that sense. So, yeah, this is a simple not-battery size that I've shown you how to write the function that gives you back the list as a pair of functions. Can you do that for sum and length? If you know how to do it, look at your neighbor and see if your neighbor knows how to do it, and if they don't talk to me about it and they take a minute to do this. And if you're convinced that you think that you guys know how to do it, then raise your hands and tell me to keep going. By the time we get to the interesting stuff. Okay. It'll get dense fast. Oh, so you can do this one, because I know how to do it. There's one of a stupid question. How do I construct a list f that I can use to test this? It's just a function, or it's just a pair. Like, it's a type. So I'll give away the answer now, if anyone wants to jump in and like feature this pattern, and tell me the answer. What is the pair algebra for sum? Do you want to tell me what that is? What's the middle? Pardon? What's the middle? Yes, there we go. Doug has explained that this is the sum pair algebra. Right? So we'll say that before the empty list, you return a zero, and if you were going to con something on, you're going to sub to do that with a plus. Okay. And length pair algebra is going to be something like lambda x, y, probably y plus one. Right? So it's going to throw away one argument, but every time you see it, you cons on it, you just add one. And that's what a length pair algebra is going to look like. So now we get to the parts that immediately get fairly tricky. And what I'm going to do is I'm going to present some things that, in a sense, you're either going to have to believe or we're going to have to chase the category diagrams. And I was planning on just asking you to trust me. Because now what we can do with this is, but I need to introduce what these list holds and these pattern bunkers work. Because now we have a non-trivial statement, which is another way of stating the universal property statements. And when you have these universal property statements here, what we're saying is, if we have some function h that can be written as the fold of some pattern functor f. So I've got some pair algebra. I've got some function h. I can write sum any way I want, but if it so happens that I add it. But I know that sum can be written as fold of that sum pair algebra. So any way of writing the function sum that is equivalent to writing fold of some kind of algebra. Now, that is the same thing, and this is a non-trivial theorem, as saying that if I apply h to folding that entirely boring, just the one-step list fold. To folding that entirely boring thing we saw where it's just constant, you know, nil. Then that is equal to this folding, which is that one-step fold, the algebra, and then mapping over h. And that's a really non-trivial strange property. And the best way to see it, and if we're going to go at the proper pace, you're not going to have a lot of time to play with this, but you can take this code and paste this into your wrapper or something. And just start typing it out and playing with it and trying to call these functions and see what happens, right? So we've... Is that a little a bound in the theorem? That little a is up here. I've given it right here. It's your trivial algebra. I meant just using the rotation. So we've already... LFA is the list fold of a, and it takes a list step where it's cons of an element to now just, you know, a normal Haskell list, right? And so you can say view that as one step of a fold reduction of a list. And what we're saying is the universal... And now we've given the universal property by writing both sides of this quality below that h.list-fold-a and list-fold-left.map-h. And this is just showing that they have the same type signature, right? In both cases, it's something that takes a list of a's to a b. And the list algebra. And then it takes the list functor, which is sort of your one step deconstruction of your list of a's into the first day cons of the rest of them. And takes that to a b. And here we've just verified that these two things that don't even necessarily look like they have the same type signature, right? In fact, at least have the same type signature. And then if you work through the symbols, you'll find that they in fact are equivalent in terms of the operations they perform. And one way to verify this here is we've said that h is equivalent to the fold of an f, right? So now we're going to substitute in from up1. We look at up3. We just substitute in h for fold f just to look at what happens. And that it becomes much more obvious why it's an identity because you can trace things. So this is something that you might want to work through a bit and convince yourself of. And it's something that either is mysterious or trivial, most of you are, you know, grafted or not. And the beauty of this is, yeah, and the full day is id when I don't want to do that, is when you write this as a diagram. I'm not going to do it for you, but you can maybe do it later if people know you want. You can now extend that diagram and you can apply that identity and then say, but now, what if I don't just have something like up here where I just have, you know, h dot less fold a. But instead I introduce a third guy, a g. And if I have an, and now I can sort of take that identity and sort of apply it to my knowledge of what happens when I introduce a third guy. And I get an incredibly non-trivial theorem, which is the heart of these program transformations, which is fold fusion. And the claim is if I have any function h and I, you know, sort of take this one step out of the pattern function transition on it. And then if I can produce some, if I have some other g, that sort of is the other way where I map the h over everything first. And then I take the one step transition with the g, then those are the same. That's my precondition that I've set up this sort of non-trivial sort of commutative identity between a g and an h. Now, I get the property that h dot fold f equals fold g. So, you can apply this in both directions. And sometimes it's the case. I have the fold of a single function g. But I know that I can do other stuff by pulling some of the work out of the fold, right? And so now I can take my fold of g. I can demonstrate this equational identity and then figure out what that fold of h, what the h must be. That I can pull out of it. And the f, which is the remainder, so I can split a fold. And going the other way, I can fuse two things. If I, you know, want to sort of do the work as I go. I guess it's hard to tell if I'm moving too fast or too slow for our party friends. And here we've written the two, again, just to verify that this claim, which I took from, you know, Bob. And they voted with symbolic notation and I didn't know if I could make sense of this. So what I did, and this is what I want to hope I can illustrate, is I turned it into code so that you can type check it, right? Therefore, I, you know, we vote the two sides of it here. And you'll note that, I pulled a trick that I want to show people here that just you can learn from in terms of how you play with this stuff on the rebel. In both cases, I claim I have some ambient hfng in the environment. But I don't know how do I even, you know, so what I do is I just write my functions ff1 and ff2. And they take these ambient guys as audience. Now if you actually take type of ff1. We see a problem, right? There's a free type variable t. Because we haven't, we've taken an argument, but we're not using it in this function. We're only using it in the other. And then similarly type of ff2 for function two. You know, the other guys free. So now, like, now it's very hard to read. I can't just ask GHCI what's the type and have it guide my intuition. But we can use this trick here, type of ffy as type of ff2. And now that's going to force those two signatures to unify. And as type of has been in the priority for wherever. And then it's for precisely such purposes. So now I've asked GHCI to do the unification. I could see what it must be all together. And I can sort of substitute in better variable names if I like. And that's what I've shown to get down here. And so now we observe that it's function that takes a b to a c. And then two list algebras. And then a list of a to b and gives you back a c. So you in fact do see the fusion. That it sort of takes your list f of a to b and gives you back a c by somehow using these two different list algebras together. You have two equivalent ways of doing so. That are computationally now not doing the same work anymore. The computationally might take different amounts of time. So here's a, this is sort of a stupid one. But it shows the principle. It's something you'd actually want to do in practice. It's a slightly non-trivial exercise. So if people want to work through this, it's actually a real exercise. Which is look at this law that I've given you right here. Take h to be show, f to be some. And write the g. And write the g. And write the g. And write the g. And write the g. And write the g. And write the g. Write the unique function g. Such that, you know, it obeys this law, right? So show dot sum of a list has to be, can be written, I claim, purely as a fold-over list in one pass. It's a really inefficient fit. So it's purely for demonstration purposes. But you can imagine going the other way. What if you've accidentally written a fold-over list? Where, at every step, because it's coming from some dynamically thing, you're sort of reading and showing the input and, you know, back and forth. Then, you know, I haven't done that in half. I've certainly done that in a dynamic life where you accidentally, right, sort of keep passing it through a Jason Dixon or something. You can now apply this theorem. And, you know, therefore say, well, I can fuse, I can unfuse and split out that two-Jason step to the end and sort of work in my nice numerical thing. And that's an application of this theorem. Now, of course, one doesn't necessarily say, in order to do this, I'm going to demonstrate this fold-fusion precondition. But that's the case always. You learn the equation rules and then you apply them intuitively as well as equationally. So I don't know if I'm going to make people want to exercise, but I want to demonstrate that. And then here we have the sum of, as we discussed. And then this gives you the parameters for the exercise. If you want to do it now or later, you just have to write what this undefined is here and then produce what that would be. And you can see what it must be, is something that reads in your integer. Then does operation on it and then shows it back out again. And now here's some trivial identities that are actually rather complicated to work through operationally. So I don't know, again, it's very hard for me to gauge with the audience if you just want to keep talking or if you want to try to actually determine that these identities are in fact the identity. And you can prove them with the full laws I've given. So you can substitute mechanically and that would take forever. Or you can apply the full fusion law and prove them. Maybe people want to maybe just pop it over in the rebel enough to get the type signature of those two things, to determine for yourself that those types are equivalent to the type that that might be worth doing. Again, these things seem trivial. But then you start to put them together and you'll be surprised. One note, when you read the bird routines book, they don't have a function make less sense, but they always represent things of at bay area. So the notation here is a bit clumsy. This is all in the GitHub repository. Sorry if that wasn't clear. It's in that size of the terminal. So you go to the lambda con for you while you're on the key basis. Get done with this. We'll sit a bit and just want to move on. Maybe later on. Or on the other hand, I might have mounder psyched the audience for the remainder of the talk. Or on the other hand, I might have mounder psyched the audience for the remainder of the talk. So, yeah. Yeah? List f of integer integer isn't actually a recursive. It's not actually a recursive. List f is one step of unfolding recursion. So this stuff, it's list f hr. So it's not a list f of n. It's just a pair. It's either nothing or it's a pair of an nth and some remainder of that. You don't know what it is. Right. So in dx1, it's list f of integer integer. Right. So that's one step where the step is integer and the tail is integer? Yeah. So I'm a little confused over what that actually means. It's something that takes two integers when you string it. That's it. You're adding two numbers. It's one step of a recursion. So it's more helpful to think about it in terms of pairs instead of cons. Yeah, exactly. And then the whole point of this in a way is that you can take things that you normally would have to think of wholemeal in terms of all your conses and think of it just parallelized. Because at each step, you only have to be available those two arguments to your combiner from the end. Right. So if you look at it, in a sense, this g just goes like a prologue. We'll just wrap up now on this part. But if you think about it right, the challenge is you need to write a function that give it a pair of an integer and a string gives you back a string. And that string should be the result of adding the integer and the integer represented by the string. So you can see that you add the first one to the read of the second, and then you show the remainder of it. And that's sort of what you can observe is going through that sort of excess sequence of reads and shows so it can be fused away into this other operation. All right, so now we're actually going to get to another non-trivial theorem. In the case of list, it's very obvious. And many of us have done this accidentally in the past. Certainly without as much to do as we're going to go from here. But the banana split theorem is a generalization of the statement that if I have two folds that produce a pair of results, they may always be written as a single fold. And the famous example of this is if I want to take the mean of a list, I have to have one fold to take the sum and another fold to take the length. And I take the sum over the length. But I can write that, in fact, in one pass. So everyone here can imagine how one might write that in one pass, probably most of these have done this in some point. OK, good. So that's universally true. You can do this for any two things, not just some and like. And the only part that makes the banana split theorem interesting, which is not what we're going to do here, is you can do it not over just on this puncture, but any of these pattern punctures. You can do it over trees or anything else that you like. As long as you have two traversals, you can always put them together. Now, the interesting part about it, outside of the fact, is that it's our first example of true program calculation. It's something that can be proved entirely using identities and using the laws we've given. And that's what I'm showing you here, is I have many functions, bs1 through 6. They are all of the same type signature. They take two list algebras, and they take a list f of a and then list of a, which is what we call is isomorphic for a list of a. And they give you back a pair of b and c. So there are many functions that give you two ways to go from a list of a's to a pair of b and c. And all of these functions not only have the same type signature, but I claim that they are the exact same function. They do the exact same thing. And furthermore, between any two of these functions, there is a single application of a single law, or since in some cases two applications are the same law, that lets you observe equationally that these two functions must be the same. And that's what it means for it to be program calculation. So in this case, we start off with the naive thing, which is fold and and and is imported from control.arrow. Now, if you substitute the a there, that's the arrow for just the function arrow, because we're only using in that limited context. And and and is something that takes b arrow c and b arrow c prime, and it returns a function b arrow c, c prime. So it's a way of forking off one argument into two functions, and then returning the pair of those zones. And so this is the most naive way that you would do this. You just applied one fold and the other fold and fork and fork off from the same list fold. Here is a slightly less naive way to do it, which is a result of a mechanical law that we call split fusion, which is, and you can see exactly what it does. If I do one function, if I apply one function to something, and then fork it off, these are all written point-free, and they like to do it point-free, so that these laws look more equational. So here I've got a pipeline of operations. The second pipeline is operation of the pipeline is these pair of folds. And the first operation is a single list fold. Now we say, well, I can distribute the list fold back into this splitting off. If I need to do it to the argument once, and then I can split off the result into different places, well, why don't I just set the argument to both places, and now I'm going to do the same work twice by splitting off that thing into both pairs? And that's what we did. So it doesn't look like it's a simplification, but it lets us now take us somewhere else. By the properties of fold, the fold of C1 is dot the list fold of A is equivalent to the list fold of C1 dot the list map of the fold of C1. And you're going to say, what? And now we introduced it up somewhere up here as sort of an equational property. It's just sort of the universal property of fold. It's this guy right here. We're just applying that identity. Now, again, it looks unnatural that you would want to apply that identity, and it doesn't sense the difficulty what this stuff is. The reason you know to apply that identity is because you're Richard Byrd. I can't give you better answers. Um, but it's also one of the only ones who've got lying around, so you might as well try to apply it. So knowing which law to apply is one thing, but I guess I'm still not seeing what's our goal here with these transformations, like what should we be trying to reach? Well, let's walk through this one and we'll see. OK. Because, again, this is a very nice proof, and then the difficulty is if I'm learning these, you learn techniques, but you don't learn how to come up with it and precisely as a nice proof of that. Because you just have to see a lot of examples, just like a lot of proofs by induction or something. After a while, you get the hang of the biggest steps you want to have. So we're just going to walk through one. I'm not going to talk about that in general, because I can't do that either. We're going to talk about the ones that are nice that someone else do knows what they came up with. So here we go. By the universal property of fault, we can apply that transformation. Now, you can apply something called split expansion. And here what we've done is we've introduced, you can always add whatever you want over here. If you're splitting something into a pair of results, you can always apply whatever function you want on this slide, as long as you then throw it away. So this slide was the list map of the fold of C1 in the first half. And so now, you can be the list map of the fold of C1 and anything else over here. It doesn't matter, because we're just going to take the first of the result. So you can just expand out whatever nonsense you want there, because you're going to throw it away by calling first on it. And here we do the same thing, except you're going to throw it away by calling second on it, and you'll expand out on the other side. But now what we've done is we've introduced this parallelism. We're sure we're folding the C1 and we're folding the C2. We've arranged things that they're now in the same form, and now we're folding C1 and folding C2 in both cases again. Is it clear to say that we're sort of slowly and laboriously commuting the fold from the end of this pipeline and forward? That's probably true. That's a good way to think of it, is that that's a way to think of the goal in a way is just like a goal you learn in algebra is get all the X's on the same side. And whatever you're doing, it should eventually get them there. Maybe one has to come back over to push it back over. But I think, in this case, the goal is that we want to pull this fold out of the pairing operation. And you want to just arrange everything so that they're in a position where you can do so. So now we apply functor splitting, where you have the property that map f dot map g equals map f dot g. And normally we use that to fuse maps. But in this case, we've got a map f dot g. And for algebraic reasons, we're going to go in the opposite direction and have the map f and the map g separately. And in this case, you can see why we want to do it, because here we're mapping first dot that and here we're mapping second dot that. So they're not the same function. But if you split the maps in two, then this function is exactly the same function as this guy. So we've rearranged it to share as more work again by getting the parallel structure on both sides of the tuple. And now we apply backwards split fusion. And you get this guy that looks pretty awkward, where we fold the c1 dot map first and the fold the c2 dot map second. And that entire structure is composed with mapping the fold of c1 and c2. Now this is not any more efficient than this guy. But it's an interesting form, because the form is we take this entire thing to be one of these guys we would call h. And so you have this, h dot map that. And then you apply the universal property of list because you have function dot a map. And that tells you that you can actually produce a result. In fact, it's not constructed. It doesn't tell you what the result must be. But it tells you if you have a function dot a map, you can write the entire thing as a fold. And then that is indeed what we then do. And now here we've gone through and determined what that fold must be in this case. And this is what we started with, is just the fold of one and the fold of the other. And this is the result. So we fold, make a list algebra, fold of one dot list map first and fold, and the z2 list map second. And there you go. That is a fully mechanical proof. And I hope you've seen sort of the point here, is that it's sort of a laborious process. But you need a flash of insight to write the proof. But you don't need to make an ad hoc heuristic argument as to why these two things are the same. You know by equational laws that these two things must necessarily be the same. And in case of algorithm like this, it's easy to see examining the algorithm why it does the same thing. When you get to more complicated algorithms, it's no longer obvious that these algorithms should work. They look like magic. And one way to make them look not like magic is to start with an algorithm that you know must be true, because it reads just like the specification. Then you apply a series of stats to something that looks like it couldn't possibly be true. And then you know it must be, because these stats each make sense. And then it's a way to approach the salt from the bars. So quick question. Can anyone tell me the obvious way to fuse two folds, just off the top of their head? In general, if I give you two algebras, what's the rate of writing a two-list algebra in our explicit pair representation? How would I just write the single-list algebra that's a fusion of both of them? Remember, a list algebra has two components. It's got two nils. And it's got, well, now we have two nils. A list algebra has a nil and a function. It's got a b and an a to b to b. Now I've got a pair of nil, a to b to b to b and another one. And I've got two nils. How do I smush them together? If I'm just writing this without thinking about any of this stuff, I'd just show nil. What would be the function there? I'll give you the answer. Your nil. Just make some pair of nils. Yeah, exactly. Your nil needs to be a pair of the two nils. And then every time you get an argument, you take the first of the argument and you apply your first function to it. Your second to the argument, you apply your second function to it. So you just turn it into a carrier of pairs all the way through. Which again, you can think, there's a very nice, and this is the sort of thing where once you draw the category diagrams, it must be the case. It's sort of like it commutes, that you can come up with these nice category slogans, like the product that folds is the fold of thoughts, in this case. And in a sense, category theory in natural transformations is about coming up with slogans like that, which is true. You swap two words and you show that this operation, the product or the idler or whatever it is, carries across it. And then that's the slogan here, in the category you like. So that's a non-trivial theory. Now we're going to do maximum sampling. Oh, boy. This took me a long time to log in. It's a fun one. There's about six papers I give them in the site that that examines. Everyone likes to examine it. Some people have written more than one paper on it, because it's such a nice little problem. So I'll tell you a story without going into too much detail. There was a problem at a job I had where we spent a long time trying to come up with a very efficient algorithm for it. And we figured out how to write it in one pass, but not manually. And much later, I realized that the problem we had was a variant of maximum segment suck. And if I had known that, I would have known why writing it linearly wouldn't have made sense. And also, rather than having to think about the algorithm which we did figure out, I would have known how to write the algorithm in the textbook. So it's a problem that actually occurs in practice in a lot of applications in various ways. And where at least I didn't know the efficient algorithm for it was a textbook algorithm for a long time. So the first thing here is I've written concat list instead of just concat in case you happen to be on GCC6, and I think you're generally less concat, so for sure. So, oh, 7.10, pardon me. We're doing a lot of romp. We're not going to hit maximum segments on the line. I think I understood what I was talking about. I'm only at, uh, but that's OK. I'll just speed it a bit faster and use you a bit faster. Concat list is a natural transformation. So map a function over concat list. If I'm mapping something over the concatenation of a bunch of lists, I can map the mapping over the list and then concatenate later, so I can push these things back and forth. We're introducing now list homomorphisms. A list homomorphism is a function h from list of a to a, or some particular a, such that you have this marvelous property here. I do this or that, which is to say that I can sort of divide up my list arbitrarily into as many sub-lists as I feel like, and then I can apply my list homomorphism to all those sub-lists, and then I can apply my list homomorphism to all those resultant lists. And that's the same as if I just applied the list homomorphism to the concatenation of all the lists. So it's something that you can sort of split as you design, and it always gives you the same thing. And a sum is, again, a classical example of this, right? But there are much more interesting list homomorphisms as well. And then they come up in many rounds, and, right, all the stuff with map reduce, a result from studying. So people talk about monoliths a lot, right, and map reduce, and all these galawadries. So the study of them actually started with list homomorphisms because we talk about monoliths as the carrier functions for list homomorphisms. And in fact, map and reduce come from a fundamental law of list homomorphisms here. We define our monoidal operation, the carrier of this, as just if you only give me a list of x's come alive, and I apply the list homomorphism to this pair, right, that gives me a pairwise operation. And that pairwise operation is going to be the monoid. Now I have the theorem. If I have such a pairwise operation to find it that right here, then I can take my list homomorphism and I can split it into a map component, which takes every element and applies the list homomorphism to the single element of this. And then, this gives a fold R, but it can be really any sort of fold, including a parallel fold, that I fold the combination of combining these with my monoid homomorphism or my monoid operation over that list of injected things. So this is a splitting of a list homomorphism into two phases, a map phase and a fold phase. And this says that list homomorphisms can be solved by map reduce. And this is the theorem which gave rise to all the ideas of map reduce. So that's sort of neat. And the thing about map reduce is it doesn't tell you how to solve interesting problems. It tells you how to solve obvious problems. The interesting part, which I will not fully get to here, is how do I take things that do not appear to be list homomorphisms and turn them into list homomorphisms? How do I take algorithms that appear to be a native sequential and turn them into algorithms that, in fact, do have a parallel component that was hiding? And we use all of our tricks here to manipulate them to expose that latent parallelism. And I don't think I'm going to get to that. The parallel prefix, some example I have, is at the end. And that gives a virtual map. But at a minimum, we're going to map from the segment side first. So now, that's sort of one thing we need to even understand at MSS. And so if you look at data.monoid, you'll find many list homomorphisms there, because all monoys induce a list homomorphism advice or something. Now, we're going to talk about an interesting fusion problem. And we're actually going to prove it. I don't know. We're going to talk about a similar problem. And so we're going to find a function called pale's algebra. And it's a pair algebra like the others. And this pair algebra is going to be the fold that yields the function pale's. You know what the function pale's does in Haskell? It's sonic, the hedgehog's little pal. Sorry. There we go, that's pale's, right? It gives me all the pale's of a list. And we also have an ints. These are interesting, right? These are interesting functions. And we're going to give you some theorems about it. This pale's algebra is a fold that produces that. For the unit, we give you back the list with only the zero element, or in this case, it's the one element to give you back that. The unit is the empty list. And then for each additional step, we cons one more guy onto the guy that we had. Our running sum, right? So we sort of build up from the back at each step. We see one more guy. We have the guy we had before, which is our list of the last four things. And then we see the third thing in. And then we cons him on. But then we also preserve everything we had before. And so you can write this as single-pass fold. Now we have our function scan wrote. Why did you do two steps in the where clause? Why are there two steps in the where clause instead of two? Why do I have this guy and not just the other? Oh, I'm sorry. Yeah, they're right. They're two waters. This is why I'm right. Yes. Yes. Do we know what a scan is? Again, we'll just be doing that. That's a scan. So I've taken the list one through five. And then I've built up, in this case, a scan from the right coming back that is going to sort of give me my songs. We can do a scan out. And it should look much more familiar, a scan. So it's sort of the triangular numbers there. I've got 0 and then 1 and then 1 plus 2, then 1 plus 2 plus 3, then 1 plus 2 plus 3 plus 4, and so forth. And so the claim is tails is initial with regards to rightward scans. And this is just an exercise in using the word initial. If I have a list, and then I apply tails to that list, then I map some function over the result. Then that can give me any sort of result that I would get from applying a scan function. So tails plus math gives me, and then that's what you could think of, is a scan, in a sense, is I've built up the first and the first, the second, first, second, third. And then in each of those cases, I've punched down that list. Now, it so happens I can do it more efficiently by sort of sharing and working incrementally. But algebraically, I can think of tails as a combination of scans as a combination of tails and maps. And that's the theorem right there. And we use our fusion law to prove it. We take our fusion law, and we just substitute it in a map of the fold of the algebra for h. Tails, algebra for the f, and scan right for the g. And we observe that these two functions, which are preconditions, are equal. And that's easy because they're not particularly recursive, right? They don't especially sound. And by fusion, then we can observe that scan right of an algebra equals map of fold of the algebra, not the tail sound. And so that's a full proof, if you walk it through, of what we call a tails fusion. And we do the exact same thing again, but I'm not going to walk you through, with the nets. In this case, note we've written a really inefficient of nets because here we're actually mapping x to constant each time. And that's because we're writing in nets in terms of a right fold. You should be really good in terms of the left fold. But I'm just doing this for, because in the program derivations, we pass through this, but we don't use it directly for one. And for another thing, I didn't want to have to do that. We make the same argument again. And we observe that we have a net's fusion, which gives us the same law. Scan left of some function in the empty element is mapped with a fold of the function in the empty element as applied to an x. And so these are powerful new tools. And you can see that we only took that one fold fusion law and all these others are derived consequential, right? This is a nice thing about an axiomatic system. You have a few axioms in many theorems, many of which seem surprising. And now let's describe maximum segment sum. We've done all that with two limiters. Here's some test data. It's a sequence of integers. It does not be integers. For maximum segment sum, it does need to be things that you can add them into having doubles, whatever you want. You can generalize more. But then they need to have both an operation that gives you an open maximum. You need to have good ordering properties. And they need to have some sort of addition with regards to that. But you need to have the right properties. So you have the right metric on it or something. Now we define a function, segments, which is concat.maplements.tales. Now there's pretty still a lot of segments, right? Those are all the segments of 1 to 5. I got 1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 2, 2, 3, 2, 3, 4, 2, 3, 4, 5. So this is every possible sort of continuous subsequence of a list. So that's a lot more of lists than we started with. A lot more of a skeleton, so it's 1 cubed. And that's the most obvious way to write segments, is it's you take all the tails, you take all the ends, and you take all the omits of all the tails. And then between that, you must cover every possible subsequence uniquely. Maximum segment sum is this problem. We're going to take all the segments, we're going to sum them up, and we're going to take the maximum among all of those. And that's going to be, I give you a list, and I say, pick any subsequence of the list, and tell me the one subsequence that gives you the highest possible number. And then that's the sort of question you might really want to take if you're doing a speeding data analysis. Now the interesting thing about this, right, is of course these lists must contain negative as well as all positive numbers, right? What would happen if the list was all positive numbers? What's the algorithm for maximum segment sum? Sum, yes, it's a very good algorithm. So you have that negative numbers as well, so that it could be that if I go here and then it goes negative, it goes so negative, right, that it wipes out all the good work I've done here building up, and I should just start over from scratch again and look at the segment that moves between two negatives. And that's why I've given you this test data with minus 100, minus 100 in style, so that we can be sure it's not just more than degenerate problems. So we're going to go through the same exercises with the banana split hero, except using a bunch of different rules, including the rules we just introduced. So you believe me that first off, that this is a very good specification of a problem, right? If I tell you this is maximum segment sum, you believe me, right? I mean, you can run it a few times and put check into test it, but this is sort of what you want to start with. And what we're going to do is we're going to take it to the efficient algorithm through mechanical steps, all of which are necessarily true. The first thing we'll do is just inline the definition of segments, right? That's the equation that we're going to use here. That's very simple. Then we're going to apply this in naturality of concateness that we started with. So we have map sum dot concateness. And we're going to place this with concateness of map of map of sum, right? Now, that, of course, must work. We've introduced list homomorphisms in the law, right? Just like sum is a list homomorphism, maximum is a list homomorphism. So we apply that function so that instead of maximum dot concateness dot map dot map sum, we have maximum dot map maximum dot map dot map sum. So we've fused away already. A whole concatenation step, because all we really want to do is just map the maximum. Because the maximum, there we are, another category that I'd explain to you. The maximum of the maximums is, how would we say that? The maximum of the concatenation is the maximum of the maximum, something like that. That's maybe not quite as mostly dual as we like. But that's the thing. And now we apply a map fusion. We take map maximum and map sum, and we shut them up to a simple map. And now we have another function that looks almost as elegant as the first one, but slightly different. Maximum dot map of something else, not sum anymore, dot tails, right? So what we've done is we've taken some of the work from segments that was in that last element. And now we've pulled it into this intermediate thing. And then that looks exciting that we have something dot tails, because we had a tails fusion theorem, didn't we? So if I have a fold dot tail or a map of a fold dot tails, I know how to fuse that away. So that looks good. So we have a map of this weird guy. So can we turn it into something that looks like we can apply tails fusion to it? That's going to be our next subtle. The way we're going to attempt to accomplish that subtle is we're going to name this little guy here, maximum dot map sum dot index. We're going to name him MPS for maximum prefix sum. We're going to apply a particular reason in there. We apply this fusion, because we have maximum dot map sum dot index. So by our mid fusion law, that's maximum dot scan L plus of 0, right? Thanks for having me, looking a bit cleaner. You by that? With me? Now we expand out maximum. So we have fold our maxes of 0 dot scan L plus of 0. Now we apply fold scan fusion. Here you'll notice that I believe the following is correct. I slightly generalize the statement in the Xingxian law of this form to a more general one, because I think this is an application of more general law. I think this is correct. I haven't proved it. Exercise the leader. In the meantime, if you don't believe me, it certainly holds in this one case at least. And then you can confirm it directly. But the more general statement is I believe that if h and g are less homomorphous and h distributes over g in the following sense, that g of some x and h of y of z equals h of g of x, y, and g of x, z, then you can always apply this fusion rule. Now in our case, we're taking h to be max, and no. One of them is max and one of them is soft. I forget which is which in our case. And in that case, you can ensure that the reasoning works around it. I guess h is max and g is soft. If h is max, 0. Oh no, h is max. Yeah, well in our case, when we say less homomorphism, then yeah, it's max, and then it must be 0 by the homomorphism. I'm sort of omitting the 0 elements because they were reduced. So yeah, so I think this is a general statement that if you have got sort of a distributed property among these two less homomorphism, then you can always fuse them in such a fashion. You certainly can do so by direct reasoning in the case where you have max and sum. A note I'm going to make, by the way, is when you have these guys like max and sum and whatever, they go under the general name of tropical algebras or trap tropical semi-rings and idempotent semi-rings. And they appear a lot in optimization problems. And you can take advantage of a lot of cool problems. And the idempotency, which we're not really directly making use of here, you often can, is that it's going to be that the max of x and x is going to be x. So you can sort of iterate taking the max with something as many times as you want. And you can fuse that away. And so yeah, max plus and min plus algebras are re. And they show up here. So here we go. We're just going to apply that little result. And so now we have a single fold, which is what we're looking for. And in fact, we've taken this guy and we're just going to rename that little lambda expression into z max. And so now MPS4, which was, as you would call, maximum.mapsum.nx, has now, by a series of correct steps, been turned into a single fold. Fold our z max of 0. And z max is, if I'm given two elements, I add them. That's it. And then, if the result happens to be less than 0, I replace it by 0. So it's sort of zero saturated arithmetic. And you can justify that heuristically. The whole point is, if we ever dip below 0, we know it's a bad carrier. And we might as well throw it out and start over. Because once you dip below 0, then it's better to just chop off that whole second. And that's what we've done here, is we've encoded that. And we've stumbled on that by applying these blocks. We substitute that in here. Maximum.map folder of z max of 0.tales. Now we apply our tail solution, as we so desired. We have a maximum.scanner of z max of 0. Which is what we're looking for. But that is what you get taught, whatever, to live your time. You're a series of appropriate variations. Here's an exercise. Observe that while z max is derived from two of this total morphisms. z max itself is not what this is. And I went from running behind to running ahead, because it's not kind of an exercise. This might start a little bit. So it's hard to time these things, I'm sorry. Do people want to maybe stop and look at this and poke at this a bit? If I was you, I'd want to just fire up my rep holding them by detecting some of these things to see if I believed it. Maybe just take a few minutes to convince yourself that these steps make sense. And if you're positive that they make sense, see if your neighbors think they make sense. And if they don't, they can help them. How do you know that the transformations you're making are improving your efficiency? You know. The question is, how do you know that transformations are making them improving your efficiency? It's a great question. There's an entire other branch of theory devoted to that question. Called improvement theory. And the difficulty here is exactly the whole point, the entire point of this formalism is that you want to be able to preserve meaning while changing efficiency. So therefore, by design, you've posted it out with efficiency concerns in your quality relation. And then therefore, it cannot be the right framework to reason about efficiency. For all those reasons, it's correct to reason about program transformations. There's something called improvement theory that attempts to sort of provide another layer on top of these sorts of things. That lets you observe that things are necessary. Oh, sorry. There's something to tell on that. There's a particular transform called worker wrapper, which there's been a lot of papers on. And one of the fun things about it is all their titles of the papers sound like that kind of album is on purpose. You know, work it, wrap it, and so forth. And there's a paper on a top flying improvement theory to the worker wrapper transform. That's very good. It talks about this problem. And the title is, Worker Wrapper Makes It Faster. Oh, my goodness. Shh. As far as, like, not answering the question and how we know that this is going faster, but is it fair to say that if you see things, like, in its entail, since they're individual, and you can think of ways to learn from them, then you might be, as a reasonable heuristic, just as hard as I think, if you want to. Yeah, I think that's definitely the case. It's whenever you see things like an it's entail, and yeah, and that's the point. You see enough of these things, and it's not like you're going to go through the mechanical variation each time. But you just start to get a sense of, like, you know, just like, you know, many of us, if I see map f dot map g, I always know I should be smushing those. Right? You start to get a sense of, like, well, here's a lot of other identities that I just have lying around. And if I see these things, then I probably should be able to fuse it. Another point, speaking of which, the entire framework of this fusion in GHC, of course, is another small aspect of this work, right? Because the beauty of this stuff is, like, these are non-trivial transformations, they require insight, they require a person sitting down and thinking hard, but you carve out these subsets of them. And you could say, you could teach a compiler any sort of smaller, decidable subset that you can, and then that's, again, a case where you want to show that this subset of transformations generally always improves things, or at least doesn't make them worse. And then that's why we only have some of these rules in the compiler, not all of them, so precisely. Because it's a hard problem in general. And I was just gonna ask that that's interesting that you can sort of automate a lot of these. Would it be possible to, because maybe some of them work in 90% of the case, but you don't want to stick them in the compiler because sometimes they don't. So it would be possible to have some kind of macro, I mean, some kind of macro, even though I know that doesn't necessarily apply to Haskell to have this sort of optional that you could use a bigger subset of these transformations. Yeah, that's sort of the rewrite rule system, which is a bit ugly, but it lets you do that, and that lets you opt in to switch from what's called build-fold-fusion to what's called stream-fusion. Precisely because stream-fusion has one personal case that otherwise is better than build-fold. So there's been a lot of research on that. That's one element now, is I think with this in hand, you should be able to go and look at build-fold-fusion and just look at the source of the Haskell payload and so forth and look at all the build-fold stuff, and you should be able to apply these principles to that and recognize what's going on very quickly. Is that it's all an application, it's all putting things into a form where we can apply the fold-fusion law. Is there any way to observe that in the Haskell? I'm not in the REPL in the source code. So here's how you do that last, it's in half-inch. And I'm gonna look at the source code. Here's my rules, this is a worthwhile progression. So here's my rules for filtering. If I see something that looks like in phase minus one, because what you do is in the phases of the compiler process it'll apply the first phase rules and the second phase and so forth. So in phase minus one, for all PXs, filter PXs, I rewrite as build a fold of this filter FB, right? So, and this is precisely right, I've taken my function and written it as an explicit fold and applied it to a build, which is sort of a translation from a list into its universal form. And I rewrite it as that, and then I can write filter list as a specialized version of that and filter FB is that. And so what happens there at the end is if I have a fold of a filter and then I've gotten all the way from phase minus one to one and I haven't managed to get that away, then I rewrite that back to my efficient filter. So that's how our UI rules work. And the whole point of the UI rules is somewhere variant here and I'm not gonna pull it up now, there's a rule that says given things in the form of folds and builds, like something like the build of a fold is applied to the build function to apply directly to the fold operators. So you can get a chain of folds and builds and then you can fuse away a whole bunch of them in non-obvious ways. And that's another example, this fold build fusion is one of these subsets of these program derivations that is shown to be at least never making it worse. And so therefore that's why we can bake it in. But it's not baked into the compiler, right? It's a user language, just baked into the Craylude as these rewrite rules. So you can write your own rewrite rules to perform your own fusion. And if you don't use program calculation you can write rewrite rules to do things that actually aren't the same. You can rewrite fold to error, I wouldn't do that. The compiler won't stop you, it doesn't check this. Is there a use to the making that argument is to it that at the very least fold build doesn't make things worse? This is a proof that you're going to do it or is that a difference? I don't know if it was done with improvement here. I just know it as a result. I don't know if I can tell you if they used a general framework or if they did it as a long-off result. Duncan, if you're interested in fusion, Duncan keeps his PhD dissertation on stream fusion. It's now I think the best source, right? So we're gonna, I guess we'll try to do parallel prefix sum. This morning I realized that the derivation idea doesn't need to be as fast as I thought I did. It's a very important, very much fun to do. Okay? How many people know what parallel prefix sum is? One, okay. So this is a surprising result if you haven't seen it. We've seen our scan function. It's also known as a parallel scan, right? And that gives me one, one plus two, one plus two plus three, one plus two plus three plus four, right? What's the complexity of this algorithm? It's oh man, right? Because I've got to visit every item in the list and to get the last result, I must have visited every item before it, right? So it's oh man. And then it's obviously sequential. The claim is, what if I gave you as many processes as you wanted with free communication between them? I can do this no log again. And so it is to me, it's the first non-trivial example of a computation that appears innately sequential, but that when you hammer it with the right tools, it turns out to be at from very parallel and subject to parallel decomposition. I did the wrong thing here. I walked through a simple list-based derivation. It turns out that the interesting derivations for this that really get you where you want neat trees and I give a couple citations, but I didn't want to introduce a bunch of them. We're not going to get all the way there. We're going to get to a basic notion and the thing we're going to derive is not going to be oh log again. It's actually gonna, it's going to be slightly parallel. Hi. It'll show you how you could be more parallel. It's a proof of concept. So we had our list homomorphism and we're going to revisit that, right? H of x's of h of y's is h of x's plus y's. That's the list homomorphism. It distributes over a concatenation. The combination of the translations is a translation of the combinations. Yeah, that's absolutely right. Opposite monolithic. We wanted to find a function. So I've told you that we're going to write scan as a list homomorphism, because that's taken parallelism. So we need to write a function scan off that obeys the property that scan off of two scans is the scan of the sum of the less. Now that's sort of weird because how can these two things depend on each other, right? I have this list and that list. And then the, I'm facing math. So this list is on the flight list for this. And this list is Tuesday, if I'm over here. How do angles work? So I have the two lists and yeah, the guy all the way at the end, he depends on everything else. So how can I have this operation that decomposes these guys in a parallel? So I can build up my log n depth tree of production. And well, we know one thing, right? We know that whatever it does, if one of the arguments is an empty list, it better just return the other list and there's nothing else to do. Otherwise it's not going to be annoying, right? So we can write the obvious case. Obvious case is obvious. Now what do we do in the other case? Here we go. We take the laws of scan on and we can observe this fact, right? That if we apply it to, if we apply the operation to some z and x cons x's, then we can apply the operation to the x and the head of the scan of z. And then apply that to the result of scanning z over the rest of the answers, right? So I've sort of given you a one step breakdown, right? If I know how to do it and I'm sort of, so how do we even change this? What we're gonna say is if I know how to do it, if I have a list x and x's and I'm scanning some z with it, if I know how to, to know how to do it for the x cons x's, it suffices to know how to do it for the x and the z component, right? And if you were going backwards, when I said tail, we're now gonna do scans in the other direction because they coincide with full rights and that's what we're used to working with. You can flip everything around here. If I know how to do it on this component and I know how to do it on this component, right? I can sort of push over the one element of work, this x cons x's, I can sort of push the x into the z side and solve that side recursively and solve this side. Sorry about what you're great at describing it. This is something you just sort of look at the code and see what it's saying to you. And, yeah, here we go. Ah, no, that should be on the plus side. Scan our opposite of the concatenation, right? These are the small laws. We're just trying to write things down that we know must be true because we're trying to create some equational properties of scan by observation that are gonna tell us things about how it operates in some sort of compositional fashion. And we don't know exactly how we're gonna get to where we wanna go, but we know that we need to produce these sorts of properties if we're gonna get there. So this is sort of how you start to think about this. And, well, if I apply it to some list that is produced by concatenating two-less x's and y's, then I know that the stuff that happens at the end, right, must be independent. No matter what I stick on the front, the end's always gonna be the same, right? If we're sort of building up backwards, right? I've got at least one portion that's independent. And then that's starting to tell us something interesting, that my data dependencies aren't everywhere as I want, right? There's portions of data that depend on other portions of data. And now, from this, we include that, now we make an additional assumption. We put no assumptions on op. Now, the op that we're scanning, right? We could be scanning plus, we could be scanning divided by, here we're gonna say, well, if op is associated as pluses, then our scan function is gonna piggyback onto the associativity of op, and then use that to sort of get a derogatory associativity problem, which is this, that if we map the operation onto the head of the scan over the first part, then we take the net, and that net is just so that we draw this x to the zero carrier that gets in the way, it's sort of a, it's really detailed, it starts there. Then, and so, but we map this operation over the head of the second part, the y's, the scanner ops, the y's, right? So we solve something about the y's, and then we map, applying that thing we solved about, that summarizes everything in the y's into everything in the x's, that as we've already scanned the x's, then that's gonna give us our new x's. And then meanwhile, the scan of op over the y's is completely untouched, because as we've argued, it's independent from whatever happens in the other side of the list. And now we have something that really starts to look like, right, we have a plus plus on one side, and we have a plus plus on the other side, so we're starting to get towards this decompositional property that we wanted to see. And in fact, and now we write it, and that's about as far as I got. And there it is, just by that property, you can write this, this is a monoidal operation. You can write scan r as a map reduce using the scan op, you can parallelize it. Now this is really, like I said, this isn't a log n, because we've made a huge simplification, right? This map of op over y. What happens when you combine the last two breaks of your tree, right? Well, this branch of the tree is gonna be length n over two, this is gonna be length n over two. So you still need to perform at over two work to map that operation. So you do save some time, and you're not. You're saving sort of a constant factor almost, because you're still gonna have an n2 in there somewhere. Now you can cheat and say, well, that's a map, so let's just do that in total, too. Yeah, I told you infinite processors in cheap communication, so. Yeah, and that will save you. Now it turns out that, so there are ways to do so, which are two-pass algorithms. All of the good ways to do that, in which you don't have this problem. And in fact, oh, yes, there's a point. If you do this just with binary sums, right? What you end up with are the adder circuits that you see inside chips. This algorithm actually is a thought-flight. If you add binary numbers, this is what you did. That's how deep this sounds, right? Because you have your tree of carriers, and adders. And it's a very good test algorithm for a lot of queue of things, and they do these two-pass upsweeping, downsweeping ones. And they avoid doing this extra work. But they do so with the two-passes in a non-trivial way, because the first pass, you're not always the same in your rewrite rules as you go. The first pass doesn't give you what you want. It only gives you half of what you want. And the second pass has to sort of back-patch it. They're missing pieces. So it's hard to do in this formalism. I cite some papers I thought by having to do it. But I couldn't think of a good way to present it. And so I've left you with a very bad algorithm. But I've got some interesting ideas about how you can think about these things. Then the other, Gorlach, is going to show up here. That's about all I've got. We're not going to have a half hour left, so we could have some time to see how much we can exercise this. But thanks for your time. You've got a bunch of things here for programming. What do you recommend being first? What would I recommend being first? I don't know. I think the one thing, if you only really do one thing and you take the time to get the algebra programming for a book and do it, start to finish. That, but that's where you dance. The sum of the materials should be enough to maybe approach the functional programming when you've been out and done books in barbed wire. Givens has a bunch more papers that I didn't cite. The math and PhD dissertation is great. And that's one of the reasons I like setting up earlier. Stuff is, like all good research. By the time you get about 20 years down the line, they don't bother to tell you why they're doing it. The more they assume that you've read like the bar 20 years of research. And so I'm setting up all the stuff even though for that reason that it gives the motivation more clearly. And later on they just assume that everyone already utilizes. And of course people don't. And I think this is very underappreciated stuff. Is that an alternative source for the bird book besides the expensive book? Yes, but not on video. Maybe the publisher had print on demand. You can get an e-book. It's still feisty. So the jump from MPS1 to MPS2 where you expand maximum to fold our max zero, does that skip lists of all negative numbers? The question is, if you expand maximum to? Because maximum of a list of negative 10, negative 3 is negative 3, right? But if you fold our max zero. This is a zero bias maximum. It's not OK. Yeah, you're right. It's not the hassle definition of maximum, which is slightly different. I admitted that. You're absolutely right that even the specification of a problem, if I give you a list of all negative numbers, the maximum prefix should be zero, right? Definitionally, because I picked the size zero list as the maximum. And what's the sum of an empty list? It's definitely a zero. Thank you. I'm new to this type of thing. What's kind of the state of the art of this field? I assume this is super simplified for us today. Where is peer research being focused in this area? The person who's done the most interesting things that I like lately is Ching-Chen. And I took this version of the maximum segment sum derivation from his blog. And he does a lot of stuff with tattooing and gaolok connections, which is really neat. I think what happened to a lot of this, is it moved from this sort of style of like, it moves from asking these questions, how do we do our programs, and to asking things about categorical semantics and these other questions. And they're more sort of researchy and abstractly interesting, but it's harder to take that work and turn it back into how do I make my math reduce so fast. And I actually don't know if there's much research in that direction anymore. In a sense, I think after some of the state of the art, people aren't making a lot of big advances outside of this work of wrapper stuff, at least from what I've seen. I might be overlooking one of the things I've seen. Instead, it's like they've given us all these tools, and now it's just a question of convincing people that it's worth going for all the effort. And it is, I mean, I spent some days preparing these notes, and at the end of it, I've shown you something that's sort of obvious, but I do hope that you sort of appreciate that I've managed to show why it's good, and that's why I've explained it to them. About practicing, right? Because just like it takes you, you always should do a few more exercises when you learn how to add numbers to the kid, the company, the concepts, right? When you learn things like this. Some of it is just getting concepts and looking at pretty diagrams, and then some of it is just doing these exercises one after another until you internalize the intuition behind it. So early on, the sort of un-fixing side, they're trying to apply those signals to building a new school that's a century or so. So the question is, is there a fidelity loss when you go from these fixed point of pumpkin things to the Haskell things? And then the answer is the usual answer. If you reason in a total setting, then there's no fidelity loss. If you don't reason in a total setting, there's fidelity loss, both in a lazy and a strict language for different reasons, if you're in a partial setting. So all of these transformations are true up to partiality. They're true in a total setting. And if you want a reason in a partial setting, there's a few ways you do it. And there's a lot of research on those various ways. I mean, related work to this is also, of course, theorems for free. And now you want the free theorems, you've got our subset of these theorems, or your small subset. But you've got what you pay for it, so. I guess that's about it. The question is, OK, so this is an interesting element. I will mention this in the third book. They introduced three different categorical rules in the book. And then the most interesting thing when you do this work seriously is how do you internalize these notions like maximum, or things like that, that you don't get in just purely sort of substitutional reasoning? And they move from looking at things in terms of the sort of category you would tend to use, where you just have objects, or types, and morphisms, or functions. And instead, they use relational categories, where objects are sort of classes, and what you have, morphisms between them, are relational, or relations, and logical relations. So they use the framework of logical relations for this. Because you want to be able to talk about entire sets of classes of solutions. Then for maximization problems, you can thin them by theorems to the maximum of such solutions. And that means you don't want to be using over one arrow at a time, but sort of entire classes of arrows at once. And it turns out that that sort of reasoning also applied in a partial setting much more nicely as well. So a lot of these results were just doing math work in partial settings. But you have to use these scary categories of gadgets called allegories, which is what you could use this for. Never has a concept really been a simple thing. Thanks again, everyone. Be happy to talk to people more about this stuff, and I will suggest further reading. I'll find one.