 Hey everyone, I hope you're having a great conference my name is Nathan Fabian and welcome to taming the stack in pure script Let's face it pure script kind of has a call stack problem and this is actually pretty common in functional languages and What it comes down to is that pure script lacks loops most languages have some sort of Looping construct, but pure script only uses functional recursion in most run times Recursive calls are really any function calls take up additional stack space and recursive calls aren't special in that regard Which means consequently if we loop too many times with recursion Then we'll get a stack overflow a Common way around this is something called tail call optimization tail call optimization is kind of a misnomer I feel like because is it really an optimization if it's necessary to actually write correct programs? in a functional language, but the the gist of tail call optimization is that it turns first order Self-recursion in tail position to a JavaScript loop. I just say JavaScript because that's our most common back end and First order just means that we only call the function recursively in a in a non higher order Contact so it's not passed to a function It's not captured under a lambda that you know someone else might invoke dynamically or anything like that self-recursion means that we only call a single function Ourself recursively and it happens as the last thing to evaluate in the function call in evaluating the functions The last thing to do is call ourselves recursively TCO is specifically a static transformation only looks at local information for a particular binding recursive binding it doesn't involve any sort of global Control flow analysis or anything complicated. It's really straightforward to implement And and just to bring it up the general form of this is called tail call elimination and other runtimes that are not JavaScript are like Erlang or Or scheme We'll support dynamic tail call elimination, which means that any Dynamic call in tail position will be optimized to not take up additional stack space And actually like Safari supports this for JavaScript, but no other JavaScript runtimes you Unfortunately So to get kind of an idea of what we're doing. I just want to look at a very very basic quick example This is a common data type list data type This is exactly what is in the core libraries and there's a function. Let's Look at a function sum and all this does is it pulls out integers and adds them together And it has a one recursive call to itself, but it's not in tail position. It calls itself, but then After that call is done. It has to take the result and add x to it. So this would not be optimized by the compiler However, we can do a simple transformation This has what's called a worker wrapper transformation, which has a worker go that is recursive and it has a wrapper that invokes the worker and in this worker we take an additional argument, which is an accumulator and then we kind of keep that state and We only call ourselves in tail position So whereas before we invoked the recursive function then did something with the result in this worker wrapper Transformation we take the result we go ahead and add it to our running accumulator and then we Recurs after that this gives us a tail position And since this is also first-order this will get Compiled and optimized into a type JavaScript. It doesn't have to eat additional stack space or allocate anything So one thing that's actually interesting is that any first-order non-tail position recursion can be mechanically Transformed into tail recursion so into this kind of form that we're looking for As long as it's first-order It can be any of that recursion can be can be rewritten into tail tail recursion And you might say okay if it's mechanical can the compiler do that and the answer is Lots of compilers do pure script is not but this is actually the steps. I'm going to go take you through our Our transformations that many many compilers do automatically for you It's just it oftentimes has sort of a trade-off and so depending on The transformation that it has to do it might require Doing a lot of additional allocation on the heap which may make your algorithm a little bit slower So if you need performance, and you know you have bounded input that isn't particularly large You can go for a recursive implementation That is going to be a lot faster But if you have unbounded input then it might be a good idea to go ahead and just eat the eat the Heap cost the allocation cost and so it might use more space And it might be a little bit slower for access because it has to do more dereferencing, but it will be stacked safe So our example here that we're going to try to transform is a little bit more complicated We're going to look at a data type. That's really similar to our list data type and it's a binary tree And this isn't really that much different. It's just we have an additional recursive node And so what makes this relevant is that our our recursion actually forks We have to go in two different go down two separate branches recurse down two separate branches And if this were a binary tree, this wouldn't be that big of a deal because it would be logarithmic or if it were a balanced binary tree, rather It would be It would be logarithmic in depth and so you wouldn't really have to worry about stack in that case But we're going to assume that this is non-balanced. And so we have no idea this may be totally left associated or totally right associated So we have to be kind of careful So we're going to look at an implanted an implementation of map for this data type and it's pretty straightforward This is actually what a compile a compile the compiler will derive this implementation for you Unfortunately, this is not stacked safe because we have two separate calls recursive calls and neither of them are in tail position So what do we need to do to turn this into a tail recursive? Implementation of the compiler will optimize into a loop First let's look at where our recursion happens Our next step will be to Move all of our function arguments into bindings and this is really not super necessary, but it is it makes it very clear What our kind of odour evaluation order is and and really Tells us that this is obviously not in tail position because we have to call these functions get the result and do something with it Afterwards so this is definitely not in tail position and it makes a little clearer what's happening Next this is a little this is a a big jump, but it's not too complicated. I'm going to walk through it We're going to convert what's convert this into what's called continuation passing style, which is Just callbacks it makes our evaluation order kind of Specific so it's very similar to the worker wrapper transformation that we did before with our sum Implementation we have a wrapper here We have a worker go here And we have this extra argument CONT cont or continuation and Really all this is doing is taking those bindings that we pulled out So if we look at this when we have the binding on the left-hand side here And then we have the call on the right-hand side It just kind of flips that around to the other way around so now our call is on the right-hand side And our binding just kind of goes over to the to the our call is on the left-hand side and our binding is on the right-hand side and Because it's a lambda and we format it, you know This will end up kind of looking like stair stepping, but it's really not stair stepping It's just Another way to kind of like look at the flow of evaluation and anytime then we looked at our old implementation Where we're doing nothing but returning a value we replace that with a call to our continuation And so one way to actually look at this is that This is the same sort of transformation as far as like an accumulator Where this continuation is just an accumulator and we're literally accumulating Program a program. We're literally accumulating code to run So we're traversing this we're we're doing this algorithm and our accumulator is just a new program to run Really really interesting really elegant Our next step then is to Lift these callbacks into explicit name bindings. You'll see this a lot in this kind of stuff It's like put it in an explicit binding. This makes everything clear around What's actually happening with this? What what what are all the variables involved? So you can see We we make our closures explicit explicit. We have to Make sure we capture all of them and pass them to the continuations. This makes it very obvious what our dependencies are Our next step then is to take these close these closures And turn that into a data type So here we have our identity continuation. This is what gets it started with We have our left hand side continuation goes down the left hand side of the tree We have our right hand side continuation which goes down the right hand side of the tree So we're just going to turn that into a sum type We have each one that it all the all the values that they capture. They just get put into the data type Again, we have our our cont identity. It's just kind of like the nil value that tells us it's done So then what we're going to do is take all of those closure bindings and turn it into a single function eval That cases on this data type and executes the code that was in those those closure bodies So this is if we go back here, we see Cont lhs. We have this let binding this go Cont rhs and that finishes it and it's doing the same thing here We have cont lhs the let and the new go call and then cont rhs Now instead of invoking our continuation explicitly Where it was a function we are instead Turning it into a call to eval So we have eval and this is a typo. This is this is next here, but it should be cont But you'll see up here eval cont So instead of calling cont tip, we just call eval with the cont and our return value So you'll notice here one thing that's interesting is that eval is always in tail position eval is in tail position here eval is in tail position here Go is in tail position here and tail position here So our calls are all in tail position So we have a first order algorithm where everything is in tail position The only problem now is that it's not self recursive. We have a mutually recursive set of bindings So we're going to do a similar transformation that we did before with our continuations Is that we're going to turn our mutually recursive go eval calls into a data type as well Kind of like how we turn our continuations into a data type. So we have our map go and the arguments So we have the function that we're mapping The the binary tree in our accumulator and then we have our eval function with our return value and The accumulator and we're going to turn these into a case So instead of having our separate bindings again, we move this into data types in a case And so if we look at this now all of our calls to go are now in tail position We only have a single self recursive loop. And so this will turn into a nice stack safe implementation of this algorithm And this is all very mechanical you can apply these transformations to essentially any any recursive algorithm that satisfies that criteria of being Of being first order so If that's all you want out of this talk, that's fine. You can you can leave it at that That's all you need to know to write stack safe The basic pure script algorithms, but let's keep I'd like to keep going There's a few things I kind of want to explore with this which is which are I think are pretty interesting One is just that instead of using tail recursion Like references to go you can you like our our worker You can use tail rec the tail rec function and this is in the the standard library and Kind of the same idea before work instead. We're calling tail rec and Tail rec makes our kind of our our worker here Correct by construction. So we can't use go accidentally wrong And so it'll always be stack safe because we have to return an explicit data type at the end of each iteration So this makes it a little bit easier to to keep things straightforward So that's just you don't have to do this. I I almost never used tail rec But it is nice to get that sort of guarantee. It might be a little bit slower. It's just just the issue But I'm going to look at something here called a greatest fix point and If we take our our worker and just put that in its own function And owns like we treat it like a state machine. It's taking an input and like a state and then transitioning it to a new state So we've got a map call and it's just every time you invoke the transition function It transitions it to the next state in map call All we've essentially done is removed these these like loop and done We've just taken the worker and removed like the extra wrappers We'll see here. Those are just missing missing. We just map each state to a new state We can then take Essentially take that loop that we had before or the the the wrapper and Define this eval function, which will evaluate a map call and turn it into b So we can call our stepper and then we just case on our identity Our identity Continuation here if if we know that there's nothing left to evaluate in the continuation Then we can just go ahead and return the result I think this is actually a mistake these arguments should report But it's the same idea and if it's not our identity Accumulator then we just continue stepping so this will This is this corresponds to what's called small step semantics and programming languages And this is great because We have a very clear evaluation state that we can start and stop whenever we want This is great for actually writing things like debuggers if we we've essentially defined our own little language Just for this map call and if we wanted to we could Use our stepper function here and step through every single evaluation step of this function Be great for if you if you wrote a little language you could use this approach to make essentially make your own little debugger for it So it's super useful Now I want to look at but you know We had to match on our map eval con identity specifically. That's the only thing here So I want to look at what this step data type is this done in loop Step here is it's step a b. We've got loop a or done b and this is Essentially just this is just either with uh more specific names to make it clear what's happening Originally tailwreck actually used just either but it was very easy to get them mixed up like which one is is it left or right And so it was changed to use explicit names, which makes it a lot easier to use But it in essence, it's really just either So we're going to we're going to transform it to either we're going to not use loop and or the step data type We're going to use either and we're actually going to kind of like flip the meanings around here So done means left which kind of makes sense, you know If you if you think of like either as sort of an error condition Or like a way to like halt or like short circuit the computation then like left is kind of like that's done Like we're done with it now like there's nothing left to there's nothing left to compute And so right becomes our loop constructor We're just flipping the meaning and then we're going to take um We're going to use the pure script fixed points library or fixed points Or funk other fixed point functors. So specifically the greatest fixed point is the new Data type and we're going to use our either functor. So either b so it returns a b and we can write this fixed function that operates just like Tailwreck and it will evaluate any fixed point over either b and return a b here and so it's the same idea Left return. It'll just return that right keep going. So it's essentially what we were doing in our eval function And you can then turn that into we can take our step function and we can compose it with a A condition here a termination condition essentially that's kind of what the Left the loop or done is just a way to communicate that we need to terminate And so we can compose it with a side termination condition and get the same result down here of of map call a Map call a b to b to evaluate it So now we've looked at kind of what uh, the greatest fixed point is let's look at if we try to swap this out with a least fixed point The greatest fixed point is this this new either b and that corresponds to an existentially qualified a And a tuple that has our step function And the state so the existentially quantified the existential Excuse me quantification over a just means that we can't see it from the outside So if you look at the all the types this a never shows up in any signatures So it's it's totally abstract. We have some abstract state and we have some uh, some stepper function that takes that state and Gives us an our result back or a new state and to keep looping So in order to evaluate it, we have to we have to keep looping until we Find uh, our our left our left instructor, which will tell us to terminate So we're just kind of pull out that either we're left with new f and so We've got a to f of a paired with an a value an abstract a value So one thing that's actually really interesting about existential existential quantification And why a lot of functional languages don't really support them? And that's because Existentials can be eliminated through what's called a closure isomorphism that and that just means essentially that any any uh existential It can has an encoding to a closure. It's kind of where you get You know object, you know objects correspond to existential So it's like where objects are a foreman's closures or closures are a foreman's objects, you know in functional languages We uh, it you don't often have to have existentials. You can encode them through closures So we're going to look at kind of what that means. So like what this isomorphism means and so We're going to look at a simpler example. Not our our uh our fixed point our new type We're going to look at this can show which is kind of common, you know I I I want to like somehow represent, you know a data type an apps like an abstract data type that supports a show function You know that can turn it into a string. So we're going to look at Type can show if it was existentially quantified it would be exist a and it would be a function a to string and and that object And this existential can be encoded as a closure with a uh a rank to Universally quantified eliminator, which is just a really complicated way of saying we're going to cps We're going to continue turn this this data type into a continuation passing style But in order to preserve the abstractness of a we have to make sure The continuation that consumes these values here is Must be polymorphic in the a value. So this rank to a universal Quantification just means that the whatever implementation it goes in here. It must treat that abstractly and really what that means This is kind of our example of here What that means is that if our consumer here this callback If all if we have some abstract a and all we can do with it We can't do anything with it. We can't we can't like add it to something because all we have is this function That's paired with it that turns it into a string So having that a and a function that turns it into a string means that it's equivalent to just having a string and and really Because of laziness and and all that it's really equivalent to having a deferred string We're kind of like a lazy string and that's kind of why in haskell you'll see like You know existential existential quantification is kind of an anti pattern and it's not so much that it's an anti pattern it's just that haskell's really good at universal quantification and You can just encode it with universal quantification and it'll probably be a lot a lot easier to use So we're going to try to apply this transformation to our new type We've got tuple a to f of a and an a So before we had our existential a we could essentially just turn it into a unit type Right because we can't do anything except just apply this step function to it But the problem when we do that with this type is Whereas before with this if we look at this implementation the a in the function only only appears as an argument In this function, it also appears as a In the return type so that it occurs in both positive and negative position of this type, which makes it I don't I don't know what to write here. I can't you act you literally can't write this and that's because you have to use a type level quantification for recursion What's usually called mu? And in There is a data type we'll and we'll get to that but A pure script in haskell actually apply this to any data declaration data or new type not not Not type not type aliases or type synonyms Any data or new type actually gets this implicitly applied and that's what lets you write recursive data types So this it in order to write this mu quantification That's where we get this data type mu f And it this is what creates that recursion. We've created a new Like a recursive data type here that we can apply our f to So we can actually factor out that unit arrow. That's the function unit functor And so we can get a kind of a simpler mu data type Which was what exists in pure script fixed points and then we can Get our sort of delayed mu Which we'll need for like evaluation order purposes We don't want to be too strict and so we'll just compose that f With the function unit functor. So our delayed mu is composed function unit with f So now let's look at can we write a fixed function for this and it's actually pretty easy We're going to use delayed mu and the same either be functor And you'll see here. We we unroll Uh, we we're importing data dot functor dot mu and data dot functor dot compose and unroll just takes that new wrapper off Um, and so we're left with a compose So we have to kind of and take that wrapper off and so then we have our thunk Then we can you call our thunk get our left or our our right value and we get um And it's the same sort of thing left just returns the result or we just recurse on you So we've written our fixed type and it's actually really simple just to swap this out, you know So this there's a few things here one is just i've replaced loop and done with right and left Otherwise, it's the exact same worker like the logic here is exactly the same as the worker we had written before Our wrapper is just a little bit different We've got our new fix function here And the worker just has to have this extra like roll and compose and that's to make kind of like the The types work out and then we have our Our delayed computation here and this implementation is just a stack safe as the other one this will run in Uh constant stack. We've just bundled everything up into these this this thunk. Okay, this this function unit So can we kind of undo this transformation? So we did all of these steps to get like all these data define these data types turn it into like a A tail recursive function, but you see this function now isn't even like really I mean it's recursive But it's not tail recursive because we're returning right But uh, it's just a stack safe because of the way we're computing. We're using fix here And delayed computation the the delayedness is crucial If you remove that function unit this will not work and of course it will just continue to spin and explode on the stack So the delay is is absolutely crucial But we can kind of start to undo these transformations. So We're going back now to our mutually recursive go and eval and the same thing This is this is just as stack safe as the other one. This will run in a constant stack Can we go further we can keep going then again? We can turn it into just remove our continuation data types and just use named continuations here and again This is still stack safe We can go even further turn it back into our original continue or into our continuation passing style here And again, this is still stack safe instead of just identity though We have to turn we have to use this done Or this is it ends up being what our identity is is role composed const left. It's kind of kind of extreme But this is still stack safe and this is in some ways. This is a lot nicer to Can we go even further, you know, we had that that kind of direct style where it was just bindings And we can but first I want to look at what this data type is that we're working with So if we take this this type aliases make it a rec we're going to call it rec Rather and it's new Compose function unit either be this is very very very big kind of like Fancy looking type Let's kind of like inline that and kind of get rid of all of Get rid of all the functerness for now the functuriness and just turn it into a pretty straightforward recursive definition We've got rec be is equal to rec. We've got unit to either be or rec be again. So this is where we get that That's why we have new here. We have this recursive data type that that recurses here And I'm going to do kind of a kind of an interesting transformation here I'm going to take this unit arrow and I'm going to distribute it to the branches of either So instead of unit to either be we have an either Where the unit is under each constructor. This isn't necessarily a Safe transformation in general, but for this particular type. This is equally as expressive. So it's totally okay We don't lose anything with this So we're so we've got this unit to be and unit to rec be We're just going to rename those back to like our nice done and loop Uh, so we've got done unit be and loop the unit to rec be and uh, this this unit to be is kind of unnecessary like the unit here because We can turn that into just done be and loop unit to rec be because if we want like a delayed done value, then Kind of just most of the time we can just wrap it in a rec. So it's not super necessary for our purposes It might be necessary if you were doing additional composition with like the unit to be like after you're actually done with it Um, and this just preserves like laziness value This is why in like Haskell like you don't ever have to think about these units These units like this because everything just kind of implicitly has that in a lazy value But we're we're we're keeping this and this is actually this type is really interesting here So if we just go ahead and turn this into our functary name again here function unit rec be And kind of abstract this away into an f we've got this rec fb Done be loop f rec fb. Wow And so we we can recover our rec here with rec function unit and this type here rec Is actually the naive definition of the free moment So it wouldn't be an eight phobian talk without something about something about free moment So through this process we we've essentially just we've derived a it's naive because uh, It's the it's the true essence of it's the minimal definition of free Um, there are some there are some properties of it that aren't great, but for our for this It's totally fine for for uh for pedagogical purposes. It's totally fine. Um And then our rec type alias is what we call trampling. So free function unit So instead of going through the process of like writing the implementation for these Which could be an entire talk in itself. We're going to just import data.free So we have to um Use we have to define this suspend function. It doesn't exist totally It has a delay function, but we need this just to like delay evaluations again The delaying the evaluations is kind of subtle when you're dealing with this. So it's so we're going to use this definition It's also equivalent to bind with a pure unit. Um, you don't yet you even don't necessarily need with With uh data.free and this particular implementation. You don't need The function unit functor. You can you can kind of get around it But for for our purposes, we'll keep this here. We'll define this suspend function And then we can write just using typical monad stuff with the trampoline monad We can we can recover our nice direct style binding. So go fl Binds to l etc. We're just using pure at the end. So this is very nice very minimal very close to our original Our original algorithm. It just has a little extra stuff. It still has kind of the wrapper of run trampoline, but it's pretty close And then just adding a little bit of syntactic noise, uh, but clearly we're in a dsl here our stack safe dsl And we're back to something that is very very close to our original Our original definition and this is totally stack safe as well So you may ask why did we go through all that process? of like Data types and all this kind of stuff when you can just kind of like throw in these These function calls and all that and there are actually like a lot of good reasons again I brought up some issues around like this is kind of like defining your own little language and small step semantics You could use this for debugging and stepping through code and but for more practical considerations It's that when you have a tail recursive loop that's specialized to your your subset your language subset um It's just it's gonna it's gonna jet like a whole lot better on the javascript back end the the trampoline monad is actually Let's you deal with the higher order recursive case And so like any call any recursion anywhere then that just that binds through trampoline Become stack safe, which is very nice, but we don't really need this We still have a first order recursive algorithm We don't really need that and so if we go to just like the tail recursive loop with explicit data types That'll that'll the just in time compiler in javascript will hit that really hard and it'll be a lot faster than the trampoline approach the trampoline approach has a Kind of like a performance hit of anywhere from 25 to 50 times, you know slower than the uh than a naive recursive algorithm The uh in my experience this is just in my experience of like writing these functions And but it in my experience the other one the using the tail recursive loop With explicit data constructors is about uh six to eight times slower than our naive algorithm, so um Big performance different difference. And so that's really kind of where where it stands like trampoline is obviously like great And if you just want to like stack safety really really quickly without doing hardly any work You can use trampoline just like this. It'll give you something stack safe. It'll just be really slow um And I haven't benchmarked it But I think it's you know promising to kind of go the middle of the road with for this first order case where you can do Sort of like our least fixed point approach are the original one where you're just using closures And a fixed loop so um Thank you. That is that is stack safety in pure script. There's there's a lot more to like talk about Around stack safety. There's like other stack safe monads like app and stuff But um, that's just basically they they essentially just use a free monad sort of encoding and that's what That's what gives them stack safety. So that's kind of that's kind of like the end of that story. So All right, thanks everyone. Uh Some natives are here on from going to be taking the questions Yes, I'll be I'll try to take the questions That was a uh pretty great talk full of a lot of stuff So, uh, I I'll try to answer the questions as best as I can So what is first order versus higher higher order recursion means, um, I think What this what Nate meant by that was that Instead of a function calling Another function recursively. So first order would be a function calling Another function recursively right and higher order recursion would involve Passing a function that happens to be recursive Into another function and then that function Calling the past and value right so it doesn't have static information on what function it's calling And that's the distinction so that makes it a lot harder to optimize it and make it stack safe