 Yeah, so we're going to be discussing a topic that I think it's a lot of lip service but I'm going to go at it a little bit more aggressively and Before we do that we have to sort of set the stage a little bit And so we're going to do a little bit of a thought exercise So I want you all to sit back and imagine your Space where we now have this perfect programming environment, right? And what is it? Well, you think of what you want you press the button and it's there, right? It's terrific, right? Who who would want that programming environment? It's guaranteed to give you what you thinking. Oh Yeah, yeah, okay, there we go now. What if I told you it's always going to be two to ten times slower than what you need? You still want it? No, no, no nobody does What if I told you that it'll work But nobody else will be able to ever work on it or ever see it or read it or understand what's going on with it Do you still want it? Maybe maybe not but Well job security. Yeah, but do you want to be the the sole maintainer for that piece of code for in perpetuity? You know, maybe that's not so great. And so we've got this idea of a great programming language to be terrific But what are some things that we actually need in for order for that ideal to be there? And this is the problem is a lot of times we sort of skip some things that we really have to care about And one of those is directness. That's the that's the easy dream. Everybody wants to be able to just directly say what they want, right? But in addition to that we need that result to be fast. We need to have a way of thinking about that speed and It has to be able to transmit to other people. We have to be able to communicate our ideas to other people And if we achieve this ideal vision, just imagine if we actually got that done That kind of system is one that's going to empower people more than control them So this is a system that would enable a ton of people to communicate At speed directly. That's a very powerful idea. Now it's not going to happen. Okay, we know that that's not going to happen But I I want to draw that dream back into focus for people because I think a lot of people have Forgotten about it and another way of framing this is Can we do more work in the domain knowledge the expert knowledge a space where we actually are solving real world problems? Versus doing systems rumination the same old thing We've always done repeating all of this wasted effort just to build yet another architecture That isn't going to deliver the solutions Right. So it's wasted effort. And so before we even begin to think about obesity I want you to draw in and imagine what the real main thing is. Where is our goal? Think about what the Real end goal of all of the programming effort if we could really imagine going forward is and it's not The latest greatest programming language. It's not APL or hint APL is going to be in this talk It's not, you know, any of the things that we think about it's we're driving towards delivering Something that helps to empower people to communicate and solve problems and do better And does that make us if we achieve something close to that vision does that slowly make all of us obsolete? Well, no It just means that we'll be able to spend our time actually doing more valuable work rather than wasting our effort And one of the reasons that this works is because what will be focused on is the still hard problems But hard problems that are hard because they're essentially hard Not because they're accidentally complex because of the systems that we've designed that have resulted in all of these problems so We would love to use a system like this But we also need to remember we're not just the consumer of our programming languages We are producers of the systems that we use We're responsible for creating that vision not just expecting somebody else to create that vision And so we have to remember that we're kind of in this together and the systems that we build are going to affect everybody around us And affect how we interact with the code that we write in the past as well as how everybody else has to interact with us And we are all part of this Group that's working on these things. So now what does our current culture look like? And this is a tech stack taken from a company that is most certainly Definitely absolutely not sponsoring this conference but If you actually look through this And think about this how much wasted effort and complexity is in here How much excess is sitting on here on all levels And the problem is is that we've kind of We talk about wanting clean code. We talk about wanting to make nicer code. We talk about all these things but we aren't really Digging in and making the steps necessary to address programming obesity. And so what is programming obesity? We have to define this So i'm going to define it as cascading systemic failures to simplify and a tendency towards unsustainable waste in our systems So let's think about what we're talking about um in that in that space What does that mean? One is this is a systemic failure. So it's not just one piece of code is too slow It's a system that's resulting in over complexity And it's not just that one thing is using too much resources or too much effort. It's that there's this general trend towards Too much waste and too much excess resources and you start to see some people concerned about that at the extreme levels Right where some people are concerned enough about the energy consumption of their work that they have to start looking for other solutions Because they just don't have the computing Energy on a literal level to do what they need to do And so they actually need to make it less wasteful on a like electrical level But in general for most programming people they ignore these two things Right, they're looking for the easy expedient solution and as long as it's fast enough for me or what we're doing here, right? So how can we look at this? What are some axes that we can examine our complexity from? Well, there's structural complexity That's how complex your code is How hard is it to use all that kind of stuff? Then there's the actual runtime performance the execution performance of the code And how many here actually think that our code is getting pretty high performance these days So I don't have to complain that code is getting slower We don't have to work, you know, that's not a hard sell But there's another one. There's an economy an economy is How much your knowledge scales to a broad set of problems Right, so it's if you learn how to use this one system Can you now do other things or do you have to learn yet another system and another system and another system in order to do anything? So you have to learn and become a master of 25 different languages dsl's architectures just to solve 25 different problems or On the opposite end is your system so clean and nice and concise that in order to express anything You have to build giant architectures Just to support anything because the base layer isn't expressive enough as it is To do what you need and you have to build up dsl's and build up architectures So if we're going to address something like this, unfortunately, that requires change And for good reason people tend to be fairly conservative because change is uncomfortable and it's uncomfortable for a good reason If something isn't broke, we really don't need to fix it that much, right? So we don't really want to just change for the sake of change. Oh, try this new system. Try this It's going to be better. I promise So we have to think what are our break points? Where do we draw the line say if I could only get this That's a break point for me. I'm definitely that Change that effort to change is worth it So let's look at our three things structural complexity and we call this simple faster That's our execution performance our our systems behavior something like this in economy Which is so you can think of this is like how many documentation books Do I have to read in order to solve my problem or something like this, right? Um, so let's let's start and we're going to do this in percentages. So let's start with You know 50% that would be twice as fast or twice as simple something like that So show of hands if you could get your system to be 50 simpler, would you be willing to undergo uncomfortable change? Wow, okay, uh, so we'll set it at 50 percent then right so Simple We've got 50% Now what about faster 50% faster. Yay. Nay Much more conservative. All right 75 faster. Oh 80 faster 90 faster 95 faster All right. All right. So it looks like people Are so I get the feeling that people really wish their code was simpler, but they think that it's fast enough So let's remember that all right I'll go up to 100 But you know somewhere people were mostly getting a break point around 80 to 90 there, right? So that represents actually really interesting phenomena is that people feel like their code Is fast enough and they don't prioritize the performance of their machine much And well that's going to have an effect on us. So what about economy? If you had to learn, you know, half as much to do the same amount of stuff that you're doing now That's a 50 percent. Would you take it? That's that's just about everybody 75 percent So you only had to know like a fourth of what you do now in order to solve the same problems Okay, okay, so most let's let's say 75 percent Okay, so those are our numbers we're going to go with Now let's imagine that that system actually exists What kind of impact are we going to be having with that could we have a lot more impact and we Become more valuable and deliver more value to our system or to our people or to the end domain that we're working with. Yeah And my initial challenge to you is don't wait to actually try to get that Right stop giving up on that And the problem is that a lot of people have given up on this They're willing to accept incremental improvement Because they just assume it's not possible. It's too hard or it's not going to work Or you can get one of those but not all of those. I you know something along those lines And I'm saying to you we can make progress on this and not just little incremental progress There are some really low hanging fruit that we can really tackle here So where do we start? And this goes back to uh snowyman's Comment are we talking about fat or are we talking about sugar? Well, I'm american And what problem do we have in america? Every single piece of food is 50 sugar Right like our bread Sugar, you know our fruit extra sugar our uh, and that's not a joke when we can our peaches We put it in sugar sauce to you know take our you know our meat sugar, right? Everything has some kind of sugar in it and it causes us a lot of problems So i'm going to make my claim here and this is we're going to think about this for a second Generalized pointers are the refined sugar of programming. I'm not being lynched yet. So I think People aren't quite seeing the ramifications of this statement Now i'm not saying cut out all sugar but If you put it in everything you're going to have problems. So what's a generalized pointer? Anybody have an example? Objects yes now. What's another type of object? records implementations of g80 t's What else see pointers? Are we getting a feel here? So let's think about what these basically are they're essentially trees, right? So why am I saying we need to dump this or at least cut back on our usage here and start being healthier here cleaner? Well, we can start with a tree here Simple tree right anybody have problems understanding the tree In this case because I can explain this one. All right, so now Trees are often used to represent Syntax trees for instance or all sorts of language constructs. So let's take an expression tree That's something you would have done in university something. I did with my compiler. So here's an example You could call it like a record type declarer or something it comes from the nanopass Framework that's in racket or shea scheme And so we've got an expression type with a value or these various constructors a b and a and a E these are different types of constructors e.2 represents sort of a binary operator like a plus a minus Times so it's got you know a few things that it's working on It's got two expressions and a function that applies together and it's got some naming information And so this is how a lot of people build their trees So you can imagine in your programming language How you do this right does it who knows how they would implement this kind of a record in their programming language All right, I see a lot of confusion so Java, how would we do this? We have like an expression object that or a class that then subtypes on Each of these different pieces and each of them has some unique field data, right? Is that clear for people? Okay So then it would have like a name and then you'd have these other e and e and that would be another of type E object or something like this So we actually have e dot one two and four we might subclass those Down even further so we have this little hierarchy of our classes in a scheme or racket or closure This is our record system, right? That's pretty clear and in haskell also. It's just basically a record declaration or a struct or something like this So now how before I move forward anybody want to tell me how much memory these things use on their system Anybody want to raise their hand if they know Okay That's the answer. I always get With the exception of one group of hackers and that's kent divvix chase scheme team They would tell you Exactly how much memory this stuff is going to use But that's because they literally wrote the compiler and they designed their compiler to be transparent in terms of memory usage But when I ask uh haskuller or a java program or any but how much memory does your system use? How much memory is this function going to use? How what's its layout in memory? Where is it storing anything there's no knowledge for most people of this kind of system Right, the people are assuming that they don't have to know how the memory is laid out because they've got garbage collection, right? So who likes garbage collection? You better raise your hand Garbage collection is one of the greatest inventions in computer science, but it is also a danger So let's look at how we might actually implement this in memory So here we've got a tag that's a type and our field data Maybe the name or some other structural data in there And then we've got oftentimes a linked list or a vector that will store all of the child nodes that we're going to store Right, so if you're a record system that knows statically how many children it has it might store it as a vector But if you look in this past one the a dot three Where we have any number of children it might store that as a linked list Right everybody with me on that representation Who who uh, who who gets this? Who's clear so far? Okay, so this is a typical classic just about everybody does it this way or some fashion of that way So let's go back. Well, let's recall this tree Makes sense, right Is anybody getting the warm and fuzzies from this layout? Where is this going to be stored in memory? How is it going to work? So all of those lines those are generalized pointers All of those little independent boxes Those are independent allocations that are occurred every time you call a cons or a make or a constructor Inside of your system. Now those are pretty fast, but what happens as your garbage collector is working Well, now you've got all sorts of effects that start coming into play Right, so we've got a ton of complexity here And is this actually what we need to represent that data? Well, that's how everybody may basically does it nowadays And I'm going to say this is wasteful horribly wasteful And you will see just how wasteful shortly, but we're going to eliminate this But what are we eliminating from this piece? We're not eliminating the tree. We're not eliminating the data structure We're eliminating any of that expedient complexity that we put in there just because it wasn't going to do any damage It's good enough But the problem is we've got cascading systemic failure So what is good enough for your system when you take it to the next level and add another layer and another layer And all of these abstraction barriers suddenly nobody has control over the system anymore And nobody has the ability to solve performance problems when it comes in because the performance problems Cascade through the system and no one at any one stack layer can control for that And so these performance problems Layer on top of each other and so good enough times good enough times good enough is really slow So we need to achieve real simplicity here. We need to fundamentally have our code represent something real and eliminate the accidental waste and the way I think we should do this is holistic economy So we should focus on making our systems economical at a total level. So look at the big picture not just the small picture and This is pretty abstract right now. So how did I do this in my real world practice? What did I actually do? So for my thesis I asked Let's what's a real hard problem that we can work on? Well, I asked can a compiler be gpu hosted Can it be fast? Can it be portable and can it be simple all at once? Now the first time I suggested that this was my idea to my advisor Uh he and everybody else said that's impossible And there's good reason for that because no one is it had been able to accomplish that And moreover the people who had tried had failed miserably Because of their system complexity Nobody even had an idea of what it would look like to do tree transform. How are we going to take that mess of Spaghetti that's our tree and move it onto the gpu and make it go data parallel and work on it And how do how do we do all of that? How are we going to make it fast and more over? How would we make it just as simple as the code that you're writing today? Because other people who have done compiler research on gpu's their code is often very ad hoc and very complex It's very cool really great research some of the best stuff. I can cfa if you read that paper. It's brilliant But it's not something you can scale to your everyday programmer So I decided i'm going to take a different approach and you know me I like bold motions of Grand gestures and all that kind of stuff. So I decided we're just going to drop everything. I'm going to eliminate everything We're going to start from scratch. I'm going to give up everything I know about compilers And we're going to try a whole different approach and so I you know, there's a new hope in town which is quite old That's apl So I said i'm going to leverage what makes apl special I'm going to leverage apl as a tool of thought and use it to drive my development of a compiler What is that going to look like? What's going to happen with that? Well, it turns out now we can think about trees a little bit differently So there's that same tree and there's the depth there on the right And on the bottom you've got each of those tree ids, but the bottom the last row Is the depth of each of those nodes Just given in a vector, right? So node zero, that's a depth of zero one. That's a depth of one depth Node two depth two node three. That's in the same depth is node one. So that's going to get depth one Uh depth a node seven is also at depth one, you know nodes eight eleven and fourteen Those are all at depth two We call this a depth vector And now that depth vector is literally all you need to encode the structural information of that tree It is entirely sufficient to represent that tree And leveraging this and starting from this insight and start building from that I got that compiler Right That uh, yes Because every child who's of a different parent will have some number of greater or less value between the two of them Yeah, so the order matters they're all put in uh depth first pre-order traversal order Yeah Good question. I forgot to mention that but from that insight Things move forward and now where there is a compiler and you can read the thesis It's gpu hosted. It's performance portable, which means the same code runs on the gpu and the cpu Without change and is fast on both systems It's very fast and it's simple how simple how fast This is comparing it to the exact same transformations in the most positive light we could against uh Various uh implementations of a nanopass compiler How much simpler is it? Well, we took a bunch of metrics of the complexity of the code And we took the one that that 94 percent of the 15 times simpler is the best outcome It's the node count between the two trees Uh, the lines of code between the two systems a thousand twelve lines of code for the typical piece of programming 17 lines for the apl compiler How much faster is it? Well, if you run it on the cpu keep in mind same code on the dialogue apl interpreter This is an interpreted piece of code running against jit compiled and offline compiled systems The fastest that any of the other competing systems were able to get is nine times slower Right and the other system was 40 times slower basically On the gpu anywhere from 50 to 200 times speed up on this code Now how economical is it? Did I have to do some whizbang stuff? Well, if you take All of the names and you look at all the names And you ask which of these names are general purpose standard library or standard things that every program is going to have in it Right versus like what is part of the language and which part or or the the small small standard library And what is extra stuff that was specific to this domain or this other stuff if you take the ratios between Those two things and you do the comparison Well, we're eight times more economical in that respect And if we just compare raw names, we're at of three times more economical in terms of just unique names outside of that space And so it looks a little like this So on all of these metrics the lines of code the number of tokens the number of unique names that are available And the number of nodes in the ast between the nanopass compiler and the codiphons. We're really much much simpler Not only that we're faster so these are speed up graphs And right around the third line where you see about 16k that represents about a thousand lines of scheme in terms of our Compilation so the amount of speed we're getting the top row is the racket code The bottom row is shea scheme And so racket there is basically not a place in which racket doesn't lose all the time even with zero amount Basically a thousand nodes, which is a tiny tiny ast We're we're winning uh And with the gpu we begin to outperform the cpu Before a thousand lines of code In the compiler now with shea scheme shea scheme is a little bit more efficient in its overheads But it still is almost always slower except for a thousand lines There's a there's a some startup overheads that were involved in that first line And then after that we're always faster on the cpu And it does better, but it only gets up to maybe 5,000 lines of code 10,000 lines of code Uh where the gpu now starts to outperform it Now i've mentioned performance in terms of execution, but that's not the only thing to concern about right How many people think eight to four gigabytes of memory is enough on their phone these days? One person wow, I I I I embrace that attitude But these days how much memory are we sticking on our phones? How much do we expect our phones to actually have to work with? Well, let's look at the memory usage of these systems so dialogue Racket and shea scheme at each of the benchmark sizes. So these are as the trees go up in log size So that's two to the zero So that's basically two to the ten two to the eleven and so forth up through there So notice at the large size, that's about 16.6 million nodes I think Two to the 24 nodes give or take Dialogue is using 63 megabytes of memory to store that ast Racket is using one gigabyte And shea scheme is using 1.4 gigabytes as far as I could tell when investigating their systems That's a big difference But these are all just a bunch of numbers So let's actually see what that feels like in real life right So let's let's load up racket. Shall we? Or who prefer so shea scheme versus racket. Do we have a preference? I think racket is the thing that most people are more familiar with Oh, look, it's finally beginning to load. Oh, oh, I've got to mirror it. Don't I? Let's see if I can mirror this You see it now Now this is the dr. Racket ide still loading not great as a start So here's the code. So let's load that into the system You guys can take a coffee break now Because this isn't going to end Well And notice what I had to set my memory limit just to load this piece of code I had to put my memory limit as two gigabytes To load this code and it won't load for minutes And that's because it's doing art orchestration to set up the debugger so that it can do this kind of stuff Well, okay, so let's be a little bit more fair to dr. Racket. Let's not do that But let's let's let's load it. Let's try loading it in apl Oh, all right, we're done We're ready. All right, so let's Let's instead of doing that. Let's pull up the terminal version of racket All right, so that pulls up faster. Yeah, let's see. Can I get the font size larger here? So we'll load that in how long is it going to take now? Something should happen. Come on now. Come on. We can do it. Let's do it Okay, let's just go to shea scheme. Huh? I think I think maybe A little too slow. So shea scheme Is known in this space for being one of the fastest systems out there Actually, let's make sure we've got a competitive optimization level Okay, that's better, right? So shea scheme is pretty fast on some of this stuff. So let's load in an ast right Let's do and let's time it Now this is just generating the ast that we want to run on So let's look at those numbers. That's eight cpu 8.8 seconds cpu time notice how much time we're spent collecting All right, let's do the same thing Actually, let's not do the same thing instead of that. Let's generate all of our data From the zero size all the way to 14 and let's uh Actually, can I Let me see if this is going to do what I want. Let's Let's make a slight tweak And let's see if we can time it as well So it's generating this a bunch of times. So our time that it took to generate was 0.7 seconds To generate not just one size, but all of the sizes So double the amount of work that they're we're doing All right, so let's Time the compiler and now this is the fastest one So while we're waiting anybody have any uh initial questions they want to ask I know Let's pull up the task manager. How much memory is shea scheme using right now? 8.7 gigabytes We are running the compiler. We are compiling a program right now. We're compiling a program with 16.6 million nodes in it The data that uh, what do you mean? Yeah, yeah, so we have we have a sample program that we then replicate in space to Mimic a full blown program It started it's it's the same computation the same generation that's going on and so oh, what are we up to now? We're still up in the eights. All right, not bad. Oh nine Oh, we almost got 10 some of these systems go up to 12 gigabytes to do this Does this sound familiar to anybody? Who else has had a 12 gigabyte memory build? Anybody who uses java Don't don't lie to me android users anybody have a build system for android How long does it take how much memory does it require? How many threads do you need? How much air conditioning does your you know? Room need to support the fans Okay, well it's still going Oh, we're up to 10 gigs Well, maybe we can just like Well, we'll we'll do it here as well so Because we're not we're not trying to make it official. We're just trying to get a feel for it So i'm going to run this uh, the two to the 14th size here and we'll see What we get we're running both of these in parallel. Okay, so we're done We're running it again just to get a consistent timing Okay, 8.8 seconds With the other compiler running in the background. Oh, look, we got it 142 seconds So sometimes the numbers when they're up on a graph don't quite feel real I would argue this feels pretty real and look at the memory usage between dialogue and shea scheme right now Dialogue using 1.5 gigs Shea scheme is playing up in the 10 gig range Almost all of it It still won't go fast enough because shea scheme isn't designed to optimize that Structure shea scheme is designed to optimize over pointers. It does a good job at that. It's pretty good at that But the reason that this is is it's that cascading Effect this is what's important here is it's not just About the raw performance it's Doing one thing right leads to positive cascading effects across the whole board So by choosing the right data structures that we're using giving up the use of generalized pointers in the system We now have total control over our data layout We have total control over our sizes and we have it in a way that every apl Around who's been around for even a little bit knows exactly how much memory that thing is going to use It's transparent. It's direct and we know not just what the code is doing But how it will run we can reason about the asymptotic complexity almost mechanically actually very close If you include some of the more recent research on type Proving type stuff and proof theorem provers asymptotic complexity is a computable thing In these we encode written this way And you get even more than that you get the availability of your entire machine So now I get to use my simd vector instructions on my trees I don't I get vectorization for free. I get efficient memory for free. I get serialization for free I get compression offline easy communication all for free By giving up those generalized pointers And sticking to this unified data layout I get control I get and none of this actually resulted in more complex code It's fundamentally simpler code So you could try to implement some of this stuff in your own language And it might be a really good idea But my advice and challenge to you is start by learning where this idea really has been matured the most and that's an apl Learn how apl does it and why they do it the way they do it and get good with thinking about things that way Play with it and we've got talks and workshops here that are going to explain that and then Maybe you stay with apl because you love it. That's my hope But if you now have to go back and use one of these other programming languages and do something in it You now have a perspective and you can make design decisions to start pushing towards simplicity And start pushing towards holistic economy better performance and not giving up Productive code in fact probably making your code easier to write not harder All while achieving some significant benefits now. Is this a magic bullet? Not exactly But This is low hanging fruit that people can work with Thank you, and I hope we have some time for questions Yes, all right Yes, I can Actually, okay. I should say that the um the t-shirt store is open. You can get this in hard copy Yeah, we're gonna we're gonna blow this up here So this is uh This is not, you know a copy of the code. This is from my source file. This is the actual code That I work on Hello, hello. Thank you for your interesting talk. I have a question Was it like that you just Just wrote all these those 17 lines in a one Or on or it was like you're constantly increased your Yeah, yeah, yeah, yeah, so this This would be fairly straightforward to write now Because I've written my thesis which is essentially a handbook on how to do this now The problem was this hadn't been done anywhere before so the apl community didn't do this Didn't have a recipe or idioms for this the rest of the community didn't have idioms for these And so I had to sort of figure out what worked and didn't work And if you watch the history of the development of my research It shows I've done a lot of churning trying to understand where to get the efficiencies here But now that they're there you can sort of their Box, you know, take a take it and use it. It's a recipe. It's it's very easy to do But you there was a lot of churning to figure that out. In fact, the github repository has 4.5 million additions and 4.5 million deletions or something like that In it is it possible to shrink this compiler to 16 lines? Yes, but I'm not code golfing here And actually I plan to try to consider every iteration of this compiler has become more featureful More reliable better and smaller I don't know if I can continue that but it would be nice So I recognize a few apl primitives here that I think were added While you were doing the work Yes the key operator and the at operator Yes, how much of those They simplified the compiler so I address key and at directly in my thesis But key is very powerful for brute forcing certain things And so it can help lead you to cleaner solutions But there are three major traversal patterns when you're working with tree code in this style And the the most expensive one is key And so if you have to use that traversal pattern you do But otherwise there are two traversal patterns before that that are significantly more efficient If you can make your code work with that and the vast majority do Work without the need for key. It's just you want to have it there in case you need to do certain things And work with certain types of approaches and at is a similar case You want to be careful with that but you can But it can be very very powerful and fairly efficient at is actually Less of a worry as far as efficiency goes Then key Uh explain what I'm talking about with regards to what? Sure sure, uh the compiler is a Project funded by dialogue to produce a A compiler for apl that Runs is hosted itself on the gpu That produces code that is hosted on the gpu or the cpu and the Compiler is written this code here is written in a style that embraces an extreme version of apl style Which means we don't have any if statements. There are no looping. There's no conditionals. There's no branching no pattern matching No use of higher object oriented programming Any of that it is pure function composition with the raw apl primitives with almost no abstraction on top of that And so actually you'll notice here that there's not any helper functions except a couple of anonymous lambdas put into the middle Everybody can see that right But this is this is what i'm talking about the directness is to achieve this I didn't actually have to build any abstractive layers on top of apl at all It's a direct expression of the solution in raw pure apl. It would like be using it'd be like using Scheme without ever touching any of the standard libraries or any of the extra stuff out there or things like that Any other questions? Yeah, I think we've got I think somebody had their hand up the back there as well Yeah So it would seem that it's hard to humanly pass apl Come to the next talk apl training wheels. We're gonna explain that it's not hard at all Okay, like but just saying beside that, uh, this is a compiler for apl written in apl. Yes, right? Uh, and Like most of humans deal with human languages. Yes, right and Writing a compiler for that in apl which understands a generic language Let's say a scheme which would be something more simpler to pass and write a Reason about would that have the same kind of performance and how hard would it be? Actually this the scheme version of this compiler would look almost identical because the the language i'm compiling is very close to a Scheme, it's extremely simple. Uh, you would have to add one piece to handle closure creation in a certain way that's more scheme like and you would have to plug in a garbage collector at runtime and, um I if you wanted a high performance output You'd have to play a few more other games for schemes specifically But the compiler itself does all the things that an untyped functional programming language needs Like like lexical scope resolution and all of these other things Um, and these techniques are not specific to this compiler. They're completely general techniques That's one of the requirements. So that was one of the requirements of trying to do this compiler Is not have specialized techniques just for this type of problem. Any tree transformation can be done with these techniques So if you're in document processing xml processing dom processing Json work any of those are just tree manipulations where you're changing the structure of the tree These apply and can be used second question, uh, so This generates some machine code at the end of it which executes on cpu um, it Is sort of like a it's a 2c compiler it compiles to a version of c++ or other things like that using a runtime What about tooling for debugging such code base? So this was designed with the idea that you would use dialogues very good tooling to do Work in this and then you would be able to dispatch this at your this would be like your o3 optimization when you wanted it or something like this So dialogue provides really good debugging for your apl code, which you can then plug into this Right from the dialogue environment Questions thoughts concerns. Yes Don't be afraid. It's all right. If you think this is utterly bonkers Feel free to say it if you think this is the greatest thing from slice bread. I'd also like that but So, uh, i'm not aware of apl. Uh, I wasn't aware before this. Yes So my question is these operators symbolize some mathematical operation. Yes So why can't you use english words for it? Like why do you need this syntax? So here is one of those compiler passes translated into english using a lisp like notation Is it more readable? How do I write it? How about we do prime number generation? So I have a keyboard dialogue comes with a suite a linux has an apl keyboard in it You enable it with xmod map or anything else and then you get a an extra shift key And that allows you to type your characters or you can use the language bar So I guess I got to show that I don't I'm I like more space, but there is a language bar here that you can Doc and then you can hover over these and it shows you all the symbols documents than how they're used and so forth There's about 50 symbols which correspond to roughly a hundred Functions and those hundred functions essentially do everything and i'm not joking when I say that they do basically everything The reason I did this thesis research on tree transformations Was the one domain where I could not find prior research where the apl has said this is how you do it in apl And it was a nice clean solution Almost all the other domains in computer science have versions of this except for certain What I would call naval gazing problems where it's like if we want to deal with type theory on this thing And how does it work for this? Well, you could write an implementation of those things in apl But it's not a like a traditional computer science outward facing problem, so If that makes sense I mean the follow-up question would be is there a reason for this tors syntax? Yes Yes, there's absolutely a reason so so the question is are you you know, uh, I assume you're going Why do I have to why would I ever write in a squiggly little form and the Responsibility is do you write in math? And why do we write with math? And we've written with math because it's a very useful notation And apl was designed around that same aesthetic is we want to be Have a efficient notation for solving problems and when we think about the problem solving for the human experience Not just running on the machine, but the human experience of code it is Symbols that turn out more often than not so symbols are actually an aid to an experienced worker Versus the longer names. They just don't provide the beginner with as much comfort But see my other talks for why that's Comfort is bogus and overrated and readability is about experts Uh, two two questions. Yeah, the anti-patterns talk will discuss that Two questions here. Uh, so the first one is uh, what was the idea behind? Uh, having port porting this to gpu. Uh, what was the thought process behind that? And the second one being uh, what are the challenges that you see in terms of Porting these kind of logics to all other compilers other language compilers to other sorry to other language compilers So the it's mainly just engineering work is you have to actually take those transformations that you're doing in your other languages And translate them into this method. There's no known way right now To auto transform your recursive descent programs or your recursive, uh Structural recursion algorithms into this form So you have to manually say i've got this transformation I'm going to write it this way and so you have to do that for every one of your compiler passes or some of the compiler passes in the middle As for asking about why the do the gpu The gpu has been a difficult target for a lot of people The languages available for it are often hard to use or they're brittle or they're um Not high level enough or any of these manners of things And so the codiphon's compiler provides a high level cross platform way of writing performance cpu and gpu code and It provides the technology and the knowledge For solving a set of problems that are very common on the gpu Which people didn't know really how to do at scale or with the same level of simplicity uh before that So it's uh, basically it was an open question and therefore I wanted to go for it And it was also a question that people said was impossible and hard and too hard. So that made me really Wanted even more, uh, but yeah Sometimes yes that it that can happen and it does happen but it does also depend on the experience of the apr and It happens much less than another language So it's very often that optimization and apr is recognition that something you're doing is a redundant verbose way Of solving the problem and so shrink it to something smaller and more direct That's a common optimization pattern that you see there are times when you do have to write more code but I find that to be significantly less true So for example this compiler performed significantly worse than the previous versions Because it was more complex and when you simplify the problem you end up simplifying the number of operators You simplify the actual work involved and if you maintain your asymptotic complexity correctly, then it ends up being faster And simpler. Thank you very much