 Thank you all for coming. I know that it's a bit rainy, so I appreciate you taking the time to come all the way out to this building off to the side. My name is Chris Marusic, and I'm here to talk about G expressions. I think some of you might be familiar with S expressions. Can I get a quick show of hands to see who knows what an S expression is? Okay, so most of you know. We'll talk about it in a moment. First I want you to know that this presentation is for you to take and modify and copy however you like. I like to let people know that so they can remix it and use it elsewhere if they want. So I am again Chris Marusic, and I came from Seattle where I do software development professionally. I support free software. I try to contribute in my spare time, participate in the GNU Geeks project off and on over the last few years. And everything that I say here is my own opinion, so it doesn't necessarily represent the opinion of my employer. To understand G expressions, I think it's first necessary to talk a little bit about what Geeks does. So I'm going to give a very brief review, but it won't be as in-depth as some of the other talks you might have seen elsewhere at FOSDAM. Geeks is about functional software deployment, and the reason that code staging matters will become clear as we talk about it. So in Geeks, it's about building software, composing it together, and deploying it from one machine to another, that sort of thing. It's also a distribution that's built from scratch. You can install packages, remove packages, and reconfigure your entire system. It's great, it provides transactional installation of the software, and you can roll back to previous versions of your system. This is an example of what a package looks like. In the scheme language, specifically guile scheme, a package is just a record, and it has fields. In particular, one of those fields is the inputs, which describes the other pieces of software that are required to build this package. Also, an operating system is just a record, it's a scheme object. Again, it just has fields, and this is manipulated by other mechanisms in Geeks to instantiate the operating system. We define everything, including the users and the services, like there's an open SSH service running here on port number 2222. All of it is declaratively described in a single file, which is quite nice. So I mentioned that Geeks follows this functional model, but what is that? The functional model is basically the idea that you can treat building software as a function. There are many ways to define a function, and this is one particular way that will work for us. Anything that takes some inputs and spits out an output can be thought of as a function, right? So in mathematics, it might be a function of numbers. You can go outside of mathematics and have a function consisting of words, or you can think about software, and think about the causes of the building, for example, GNU Hello, as a function, where the inputs are the compiler, the libraries, the source code, the function itself is the usual procedure of configure, make, make, install, and the output is the program that you can run, GNU Hello. So thinking about software as building, executing a function turns out to be quite useful. The output of this function is not just thrown randomly into the file system, as it would be in some other build systems, but we store it in a place called the store, which is at slash GNU slash store, and we address it by the hash of all of its inputs. So what this means is that if I build GNU Hello with the same libraries that I used before, I will get the same output path, and if I build it with a different set of inputs, like a different version of libc, then I would get a different output path, and these two different versions of GNU Hello will not conflict when I install them. This enables a whole bunch of really neat features that I'm not going to go into detail, but at the very least you can see how this allows multiple versions to exist on the system, and their dependencies also follow this. So libc itself is also addressed by its hash of all of its inputs in the store. So if these two different versions of GNU Hello happen to use libc, they will share that exact same version, you get deduplication of dependencies automatically. It's very nice. At a high level, when geeks build software, we start with guile scheme, which is the package definition. Again, this is just a guile scheme object, a variable called clutter, clutter GST, and it contains an object that you can manipulate using scheme code. That gets converted into a lower level representation, which we call a derivation, and then we ask the daemon, the geeks daemon, to execute that derivation. So a derivation is like a package in that it has inputs, which are required to build it, it has an output, which is the built product, and it exactly specifies the steps that need to be taken to build it, like make and make install. The difference between this and a package is mainly that there are other factors, such as the platform for which you're building the package, which go into determining the output hash. So if I take that package and build it on an ARM machine, I'll get a different output path than if I build it on an x86 machine. So derivation is a lower level representation of a package. We ask the daemon to build it, and this executes the build in a tightly isolated environment, very much like a trute, different namespace, unique UIDs, and the code that drives the build is guile. It's guile scheme code. I'm not showing the script here, but this would be the build of one derivation, this would be another derivation, there are three derivations here, right? And they're all running make and some other things to build software, and they're all being orchestrated by a guile script. And so the question is how you stage that code into the build side to run that code. When they finish building, they'll store their output in the store, like I mentioned earlier. And generally we speak of the host side code as being the code that lives up at the top, outside the scope of the build container, and the code that runs inside the build, not container, but the isolated build environment, that we refer to as the build side code. So now I want to talk about how you get the code from the host side to the build side, how do you generate that script that drives the process of building Clutter GST, for example. One way that you could do it is to use S expressions. So a lot of you are familiar with S expressions. If there is anybody out there who is not, perhaps the viewers, perhaps somebody in the audience, an S expression is essentially a list that can be evaluated by scheme or list or some similar language. And the very wonderful thing about S expressions is that they can represent data and code at the same time. So a lot of people that like lists like it because it has this property called homo-iconicity, that is the fact that you can use code and data interchangeably. So this allows you to create an expression like the one that you see here and bind it to a variable and then pass it around and evaluate it later. Also in scheme, you have this quote mark right there at the top. Sorry, didn't mean to do that. Right there. And that is quotation. It means I literally want this list, this expression, so that I can pass it around. So once you've bound this expression to that variable, you can pass it around and you want to get it into the build side and basically create a sim link pointing to the output path for core utils. But you can't just embed the output path from the beginning because that is not known until you build derivation. In order to get that information on the build side, with the translate the S expression in some fashion into a form that evaluates it at build time. So the way we do that here is instead of writing the strings, we have essentially a dictionary, right? It's an A list that contains the inputs and the associated value is the output path of core utils. Likewise, the output variable here is just going to be arranged, it'll be arranged to be in a context where output refers to the output path. So I take this build expression, I pass it to a procedure that Geeks provides and it promises that when I run this build expression on the build side, it will be evaluated in an environment where this variable refers to something like that. What goes into that, the build inputs, is determined by a separate variable that I provide. I have to specify that I want core utils in another A list that I have to pass separately to the build expression to derivation procedure. And Geeks will then look at the inputs, it will find a package called core utils, it will translate that into the output path and then it will pass that value in the build inputs and that's how I can reference the output path here in my build side code. So this works and it's pretty good. But I have to manage the inputs manually and the build expression manually and they're separate and so it's possible that I might forget to specify some of my inputs. I forget to update my build expression when I update my inputs, the build expression will no longer work when I run it. So it's a bit fragile in that I have to update the inputs and also the build expression, I have to keep them in sync myself. And if I have two build expressions that I want to compose somehow, maybe I want to run one before the other, there are two different lists of inputs that I have to manage and that increases the complexity, pushes this complexity onto the programmer. It would be nice if we didn't have to do that. So a quick summary as expressions, some benefits of using them is that they're familiar and also because the inputs are separate from the expression, it's fairly easy to change the inputs without modifying the expression. If I have a custom package that I want to use in place of the default core utils, it's very straightforward, you just update that value in the A list of inputs. But the downside to doing this is that because I have to manage the inputs manually on my own, it pushes complexity onto me, the programmer, and it makes it more likely that I will forget to update both places and these expressions become less composable. G expressions are an attempt to make this better. And the main intent of this is we want to get rid of those inputs so we don't have to keep track of them manually. The way this is done is by introducing a G expression and then introducing references to inputs that we would like via an unGX form. And so when I do this, unGX core utils inside of a G expression, Geeks will remember that core utils is an input. And this list of inputs for the build expression will be automatically associated with the G expression. It will be carried along with the G expression and I don't have to keep a separate list up to date. So I can just write this build expression, pass it to G expression to derivation and I will get the same basic effect that I wanted. The core utils package will be translated into the output path and output is a special case that gets translated into code which accesses the output path via an environment variable which is always set in derivations that Geeks builds. It's a little tedious though to write GX and unGX so we like to shorthand that by introducing readers syntax hash tilled for GX and hash dollar sign for unGX and you can see that when we do this it looks very much like quasi-quoted S expressions except that now the input core utils is automatically added to the list of inputs in the G expression and I don't have to manage it. So this is quite nice because it makes G expressions harder to write incorrectly. I can write a G expression and I don't have to maintain the list, harder to make it fragile, but they're also more composable. You can put not only packages into a G expression via the unGX form you can also include other G expressions or other high level objects that we'll talk about in a little bit. So G expressions also give you the ability to import modules like guile modules onto the build side. By default only the built-in modules are available and so if I run code like the one you see here it's not going to work. GX provides this helper module, you know, provides this procedure, make to repeat and if I want to make my empty directory it's not going to work because that module is not available for importation on the build side but if I add a with imported modules form this will arrange to make that module available for importation on the build side. So in this way modules that are not built-in to guile by default can be imported from the host side into the build side and GX promises that it will set up the necessary arrangements to make that available for importation on the build side. And this is how we end up using a lot of different libraries in GX on the build side. So in summary some of the pros of using a G expression are that it manages your dependencies for you in your build expression that you're passing around. I no longer have to maintain a list of inputs. That leads to less complex code and it makes the G expressions much more composable than it would be if you were using S expressions directly. It does have some downsides. For example the fact that the inputs are no longer separate it makes it much more difficult to dynamically modify those inputs. Once I've said I un-GX core utils I get the output path for that exact version of core utils. And if I want to change that to a custom version it's a bit more difficult. In practice that can be a pro or a con depending on what situation you're in. The way that we use G expressions in the GNU project, the GNU Geeks project I should say it's generally a good thing. So moving on I'd like to show you some examples of how we are actually using G expressions in our project. This is an example showing how we would manage a demon in GNU Geeks. We manage demons with the shepherd that's our init process and for any demon that you want to start and stop you have to create a shepherd service. This again is just a record in scheme code that describes what to do. There's a start procedure and a stop procedure. Actually these are fields and the value of the field is a G expression. So here there's code which will invoke make fork exact constructor and it'll insert some command to start open SSH. The open SSH command in turn is another G expression which has some other ungexed forms and so on and so forth until eventually you get the open SSH package and you get the output path to the package. So in this way we're composing a lot of different expressions together and it would be very hard to do that if we were just using S expressions we had to manage the inputs separately. Another example is booting the system. When we boot the system in Geeks, that is to say the distribution, the GNU slash Linux distribution that Geeks provides, we build the init RAM disk also using guile and we use G expressions to stage the code that is necessary to build it. Sorry, that is necessary to run it. I'm not showing all the details, right? The file systems variable, Linux modules, their definitions are not shown but they are objects which can be ungexed inside of a G expression and an appropriate representation will be substituted in its place. So for example the coder will expand to a directory that contains all of the modules. One more example. When the system starts, you generally have to do various bookkeeping tasks. For example DHCP daemon likes to have its lease file exist before it will start. So in order to create that lease file on a traditional system you might have a bash script that creates it but because we love GNU guile in the Geeks projects, we do it with G expressions. We stage the scheme code and again it's, yeah, okay, five minutes. And again you can see we're just using ungexed all over the place, getting package outputs, appending that with the path to the actual program and this is how we run daemons in GNU Geeks. It's how we make glue code to set up things like the leases file for DHCP daemon. There are many other uses that I haven't talked about. Everything from running automated tests to deploying to other servers, it's possible to remotely evaluate a G expression and transfer all of the transitive closure of its dependencies. Again you don't have to manually manage that, it just happens through the magic of G expressions. I mentioned earlier that you can also insert other things besides packages in a G expression. One example of that I want to talk about but first I'm going to explain how we go from a package to an output. Oh, ten minutes. Well, we'll have more time for questions. So how do we go from core utils to the output path? Generally speaking we start with a high level object like a package and we lower it to some lower representation and then expand that into the output that gets replaced. So concretely with the package there's a procedure for the package compiler. There's a package compiler that has a procedure called lower which will convert a package into a derivation which as I mentioned earlier is that lower level representation that describes exactly how to build it. And then once you've got that lower representation, the derivation, there's another procedure that is associated with the compiler which will expand that derivation. In other words it will just get the output path that has been calculated for the derivation. So it's not just for packages but also if you want to, for example, get rid of this string append, right, we've got this code up here on the build side, it expands to string append and then two strings. This works, this is nice, but if you want to avoid running string append on the build side it'd be kind of nice to just be able to say, I just want to run, you know, binhello, right. You can do that by using a so-called file append object and a file append object is something that you give a base, a package as a base and the string suffix that you'd like to append to it and it does the same thing that string append would do but when it's lowered to its replacement value it will just result in this string. So this is a very simple example but I hope it gives you the idea of how you can take anything that you want really and insert it into a G expression and get an appropriate replacement value. Another slightly more complicated example is if I want to run Tor with a custom configuration maybe I'll use my file append object to get a replacement value that leads to Tor and I'll also want to put a local file that I have containing my configuration as a file argument to Tor. I can do that with a local file object which is basically in the same spirit. I'm just saying I want to take the contents of this file. I'd like Geeks to put it in the store and then I'd like Geeks to replace the reference with the path to that file in the store and, you know, since Geeks is a functional package manager if I change the contents of this it would end up going to a different location with a different hash but the point that I'm trying to make is that you can put any kind of objects you want in an un-GX form and as long as there's a GX compiler that knows how to convert that into a lower form you can put pretty much anything you want in and get any kind of appropriate value back. It doesn't even really have to be associated with Geeks' store. You could imagine having something that, I don't know, it's a list of your favorite fruits and inserts them without consulting any kind of GNU, Geeks' specific stuff. So GX expressions are a way to tie this code staging mechanism together with deployment of software and in the Geeks project we use it quite frequently to glue different pieces of the system together which involve using packages, using configuration files and we want to compose lots of expressions together to create a larger program. So GX expressions make it very easy for us to do that and in a lot of cases, in that sort of problem domain it turns out that they are more ergonomic than using SX expressions directly. I don't want you to get the wrong impression about GX expressions. I didn't invent them. They were made by Ludovic Cortez who's also the guy that made GNU Geeks basically and he has a great paper that describes in more detail some other wonderful features about GX expressions. For example, they can be used to access either natively compiled code or code that's compiled for the target architecture. They can also be made to be hygienic so that in the same way you can have hygienic macros when you're manipulating this code with GX expressions you can maintain a similar kind of hygiene. Unfortunately I don't have enough time to talk about that but if you are interested his paper has a lot of detail so I highly recommend that you read it if this sort of code staging mechanism is interesting to you. The manual for Geeks also has a lot of great information and the community in general is very welcoming so I invite you to get involved and learn a little more about it and talk with everybody. Thank you for your time. Do you have any questions? We have a few extra minutes. Go ahead. Yes, in the back. I think the question is back in... I can go back to the slide that we had perhaps where we were importing modules and... It's this one, right? Oh, okay. So forget about the screen then. Right. You say I want to use these modules and then again I have to say with that module do something. Right. So the question is why do I have to say with imported modules, Geeks build utils, followed by code that then says use modules, Geeks build utils, right? Okay. So Geeks build utils is... has mainly to do with the built-in modules for the new guile because it's not necessary to make that available through any special means. Every derivation is run with guile, an exact version of guile and if you're just using a built-in module, you don't need to have any extra support for that. You can just say use modules and when coupled use modules together with the with imported modules form then you would have to import that built-in module from the host side and that means that if I'm using one version of guile and you're using a different version of guile it starts to be possible that we would pull in different versions of the built-in modules which we're already there to begin with, right? So it's mainly to improve the reproducibility of the build and the fact that it's not necessary for built-in modules and we didn't want to couple those two together, right? Right, so the question is it seems like g-expressions are kind of tightly coupled with geeks and is it possible to remove that from the context of geeks and use it in other situations where code staging might be useful? Right. So I kind of mentioned this with that silly example earlier but it is possible to do that. It's just that right now there is a lot of geek-specific stuff kind of polluting the implementation I guess and right, g-expressions by themselves can be used without any geek-specific stuff. So you can write a g-expression that is essentially just an s-expression and doesn't reference any packages and you would be able to stage that expression without using any geeks machinery to do so but and also if you have a high-level object that's not a package and you don't need to if there's a situation where you want to take high-level objects of some kind maybe a graph of objects, not packages but something else and you want to insert a reference to some aspect of it, right? Like you want to take some high-level object and you want to make code that's going to have a reference to that thing but you don't want the thing to be included directly, you want to have some representation of it replaced in the code you can do that with g-expression compilers and they don't have to be using the store, it just happens to be the case that every problem in geeks involves the store. So you could do this, it would be it would not take too much work to take the code and move it out but as it's implemented right now, all of the APIs that involve using g-expressions like g-expression to derivation assume that you're using the store. But did I kind of convey how it may also have a question? We can talk about it after, yes. How do they get combined? So the question is how do the g-expressions get combined and is there a way that you can specify how they are combined together? Just like how you would un-g-ex a package in the middle of a g-expression, you would un-g-ex another g-expression, that's how you combine them together. So if I have two g-expressions and I want to run one before the other, I would make a g-expression a new one that says open parenthesis begin and then I would un-g-ex the first one and then I would un-g-ex the second one, right? Yeah. You can just write code that looks like an s-expression but you start it with the g-exp form and then whenever you want to put a g-exp in somewhere, you un-g-exp it and the idea is that it works basically the same as quasi-quote and un-quote. If you're dealing with s-expressions you could just insert a new s-expression in the middle of a quasi-quoted s-expression using comma, right? And it's exactly the same with a g-expression. I think we're out of time. Is that right, Piotr? Yeah. Okay. Well, thank you very much. I appreciate your time.