 Hello, I'm Janneke. This talk is about GNUMS and the ongoing effort to remove the binary seeds that we inject into our free software stack. Hopefully you will learn what this new bootstrapping hype is all about. So to crack this chicken and egg problem, which bootstrapping is, I wrote GNUMS. Messy C is a C compiler written in a subset of guile scheme and it comes with Mess, which is a scheme interpreter written in C. From the early days of computing we know that Lisp is a good way to make the jump from a low-level language into a high-level, elegant language. So you may wonder before mess, how did we bootstrap our systems? Well, actually we don't. Well, that's normally because GNUGEEKS as well as NixOS actually use some kind of bootstrap. However, Ludovic Cortes, developer of GNUGEEKS, noticed that the bootstrap binaries that we bootstrap from are still very large and that is a problem. So he suggested we remove those. In the 80s already, Ken Thompson gave a talk and saw that we have actually a pretty big problem in computing. He called it the trusting trust attack. So what do we do when a researcher points out a problem that's possibly going to affect everyone and is very hard to solve? So usually we just ignore it and we want to change that. So with GNUGEEKS and also NixOS, we have already a bootstrapping story. So NixOS and Geeks are interesting from a bootstrapping experimentation and research point of view because the package dependencies in Geeks form an acyclic directed graph, which means there are no bootstrapping loops. That's unlike most other software distributions that are full of loops that you would need to break. The critical importance of this was noted by Bitcoin developer Carl Don. He gave a talk about this at the Breaking Bitcoin conference last summer. I warmly recommend that talk. It will take only 18 minutes of your time. Carl Don explains how by wishing to provide the community with trustable binary downloads, they have implemented Githian. That's a system that uses reproducible builds to do so. So in computing, a bootstrap is slang for doing something which is actually impossible to do. For example, say that you wrote the very first C compiler and you wrote it in C and you called it GNUCC. It is impossible to compile that very first C source code into a working GCC binary. So what to do when you're confronted with an impossible task but you know that something quite similar has been done before. You just ask grandma. So grandma, tell me again how did you make that yogurt? Well son, you get some fresh milk, must be good milk and you just take some leftover yogurts from yesterday. Okay, so with that wisdom, we can now create our second GCC compiler. So we take our GCC source and we compile it into a working binary. So while this looks like ordinary milk, it actually is a bit of software that has been carefully crafted. It's actually, it's a masterpiece. A work of art. It's bug free. It has been the difficult parts have been peer reviewed. And if at all possible, maybe a pair programmed some difficult bits or if possible, we even apply formal methods to prove that this second compiler of ours will be bug free. Then we apply the recipe. And we even share our recipe so that others may reproduce the result that we got and produce the same second compiler. And lo and behold, we're reproducible. We got the same second compiler. And as long as they follow exactly our recipe and they use exactly our first compiler, um, then we're reproducible. And we're just as safe as our first compiler was actually. So what follows from this is that reproducibility is critical, but it's not enough. And even reproducibility with clean source code is not enough. So Carl went looking for something else and found it in geeks. So he he noticed that Bitcoin with Githian, in order to provide trusted binary downloads, when you start to get in build, it all it starts with downloading almost all of Ubuntu. So in order to create trustable binary download, we first download a lot of binaries that we have to trust. So yeah, that's not good. So last year at FOSDEM, I prevented, I presented, I prevented, I presented the reduced binary seed bootstrap, which reduces the bootstrap seed by almost 50%. But it reduce it removes GCC from our bootstrap seeds, our first compiler, right? So could we improve on that? Well, that reducing the binary seed in our case of geeks would mean removing bash, core utils, bzip2, orc, grep, gzip, patch, sed, tar, xe, maybe a couple more. So that's why I'm very proud and excited that NLNet saw the importance of this project and decided to fund me to create this next step, which I'm presenting here, the scheme only bootstrap. Another reduction by about 50% of the binary seed. One component of that new bootstrap is GASH and GASH core utils, which are an implementation of these critical binaries in scheme. While we've been focusing quite narrowly on bootstrapping, it's our intention to provide a really rich shell scripting experience and bring that to Gile. So this is what the current bottom of the bootstrap graph now looks like. The only interesting binaries left here is a scheme interpreter and a scheme compiler, GNU-Mess and GNU-Gile. So that's the scheme-only bootstrap. So when Vagrant Cascadian got GNU-Mess packaged for Debian and it went into unstable, at a reproducible build summit last month, he was wondering, is there anything we can do to give more trust to this new first compiler that we injected? And he thought, well, it would be nice if we could build mess and do it on different distributions and prove that we get the same binary that tells us something. So when he suggested that Dave Derry and Jelle Van der Waar joined in and we actually did that. So is there any more we could do? Carl, don't think so. So stepping back a bit from this, given that we dislike downloading binaries from the internet and trusting them, why not stop doing so all together? So that's what I'm proposing to do the coming year, to create the full source bootstrap. Then is there anything left? Well, we do it on the geek system. So we have the geeks build demon, we have the user land, and we have the Linux kernel. And at the reproducible build summit, Ludovic actually built a geeks package in the initial RAM disk. So with that effort, user land and the build demon are removed from the picture. The next target, obvious target is Linux. So the last release of mess, version 22 of this month, starts to run on the canoe herd. Microkernel could help there with reducing the trusted computing base. So will we be having a boring life after that? Well, Jeremy Orians has some ideas. And well, Mark Weaver has some nice suggestions. So we have something working. Well, the next step is maybe other architectures and well, make it more audible. So are we really doing this just to address the trusting trust attack? To be honest, I'm not sure. I think that the proper way to do computing is to use source code always and compile that into a binary. And I think the trusting trust attack is just a symptom of confusing a binary with actually compiling a program from source. So I'm very grateful for all the help that I got and the support in this project. Many people are helping spreading the world or helping with code. So that's all folks. So the question is, I'm not very knowledgeable about hardware, but is there something we should do? Yes, if you're knowledgeable about hardware, please do something. Thank you.