 Thank you. Hello. We're coming to my talk. It's about the reduced by-rate seed bootstrap for geeks in their country. And I'm going to make a new one. There's no view over the past half of this page. Probably there's an introduction about what's a by-rate seed bootstrap, but we want to do it out how we did it at a high level, and what's in the future. So the talk starts with GNUMS, so GNUMS is a scheme interpreter, a tiny language written in a simple form of seed, and a seed compiler written in a pretty simple form of a scheme. And it's called MASS because our scheme is built on evil in a pie, which sometimes referred to as the Maxwell equations of some kind. Okay, so what's a compiler? The command that you see at the top is GCC. That's a seed compiler. And the compiler produces a program that you can use run on your computer from your real-world reading code. So this is a program that humans are supposed to be able to read, and GCC will confer that to a program that is still readable for humans, but it's harder and harder to program in. And then it doesn't have to be this way, but often it's a two-stage process, so then the assembly produces object code, which looks like this. The computer can read that, and then you need some stuff at the front to be able to execute it as a program. So obviously, you don't want to type in these codes when you want to have the computer greet you to say how to work. So we've got compiler, we've got NAS, we've got compiler. So what's a binary seed? A binary is a program that was not built from source, but you run it and you want to run it. A binary can also be a program that you have the source code to, and you might be thinking that you're running the program from source, but actually it's a previous version, a previous generation of the program that you have the source code from. You're actually using a binary that was created earlier. That's a very popular thing to do. We do it with GCC, but mainly any language, even languages that are being created new today, like PURL6, are written in the language itself. We may only hope that there's another way to create the first PURL6 compiler. So, a bootstrap seed, if we look at these, the current version, 0.16, it uses 250 megabytes of opaque binaries that we use to build the whole software distribution. So that's a pretty small seed compared to the rest of the whole distribution, but it's still pretty large. If you look at more traditional distributions, you see a similar set, but it's almost twice or three times larger. And this is a trusted set of binaries that we are all taking for profit. So what's a bootstrap? A bootstrap is the things that you see here, or straps on a boot, and to bootstrap yourselves actually to pull yourselves up using your bootstrap. So, traditionally, it's an impossible task. So what does it mean to bootstrap in programming, say, compiler or a kernel? You take a source or a program, you add something, and then you get the binary that you can run. And the thing is, there's something. What do we do? So we use an ancient recipe. So actually, you can think of the source code as milk, and you add a bit of yogurt, and you create more yogurt. So actually, all we have to do if we want to create a next version of GCC is just use GCC. And we're done. Okay, so what does it mean to reduce the binary seed of a bootstrap? Well, one of the biggest things in the bootstrap is GCC and the tools it needs. So we simply replace GCC. This is the current reduced bootstrap graph. We will use the upcoming version of Geeks, and I won't go into details here, but here is the first version of GCC, version 295.3, which is actually built from source by this J. And mess is one of the first binary steps we use. So mess, we had most of this. So the reduced binary seed bootstrap, the thing that's new here is that it has the binary seed that we use to produce from 250 megabytes to about 100 and 30. So why would we want to do it? Well, almost at least 35 or 40 years ago, people realized that if you are using a binary that you cannot inspect, it's a trust issue. And Ken Thompson wrote a great speech, an article, and part of that of why you don't want to start with binary seed. You shouldn't do it if you care about security or privacy. And, okay, let me go. Now more people who advise against it, why it would be not use binary seed, but use the bootstrap, our computer, for example, in the 70s, 60s, we used to do it like that. It's very pragmatic if we don't use binary bot, if you want to support no harm there. Then I have a kind of joke. If you look at it, we'll go into that. So security expert Bruce Slyer reminded that the trust and trust attack has, as Ken Thompson described it, is not just an interesting anecdote. It's becoming more relevant today. Peter Herman says something similar. He says enterprises overwork for bypass processes to ensure they can trust the software. And he even says you still cannot trust any software today. And David A. Wheeler, who focused really on diverse double compilation to address bootstrap binaries, thinks this is good idea. You mentioned the Thompson attack, but I feel like if I didn't know what a Thompson attack was, I wouldn't know from right here how you would do it to make it dangerous. What would somebody who has doing something malicious, what would they put in that binary? Yeah, so the question is how does a Thompson attack work? How could someone do something dangerous? Well, to go into technical details, I don't have time to read it up, but the interesting thing is that if you use an untrusted compiler and Ken Thompson showed it, actually you get out of hands all control of your computer. And at any point in time, it could do anything to compromise the compiler writer. And they can put it into the maximum magnitude, that's a key part. It's very hard to find if you use the Ken Thompson attack. And James Comey, who was the former US FBI director, in an interview he said, well, it's just people ought to take responsibility for their computing. So people should do, could take on their monitor, on their cameras and on their microphones. So paraphrasing him, he says, I can say the FBI thinks we shouldn't, we shouldn't just trust binaries, we should do that. And look at the experts who wrote that, really, some of their solution says, well, let's say big binary blocks of code are non-potentful and using them is a hazard. And our goal is to reduce them to environment. So research in the 70s regarding high-level language computers, where you actually don't need a compiler, but that they can directly run, like at least, or they can send codes. But then that has to be put aside because then filers for children, so it won't. But if we look at it from the FBI, if it's in high-level language computing, if we go for another program, because then you can put stuff, I think from source, because there is no concept of compiler vehicles, right? It's a long remark. That's a remark question. Like, didn't we do much smarter things before and maybe even in hardware, have a high-level language? I think we're just reinventing the wheel over and over. So the interesting thing is that the GPL version 3 allows you to distribute binaries. So the question is, can we distribute the GCC binaries whole? And there are... Well, you may be able to... You may be allowed to do that if JCCC for minus 1, the one you compiled the program with, is a system library, I don't think so. Or if GCCC minus 1 is a general-purpose tool, some people would say it is, but you also would argue it's a very specific tool. It's a C compiler. You cannot compile Python. It's not general purpose. But there's another exception, and that is, you can if GCCC minus 1 is a generally available free program and then lawyers can find out what program is. But I assume that it's about the binary because I cannot write some source code, right? So... And maybe... Can we... Is it legal for Duggin to distribute GCC in Scratch? But maybe if it was legal to distribute the GCC in Jesse, then you can see where that goes. But luckily or unluckily for some, the GBL version 2 is more lax in that story. Now, why would we do this reduced-by-missue C bootstrap? Well, for me it was also a big part of inspiration. I was inspired by this 500-byte hex zero-volta written by Oriens J. You just have to have a look at it. It's really great. It's 500 bucks, and it can just reproduce itself, but also create a bit more interesting program. So we might be able to bootstrap our system with 500-bytes, and we can hold it down. So, let's say we have GCC, which looks as yogurt. So how do we remove the yogurt? Well, that's easy. We eat it, right? So, GCC tastes like yogurt, but is it really yogurt? Well, and here's where we go from bootstrapping to bootstripping. It turns out that the tiny GCC developers put a lot of effort in making sure that GCC can compile GCC. And GCC is an order of magnitude smaller than GCC. But, yeah, is GCC yogurt? It's still a bit big binary block. Finally, there is MESS, and the MESS SQL binary, which is a really small binary block, which can compile GCC. So the problem has gotten smaller. I'm sad to say that MESS is still pretty. So here's how MESS GCC works. MESS GCC is just a string script that could be run on a tile or on the MESS scheme interpreter. And it's a C compiler. It compiles a similar Hello World program to something that looks suspiciously like assembly. And that's actually all that MESS and MESS GCC do for us. And then it will invoke a program from the Stage Zero project from Orients J, which acts to its first program. It's a program called the M1 macro assembly. So this is M1 macro code. And it's assembled to this X2 code. And you can see that it's already pretty close to machine X2 code. And then there's the linker stage, the X2 linker, also part of the Stage Zero project. I will get my new program. So this is how it works. But of course, our ultimate goal is, well, as I presented it two years ago at Boston, what was all between is the full source bootstrap. We won't stop before we bootstrap our systems from nothing but source. So we were a bit about the current stage of the Stage Zero. So when I started two years ago, Stage Zero was only an inspiration. But now over the past two years, MESS has been implemented and I've been working very closely together with the Stage Zero project. And this is the current state of Stage Zero and how it comes together, it comes towards MESS. So the lowest level has all been done. There's an X Zero monitor and an X assembly, which produce just a little more capable X1 assembly. And there's Stage One, the X assembly, and the X2 assembly, which can handle labels and interesting things. And just recently, a C compiler for a very primitive form of C has been written in M1 micro. So we have bootstrap C compiler, which is really a C subset and it can produce a compile M2 planet, which is written in a simple version of C. Maybe you can see where this goes, because now here is MESS, which is currently it's a C program. What we're in there, we've been working for weeks now to get the C program MESS translated into M2. It's just a lot of work, it can be done, but we won't have a real little program because apart from having an all source bootstrap patch, one of the key things is that it's all that's built. It's hand-understandable. The red things are the stuff that is missing. We cannot bootstrap MESS yet, so there's a gap. And there's a thing with the C library, which is in there. But just imagine, we have this done. Then after we have MESS, there's still a bigger C library we need for TinyCC. It has been done, but it's much in red because it hasn't been translated into M2 yet. And we need a bigger C library to be able to compile the rest of it. And there's this thing with bootstrap libraries. There are different paths how we can do this. I need to burn all this code in one big chain of commands in FORMAR. But we are prototyping it currently in human beings, which means you're running on a latest kernel and you have to execute scripts. So in practice, we're looking at all kinds of binaries, the 130 megabytes that I showed you that we still have in the current reduced-binaries C-boots one. Yeah, that's a question to make right now. And from there, it's pretty much done. This is all scripted in Ease, ready to be merged in master. It works, and we have PCC, for the second compiler, which builds the rest of the system. Yeah, so what's the next stage? I said we're aiming for the stars, but we do it in very modest simple steps. So one of the things we're working on is, as I mentioned, the national C-2A2 translation. But another exciting thing, I think, is the GASH project, which is a scheme implementation of all the utilities we need. You can think of it as busybox, but then a written scheme. So we can... If we run it a mess, you can see it as a source form of busybox that you can run. And that's working progress, and that's already on the way. And we... So the first step will... The first flexible stone will be the scheme-only-boot-strap, which will hopefully reduce our 130-megabyte binary seed by half, we aim for 60 megabytes. And then reducing the mess of M2... So reducing the mess of the binary seed will... Well, hopefully bring us down in another factor of two. We'll see. Things that have been happening. I said GASH, actually, there were two projects of implementing shell or bash in a scheme. There was a historical GASH project, and it was a GISH project by Timothy Sample, and a GASH project by Bert van Beuzegel. And the project merged, so we are ready to go forward, and we hope for a 0.1 release in the coming months, or the coming months. I didn't do this alone. I did a lot of programming alone, but I had a lot of giant shoulders to stand on, and I had a lot of help on IRC in real life by a lot of people, which I'm really grateful for. So thank you. How could people help if they want to join in on this project? One of the first things is visitors come to send us a mail, look at the website, or visit us on IRC or put strapable. I just say hello. Just how you support is already very important because when I started this two years ago, there was a lot of discouragement going on. This is impossible to do. So it is possible, and support is very important. Well, there's all sorts of things that could be done. If you're a beast developer, or you want to become a beast developer, you can help getting this. So we put, by receive bootstrap into the mainline. There's an iron port underway. This is for x86 32-bit only, although we can, and we do bootstrap 64-bits at the moment from it. So if you know ARM, please come and help. Yeah, there's lots of things. What we especially need is documentation and all of it. Really? Yeah, Gash. We have our busybox in scheme. If you like hacking scheme or learning scheme, we have a simplistic implementation of port and of set. We need more. Why do you need ARM if you have GCC? You are the first provider, so you don't need to do anything in specific to ARM, right? So the question is, why do we need ARM once we build GCC? We can cross-compile. It was an option, but if you're running ARM, you really don't want to depend on stuff that was built on Intel or on Intel hardware. So I haven't talked about hardware, but the same story goes for hardware, so it's nicer if we... Do you have funding or something like going to the EU and saying you have these bug bounties? We have a solution for all our systems being unscrupulous. So the question was whether we have funding. No, we don't have funding. Interestingly, I applied for funding yesterday, thanks to Piotr and to AILGO. But chances are, we won't get it, so if you... If you're sitting on a big bag of money or... You don't have to get funding, but I think this is a valid project for visitors. So the remark is how do you apply for grants with the EU? Because it should be possible with privacy and security. The grant I applied for yesterday is indirectly from the EU. I did apply for grants in a previous life working on Lelepo. I spent two times three months of my life on it. It has been wasted time until now, but if someone can help me with that, it would be much easier. Maybe you should ask the FBI. Well, they're watching, right? If you run this on a CPU that has a minix instance running as I imagine, it's anywhere. I'm sorry to say that way, but what's the part? Okay, there's part, there's parchment coming from and they say, you're running a compromised computer anyway, so why go through all that effort? You're totally right, so I'm only doing this to raise awareness and to make sure that people who are great about hardware address that problem. I cannot do that. I hear that people are working on that. It's called Power 9. Okay. Okay, so there's a suggestion to port mess to Power 9 because it doesn't have a management engine. If you know Power 9, please come and help port it. The art port is well underway. It took two days of effort to do 80% of the work, so it's very doable to do that. Okay, I'm running to this present idea and I think that if you ask them, they'll say, okay, some people will help you port it. Also, one year around or one year and half ago, someone from IBM came to us to talk back to them today and tell me what they did.