 So, this talk is about MKSI interd, which is an alternative way to build interd's. And the official title had in Fedora, but the Fedora part kind of shrunk over time. So, you know, it's like MKSI interd and a bit of mention of Fedora at the end. And so, let me start with the justification. And this justification will just use, refer to the talks on Friday. So, we had some excellent talks about confidential VMs. And confidential VMs can be summarized as you are not allowed to run any code which has not been signed and verified to be the right thing delivered by the vendor or the owner of the machine, whatever. So we need to have everything immutable and signed and checked. So if we, and this is certainly the case for confidential VMs, but it's also the case for other things. Like, for example, you have a device somewhere on the edge and you just wanted to do the thing that it's supposed to do. And if somebody comes in and swaps the disk, this shouldn't allow them to do anything with the device. So if we start at this point and say that we have to run the signed code only and go back from there, well, currently we have only one way to deliver kernels that are checked, right? It's to create a unified kernel image which combines the kernel and the interd and the whole thing is signed together. Well, okay, it's not done anyway. That's not true, but it's, I think I would say the best way. And so if we are supposed to sign the interd, it must be built by the distro because we need to build it first before we can sign it and the keys are somewhere in a vault somewhere. So local modifications are not possible, they're not just not useful in this model. So the current system that we have, that has been built around local modifications and doing things at the end machine, it doesn't, cannot be used anymore. And so let's say that we design a system from scratch. What we would do instead of this fairly complicated thing that we have right now is, well, do things in the simplest way possible. And this means that we take upstream packages and distro packages which already have our software in a form that is ready to be deployed anywhere and just build an interd out of this. And this is what MKSI interd is. So Martin, you have questions? Because if anyone has questions, then please raise a hand. I think it's, if I'm talking nonsense, then it's better to clear it up during the presentation. So right, we have, in Fedora and Drill, we have track code, but other distros have similar systems that take files from the host FS, use, somehow do some magic to resolve what dependencies should be installed. So LDD on binaries and maybe some special hacks to figure out things. So essentially people, when you're building an interd, you are redoing the work that was done by packages previously just in a more hacky way. So it's like, I mean, essentially duplication of the packaging layer. You have a dependency mechanism and conditionals and a lot of that stuff. And of course, this takes time on each end machine. So this is what happens before, I mean, while we are building the interd and before we install it. And after we have installed it, the interd is very special. It has, so with track code, we have a need queue and we have special code. So for example, in track code, we have bash code that generates bash code. And so things are complicated, different than in the host system. And it's, if you want to debug things, you need to know both the thing you are debugging in the interd and it's just complexity. The environment is set up in a slightly different way. And what I want to underline is that not everybody is aware that our interd's use system D in the interd. So we start system D in the interd. It sets up the whole system in the way that it likes it to be set up. So all the slash procs slash dev is mounted in exactly the same way as on a real system. And it starts services in the same way and so on. The difference is that the root of the file system is a temporary file system instead of a real file system. But the logic and the API is the same. But we add to this and we have the system D queue for jobs and we also have the track code queue for jobs and they kind of interact and play with one another in some semi-defined way. And another issue they find with the system is that every distro does a different thing. So the alternative approach is to say, well, let's do less. Let's build images from distro packages and MKSI, it was kind of a script, a wrapper around package managers that was initially written to test system D. And it builds images, but it's actually fairly convenient to build inter-D archives with it because it has support for different package managers and conditionalization and stuff like that. But MKSI is not the important part, I mean we could replace it with a different system and it would probably work quite as well. But anyway, so MKSI builds images from distro packages so it's kind of like a fancy wrapper around change route and DNF. And for the last upcoming, for the upcoming MKSI release we have done a lot of work. So Repart has been written to write stuff directly to the file without using a loopback device, which allows for unprivileged operation, which is very nice if you're building images as a normal user or you're building images in a container because you don't need root privileges. It has been converted to use DNF5, a lot of work has been done like on the way that the configuration files work, now you can do profiles and conditional logic based on matching. So for example, you can have a shared config as one file and a subset that is specific to distro as a separate file with a match section and then you end up with a limited amount of duplication. So all those things, they're part of MKSI 15, which will be released as soon as it's done says it should be released, so maybe next week. And so we have MKSI and now I can say what MKSI NDRD is, right? It's just a few config files, so the main thing is a 20 line file that lists packages that should be installed in NDRD. And this is, well, okay, so what are the benefits? So this is a really, really minimal system, right? We, MKSI itself is not, well, it's kind of complicated, but this is actually a wrap around DNF, so the heavy work is done by packages and we use the package dependency resolution mechanism to figure out everything that should be installed. This means that our packages have to be packaged well, but we do that work anyway, so that's okay, right? And we let the existing tools handle 98% of the work. We are independent of the host, so I want to build an MKSI image for Debian or for Gentoo or Arch on my Fedora system. This works without any problem and the other way, so I mean, this is not just a question of not being, I mean, it's kind of ugly to pull files from the file system because they might have modified locally or something might be off. But it's also important for stuff like build reproducibility, right? If you are supposed to sign something, if you take it from packages, you know that the package manager will verify the hash of which file is installed. And if you repeat the installation, you expect bit for bit identical result, so there's much less variation. Yeah, the images can be reproducible and, well, fortunately or fortunately, everyone gets the same image, which is a problem I will talk about later. And it makes sense to sign them. And we reuse system D, right? We get rid of those additional helpers, bash stuff in the interd. System D does the setup and the execution environment in the interd is exactly the same, well, not exactly, but it's very similar to the host environment, you open a shell into the interd and it's like debugging a normal system. The root file system is different. And also, I know, like you want to install SSHD into the interd, you specify an additional package on the list of packages to install, and SSHD pulls in the dependencies and you don't really need to do anything else because the packaging has already been done in this way that you can pull SSHD into any system and it gets started as part of the normal system D transaction. Yes? So, I mean, reproducible in the sense that if we take the exact same of RPMs and install it, we expect the same result because we installed them in a fixed order and it's a bit hand-wavy because it's possible that it doesn't actually work, we haven't really tested this properly because, I mean, there are other problems, but in principle, there's really no reason, right? I mean, the set of RPMs is fixed and the installation order is fixed, so we expect predictable results. Yes, we have to, yeah, so it's like it gets into the same problems as building of RPMs reproducibly. You have to do some adjustments to fix time stamps and so on. Yes? Well, maybe twice and so, I'll get to this. This is a valid problem, right? So, I mean, what they need is bigger, but it's not terrible. So, first of all, I don't have, like, I don't want to have specific numbers, but generally it turns out that you would expect that the, like, the interd built with trackwood is so much smaller because it has a smaller set of files, but actually that's not true, like, because almost everything now is delivered as a binary and some libraries, you, if you follow the package dependencies, you get almost the exact same set of files because you need the same libraries and most of the space is used by the libraries, for code. So, this, there's very little difference in this regard. Then we have some stuff like the hardware database and I'm installing that because as part of the RPM, trackwood is not installing that, but I think it's a buggy trackwood. So, there's a little differences and then the big difference is the stuff that you said that, well, what to do about kernel modules and what I'm doing right now is I'm just installing the kernel modules called core RPM that has all the modules. So, that's 30 megabytes or something like that. It will be possible to modify this, right? I mean, I'm saying that we want to install everything from packages, but it's not like we cannot, I mean, we already removed some files that we don't want and we can also remove, maybe, I mean, if there's no other choice, we can figure out a mechanism to select a subset of modules. This is doable. Hmm? Okay. And another way to answer this is that, okay, so the interd is, let's say, less than 100 megabytes. For current machines, this is actually tiny, right? I mean, you can have go binaries that are multiple size of that. And why is that such a problem to have any interd of this size? It's a problem because we loaded it, boot and decompress it and put it in memory and there's an idea to use interd's that are not CPI or archives compressed, but they are, for example, EROFS with internal compression so you get like a block device and only uncompressed the stuff that you use. And then, so this needs some small kernel work and then you sidestep the problem because you have an interd that is a gigabyte and you wouldn't really care. I mean, depending on how much memory you have. So, I mean, right, because copying of 100 megabytes in memory, it's a fraction of a second, right? It's the uncompression that is the problem. Yes, the kernel modules. And also, what works in the interd's that are built this way is that it's the stuff that is directly supported by packages, right? So, for example, LVM is not a problem and normal installations with ButterFS are not a problem and encrypted disks are not a problem, but iSCSI is currently implemented in this way that there is some very complicated logic in trackhood that does string matching and builds the configuration of the fly from some other stuff. And I don't want to repeat that. I want to change the package so that it supports running in the interd natively and let the package managers or package authors deal with that and not to have special logic. Okay, so, let's say that we built the image in destroy infrastructure and everybody gets the same image. This is nice if it works, but it will not work for everybody because people need to do some local modifications. So, like the side step answer is to just build multiple interd and we will certainly do that. So currently in Fedora we build a unified kernel image for virtual machines, not with interest in interd but with trackhood, but it has an interd that only supports booting in all types of cloud VMs. So it has enough kernel drivers and enough software to work on virtualized hardware and not on other things. We could and probably would do a bunch of variants, but we also need other ways to extend the, to provide local modifications and there's a bunch of answers. And over the last, well, year, at least in the system project, a lot of work has been going into this functionality. So, it is needed if you want to have read-only sign, the images that you run, for example, in the cloud, but it also is the same problem for the interd because you want to have an interd that is signed and is read-only and then you need to extend it. So, I want to talk about those approaches. So, credentials, right? We have a row of data, we started somewhere, we have multiple ways to store the credential and I talked about this during my talk in the morning, so I'm skipping over this, but the important part for confidentiality is that we can encrypt this. So, we encrypt the credential with, they have this here, we encrypt it with a file that is stored on a local disk that is protected with lax, so it's a secret. We also encrypt it using the TPM. I mean, either this, either that, or both and both is the best answer. Then we have a credential, which we can actually place in the ESP, where it's public, so to speak, but it is secret because you cannot decrypt it and also the encryption works as an authentication layer, but because if we decrypt something that was encrypted in the wrong way, I mean, it was the wrong keys, then the file will not be accepted. So, this is one way that we can have stuff, the local modifications, they're created on the running machine and then they are placed in the ESP. The second mechanism is configuration extensions and this is kind of a new concept, but it's a variant of something that was there already before. So, we have system extensions and configuration extensions and they're kind of the same, so it's a discoverable disk image that has a partial, partial content of the file system that system D will load the runtime, use overlay FS to make it visible in the file system and system extensions are for code that is in slash user and slash opt and configuration extensions are for slash ETC. So, you can make, you can drop those images, those extensions into the right places, for example, in DSP and they will appear in the interd one system DS app. And they, the discoverable disk images can contain DM variety data and the DM variety data is protected by a root hash and this root hash can be signed and also embedded in the DDI. So, system D will tell the kernel to use the kernel key ring to authenticate the contents of the extensions before they're being used. Yes, discoverable disk images. This is, right, so system D dissect will tell us that, okay, the thing is an extension for system images, for interd's, for portable services, depending on the metadata in the image. And so here's the example. We have, so this is a disk image and it has a user partition slash user partition and slash an ESP partition and variety data as a separate partition and a signature for the variety data. This is all separate partitions in the discoverable disk image. So GPT as a file system table. And we have yet another mechanism, which is add-ons. So, an add-on is a Yuki-like binary that has a section and the section has kernel parameters. So, I mean, this might sound crazy and I think it's a bit crazy, but the reason is that because it's a, so it's a Yuki-i, it means that it's a Microsoft PE binary. And this means that we can feed it to Shim and Shim will tell us using the secure boot infrastructure whether it has a valid signature. So, we are putting a text file inside of a binary that is signed by some key and this way we can verify the whole thing. So, returning to this list from before, we have the variants, we have credentials, we have extensions that are checked by the kernel keyring. So, this ultimately also means the secure boot infrastructure and we have the add-ons. So, it's, I mean, different options and the addition of those options is what I think makes the whole project viable because we will have ways to deal with the, with just not having flexibility in the entirety itself. Yes, so, I mean, the stuff that I talked before was kind of like on the more on the system D side and on the MKSI in the D side, there has been work done to integrate with kernel install. So, kernel install gained a few plugins for different things related to UKIs and MKSI in the D. So, we have a kernel plugin to invoke MKSI in the D to build any in the D. When you call kernel install, we have a plugin to, and for some reason I forgot to list it here. We have a kernel plugin to call UKFI to generate an image and UKFI gain support for config files. So, if you put the right config file in the right place, this operation will create a signed image. And then we have a plugin to copy the unified kernel image into the right place on the boot partition after everything has been ready. And this stuff is opt-in. So, you need to specify some config, very simple config in various places. Oh, because it's kind of experimental and we don't want to break systems yet. And also, boot CTL kernel identify is a helper that will tell you what the kernel, what the type of the kernel is and kernel inspect will print details about unified kernel images and so on. So, to wrap this up, I mean, in some ways, not much has happened in the MQSI RD project over the last few months because there isn't really that much to do from the MQSI in the RD side. The work has been happening in other places. And I think we are kind of getting closer to the point where it becomes useful for real users. In principle, we have a change for Fedora 39 to make MQSI in the RDs available. I'm pretty sure that this will not happen, but maybe for Fedora 40, there's some outlook how to build such images in Koji. Grap2 might get support for unified kernel images. I mean, patches are out there, they are being reviewed. There has been progress on making kernel modules easier to install. We don't need the whole, I mean, it's still a single package with a big set of modules, but at least it doesn't contain the kernel anymore. And this was done because of the change of introduction of unified kernel images for cloud VMs. And I mentioned that MQSI works unprivileged and there has been some work on integration. And links, and well, questions, comments. How do you like create the question? So I guess I, so the interd is a CPIO archive and actually we can build it without any privileges. So the stuff I mentioned about running without privileges is when we build a DDI. So like, for example, a system extension would be a DDI and then we need a file system. The interd actually don't have a file system and they are built unprivileged without any issue. So the building of interd's locally using this mechanism, it's like a supposed to be temporary step, right? In the sense that yes, we do it and we also need it for development and we will also always need to do it for development, but for end users, we want to build the interd's once and deliver them as a package content. So basically you have a, like with the kernel package, you have an interd package and it just, you get the interd there. Or even better, you have a kernel package that has a unified kernel image with the interd already embedded. So I have to admit that that was my idea in the beginning, but maybe it's actually not that useful. Maybe we want to do, I mean, the fact that the modules have been split out from the Linux binary, this is nice because they are both big and we don't want the Linux binary at all. But the selection of specific modules, it could be done this way or it could be done using some filtering mechanism. I'm leaning towards the second option now, but either way. Yes, that would be one option, yes. So, okay, I forgot to repeat the question. So the question was if we have some special hardware or file system and we want to deliver the code to open up the file system in the interd as a system extension, how does it work? So the way that extensions work is that very early when system d is booting, it starts a service, the service locates any extensions that are present and signed properly and they're at some point just appear in the file system. So this happens in early boot. So if you have this extension, this extension would essentially be overlaid and then the code inside of it could be used at a later point during the boot to mount the root file system or the whatever storage. So there's a mechanism to match extensions to the running system or the image. Okay, so the question was if we have multiple kernel versions and how do we match extensions to the right kernel version? And the answer is that for each kernel, we have a specific place for extensions for this kernel and we also have other places for extensions for our kernels. So there is a mechanism to do the matching. So the question was whether the style has to care about the installation of the kernel and whether it will be installed by the RPM packages, yes? I'm not sure, I mean, essentially maybe copying a file a second time so not a big issue, but... So I'm out of time, so let me do a demo. So let me show the config for MQSI interd. Sorry, MQSI config. So this is my config file and it lists packages and it says that the output is CPIO and now we call MQSI and since this is without, I mean a demo, I'm doing it without network. Oh, it failed because it exists already so let me do it with minus F now. And we do the package installation and we do a setup and then we remove some files so that the interd is smaller. Here, those packages are needed for installation and then they get dropped. And we have an interd somewhere here. It also comes with a manifest and a changelog so that we know what is inside, so like hash of stuff. Okay, thank you.