 Hi, okay, can you hear me right? My name is Joval Tujiman. I work for Red Hat. I'm part of the Overt team and today I would like to talk a little bit about the kind of the reincarnation of Overt node which is probably going to be as a layered product of Fedora Chorus. So I'm going to give a little bit about a background about what Overt is. Probably everybody knows. And more specifically what Overt node is. So then we're going to go to the evolution of Overt node where we were like three years ago. What we're doing today and what we plan to go in the future. Which is kind of probably CoroS-based node. I'm going to explain how we built CoroS node with the standard CoroS tools for building this image. And then discuss some of the challenges that we had and how we solved them. And some open issues that we still have with the node itself and the OS tree and all that. So Overt. What is Overt? Overt is the open source virtualization platform. The leading open source virtualization platform and more specifically Overt node is the hypervisor that runs the VMs. So an Overt host is a hypervisor. You can just install the Overt bits on any host basically. But Overt node is something a little special. It's an image-based operating system. And it had some of different layouts from the past until today. But basically it's a minimal operating system with Overt bits installed on the image and not requiring anything externally. So like I said, there are different layouts. There was the legacy node, which was... We don't use it anymore, but it was on Overt 3.x. And the next generation node was Overt 4 or is Overt 4. And where we want to go next is something that we're looking at is the CoroS-based node. I don't know which version it is, but... So legacy node. I'm going to go really briefly about this. Basically it was live CD. It was mounted read-only. ISO and it had like a live read-write and some persistent paths. Some of the drawbacks is that... I mean, some of the benefits are that it was an image-based, right? So whatever you tested is what the customer got or user got. And it was really... It's not really kind of immutable, right? Some of the drawbacks were like the live read-write amount was that... You reboot the system and everything disappears. I see some people laughing here, but... And it's not something that you would expect from an install system. Okay, so like I said, it's image-based. It was immutable. Yeah. So the maintenance for some other drawbacks were the maintenance, you know, to create a custom installer and to whitelist some files. And like I said, the live read-write is something that is not expected from a system that you install on your hard drive. It looked like something like this, if you can see it. Yeah, good. Yeah. So there is the live read-write and some of the past there, the VAR log and the data, logging data and configuration stuff. And like I said, we're not using it anymore, so it's not really relevant. But we're talking about the evolution of what we're... You know, where we were right now. Okay, so the next generation node, NG node, is what we're using today. And it's based on LVM. So what we do is we compose on the... We compose a squashFS using Lorax. We give it a kickstart. And we deliver it... Okay, we compose a squashFS with all the over bits installed and a minimal installation of Fedora or CentOS or L. And we deliver it via YAM or DNF. So when the user does DNF update, it would get the entire image. And then the post-installation script, what it does is it would get the squashFS, extract it, create a new LV, extract it to that LV. And the LV would be read-only and then would create a snapshot which is read-write. And this is what we have mounted as our root directory. We also have some state like var should be... slash var is not something that is on the image itself, but it's common to many deployments of node. And eventually a new boot entry would be created and you can switch between two versions of running nodes. So let's say if we booted the system, the new upgrade, if we booted the new upgrade and it didn't work, we can just roll back easily to the previous one. So the benefits are almost kind of the same as before. It's image-based, so we know what we are using because it's all tested and nothing is added on top of it. We have the AAB updates, so we can switch between different versions of node. And like someone says, it just works, right? And so the drawbacks is making it just work is not really that easy. It can be sometimes very complicated. So for example, some of the issues that we have is whenever there are Z streams, new Z streams or package changes and migrating between, I mean, we're migrating slash ETC, for example, between the layers. So sometimes slash ETC changes stuff like there are simulings and stuff like that. And we need to handle these specific cases with very delicate care, let's say. Also, another thing that we can do is we can't just take the post-install scripts that we would like to run or to execute because it's basically a post-inst... I mean, it's an upgrade, right? So we would like to run those post-installation scripts in upgrade mode, but we really can't do it because we are creating a new layer and we install it like... It's kind of like a fresh installation, but RPM doesn't know that. So it's tricky. So sometimes it can get very... There is a lot of maintenance that needs to be done. And another issue that we have with Node-NG is since we are using a YAM and DNF to deliver this squashFS, after we install it to the new layer, we need to kind of do some nasty hacks to not install over and over again the image, right? So what we do is we take the image and we just hack the RPMDB and install a package on that new layer so that the next time when you boot to the new layer when you do a DNF update, it won't get the same update again. So that's the issues we have with it. It's working. I mentioned the image base and all that. And for all this work inside the Post-Install for Node-NG is driven by a package called ImageBased which is a package that women maintain. And the way it looks, the layout of the system is something like this. I have just one layer here but as you can see this is minimal partitioning, right? It doesn't have all the nest partitions so it requires approximately 15 gigs of disk space. So if you see the ImageBased layout the first LV is the read-only layer or volume and the one with the plus one is the read-write which is mounted on Slash. And then we have Slash of R and Slash of Boot which are common to all layers or to all images. So the next... So we were thinking about where we want to go with it and of course read at Acquired CoreOS so we thought... Well, CoreOS and actually it was atomic as well we're trying to solve the same problems. We're trying to do ImageBased, server side creating the image on the server side and then delivering it to the user. So it's basically the same and CoreOS is cool so why not. RPMOS tree which was kind of taken from atomic is much more modern solution than what we are doing to the ImageBased and it's much more generic than what we are doing. So over the years ImageBased became more tightly coupled to OverIt so for example if we have some special stuff that we need to do while upgrading like stopping VDSM and stuff like that VDSM is the basic component of OverIt node and then this is something that is integrated inside ImageBased which is not very nice but yeah that's the only solution that we could come up with. Another reason for using CoreOS is the large community like you know CoreOS, Silver, Blue and Atomic and so I want to talk to you a little bit about how we build and install how we build and install the OverIt node CoreOS basically we wanted to use the standard tooling from CoreOS we can take RPMOS tree and compose something on the side but that's not the right way to do it. There is a project called CoreOS Assembler or COSA that's how they call it. It's a container, you pull it from Quay and you give it the configuration repository so configuration repository will hold all the manifests and all the repo files that are needed to compose the OS tree commit from this configuration and what we do is we take the OverIt release RPM which holds all the repo files for OverIt and we take the OverIt release, we extract the repo files we install them right on the configuration directory and after taking some manifest files and some other configurations from the basic Fedora CoreOS config these two together we compose our own OS tree image RPMOS tree image so we produce the commit, RPMOS tree commit and the bare metal installer we would install it just like we would install a normal Fedora CoreOS I have a screenshot here that's what the installation ISO looks like when you boot it we did some rebranding here for OverIt so you tell it where to install the install device, the image URL for the metal image and the ignition URL it's just the same as CoreOS, it's a normal CoreOS and eventually we would add host to OverIt it would look something like this it's a little hard to see but right there on the bottom it says CoreOS preview it's an old version but this is the picture that I had and this is another screenshot after updating, when we compose the commit itself we create our own commit reference and this is what it looks like so some of the challenges we had to face we had some issues with Kdump since CoreOS installs in this it has like special boot directory, Kdump couldn't find it we send patches to Kdump to help them support RPMOS tree sorry, CoreOS correctly managing the users and groups so when you give, when we clone the Fedora CoreOS config there is groups and password files that you need to maintain on the server side before you compose and it didn't really work for us that well what we end up doing is you put like an overlay file which would be installed on the image itself for system DCS users and that would configure all those users and groups that we need for the installation some other issues we had is Podman and Docker I mean everything was working with Podman great until we moved it to RCI and everything broke because we had to switch to Docker but we solved it eventually and ignition versioning so the issue with ignition is that well the issue with CoreOS in general they're moving very fast they're closely aligned with the Fedora cadence so whenever they are, if we are working on Ovid for example for Fedora 30 they're already in Fedora 31 and trying to maintain to align the versioning of ignition for example it's kind of problematic another issue was RPM Scriblets they can't, I mean it's a mess right and permissions and yeah but we solved all these some of the open issues that we have with non-standard paths because we can't install like in slash whatever slash gizmo we have to rebuild our VDSM which has a non-standard path in it and we need to create a hook for QAMU so we would be able to migrate VMs across different types of hosts another issue is kernel arguments so Ovid uses grabby and we need to use RPMOS tree because RPMOS tree can control the kernel arguments and last but not least this is a big one oh I'm sorry one step behind it is the Ansible module for RPMOS tree so we are using Ansible to add our hosts and Ansible is not working nicely with RPMOS tree at the moment we are collaborating with the chorus team to resolve this issue that's about it if you have any questions I tested it with shared storage I don't really know yet what the future holds and now it's not GA it's working right I mean we build it we don't build it nightly it's very hard to follow like I said chorus team are moving really fast it's hard to keep up what state is it? it's kind of you know patches are welcome no well no we didn't look at it yet one thing that I forgot to mention is that the one of the reason for doing this is for you know now with the rise of CubeVirt and running VMs inside the ports people might want to you know move between Overit and CubeVirt maybe and also like kind of moving from an Overit node chorus to a normal chorus for running containers or running VMs with inside containers and maybe it's it's kind of what one of the motivations for this migrating from Overit from node in GA to node chorus is something that I didn't I didn't look at yet good question yes we had the talk we had a discussion about this with the chorus team I'm hoping that we will maybe we just had a discussion yesterday with Mikhail we were missing some stuff like ignition in CentOS right? yeah didn't get the question sorry so yes it's kind of like I mean in large deployments you would use Kickstart anyway right so kind of something like that is that the question right I mean I wasn't following okay yes okay exactly so and I would love to see the same mechanism here okay anything else nobody is asking about the nest and just kidding okay thank you