 Hi everyone. Good to see you all and here at till 4 o'clock, which is I think a great achievement, right, listening from the morning. So, yeah, hopefully I won't bore you all with my talk and I'll try to make it more lively with the demo so that you all have some interaction and feel free to ask questions. So, a little bit about me, I am Sini and I'm working on machine config operator project, which is part of OpenShift, yeah, and I'm, I'm in Red Hat. And today I will be talking about OpenShift OS customization as a bootable container. So, you might be wondering what exactly this is, but probably you might have guessed a little bit what is OpenShift OS. So, yeah. So, here we'll be talking about where exactly this whole OpenShift for run, right, and you need OS. And so, I think probably, how many of you know like what OS we run on OpenShift for? Awesome. Can you tell? Yes, CoroS and it's formerly called Rel CoroS because this is derived from Rel and CoroS because it is based on the CoroS technology, which in internally uses RPM OS and we have upstream version of this operating system as Silver, Blue and Federal CoroS and yeah, I'm probably many more in future, Kinoite as well, which is Caddy Edition, yeah. So, for here, our talk interest is more for Rel CoroS, that's run as part of OpenShift for and this is image-based because it's based on RPM OS technology and so for OpenShift when we made it, it was open-ated and we didn't want to use us to really interact and go and SSH into machine and two config changes. We wanted to make it secure and like so that we can have everything being updated automatically including the operating system itself. And that's why it was more like don't do SSH and two manual changes, use something and that's where MCO comes into picture. MCO is what, which is a core operator that runs as part of OpenShift and that's where you have all the config changes that you want to do on the OS, you can do that and it takes care of your OpenShift or CoroS, I mean, Rel CoroS update as well automatically. So a little bit more about MCO, this is a core operator as we already talked and here it helps in performing the OS update, like when you click in your probably console or when you say OCADM upgrade, everything upgrades along with that, your MCO that takes care of upgrading your operating system itself from whatever version it was before to the next version and any other changes that are part that should go to the OS and from where it gets exactly the content of the OS. So if you know OpenShift ships everything in a single, everything in the image format and the container image and similarly the Rel CoroS as well is a form of container image and that's how we ship and there are some extensions which is not part of the core OS core itself, but you can basically install some of them and they all come together in the release payload that is part of the OpenShift release of any particular version. So why we are having this talk, so this talk is related to this limitation. So with all those design everything works well, like you can configure your machine with changes that you want on the node and you can have really update and everything working well. It works well for most of the people but for some people it doesn't work. Like what if I want to have a custom agent or a third party like which is not really part of the base core OS? I mean how exactly can get it and that's not something we support today, right? And it's very difficult and that's why some of the, I mean, enablement cannot be done because it's not available yet and the next is additional rel packages. So rel has a lot of packages but not everything is part of your core open operating system. So for example USB Guard or Libre Swan or anything, they're not part of your base operating systems. If you want that on node, I mean definitely the idea is to containerize things, make container and then use it but not everything can be containerized and that's why sometimes people want it to have it directly installed in the base OS and it's a long process if you want to do really in rel, getting an additional package. For example, we have USB Guard and that's where extensions come into picture and to get that we have a lot of conversation like should it be part of the base OS because every additional package increases the size of the OS and that basically increases it, right? So that's why we have to be mindful, careful and we have to test it, support it, everything. So it's like long process and it's not necessary to get into it so it has to be very much reasonable to make the use case. And the third one is the performing hot fixes. For example, you have a new kernel and now it's security fixes and you want to apply it on your cluster and for that you cannot go and really get it unless brand ships a new update, right? And there is a new release and then you have to go through that then only update will happen. So everything will take time for this constitution. So these things like happens but it takes a little bit time. Also you need to know like what is machine config. That's basically on which we, I mean MCO based on. So yeah, so these are the things that basically lead to what we wanted to do which is called chorus layering. So chorus layering probably over hard in other talks in the DevCon before as well. So here we are focusing more about OCP chorus layering like how we are leveraging layering in chorus in OpenShift. So the layering is OCP chorus layering. This is basically based on the layering technology where the whole R-Cos system basically has a root file system in standard OCI container image. So you might be wondering like what exactly is a standard OCI container image. Earlier also I was telling that we are shipping the chorus in container image. But there is a little bit different in like standardizing this as standard OCI container image than what we used to ship before. I will get to that a little bit later. So basically using this OCI standard we can basically use any standard tooling in the, for managing this image. That's where Vittable Container comes into picture. So basically with this OCI container image that we have now that we ship, you can basically use any container technology tools that's available to build and layer anything on top of it. And it works as a delivery transport mechanism for the updates. Like we use Admin Ministry update, you say, and this new image that we ship. And it will take automatically care of everything. And your system will be updated to the latest version of whatever is in the payload. And this takes care of also all the image content source policy. Like probably knowing the open shift, like some people use mirroring. So they have all those defined in the registered record. So this will take care of all those things as well. And everything is a factor in this new format too. So here is the comparison between how exactly the OS image content was before and now. So when I say before, it was before 4.12 when we started OpenShift 4 until 4.12. And that's where in 4.12, we basically implemented the chorus layering in the OpenShift. And when you see, so it used to be called machine OS content. The image that we ship in the release payload for the OS. And if you, so for example, you cannot run it. Like I have, I can show you machine OS content. So this is basically an image. Which is basically that we have for 4.13. And if I say part man run, I'm not sure if it's the same one. So I will just cancel it, yeah. So it's not really executable. You cannot go run or do any additional activities. It's not really fun here. So how we exactly used to do OS update before was we used to extract that image. And so if you see, all these are the content of that image. And there is SRV and repo. So this is where OS tree contents are there. And what we used to do is we extract the content. And we used to do RBMOS to rebase. And it used to take the information from here. And it used to update that. So it was not really a nice fully integrated update system. With the new format, we have this is real core OS. This is available in from 4.12 onward. And here you will see there is like this is very native to OSI container image. And we have actually a kernel here in the user lib modules. If we go, we can check it later here. And it's very native to that. And you can do lots of thing here. Yeah. So how exactly it works. So since this is a very standard OSI container image, you can basically use the, you can create a container file or a file however you say it. And yes, you can layer additional packages or additional file or anything you want. It's like, it's whatever you want to do here. So nothing is like limitation here. So for example, this is something like use case for us here, as I was talking about hotfix. So suppose the new secretary update has come and new build is not available. But suppose you've got the fix or it's still in the center stream which comes before coming into the, I mean, rel. So you can try out and test those by like, for example, here I have a, so this is what from. And this is basically the shot of the image. And this command basically we need to use for kernel override. But usually we don't need it. We just need to do our chemistry over install and other for here we override because we want to do hotfix from the center stream. So this is, yeah, this is the simple container file, nothing fancy, which is good. We don't need fancy everywhere. And for demo, I will just show up what we were seeing here. So I have a cluster. This is running on GCP. And I will show how exactly we apply those changes in the learning model. For example, I will see it, let me see cluster version. Cluster version. So I have for the 13 cluster and I will, so this is basically I'm fetching this command OCADM, can you see? So this OCADM release info and this is the image name in our last tag. So this will give me basically what exactly the image URL is that I need to use for the, I mean, from where basically your base image will start. And yeah, so now, I'm not typing it here because it will be definitely time tagging. And so I have everything here ready, yeah. So this is the from, I just copied from there. So this is the same, you can see. I'm using this as a base image. And we saw earlier in the documentation in the presentation, so that's where our PMOS Tracer apps, this is using for the override stuff. And I'm also layering here another package, which is from CentraStream, and this is IOTop. So we are doing two operations here, override and hotfix of existing base-wise things. So you can do more things, but for this demo, I'm just doing this. And what you need to do just using, I mean, regular container tooling, you basically build an image. And I will do it, podman build. And it won't take time because I'm using the cached version, I have it locally. And once you have built it, you need to push it to a registry. It can be anywhere, however you want. And for me, it's my personal query.io, escomari, user, where I have. And so remember, this is where we push it, this container. And so what we do here, right now, I will just go and apply this into my cluster. But for the use case, where it is used, basically there is a lot of things you can do. You can do it in the CI CD, I mean, you can run some tests there and do all this stuff. And then after that, basically you can go and apply to the cluster. So a lot of potential here to do the testing based on the use cases. And now I will, so basically how exactly you apply to the cluster. So basically, you have to create a machine config. And this is how it looks like. So this is the CID for machine configuration. And here we are basically overriding the OS image URL. And this is the URL with the SHA. So we don't take tag, basically. The reason of not taking tag is it can change. And we don't want to deviate. And that's why we use SHA instead of tag. And you can get the tag by just saying scopio inspect and what was the image name? Yeah, I'm sorry. This was the wrong one moment, just a second. So this was the base image. And I will say scopio inspect. And is there something missing? Oh, yeah, yeah, yeah, yeah, you're right. Absolutely. Thank you. We need to define this. So yeah, usually it will show up. It will take some time because of the internet. And so basically, we'll have the SHA information in this here. Anyway, I was actually looking at wrong place. I should look at the query.io, not the other way around. Scopio inspect, I should have somewhere. Yes. So yeah, we don't need to inspect the original image. We need to inspect the image that we created. So we need the SHA for this. And this was the format. So I'm just showing how exactly we need to use. And yes. So this is how we generate this image URL. And now what I will do, I will apply to the cluster. And to apply it, you just say oc create. So and probably if you know or don't know, I'm not sure. But this is how basically it applies to. It uses MCO, a machine config operator. That's where you basically monitor the change. And a new config gets rendered here. So this is OS image URL overwrite. And it was generated six seconds. And this is new render. I was applied for just worker. If you've seen that, so this is basically applied to the role where they all are in worker pool. So it won't apply to other pools. So it's pool-specific. And one interesting thing, so sorry. Yes. So here, this is the node is scheduling the server. Because when we do that, we cordon the node and the drain operation starts. And then we apply the changes. So the interesting thing I wanted to show us, oc get node hyphen white is very weird. Because we have to increase the size. But the interesting thing is the kernel version. Where is it here? So we have 5.14.0.hyphen 284 kernel version. And we just created a new image with the CentOS stream. And that was, let's check, a container file. So this was basically 5.14.0.325. So we should have basically once the node is updated, we should be seeing 5.0.325 version. It will take some time. And we can just, let me check, oc get parts hyphen white. So this will show me all the parts that are running in the MCO namespace. And I am interested in this one. Because this is where update is happening. So in MCO, by default, it will apply to one node. This is because for safety. So let's give it some time. And meanwhile, what we'll do is we go and inspect the image that we have. So cat, container file, this is the image. And I will run, for my run, it, hyphen hyphen rm, yeah. So here we can see rpm hyphen qa and kernel and iotop. So yeah, it's 5.14. And there is no iotop here installed. Because this is the base version. If I remove, I think it will be here. Yes, so you can see there's no iotop in my base image. That was coming from base arcos. And now I will check for the check here. OC, sorry, podman, run, hyphen it, hyphen q, kernel, iotop. So yeah, you can see here we have the latest image kernel that we got from the CentOS. And we built through the image and the iotop. So we can basically go inside the container that we created through this standard OC container image that we have now. And that's how it works. And we can, let me see if we were able to get. So you can see this node has now actually 5.14.325. So this got applied, the custom image that we created, it got applied to this node. And similarly, it will happen on the rest of the node. So this was our demo. And so where we are exactly with this whole new layering model. So like we said earlier, in OpenShift 4.12, we basically have this new format of image for arcos. And MCO knows how to understand it and apply it. And in 4.12, we basically support the hotfix model, where you can apply hotfix through the customer portal and all those stuff. And off cluster build, basically went GA in 4.13. And what is off cluster build? I will tell you here. Off cluster build is basically what we just saw here, like you created Dockerfile. And you do on changes that you want. And basically, you take the control of the whole cluster and the OS update mainly. And whenever a new update happens from the OpenShift, and the OS will not be updated, it will be done when the user admin does it and creates the new OS image and with the new changes. And that's how it turns. So MCO basically won't be making any OS update when there is an override. And you can go back by basically deleting that machine config that we created. And things will go back, reset to the default mode. And from there, now onward, the MCO will take care of the applying of the OS update as well. And yeah. So from all this, things to remember is like with this new model of cluster build and using the bootable container for OS update, machine config is not deprecated. So the way it works, it will continue working. And off cluster build admin takes the control. And they are responsible for doing the OS update by themselves by creating their own image whenever a new update happens or whenever they want to do it. And for accessing rel packages, for example, when you do pull in a package that is like part of rel, you need to build this on an entitled host so that you can fetch those content which is from rel. And yeah. And RT kernel and extension, those are not yet supported because they conflict the way it's designed in the open shift of doing the RT kernel, applying the RT kernel or extensions. They are conflicting with the current model, so we need to work on that. So what's next? Next is in cluster build. So off cluster build is there, but not everyone wants to really take the control of building by themselves and maintaining the OS update. So we are working on in cluster build where user can basically define what they want to give and MCO will create a new build, apply it, and give it and put it in the registry. And everything will be like a battery included kind of model that we say. And better integration with the console so that it has a good user experience. Right now, you have to do it by yourself. And hopefully with the OC integration, we will have a little better and feedback. So right now, this is very new. We are now for 13. And for 12, it came up. So more feedback we are looking for after the users are trying out and see how exactly it looks like after the try so that we can improve it, where exactly we need to improve. And also, yes, the articunnel that is missing, we want to really make it in supported ways so that user can, there are a lot of people who wants to use this actually articunnel. So this is something we'd like to support and give a better way forward for this. Yes. So this was my talk. And these are some of the resources that we used. Some of the upstream links are also there, which you can try out and see if you don't have an OCB cluster. And yes, thanks. And shout out to Colleen Walter, because we were going to give the talk together, but he couldn't come up. So yes, thanks to Colleen as well for helping me with this presentation. And thank you all. So if you have any question, feel free to ask. Yes? Yes, the question is, can I use any other commands, add or something that's supported by the container file? So yes, you can do that. Can you use multiple? So the question is, if I have multiple machine config with OSC Majority Lower Ride, will it conflict? Or what will happen? So the way MCO work is, there is a particular ordering. So if you have multiple machine config, one of them will supersede. So the latest will be taken. So like alpha and numeric, the system de-handles the name ordering. Similar to that, we use that pattern. So during the new rendering process, it will basically pick the latest one. The latest one in the sense that you have a name of the machine config, right? So the name matters. So whatever name is the latest, the way MCO handles. So whoever is the latest one, that will basically be picked up. No problem. Any more question? Sorry, I couldn't hear the question. Can someone repeat? How does OSC handles container image update? So that's the question. That's a very good question. And I think I am not the right person because the CoreOS team works on that. So I think that will be a good question for some of the people who worked on it. So internally, it's a new way of using it. So there are definitely technical details here. So that I'm not aware of it. That was before 4.12. That's how we were doing with after 4.12. The native container image, the way so ARP Moisture handles itself. Now during update, we just say ARP Moisture update and the URL of the registry, image registry. And OSC in behind takes care of doing the update. So how do you give a new OS, a custom OS image? That's the question. So the way we handle it, so you basically use the OCADM command, OCADM release info. And here I'm just using the existing cluster, but there is also a way to provide the actual URL from where basically it's coming, the whole OpenShift update. And you can basically, this is my, I can just say rel core OS tag. So you have all the image information there for a particular release. So suppose you want to go from 4.12 to 4.13. So you basically use that release payload that's available for 4.13. And you give it to OCADM. And you should have the URL for OS and use that. Does this answer your question? We have three more minutes. That's a very good question. So we did hit some issues. But yes, we do handle it. And it happens because during scale up, what happens, the cluster that are born from 4.1 or 4.2, they are very old. And we still do not update basically boot image. So they boot up from 4.1. And then we update it. So that's how it works. So it works. And there is a workaround, actually, because we don't have the new Rpmstry capabilities. So there is another system rerun and invocation we do. And we apply the update of the new OS image. So it works. But there is a workaround there. So all the quotes are in MCO. So I can point you later if you are more interested into how it works. Any more question? Since not. So thank you all for coming.