 Hi, welcome to our presentation. We want to introduce today Yaouk to you to fearless automate the installation and deployment of an OpenStack cluster. So, first we want to show you a bit what Yaouk actually is and why we think we needed to build it and how it works. After that we show you an example about that, but first we want to introduce ourselves. So, hi, I'm Robert. I work for StackEd, which is part of the Schwartz Group. Maybe you know the Schwartz Group when you are from Europe or the U.S. as the retail companies Lidl or Coughland. So, we're part of the Tau Group. We have some many other companies in that group, but basically we are the job writer for internal and external customers, for the Schwartz Group and also other ones. And currently we run multiple OpenStack clusters with more than 700 hypervisors and about 40,000 physical CPU cores and 12,000 VMs. And I'm Stefan Hoffmann from Cloud and Heat Technologies. Our version is to build a sustainable and silverware cloud stack, that is not only operated by Cloud and Heat itself, but we will also enable others to use that. And the sustainable part we want to achieve with hot water cooling, though we have in software and in hardware stack, you can be there. We use the water, for example, with integration into buildings. This needs to be monitored and operated, but also the software stack above. We use Yahoo! to roll that out. And yeah, of course one OpenStack, if customers are interested, so Kubernetes is on top of that. But enough with the marketing stuff. Let's go to Yahoo! what it actually is. Though it's a tool to install and also operate OpenStack. We want to show it to you a bit. And the name is yet another OpenStack on Kubernetes. And it contains three parts. Today we will only focus on the operator part. But somehow we also need to install the bare metal nodes. So we have also automation for that. On top of that, we want to run Kubernetes, bare metal Kubernetes, and then can deploy the Yahoo! operator. But you can run the Yahoo! operator on any Kubernetes cluster you want. Though you don't need the stack below if you have some other automation for that. Yeah, the question is why to use Kubernetes for OpenStack. I mean, the idea is not new. We saw it at the summit a lot already, but still OpenStack is really stateful and Kubernetes like to kill pots. So why to do that? Still, you can define your workload pretty nice with Kubernetes. So you can define this is how my infrastructure should look like. And then there is some magic that make it happen. And you can use some Kubernetes features like replicas. You can say, I don't want to have three keystone APIs, but maybe 10. And Kubernetes make that happen for you. That's pretty nice. Also, it comes with networking, load balancing, lifeness probes, and so on. That's pretty nice. And you can, like I said, you can easily test it because if you run it in your test environment, you have a Kubernetes cluster based on virtual machines. And you can pretty nice test that. But how do we actually install OpenStack? We could use ham charts, but no. We use another thing from Kubernetes, the idea to use control loops. So we define something and then have a control loop that regularly checks. Do we actually have this state? So we define nearly everything and keep it in Yauk with custom resources. And then the Yauk operator checks this custom resource definition. And run this control loop. Do we have a user, for example? Do we have a configuration? And then creates the deployment out of it. This is an example of how a client's deployment looks like. You don't need to understand it completely, but like you see, we define database and API stuff, like the replicas, but we also can define configuration. And that's one thing we wanted to achieve, that we are able to configure every option that OpenStack allows you to configure. But also we define the target release as we support different releases. And the operator then goes through some things to achieve that this is actually made happen that looks like then like that. I'm not going through this today, but that's an dependency graph. So Yauk knows the dependencies of a resource. So, for example, it knows if I want to have a config, first I need to have a database. And a database user. And if I have that, then I can create the config. And I need the config to generate the API deployment. And that's pretty nice because we have then a defined order of all the things. And that's basically the core of this reconcile loop. Though the loop can check if something updates in the definition before and then updates this part that changed, actually. And also, it will fail if it won't continue if one step somehow fails. For example, it won't create the API deployment if the configuration failed. But what means configuration failed? If we want to define every option that is possible, I mean, for Glance deployment, it's quite easy you have a Glance config. But for no more compute nodes, you maybe want to have different configuration for different nodes. And then you could have multiple configurations for the same or multiple values for the same option that are different. And that's not good. At Yauk, we see that because we have a Q-lung validation and then fail if we define different values for the same thing. We use Oslo config to see what options are actually there and supported and that thing is defined as an integer if it needs to be and not a string or something like that. And we have some common default set in Yauk as well, like the database connection. If we generate the database before, we know already the connections thing and the user don't need to put it in. But the really fancy thing out of the things I explained before is the day two operation. Because we know what changed, we can easily update and react on that. I guess all of you somehow operated already an OpenStack deployment and know how hard it is to train a gateway node. If you need to remove 500 routers from an L3 agent, that's not so easy. Or let's take no more compute nodes. You not want to live migrate all. We empty from the hypervisor to update that. I mean with 10 hypervisors, that's fine. With 500 or more, that's become time consuming. So you have some script for that also. Or you have Yauk. Because Yauk does that for you, it knows what to check. Robert will show it a bit in detail later how we erect the nodes. And this way we can easily update the config and introduce config changes or updates of the containers. So basically after you learned a lot about Yauk in theory, I will show you just a small practical example like how the life cycle management of a compute node in Yauk works. And just to make it a bit more understandable how Yauk can benefit you. Basically I will walk you through the whole process with some screenshots out of KNNS. Maybe you know KNNS, it's like the operating tool for Kubernetes, which is a nice UI. I don't want to make a live demo as I'm not that crazy, but I made everything a test cluster where I've already deployed Keystone, Glance, Neutron, and Nova with Yauk. And basically the Nova deployment then in the cluster looks something like this. In this deployment I stripped the most parts out of it and only like the essential parts for this demo. For example you don't see the section about databases, the section about the policy, the most config is stripped out and also like for example policies are not there. But what you can see is one for one the target release. We as Stefan already told you support multiple releases and for some services also already the automatically upgrading of the OpenStake releases. What you can also see are these two config sections. Like the green box would be a config that would be on all compute nodes. In the Yauk context a compute node would be defined by a label in Kubernetes. That would look something like yauk.cloud hypervisor true, which tells all the Yauk operators this is a hypervisor. And if you don't have a special node selector, this config would be rolled out to all of the nodes. Here I have just a few examples in there, like for example default availability zone or some other configs. And in the blue box you see a config that is selected via node selector and only will be applied to nodes where the hypervisor type QEMO label is set on. And there I just set some extra noble compute config regarding the virtualization type. If I now label a node with both of these labels, I would get this resulting config that just merged together and the next step would be extended by some Yauk defaults, OpenStake defaults and some like example database configs and so on and then implement it into the node. If I have a look at my cluster, currently there are five nodes but I will only have a look at the first one because this is my compute zero one and I want to deploy a compute node on this node. If you have a look at it now, it is empty, there are no pods from Yauk there and this is because I not yet labeled the node. In order to do that, I can just use Qubectl or edit this with KNS. For example, I will add these two labels like the hypervisor true to tell you Yauk it's a hypervisor and also the hypervisor type QEMO to get my additional config in there. In the next step, not the nova operator will go to work but the new operator because before you can install Nova compute, you need an L2 agent and this will be deployed by the new operator which also reacts on a label and basically what you can see here is just another custom resource this time not created by us but by Yauk itself. So also Yauk creates custom resource and then the next operator also creates custom resource to define our whole environment and basically what you can see here is first the name of the node where we want to deploy it then we have also some configuration for maybe the SSL encryption because all the internal communication in Yauk is already encrypted. We have some references to other services that Yauk can automatically configure them for example the keystone section in the config will be just taken from the already deployed keystone deployment and then we have some two agent specific config or OVN agent for example the southbound configuration and at the end we also have again the target release as we support multiple stack versions we have to pass that through to every service. If that is deployed we will see three new pods on our node of course if you want to deploy an OVN agent we need our vSwitch D and also our OVSTB and an OVN controller itself and also if you see in the first pod we have two containers in there so there's also the metadata agent in so we have like the whole bundle automatically deployed via Yauk and in the background there were already created all the keystone users that are needed as each service of course has its own keystone user the config objects and so on so now we have working two agent on the node in the next step after the OVN agent is already deployed the no more compute agent will go to work and we'll read this no more compute node custom resource which was deployed by the no more operator and basically it's quite analog to the OVN agent we saw before there you have also like the target name the release and so on and the keystone reference but we have also our config which was rendered like I showed you before based on the two labels and added to this definition which will in the end on the node look something like this so now we have a new pod there the no more compute pod basically in this pod we have three containers the first one is no more compute itself which just runs no more compute and we have a lip word container to have the lip word functionality and also an SSH container to enable the life migration if what is quite fancy is that we if we kill this pod the VMs would not be affected as the lip word process itself runs on the host namespace next step there may be some reasons why we want to change from configs or recreate a node and there we'll just show you how the life cycle would look like so some reasons for recreation would be like you want just to make a config change in nobody deployment which should be rolled out to your whole nobody deployment or just to one compute node or you relabel the node like you label maybe shows that the node is in your aggregate or just gets a new config or maybe we don't want to queue your config in there for some reason or you want to update your nodes or what always can happen if you're hypervisor just crashes in this case your group would also start an eviction process of the node to get all the customer workload removed and of course you can always manually delete the compute node which always would trigger this eviction workflow so if now something changes the operator detects the changes and marks this compute node for deletion and all the VMs on there would be started to be evicted and if they are gone all the that the notebook would be deleted and there will be an action as well and then the node is empty and would be can be redeployed with a new notebook compute node resource and everything starts from beginning but what is now this eviction of VMs for that we implemented a big job which would be run before the node would be deleted and in this big job I just brought in a log here we could see in the first in the beginning it will check if the node is actually still alive so we just ping it we look into Kubernetes look into OpenStack to ensure that the node is still there and not to like start an eviction of all the VMs if the nodes if we are nova if the nodes are still alive because if the node is still alive we can live migrate the VMs away so in next step the job will just look at all the VMs that running on the node and will classify them based on their state for example active VMs normally can be live migrated shutoff VMs have to be offline migrated and if they are for example in an error state they would be not handleable for you and we will just wait for something somebody to have a look at it so that we don't break it because for error status can be so many reasons that we don't want to build an automation for that and in the next step we just run our to run the evictions and migrations itself this is quite nice for a single node but the real benefit of it comes when you have like 500 nodes or more because then you only have a single point where you can configure your nodes and so on and configure your config and the versions and everything and your hook will take care of the automatic rolling upgrade of the nodes so it will take one hypervisor after another and actually you can also configure how many hypervisors from the same label set Mr. May from the same aggregate will be rolled out will be updated at the same time so for a small aggregate you can maybe one node at a time and for a big one you have like five or ten just to have this all in a nice and even fast manner so this is all about this compute noble compute example and I will get back to Stefan maybe to give you a short overview what is already in Yauk so we support the open stack versions green train and yoga yes queens will be excluded soonish we have already automatic open stack upgrades for keystone and glance from currently queens to yoga for other open stack servers not I remember right now ironic is maybe also in yeah okay thank you so much back ends currently supported our staff and net up and that's a bit tricky with the yoga release we switched to the OVN network I set up before we used neutron agents with OVS and we are also GitOps ready yeah so basically as in Yauk everything is a custom resource you can just place it in your repository and use tools like for example flux or argocd which just sync the state of your repository with your command list cluster which like enables you to just make one changes in your Git repo which we automatically rolled out to the cluster and then also rolled out by Yauk to all your workload Yauk is not or we are not done with developing it still some things that we want to include for example automatic upgrades to the newest releases for all open stack services we want to be faster to every type of visors because currently if a hypervisor dies we need to wait to keep a need to see the notice down and this can take five minutes or more that's not so nice we want to optimize the OVN setup as we use it pretty newly see all day that we could optimize something and then it's nice to put this changes also into Yauk and yeah maybe you found some or don't found some open stack servers that you would like to use on the list then either you could build it by yourself or open an issue for that that could be also included and with all this Yauk thing we wanted but it's an open sourcing and we wanted to have it really open sourced though we are from two companies but have some software owned by two companies it's a bit tricky from a legal size and also it doesn't feel so really open sourced so and we build it an association a non-profit association with other companies as well to okay I don't care now I got stuck yet for building silverware and digital infrastructures I'm stuck at this point currently but if you are more interested at four in the evening in room 15 my colleague will introduce this association a bit more maybe a short overview I guess you would see it there also you we have the educational part that's mostly about to tell others why open source open infrastructure is is important why software entity is important for them we want to tackle the community building for projects like Yauk and of course support not only Yauk but also other projects to become open source but more than in the presentation later like I said this whole thing is open source you can check it out on GitLab look through the documentation or contact us on the IRC at this point thanks for your attention do you have any questions thanks for the presentation my question is you mentioned the name of Yauk is yet another deployment tool can you elaborate on what why you decided to create something new compared to the existing initiatives in the community that were already there before and the other question is you want to expand the development beyond your own two companies do you have any plans maybe to move the development to opendev.org to host Yauk together with the rest of the OpenStack projects did you all get the question okay did you get to the first part maybe the second part together with OpenStack so we was in discussions with the OpenInfo Foundation when we thought about okay will we build an own association or not and at some point this got stuck somewhere I don't actually know what I guess it's the focus is a bit different so and we wanted to to have this project home somewhere so for now it's that but it's not now that it needs to be there I guess this can move later on if it feels right for both sides I'm regarding the first question we of course had a look at other deployment tools like OpenStack Herm or so on but what we missed the most part was like the second second day operations like keeping your cluster safe if you don't touch it like for the most part if you don't touch deployment it will break after some time and also like this what I showed you today with the live migration that you have this whole life cycle management for a compute node and so basically don't have to do some things manually we missed and other deployment tools we saw when we started mostly this operator stuff with the control loop at something at the point we started with this project we didn't found somewhere else in a way that fits our needs and also I had a look because at the same time like we started also also project started and was also presented at the summit but I think at the most I didn't found this approach which is totally fine it's the needs of other users so then they build it something else but like color ends a bit and so didn't give us this stuff it's nice to deploy an open stack deployment maybe it's even easier than yeah if you only want an initial open stack deployment but to change something it's hard for all the stuff how are you doing networking between Kubernetes and OpenStack and are you doing networking between Kubernetes and OpenStack and can you for example schedule a normal pod on a hypervisor Do you mean communication between VMs and workload on the Kubernetes cluster? Yeah so pods talking to VMs No that's currently out of scope because we would also say okay if a customer wants a Kubernetes cluster which would be then something they want and we would run a Kubernetes cluster on top of OpenStack for example in the project of the customer and then customer can put also a VM next to it which make it more easy Communication comes from the network fucking news so the Kubernetes part like talking and all of you talking to its database uses Kali code it's a bad end and everything customer facing uses OVM or OVS whatever you decide in the deployment and this runs in the host name space so the container utilizes basically every feature on the node itself and not running in the Kubernetes overlay so they basically these two flows never need each other so this has some implications so you need for example using OVS fire warning so you cannot rely on other tables because it's used by Kali code and there's conflicting stuff in there you lose if you you have around OVS in the same time as Kali code be extend you need to use different be extend ports so there are some some caveats for them but basically it's completely separate flow here yeah so so basically that the decision was okay we want to operate clouds with multi-tenants and they are connect somehow they're underlying Kubernetes to VMs wouldn't be so nice but that's what you can see but whatever port you want to have on hypervisor like to do monitor and start exporters we also run lvp the container because we don't want to install it on the node you can run everything you want but do you mind these resources are then subtracted from whatever you want to provide as a VM so you need to keep resources scheduled in mind because the Kubernetes resource schedule does not consider what opens that goes to you so you need to do it hard by C-groups or have a very good compute configurations further questions or questions regarding that you mentioned the very popular term sovereignty especially in Europe at these days can you elaborate a little bit on how you address this term here with Yauk I think it's a bit hard to put it there I mentioned it more on the cloud and heat side and so I won't go into detail about this company right now or my company right now Yauk is mostly focused on installing OpenStack and how you then configure it or what you do with it that's on the operator side I think our part for that with Yauk is that everyone can use Yauk to install their own OpenStack cluster the rest for my understanding but maybe we can discuss later about it I would say it's not really the focus but of course we don't want to block it so we don't want to block sovereignty with our but want to support it okay I don't see anything that we could do with Yauk to make it much better but maybe my understanding is different than yours okay I was just conclusively with that since we added it on your slides here it was only related to to the company and not really to this tool maybe further questions if not also time is over thanks for your attention if you have further questions you can reach out to us thank you