 Welcome to this OKD working group briefing on deploying OKD4 on Azure brought to you by Joseph Meyer, who's been working with the working group testing on one of the many platforms. And he's going to walk us through today his lessons learned deploying the OKD4 beta release now on Azure. So Joseph, why don't you introduce yourself and take us on a tour. Hello, everybody. My name is Joseph Meyer. I'm a cloud architect at Rode in Schwartz in Munich, Germany. My company is rather old. It's founded in 1933 and it's famous almost every physician or electronics engineer would know it because we are one of the leaders in making measurement equipment, oscilloscopes, signal analyzers, spectrum analyzers, signal generators. We also make TV transmitters where we are a market leader. It could be that also in Canada, your TV comes from our transmitters. We also make body scanners. Maybe you have seen one on an airport. We also do lots of RF high frequency measurement and generation equipment. That's what we do. And my job since two years, almost three years, I'm working in a team which is working with digital transformation at Rode in Schwartz and our job is to build how we call it Rode in Schwartz cloud, which is based also on OKD or OpenShift. So we are responsible for making a great platform for our developers. And we are working with OKD since version three. I don't remember exactly when we started, but the last version came out in the end of the year 2018. Since then we are eagerly waiting for OKD. Why do we wait eagerly for that? Because we have more and more applications which require more modern Kubernetes version. As you know in version 3.11 is the last version in the version 3 line of software of OKD. It's Kubernetes 1.11. And yes, OKD 4 is starting with 1.17 as far as I know. And that's what a lot of applications require now that we have more than a Kubernetes version. What we also like are the great ops features of OKD 4 with the automatic updates. Not only for OKD, but also the underlying OS. We love that a lot because currently we have lots of work in patching hosts. Sometimes those things don't work as expected and you have lots of troubleshooting. Everything was fine for its time on OKD 3, but OKD 4 is more advanced and we want to have it. Also the great integration of tools like monitoring open up and text on and many much more is very good. It's a web UI. I'm a totally fanboy of that because I think it looks fantastic. And it's great not only for beginners, but also for experienced Kubernetes developers. It finds things very fast. Everything looks like it wasn't designed for Kubernetes. I think it's the best web UI nowadays for Kubernetes in the world. And also see a very promising last time I saw a PowerPoint presentation with more than 100 slides filled with nice cool features which are coming in the next releases. That's why we like to jump onto OKD 4. That's also the reason why I'm very engaged in getting that because I like to see it in my company. We have two versions or locations where we use OpenShift or OKD. This is at Rudi and Schwartz on-premises in Munich and also on Azure. We do some kind of hybrid approach. It means we build software on-premises and deploy it in the public cloud. That's why I'm working on the vSphere version of OKD 4 and also on the Azure version. The Azure version is kind of special because it was hard to get it running because there were several milestones which had to get achieved. The first problem was there was no Fedora Core OS version which worked out of the Fox on Azure. I have to say there was a network bug which required a reboot, sometimes even more. On the first boot before the virtual machine got an internet connection. I think not only Azure is affected by that but it was my problem. At first I reported every VM. I had no idea if the OpenShift control, the Kubernetes control plan was still running. It was not ideal. But since a few days, there is a test version of Fedora Core OS available where this bug seems to be fixed. I tried it out several times to deploy VMs with Fedora Core OS and it worked every time. This problem seems to be fixed. I'm very interested to get it in a released version soon. The second point on Azure is that the Fedora Core OS image is still not available on the Azure marketplace. Normally if you spawn a VM on Azure, if it is Ubuntu or CentOS, you can use images which are maintained by third parties on the Azure marketplace. Very easy to get that. In the case of Fedora Core OS because it's rather new, there is no image available in the marketplace. The problem was how to get it in an Azure resource group because the installer tries to download an image from the internet, tries to download a VHD file from the internet, extracts it and uploads it again to Azure. In my case, I have my internet connection here at home where I do the most work on OCD. It's very slow. I have download rates of 5 GB per second. Upload rate is far less. It means if I use the installer without modifications, then it timed out almost always after half an hour so I wasn't able to use it. What made it also complex is that Fedora Core OS image a few weeks ago had to be modified as there were modifications necessary with QEMU. Christian told me today that it should be fixed since a few weeks. I was surprised about that because I made a workaround with lots of love in the long weekend to get around that. But I think it's still necessary because I tried today to deploy OCD on Azure without modifications with the installer from GitHub and it didn't work. Maybe I can show Christian later what the problem is. Yes, I made a workaround for all these problems here because I wanted to test OCD on Azure and I wanted to do lots of manual interaction. All I want to show you in the next three slides is not necessary anymore if Fedora Core OS is available on the Azure marketplace. That's important to say. This modification from mine will never get it to master in the installer. It's just if you want to try out OCD on Azure. Here is a little picture. It should explain what situation we have. If Fedora Core OS would be on the Azure marketplace, you could create OCD only by calling this command OpenShift install create cluster because we don't have it now on the marketplace. I made a, yes, Vadim called it a hacky workaround. It is a hacky workaround. I'll be honest with that, but it helped me a lot. What does this workaround do? It creates a helper VM on Azure through the installer. This helper VM downloads the Fedora Core OS VHD file, extracts it and uploads it to the storage account in Azure. I use Terraform, which is included in the OpenShift installer to do that. My assumption was that the download and upload is much faster if I do it in a virtual machine, which is already on Azure. That is also the case. For that, I had to modify the installer. Yes, I also thought about different strategies to achieve the goal. I also thought that I could manually upload the extracted image to Azure blob storage. But the problem is still there that the upload is very slow in my location here. I modified the installer and set what I want to show in the next three slides. The core commands, which are necessary to run in this helper VM. The list is very short because I removed all the QEMO stuff from that. In principle, the script runs on the helper VM. I download a set copy. This is a tool from Microsoft for Azure, where you can upload software to the Azure Container blob storage. I download the compressed FCOS VHD file, extract it and upload it with Azure copy to the blob storage. On Azure, I get wonderful upload rates. It only takes a minute, I think, to do these tasks. Here at home, I never got it running. It runs forever. At first, you have to install Go somewhere. I think this is nothing special. Next, don't be frightened by the list here. It's also nothing very special. You have to clone the OpenShift installer. Check out the FCOS branch. Still, there is a wrong release image addressed in the installer. I think Wadi made a commit for that, but it hasn't landed. I have to replace the release 4.5 by 4.4. I think maybe tomorrow this is not necessary anymore. Then I cherry pick three commits from my repository where I've made the change for the installer. You build it and you copy it somewhere where you have access to. I copied the OpenShift installer to user slash bin. That's everything about the hack. This here should be very common. At first, you have to create a service principle on Azure. There is a very good documentation about that in the OpenShift documentation. I have pasted the link to the slides. This is very important. A service principle is something like a machine service account, which has some air work on it, which allows the OpenShift installer to create resources on Azure. You have to do this first. You get also an application ID, a secret. You need your tenant ID and your subscription ID. This is Azure terms. Everything used to Azure should know them. You will need them later. The next step is that you have to generate an SSH key pair with SSH key again. Afterwards, I create an OpenShift with OpenShift install, create install config. I create a config file, which is called installconfig.yaml. I create this before because I like to save it or later use it, so I don't have to ask all the questions the installer asks me for every time. I copy the installconfig away because that's some installer weirdness. I never understood why he does that. It deletes the installconfig.yaml. Why I don't know? It does delete it. That's why I save it away and copy it back later if I do the next test. What's also important before we install the cluster is that you have to set an environment variable, which overrides the default Fedora Core SOS image with the one Fedora Core S team has offered a few days ago, which contains the Azure network fix. You have to use that. If you don't use it, CVMs won't come up and you would have to restart CVMs a few times. They mean every VM, not only the bootstrap VM. With all these preparations, you can create a cluster. You should know this command. It takes around 10 minutes at first. The installer will create this VHD helper VM. I told you about that before. A few minutes later, the VHD file with 8GB will be in the storage account in Azure. After that, the installer proceeds. As usual, it creates an Azure VM image out of this VHD file, and all the VMs are created. After around 10 minutes, the bootstrap VM should start. After the master VMs show up in the Azure portal, you can start watching your OKD4 cluster coming to life by using OC CLI. I think it's very nice that the installer creates all the resources completely automatic. You don't have to do anything manually. Normally, if everything works, you have your cluster and can start working with it. That's very important. This is why I wrote this modification in the installer for. It should look like this. In the end, there is an FCOS VHD file in the Azure blob storage container. I always had to use the blob type page blob. Without that, Azure complains that for images it needs page blob. Not block blob. I saw in the sources that the FCOS branch and also the master branch have block blob in the source code. I don't understand why it works for the redhead core as a version of OpenShift installer. But I always had to add a page blob type in Terraform code. Maybe I misunderstand something, but for me it was possible. It was required. If you want to watch the installation, you have to get the OC CLI tool for version 4 of OKD. It's very important. Don't use the old OC command. If you only want to see ports or CSRs, this is also the old version would work. But if you want to get upgrades for your cluster, for your running cluster, the version 3 CLI, the OC wouldn't work. So I get it here for OKD4. I untar it and then I set an environment variable kubeconfig to the content of a file which the OpenShift installer created as it creates a cluster. It's in the auth kubeconfig file. That's important that OC can get a connection to the cluster because kubeconfig also certificates and the cluster location, the domain name is also included. What I normally do afterwards is to do a watch. So I see all ports in all namespaces each second. That helps me in seeing if there are any problems, if ports don't come up or restart. If I have the situation, I look in the logs, the ports, and if it looks strange, then I open a ticket. The last few days, weeks said it didn't happen. Everything was running as expected. Yes, but you should also have a look on that. If the ports in the namespace OpenShift machine API are already normally there should be created 3 worker VMs in Azure. After a while, CSRs for them are generated and a few seconds later normally they are joined to the cluster. If that happens, you are almost done with everything. The problem is, it's a little spoiler that sometimes the VMs come up in Azure but they don't do anything. It happened to me today two times that there were two worker VMs joining the cluster. The third one didn't join the cluster. I was able to SSH into them. I saw that they have an internet connection but POTMAN images didn't show any images. The journal, the system, the journal CTL didn't show any great activities. It was idling around. I don't understand where it comes from, maybe it's a new effect. We have to see in the live team if it still happens again. Afterwards, after around half an hour, you should be able to see the web UI. Here is a screenshot from a cluster I made two days ago. I can show that. Maybe I can show it before we go into a live action. I will switch the presentation to my browser. Here it is. Something has degraded already. I think it's the samples operator. Here we are on Azure. You can see it. Inside is no. It's our domain name. If I go into the settings, I see that I'm on this version. It's from two days ago. I think it's the beta version, I'm not sure. Here we have a degraded samples operator. It happens from time to time after an update that something is degraded. I don't know exactly what I can do against it if it is normal because upgrades, as far as I know, are not supported at the moment because the tests for upgrades have recently begun a few days ago. You can see it on the release page for OKD for Origin. I love this visualization here. I think that's the reason why upgrades sometimes are working, sometimes not, but normally the first installation always works and everything is green. This should be a proof that OKD can run. I also deployed Argo CD as a test. Everything is running. Here is it. We use Argo CD and GitOps for configuration of the cluster and also for configuration and automatic deployments of our applications. Maybe I will block up on that because we have had a few nice findings about it. Now I would like to try to create an Azure cluster. It will take a little bit of time. I will go through the slides in parallel so we don't lose too much time. I open an Azure Cloud Share here. Why do I have to do that? Because in my company the SSH port is blocked by the firewall and my modification of the OpenShift installer runs a script with SSH and it does not work from our own premises. That's why I'm here in the public cloud. You won't do VM, I prepared before. I don't want to show how I built the installer, the instructions and the presentation work. I have tried it several times. You can try it on your own if you like. I will go in a folder. Here is the OC2 already downloaded. I already compiled the OpenShift installer with my modifications. I check if it is not here. I will get my environment variable for the Fedora CoreS image. If you forget it, you will have problems later. I think I made a script for that. Here it is. The environment variable is still linked to Fedora CoreS with the Azure DNS network bug fix. I will create my third config. I have to answer a few questions. Here I am asked for a public key. It's useful for debugging. If you have a problem you can SSH into VMs to make trouble. I always create them. Here I select Azure. I already was logged in before. I entered my credentials. I'm pretty sure you know what to enter here. You have to enter the service principle information about the service principle you created with help of the OpenShift documentation. Then I'm searching for a good location to deploy my cluster. Where is Switzerland? I like Switzerland. It's rather forced. Here it is. The installer is already connected to Azure. You see it because it finds our DNS zone for our playground environment. I select it. It gives the cluster a name. The secret is, I think, optional. Press Enter. That's it. There is installed config. YAML was created by the installer. Here is everything I entered. The credentials are stored in the home directory in a hidden directory called Azure. There are the credentials. They are not in the installer, YAML. Now I can create a cluster. I hope I did not forget anything. Always brutal if you see that you forget something in the very start. If you already have all VMs spawned, that's very sad always. I hope I didn't forget anything. Now the infrastructure gets created by the installer in Azure. What you see here is I forgot to switch off my debug code. As I told, I have this hack which creates a helper VM and now Terraform, which is a part of the installer, creates a few resources and also starts with this helper VM. Go to Azure. Wait until here, some C4, OKD4. Resource group will pop up. It always takes a while. That does not mean that the installer is doing nothing. The Azure UI always takes a few seconds or sometimes minutes until it shows resources which were already created. We can see that there are some older tries of creating OKD4 cluster. Let's remember this two and three and a few seconds as they should pop up a new entry. Maybe I can go first. What I normally do if things go wrong, it's this stage it happens for Azure deployment from time to time. This are the typical commands which will sometimes save your life. You could enter at first if the master VMs don't come up or still for a long time in creating date. Then you should enter the bootstrap VM. The username is always core. The public key is the one you have created before with SSHKegan. The first thing what I do is do a curl on Google and check if I have an internet connection. This was a problem in the earlier versions of Fedoraco as that I got no answer with this command. It was obvious that I have no internet connection after I reboot Google Work. The next thing I do is I do a pseudo portman images if Fedoraco has downloaded something from the internet. After there are four or five images downloaded normally the bootstrap every VM restarts. I think it's because there are some in the images are some tools which are downloaded and written to the Fedoraco as partition. After reboots these tools are permanently inside of the Fedoraco as VM. Also cry control is not in CDM from the beginning. It needs a reboot and after that it's in the system. With this command, pseudo-journal control, no-pager-follow, you can see all locks, all system locks. Here if you have containers which make problems you can have a look in the locks. I forgot the command which shows the running containers. It's cry control ps-a. You can see the container it is. If it takes forever until the control plan comes up then you have a few options. The bootstrap VM has already a public IP. The master VMs are behind a load balancer. If you want to go into a master or worker VM you have to give them a public IP and also enable port 22 in the network security group. That's also a point where I think debugging could be improved if the private key for debugging purposes would be in the bootstrap VM because all VMs are in the same private network. It would be nice if I could jump to each VM, masters and workers from the bootstrap VM. But currently that's not possible. Maybe it already changed. I have not tried it since the last days but normally private key is not inside of the bootstrap VM which makes sense for security reasons but for debugging it's nice. In the next step we can see with OC get CSR if all masters are already approved and this is the situation where control plane should be there and you should see something with OC get POTS. You should see that lots of containers are creating. If the reasonable time are not three masters then you are lost. I don't know how to recover that. I always destroy and recreate the cluster in this case. I hope that it doesn't happen now during the installation which occurs in parallel. That's a good time to switch over. Let's see if there is a third. Yes, we have a third resource group in the Azure portal. I changed to it. Already lots of resources created. Okay, we have only one virtual machine. Let's have a look. Okay, it's still running. We have one virtual machine. This is this helper VM. I was talking about we have a storage account. The VM image is already created. It means at this point I could delete the helper VM. It's not automatically deleted. I saved this work. I don't want to see that. To show you the storage account. Here is this. Here is our uploaded Fedora Core's VHD file with the type page flop in the proper size. If you have an image created, it's a very good signal. It means that the format of the VHD file is correct. If it has not the correct format, you can create this image. Azure will complain about that. Let's refresh. We have bootstrap VM and three masters. At this moment, what I normally do in this phase is to go to the bootstrap VM. It's still creating. I'll leave the serial console where you can see if it's running what VM does. Currently, it's still creating complaints. I have to wait a little bit. The master VMs should also be in creating state. Maybe we have one which is a little bit faster than the others. Then you could see that Ignition is waiting for its configuration. We have to wait a little bit. That's normal. VM creation always takes a little bit. What is important? You should not use this URL to SSH into it because this is the IP address of the load balancer. Go into networking and take this IP address. This is the IP address of the virtual machine. We also see that the bootstrap VM already has port 22. The SSH port opened. It takes its time. Now I try again to go into the serial console. This time the bootstrap VM has started. Go back and forth to show you that. That's how it looks like. If the VM has started, you see here somewhere between the bootlocks is the Ignition code starting up. Here you will see Hi, I'm Fedora Ko as I have this IP address. This is a private IP address here in Azure. Now I will show you how to go into the VM. I was falling out of my VM here. I will enter my Ubuntu VM again. I will SSH into the bootstrap VM. I get its IP address. This is the point if you forgot to create a SSH IP here that the VM won't let you in already inside. What I saw, I don't know if it is severe or not. Normally it isn't that I have sometimes failed units here of services which didn't came up. I talked too much. This is the situation here where the bootstrap VM restarts. I said it before, it installs a few tools from the images it downloaded from the internet. It extracts the tools and puts it to a second partition. I think it's a way R-P-M-O-S-3 works. Reboot and then you have the tools inside of CVM. You have to wait for the restart. Completely normal. Nothing special about it. Boot sign if CVM reboots. That's only a little check. If it pulls images, this list will grow. To grow a lot. You can see how it grows. Lots of images. There's a command. You can say that you want to see a Q-Blad. Nothing special. Something should also work. A favorite command here. Because the API server already is answering. As far as I understood, the bootstrap VM starts a fake control plane and all the masters get their ignition configs from the bootstrap VM. They install some of the software. Reboot and join the cluster. If there are three master VMs, if they have joined the cluster, then the bootstrap VM stops working and gets deleted. That's the point where Kubernetes is running. The installation finalizes. The machine API operator starts new worker VMs and you are almost finished. I love this moment. Lots of pods are spawning. A few seconds. At first they are in pending. The next thing I do is that I don't want to disturb. I have to do something in the cloud shell. If I do not, it will close. I will check the dedicate signing requests. They are now a little bit. Yes, here is this view for the workers. Pots. You will have this view with the pending pods as long as the three masters have joined the cluster. They prepared that this can last a little bit. Of course, I do not see any CSRs. No worries about these errors, which are warnings, the Terraform feature. It tells me that there are some resources that are deprecated. There is already a version 2 of the Azure provider for Terraform. They already tried it out. It looks very promising. This information tells me that something gets deprecated. It is not an error. The installer says the API is up. It is waiting for a bootstrap to complete. Our CSRs come to life. It is always, for me, I do not know the insights of OKD very good, but this is for me the first indication that something tries to attend the cluster. It is a good signal because it means that all master VMs came up. Be visible now in OCGetModes. It is true. Wait, they are not ready. That's OK. I hope I don't get you sick through my switching here. Now comes the moment. I love it if everything here is inner-creating. It is this counter. It goes up to 5. Ports will explode. I was wondering on the last try, I did not see any open shift on Zool operator. I don't know where it is. If it comes later, the focus didn't come up. I was thinking that this is a problem. Maybe the open shift console operator is not present, but I'm not sure. Now it is exploding. Isn't that cool? Yes, come up. Console operator is still not there. If you want to see how some information about the health of the cluster operators, you can do this. Now you see a list of operators. If they are available, if they are progressing or if they are degraded, don't be worried if during the installation operators get degraded. They repair themselves. Sometimes I think there are dependencies between operators. Don't get confused if there is a degraded status set or true. What concerns me more is that there is still no Zool operator. I don't know what I missed here. Maybe it comes. Now I will check if machine API are present. Yes, this is true. This is very young. This means that the machine API controller will create your workers. They should appear in the portal. The portal was so young. It's a little bit. If the OEC CLI blocks or sends you this message, don't be concerned. It can be that this is a situation where the load balancer kicks the Bootstrap VM out of the cluster. Sometimes this is a result of that. Normally loads should be ready. I think this was the time where the Bootstrap VM left us. Sometimes also see API ports restart. Then you get strange messages but don't be concerned about that. At least in my experience, it's normal. Check again if the workers come up. Yes, they are. There is a little bit of tension because it's not clear if they all get in running state and if they are in running state, it's not clear if they will join the cluster ever. As I said, a few days ago it worked but I'm not sure if I had luck or not. Wait a little bit. Creating does not always mean that the VM is not running yet. Creating can mean that the ignition is waiting for a configuration. Only if ignition has completed its configuration sets a point where the VM goes into running state. If ignition waits for its configuration file because also the workers need a configuration file which is served this time from the control plane from the masters through a service which's name is MachineConfigDemon As soon as they get this configuration they will configure themselves and they are in running state. Maybe we have no chance to see something in the serial console because Azure needs boot diagnostics to be enabled. Maybe it's also a good idea if I could configure that because the masters on the bootstrap node it's enabled on the workers it's not enabled for some reason. Only to be able to see which one I think we'll see. If it not works it's not bad. Let's see if they are running. The first one is running. The second one crossing my fingers. Two running VMs. I forgot to I don't understand why can't that's the time out of the Azure Cloud Shell to be such a nice feature to not get thrown out every time. You can ask the Azure folks for that. Yeah, I hope they see that they are very helpful guys and they love the Azure team. Only masters but maybe we have few workers. Wait for them. Takes a little bit. Yes, Diane. All three are running. Maybe it works. This is not a normal situation. I experienced since I use the fixed Fedora Core as a VM image. There is something. Something is still inside some effect. If they are running as I explained it doesn't mean that also something happens inside of the VM. Yeah, nice. This is also I will make a pause. Diane cut it out. We have to wait until the workers come. Okay, here was something. This always comes before something joins the cluster. There is always this node bootstrapper and here we have worker VM first one. Yeah, a few words about that. If you do, if you are watching how much pots are running, we will see that there are lots of pending pots. Yeah, at some times there are lots of pending pots because pots have node constraints. Node selectors on them are allowed to run on certain nodes. And that's the point when they get started. Yeah, if the workers come up. I still see no I don't see OpenShift console. I'm a little bit concerned. Don't know why. Yeah, maybe it comes later. Don't know. The second worker has attended the cluster. A little bit of a hiccup in the API server should be running. The first worker is ready. Second one getting ready. The third one, I hope, that we see it soon. Full strike. Yeah, we have three workers. Yes, that's nice. But I'm worried about the OpenShift console. I don't understand. That's completely new. The last time I installed it, it was there from the beginning of the setup. I saw the operator. I don't know where it is. No, it's not there. Maybe it comes if all workers are set up, but it's a little bit strange because all other operators were there from the beginning. I love this picture. Christian and Diane, I love presentations that work and have a result. I have a strong feeling that this presentation will have a running web UI at the end. Oh, it's not here. It's missing. Oh, my God. Maybe it's missing from the release payload somehow. That would be weird, but it should be in that view as well. The operator is missing. I think that's the point. Can I check that somehow? Yeah, it looks like the namespace doesn't even exist. I'm not sure. I don't have a quick fix. Everything else. But I'm prepared. I have a running cluster here. I would say I make a little pause. I don't have a route, but I can get the route with this command. Needs the OpenShift console namespace for that. This is the name of the URL where you can open the web UI of OKDWiz. And I have prepared a login because there is a problem with the installation. I switched to a former installation of OKDWiz and I entered the console URL and I'm asked for a first-time login. Username and password. The username is kube-admin. The password was generated by the installer at the very beginning of the installation process. It's here in OS in the file kube-admin-password. Take that and copy-paste it in my web console. Yes. And I'm in. So this is the web console of OKDWiz. We see here there is a one degraded operator. I think it's the OpenShift sample operator. And Christian told me that there is already a fixing preparation for that. And yeah, normally everything is green. Maybe there are if you update or upgrade the cluster with this OCADM upgrade command, I'll try to find it in select. Some of this should do the job. There's something with warning. If you upgrade a running cluster with this command here, I think it's also in the documentation. Then it can be currently in the beta that you get a degraded operators. But I think, as I explained before, it's in this phase of the project where upgrades are, yeah, whereas they began, the team began to start a test upgrade. It's normal that things sometimes get degraded. And yeah, that's it for the installation process. I would switch back to the presentation. I have a few slides left. Yes. But I'm happy to get it running on Azure. Yeah, it works good. I also tried. I prepared something in advance. I tried out storage. Where is my storage? Storage classes manage premium. It's Azure disk, the default storage. Lots of Kubernetes distributions. And I think I have some storage test. I created a PVC assistant volume claim. And it is bound. It created a persistent volume. There's something special. I don't understand if it is a wanted feature that if you create a PVC set, it will only bind to a persistent volume if it is used. So I had to create a test. I was using this PVC and where is it? Volume, mount volume. Yes, that's the name. Joseph test is the name of my PVC. And as soon as I start this part, which is mounting this persistent volume, this PVC, the persistent volume was created. And also see that to change the resource group for that. This is my Azure disk. Size was one gigabyte. Also here, the requested size was a gigabyte and you see that we have a disk with a gigabyte premium SSD for sure. I hope I can change that because this is the most expensive disk on Azure. Yeah, maybe it's not so smart to have the most expensive disk as a default storage class, but it's a detail. I'm back to the slides. I stopped here, I think on the slide, if something goes wrong, where I showed you typical procedures you can do if something goes wrong or is not as expected. On the second slide, I mentioned the effect. I stumbled over a few times since a few days that even if the VMs are started and are configured by ignition and have an internet connection, let's say, don't pull image from the internet. Maybe my impression is I don't see anything in the logs here. I don't see that some tubelet script is restarting all the time. Maybe something is, there is a race condition, I don't know. But it seems that it does not run successfully. It doesn't pull ports and this means it's, yeah, this VM is dead. I don't know a procedure how to get it back to life, but I think this will clarify in the next time. If the masters are affected by this effect I think you have not much chance but to recreate the cluster, to delete it and to create a new one. If a worker is affected, I got it working one times. If I deleted the worker in the Azure portal then the machine API operator complains about that he doesn't find the VM again then I deleted a machine customer resource, that's the name for VMs in OKD. I deleted it and I exported as a configuration from a working machine customer resource for a working VM and created a new machine from that and I did this and a new worker VM was created by the operator and everything worked afterwards. Yeah, it's some kind of workaround if you stumble over this effect and yeah, but it must be observed. I don't know if it's special for Azure of course or not just to have it mentioned. Finally, I'd like to recap what we talked about. Yes, installation of OKD4 currently is possible in Azure. Even it takes sometimes a few attempts but things get smoother and smoother it's I see in the future very optimistic. I think it's nothing is unsolvable unsolvable unsolvable here. Sorry. My hack I implemented is not necessary anymore if it or a course image is available on the Azure marketplace. But I would it would be I would be sad if people have to wait until that because I asked in the community when it will be available and so was said as it can't give a schedule about that but yeah, so surprised us a lot in the in the last month said can get very fast. Sometimes there still are effects during the installation that must be observed OKD if it is installed it seems to run stable. And yeah, for sure it's a lot of fun to work with I love it a lot. I can really say and my team also always loves about me that I'm so a fanboy about the software but yeah, it's so carefully crafted piece of software. I love it and yeah hail to Redhead and community for that. Thanks to Christian Dusty and many others for your support I'm always asking lots of questions in my ticket. Sometimes my tickets are not real problems but based on inexperience from my side but yeah people are always friendly and help if they can. That's very nice very nice community. I love it. Yeah, here is a page. I try to put lots of stuff on that because I was asked by the yen if something could be improved and I got a few answers. Not only for the Azure installation it's the least on this list. Yeah, let's start. Fedora Core is for sure should get on the Azure marketplace so that no hacks are necessary anymore. Biggest pain point in OKD 3 was that we use image scanners which are reporting vulnerabilities also on containers which are running. This software looks for images where there are running containers and because the images for OKD 3 were rather old sometimes we got lots of problem reports and we had to patch the OKD images on our own because yeah there were all types known newer ones and this procedure is very very bad for us because nobody enjoys in this moment that the images or the containers based on this images run properly. If it's possible I don't know if it's possible it would be great to have OKD 4 images which are rebuilt regularly not only if there are code updates but also if there are commits. I understand that everything is driven by a commit to a github but in the case that image has no updates it would get no there would be no new image for that where you could do a YAM update which injures that you get hot fixes on the image. It would be absolutely great if there would be updated images. This would be a big advantage over OKD 3 where we suffered a lot in the last months on custom patched images. Yeah Gluster FS was the next point it was very unstable in OKD 3 we moved it from the cluster but as far as I understand OKD 4 is a completely new storage solution which is proposed I think it's based on Seth and Rook it's very nice. Then we have one LDAP sync creates a lot of traffic on our active directory servers in OKD 3 because we have nested group queries and the colleagues asked me to tell that maybe it's better caching algorithms for users and groups to reduce the load on active directory servers maybe it's something we don't know about OKD 4 already available the logging stack in OKD 3 was rather old Elastic search was I think it's version 5 currently we have I think Elastic 7 or 8 don't understand why there is no newer version maybe because it's decoupled from OKD that would be okay but currently it's not possible by the way to install logging stack in the documentation there is a section about how you can install it as far as I remember it should be taken out of the operator hub but there is no logging solution currently in the operator hub it seems as if it was filtered out during the procedure as only the community operators got filtered out for OKD formerly there were also redhead projects present now they are not anymore present next point documentation about open shifts for OKD's internal architecture would be nice as I understand that it's planned after OKD is GA that the community is asked to work intensely on OKD that's the point where it's absolutely necessary in my opinion to have more architecture docs about let's say purpose of operators diagrams how things play together but I think it's nothing new it will come I'm absolutely sure but it's also necessary if you want to have the community with you in the development more debugging features would be nice during my presentation I mentioned a few one maybe it's possible to have private keys for masters and workers in the bootstrap so if things go wrong that I don't have to get them public IP first that's if you do that over and over it's can waste your time with that and I think that's it for the moment what we would like to be improved in OKD here is my last slide some links about valuable information sources most important information source in my opinion where I got the most feedback for development is the select channel open shift dash def in the Kubernetes workspace there is a situation that I ask something and normally takes a few minutes and someone answers even my colleagues which they always open the GitHub issues and were wondering sometimes but there is no response as colleagues were absolutely amazed about the response times on Slack this is where the great community shows its power because you always get valuable information and there is always a nice atmosphere in this chat then we have the OKD open shift link here you should put your issues inside here not in open shift installer or the other repositories because if things maybe come the problems come from misunderstanding OKD it can be solved here and we don't bother the developers about that so please open your issues here then we have the original release page where all new releases are announced and you see a very cool change lock it allows the presentation on this page it's so completely different to what we are used to in former versions it's absolutely great I can absolutely encourage you to visit this page I love it I could watch it all the night because there is it's cool next we have the OKD working group page there you have information about the OKD working groups it occurs b-weekly normally on I think Tuesday evening in German time there is also a calendar for that you can follow this link and you will get the entry in your calendar automatically we have the open shift documentation a few days ago also the OKD preview documentation got released so you also should visit it you have corrections if you find some errors you can I missed sorry I forgot the link there is also repository under open shift open shift dash docs I think you can open the pull request for your documentation box there don't forget to put the OKD in the records in the title so the documentation people can filter out what's special for what has to be separated from open shift or OKD and the last one there are two repos we are currently in this phase the most differences between open shift and OKD are as far as understood this is the open shift install and open shift machine configurator these two projects have a heavy dependency on feed order queries or redhead queries and that's why the OKD guys are mostly working there they do something special for specially for OKD yes this is the end of my presentation about how to install OKD 4.4 on Azure thanks for your attention I hope you learned a little bit and yes maybe we see each other in select channel I would love to see you there goodbye thank you so much Joseph we're taking the time to do this you're welcome there's always something that goes wrong I'm very curious to figure out what it is about the console operator that didn't work but I don't have any tips or tricks for figuring that one out yes that's a stress test