 My name is Nurmagnesi, and with me, Freddy Roland, who are both part of the HPLA group here at Red Hat. Today, we're going to talk about a new way to deploy OpenShift clusters at scale in an automated way. It's called zero-touch provisioning or ZTP for short. After the containers revolution for Kubernetes, which is the basis for OpenShift, became the de facto standard for infrastructure management. Having multiple clusters in your organizations spread across data centers, regions, and multiple cloud providers has become a reality. And with that, when the number of clusters increases, the need for administrators to have robust and reliable tools to manage their fleets of clusters became more apparent. Open cluster management, also known as Red Hat Advanced Cluster Management, is an open source solution for managing multiple clusters. It offers solutions for challenges like policy enforcement, application management, and also cluster lifecycle management, where you may deploy, upgrade, or deprovision your clusters. Open cluster management is available for download as an operator from the operator hub website. Now, the notion of cluster being this gigantic thing with hundreds of nodes in a large footprint with movie tendency still exists. And we do see that, but less and less. What we start to see is smaller clusters with new and more compact apologies, such as three master, three workers type of scenario. And as we are getting closer to the edge, it comes as a requirement to support more lightweight infrastructures, such as, for example, in 5G, the Far Edge network with the distributed units. So with that, we got to a point where the entire deployment needs to occupy the absolute smallest possible footprint, just a single server. To achieve that, Red Hat engineers have been working to reduce the footprint of OpenShift, so it fits into more constrained environments by putting the control plane and worker capabilities into a single node. OpenShift typically requires a temporary bootstrap machine, which is usually a separate machine and, of course, a provisioning network, but edge deployments are often our environments where there are no extra nodes to spare. However, for these use cases that we're interested in, a new functionality provided by OpenShift called bootstrap in place, eliminates the separate bootstrap node requirement for single node deployments. So when installing a single node OpenShift, you only need the node that you wish to install on to. Yet, few things to keep in mind here. Single node OpenShift, or SNO for short, is still in developer preview in OpenShift 4.8. At a while at least, 8 CPU cores, 32 gigs of RAM, and at least 120 gigabytes of storage. It does not have the option to add additional hosts after you've finished your installation. And as you can probably assume, single node OpenShift is not highly available, which means you cannot expect zero downtime for your Kubernetes API. So how do we deliver a single node OpenShift all to the edge in large scale and do that with good performance? First, let's define our requirements. The topology will be hub-spoke, where the hub is a cluster running the management application. We want the spoke cluster installed on bare metal, like, for example, a server at the base of an antenna in a 5G Farage environment. We should have a minimal performance impact on the hub cluster when installing spoke cluster. Also, we want to be able to automate the process to avoid errors. So the derivative API, like currently CRDs, is required. It will allow us to use GitHub's oriented deployments. Customer environment will be disconnected, so no access to the internet. And we want to be able to install a thousand spoke clusters, import them to Ocm, apply policies, get a status of the policies, and all of that in an acceptable timeframe. So in order to install cluster, Ocm is using an open source library called Hive. Hive is an operator which runs as a service on top of Kubernetes. The Hive service can be used to provision and perform the initial configuration of open chief clusters. It supports several platforms, like AWS, Azure, GCP, OpenStack, and more. Hive also supports bare metal provisioning as provided by OperaShift installed using API. However, this feature requires a separate pre-existing provisioning host to run the boot sub-node. Also, this host will require specific network configuration. And for Hive cluster install, Hive will start an installation board taking about 800 megabytes of memory on the hub cluster and also will consume some storage for running the installer. So with the additional node and additional workloads on the hub for Hive cluster installation, this method cannot meet our schedule requirements. So how can we take that away from the hub? So there is a way to install bare metal clusters without the need of an additional boot stop. The assisted installer is a SaaS hosted in reddit.com that enables the user to easily install OpenShift on bare metal or VMS. It provides a UI where the user is guided to the process of providing the minimum input to create a discovery item. The user will need to boot the server without ISO where an agent will report to the service, the hardware, and other sanity checks. Then the installation can be kicked off by the user once all prefab shades are done. So, good news are that assisted installer supports SNO. There is no need for a boot stop node. And the installation is run on the node itself with bootstrap in place for SNO. No need to run the installer on the hub cluster. And the bad news are that assisted installer is SaaS, node need access to the internet to communicate. It had a REST API and node a declarative API, and the user needs to boot the user himself. All right, so we need to take the assisted installer from the cloud into the hub. How do we do that? We packed the assisted installer as an operator, deployed on the hub cluster without any UI. We created Kubernetes API based on the API defined by Hive so that the integration with OCI will be easy. And for booting the node, we can use existing capabilities like MetalCube. The permitted operator is capable of booting a host given a URL of an ISO and using BMC, which is a baseboard management controller. Either way, if you want more information on the assisted installer, you can check out the talk we did about it earlier this year. A link is available in the last one. So now we have all the pieces. Multicluster management on the hub cluster with OCI, cluster provisioning API with Hive, agent-based installation with assisted installer, bare-metal ISO boot with MetalCube and bare-metal operator. And all we need is just to connect everything together. So let's walk through the high-level flow. First, the assisted installer will generate an ISO according to parameters defined in the CRDs. Once the ISO is ready, the bare-metal operator will connect to the bare-metal via the BMC interface and boot it with the ISO. Once booted, the assisted installer agent will start collecting hardware information and report back to the assisted installer service. Once required validation are made, checking for example that we have enough RAM and CPUs, the assisted installer will kick-start the installation on the spoke cluster. Once the purchase is installed, Hive will report the CRD to OCI that the cluster is ready to work with. OCI and with an imported cluster, deploy is agent called clusterlet, then apply policies if any are defined and also configure primitives to report metrics. So now, let's take a closer look at the APIs and CRDs we use for the ZTP flows. Note that some of them were already defined. Cluster deployment and cluster image set came from Hive. Cluster deployment was enhanced to enable to plug an external installer. Using cluster deployment gave us the ability to keep the existing open cluster management interface. Bare-metal host is coming from the MetalCube project and managed cluster and clusterlet edition config came from open cluster management. The infra-env and the nmstate config are the resources needed to create a discovery image. The two are linked with the label selector that allows infra-env to locate the relevant nmstate configs. Those contain network configurations such as static IPs and more. Users may configure infra-env with SSH key to debug the host on the discovery fix. The bare-metal host contains the BMC connection information to the target bare-metal machine. It would also load and boot the discovery image on that spoke. Agent cluster install specifies the cluster's configuration such as networking, number of control planes, et cetera. The agent contains hardware information about the target bare-metal machine. It is created automatically on the hub cluster once the discovery image on the machine is booted. You'll see that in the demo. One additional note about managed cluster and clusterlet add-on config. Both are open cluster management CRDs. In order for the cluster to be managed by the hub, it needs to be imported and known. Managed cluster provides that interface. Clusterlet add-on config contains the list of services provided by the hub to be deployed on the managed cluster once imported to the hub. Over the next few minutes, I'll demonstrate the deployment of an open-juf cluster comprised of a single node. Such a low footprint cluster is useful for many use cases and you'll see how such deployment is performed using the declarative API we mentioned previously in the session. I'll start by creating a cluster image set and pull secret which are prerequisites for registering the cluster to assisted installer. Following up with creating cluster deployment and agent cluster install to register the cluster. Now we can monitor the cluster events and agent cluster install conditions to get better visibility to the installation process. I'll follow up by creating the infra-env resource which contains configurations relevant for generating the reddit.coros discovery image such as SSH key, ignition config and more. Notice how this is immediately reflected in cluster events. We now get an ISO download URL which we could use to download the image and upload to our server but instead we have the assisted installer that will monitor that URL and when it becomes available, update the bare metal host. Then that bare metal operator will automatically provision the host using that image. I'll speed up the recording so we can hop into the next step and see the host reports to assisted service. The assisted installer will approve the note for us and the cluster will start preparing for installation. The installation has now started. I'll speed up the recording once more and we'll see the installation steps take place. The installation workflow consists of several steps. The cluster, which in our case is a single node that also functions as our bootstrap machine starts the installation by writing the coroS image and cluster configuration to the disk. Then the machine is being rebooted and started from the disk. It no longer depends on the discovery ISO. Lastly, we move to the finalize phase where we wait for the cluster to initialize the core OpenShift operators. The cluster installation is now completed. Let's validate that we are able to interact with the cluster by gaining CLI access using the admin cube config available in a secret. Note that we can see our single node in a ready state. It functions both as a master and worker. That concludes the demo of a single cluster deployment. All right, now that we saw how to deploy one SNL cluster, are we ready to deploy thousand SNL clusters? We used about 100 physical nodes and used libVirt on top of them in order to spin up VMs and simulate those core clusters. Here's the result of one of the one care runs. We did 10 seconds scheduling for polygaming, meaning that we'll create all CRDs for one cluster, wait 10 seconds, and then get the next one. So it takes about three hours to create all of them and about an additional one hour to finish all the recipe install. So the blue line is initialized, meaning that CRs for cluster are created and applied. The red one is booted, the automated operator booted the Spark machine with the discovery ISO. The green is discovered, meaning that agent CR was created on the hub and that the hardware discovery phase is completed. The provisioning line, the one in purple, shows that OCP installation is in progress. Completed means that OCP is installed and managed means that OCM imported to cluster. The success rate is very high and out of a little more than 1,000 cluster, maybe about only five cluster did not finish the installation successfully. A lot of performance improvements were done on the way to be able to achieve this milestone. In an ideal world, we just pull everything from the internet, but in the real world, it is not that simple. Pivot data centers are mostly disconnected from the internet for security reasons. Here are the components that are required to achieve a disconnected setup of the deep. So we need an internal registry server where we should mirror the OCP release and also all the OLM containers. We need an HTTP server to also our course live ISO and the root FS. These are needed for this to be installed to create the discovery ISO. And of course, we need some networking configuration like firewall rules, make sure that DNS and HTTP are well-configured. Exact steps are available in the below beta variable. Now that we have all the declarative ZPP APIs and we can use GitOps to deploy full sites and use also the power of OSTM policies to apply a contribution. The source of tools is a set of Git reports that contain a site planning. A site planning is a set of the image files that contain all the different settings for the cluster that we want to deploy. 20-foot parameters like cluster name, domain, IP ranges and also definitions for the hardware like BMC credentials, static network definitions and also additional operators that you want to be installed. All that combined with customize, we are producing a final output. It will be a Yama definition of all the settings that we are passing to OSTM that will apply them. Create this cluster and after that, you will apply all the configuration via policies. So for example, a 5G profile will include machine config settings for NTP and SCTP, performance add-on operator for a real-time kernel and other features. SRI moving if you want to interact with low-level S4 function and in the end, PTP configuration is also required. Below is a link to a repo with all you need to use GitOps with our RCD and deploy ZTP with 5G profiles. Here are some of the features planned for the TTP component. We want to support additional cluster topologies like TreeMaster or TreeMaster and more workers. Also, the user may want to add an additional worker node to an install cluster as a data operation. The flow would be very similar, only that the starting point will be an existing cluster. Regarding lead binding, the idea that the persona that will be in charge of booting the host and maintaining the hardware inventory is not the same persona that will create the clusters. That way you can pre-boot the host and build pools of host that cluster can be built for. So with the lead binding option, a new integration in OSTM will allow the end user to select the host to create a new cluster. And finally, more improvement for SCAR support. What about 2,000 SNL nodes? Maybe more. Here are some useful links with all the projects we mentioned. Open cluster management, Hive, Assisted Installer, ElmetaCube, the ZTP, the hardware repo, provides a walkthrough that will guide you if you want to try ZTP. We also added a link to Assisted Installer in redder.com and we encourage you to try it out. We also added some additional talks down below. This session was pre-recorded but will be available in the chat for any questions you may have right afterwards. Thank you for listening. You're welcome to reach out to us and share your feedback, wishing you all successful installations. Thank you. All right, thank you so much to, sorry, thank you so much to Fred in there for that talk. You guys can come on and answer questions. Yeah, I guess you can write in the chat too. Hey, Fred. Hi, any questions? It looks like we have a couple in the chat. David asked, the remote worker node will be asynchronously managed. Will it be able to handle periods of disconnection? Yes, so it would be like, it would be other layer, layer three. So it will be needing to run those kind of disconnection. It's still working for us, so I don't know all the details, but yes. Okay, thank you. Someone asked, what is currently the place where most time is spent in the scale deployment? In other words, how much time, how much can it be sped up? So like I said, we did some scheduling between applying the different clusters, like 10 seconds between each of them. Actually, one install will take about 40 minutes. So, and the bottleneck would be, I think around 8CD, so that the cluster will be and overloaded if we get all everything on at the same time. So it really depends on how fast your storage is on the hard cluster. If you want 8CD to be able to handle all these CRDs at the same time. So I guess that in the future, we will try to lower the stage of its time between the configuration of the clusters and see what happens. And then, only guess, I don't know who that is. What issues did you face when optimizing SNO for scale? So SNO itself is an open shift just with the one cell or so. So it's a different effort that we did for the hub, for the Hub site. SNO itself is defining that some operators need to behave differently when they know they are running on a single server. So I've been a team effort to understand what operators are running or they should behave and what they want to do is need the minimal performance to be able to minimal impact on the single node. On the Hub site, there have been issues regarding caching when you are working with controllers and the caching, all the CRDs in your memories is having a heavy performance impact. So we have been working on having it to cache more filter and also run into the performance having the right disks for the Hub clusters. Thank you. David asked, was there a method for deploying the SNO for public cloud for testing, specifically from cloud.rh.com for testing? Yes, SNO is available via Assistant Installer in the cloud, SAS version of assisted service. So I will paste the link on the chat in a few minutes. But it is, so it will be the same process, just configure some parameters, download the ISO and you will need to boot your server yourself. And which you will be able to look at the SNO on your own server. Okay, thank you. Yeah, if you want to post the link to the slides, that'd be awesome. I put them in the SCAD also with the transcript of the talk. And if anyone would like to talk with Fred or Ner after this talk, you can feel free to meet them in the breakout room. Thank you for your time. Bye-bye.