 In this session, I will introduce how to use Cluster API, Metacube and Cluster Scaler to scale on-premise biometal Kubernetes clusters. This, the contents of this presentation are pretty repeated with the previous ones, so, you know. Okay, my name is Xu Kun. I work for Fujitsu and currently focus on biometal provisioning on Kubernetes. And today's contents. First is some background, and then I will introduce how the three projects work, how to use them, and finally show you a demo. Okay, the background. Nowadays, we still need on-premise for some reasons like need for full control of the entire environment, or need for some specific level of security and privacy, or maybe the compliance requirements. But we all know that managing an on-premise environment can be very tedious because we need to do everything by ourself, so how could we reduce the operation cost? And as here we are at Cupid Day, the solution I want to introduce today is Kubernetes way, combining Cluster API, Metacube IO, and Cluster Scaler to automatically create and manage Kubernetes clusters which are provisioned by biometal servers. So Kubernetes, we are all very familiar with, is a great tool for deploying applications, and the rest three projects can be used to manage the infrastructure. The first is Cluster API. It extends Kubernetes API to create and manage the Kubernetes clusters, and it is a Kubernetes community sub-projects created by C cluster lifecycle. It also runs on Kubernetes cluster, and that cluster is called management cluster. The cluster is created and managed by cluster API, it's called workload cluster then, and the infrastructure can be a cloud, all biometal. But no matter what infrastructure is used, there must exist an infrastructure provider component that provisions machines and networks for our clusters. And for biometal, Metacube IO can be that provider. It is a biometal provisioning tool for Kubernetes. With it, we can install OS to a biometal server in the Kubernetes way. And then the last one is Cluster of Scaler, which adjusts the size of a cluster according to the workload. And using this Cluster API, it watches a workload cluster and send a scale request to Cluster API when necessary. And combine the three projects, we can automatically create, manage, and scale of biometal Kubernetes clusters. Okay. Now before talking about the details of the three projects, there's a little bit background knowledge or although I believe that everyone's very familiar with, which is the way in which Kubernetes works. Kubernetes uses objects to describe the state of the system and controllers to watch those objects and make the desired state happen. So this is the fundamental mechanism of Kubernetes. And we also know that Kubernetes provides a easy way to extend its API, which is CRD, the custom resource definition. With CRD, Kubernetes can understand the new kinds of objects and then combine with custom controllers to control those new objects. We can make Kubernetes do what we want. So the projects that we are talking about today also use this mechanism. And okay, with keeping this in mind, let's go further about how the three projects work. The first is Cluster API. Again, Cluster API extends Kubernetes API to create and manage Kubernetes cluster. And to do this job, Cluster API first defines two new CRDs, cluster and the machine. Cluster, which represents a workload cluster and machine represents a infrastructure component hosting a Kubernetes node. And that component can be a virtual machine, bare metal or even the container, which works like a machine. With these two objects, we can describe the basic state of our cluster. And next, we need some controllers to watch these objects and request resources from infrastructure and boot machines into Kubernetes nodes. And to decouple from infrastructure-specific logic, the cluster API uses the idea of provider. There are three providers defined here, the infrastructure provider, control plan provider and the bootstrap provider. The infrastructure provider is responsible for requesting resources from infrastructure and the metacube.io is an infrastructure provider. And the bootstrap provider provides data to a machine for bootstrapping it into a Kubernetes node. And the control plan provider is used to create a control plan for our cluster. Now each provider needs to define some CRDs and implement related controllers so that the cluster and the machine object just need to interact with those providers, CRDs, and do not need to worry about any detail provider-specific logic. By the way, the blue arrow in the image here means reference referring to another object and the black one means creating or modifying an object. The cluster object contains the inter-cluster network configuration like port-sider or service-sider. And also references to the infrastructure cluster and the control plan objects. And then the infrastructure cluster object contains information about the infrastructure network which the cluster runs on. And also some other configuration for our cluster. And here, the example here I used is very specific to metacube.io. And the control plan object contains templates of infrastructure machine and bootstrap config object. And those templates will be used to create those objects. And the infrastructure machine object represents a machine on the specific infrastructure. It contains information for the controller to request a machine from the specific infrastructure. And then provisioning and bootstrapping it into a Kubernetes node. The bootstrap part information is actually from a secret generated from the bootstrap config object. Okay, the bootstrap config object contains information about how to bootstrap server into Kubernetes node really through CloudyNeed. And a secret storing those data will be generated and referred by the infrastructure machine objects. And the last is the machine object. It contains the references to the infrastructure machine object and bootstrap config object. And also it has, it knows the name of the cluster which it belongs to. And there are many other studies defined in this architecture but to create a Kubernetes control plan we mainly need three objects, the cluster, infrastructure cluster and the control plan object. After the three objects are ready, the control plan provider will start to create the infrastructure machine objects and the bootstrap config object. And then create a machine object which has reference to the previous two objects. And after that, the machine controller will set on a reference to the infrastructure machine and the bootstrap config object to notify that, to notify each provider that these objects are consumed by some machine. And then the bootstrap provider will know, okay, I need to generate bootstrap data for that machine and the infrastructure machine controller will start to use those data and request the machine and bootstrap that machine to a Kubernetes node. Here are the sample implementation implemented by the cluster API community which are using QADM to bootstrap a machine. And when creating a worker nodes, cluster API has defined another two CRDs, machine deployment and the machine set. Their logic is familiar with the control plan provider and the relationship between them is just like port, reference set and the deployment in Kubernetes. So next is about a Metacube DIO. Again, Metacube DIO is a bare metal provisioning tool for Kubernetes and also is an infrastructure provider for cluster API. There are two main components in Metacubes. Cluster API provider Metacubes or we can call it a CAPM3 and the bare metal operator. The CAPM3 is the infrastructure provider and the bare metal operator is the bare metal provisioning tool and the CRD defined by them are actually Metacube cluster and the Metacube machine and the bare metal host object represents a bare metal server. It contains information such as BMC details, boot mode, OS image, OS image, user data and so on. Bare metal operator will use this data to access the bare metal servers, control the power and do the provisioning work. Now to use the Metacube along with cluster API we first need to create bare metal host objects for each bare metal servers. These bare metal host objects are the server inventory ready for provisioning cluster API machines. Then we need to create a cluster API object that will generate machines for us. Here I use machine set for the example and also we need to create related Metacube machine template and the Metacube data template object. The Metacube machine template is used to generate the Metacube machine which is a consumer of the bare metal host object and the Metacube data template is used to generate network data and the metadata used for cloud in need for when provisioning our bare metal server. And like we talked before, the machine set controller will use those templates to create a Metacube machine and the Curvedium Configure object. And also after that a machine object will be created and reference to the previous object, two objects. Then the machine controller set on the reference and each provider will start the provisioning work which is generating the secret secrets that's storing our bootchop data. And after all the secrets are ready, the Metacube machine controller will choose a suitable bare metal host set part of its spec including image and the reference for user data, metadata and network data and also the consumer. And the next the bare metal provider will detect those changes and start the provisioning work, use this data. About provisioning, bare metal operator uses OpenStack Ironic project as the backend. There is an ironic server running along with the bare metal operator accepting requests from the bare metal operator and it will access to the bare metal server, change the boot option and power on the server through its BMC, start a RAM disk and runs an agent called Ironic Python agent or IPA. And then the IPA will talk with Ironic, inspect hardware and write data to the hardware disk. Okay, this is the provisioning workflow. It shows how the image is written to the server when using Pixie boot. And after write, after written to the disk, the server will reboot and if we set the correct bootstrap data, then Kubernetes node will be initiated. Okay, the last one is clustered or scalar. It can adjust the size of a Kubernetes cluster according to its workload to implement this. The core mechanism and idea it uses is node group. The node group is a set of nodes that have the same capacity and a set of labels. The cluster or scalar defines an interface also called node group to control a node group. Though this interface has methods like increase size or delete nodes and so on. And it is implemented by each cloud provider and cluster API is one of them. And the cluster or scalar will check the pods deployed on the target cluster periodically. If there exists a pod whose status is pending because of lacking resources, the cluster or scalar will then choose a node group and calculate if scaling up that node group, whether the new node could host the pending pods or not. If so, how many new nodes are needed? And after the calculation, cluster or scalar will call increase size method to scale up that node group. And requests will be sent to the related cloud and the new machines will be created and boot into Kubernetes nodes. Scaling down is similar with this logic. The cluster or scalar checks the resources usage of each node to see if there are nodes that are not needed anymore. If we exist, it will start scaling down by calling the deletion nodes method. Cluster API is one of the cloud provider. So if using cluster API, a node group is actually a machine set or machine deployment object. And the scaling job is done by changing the replicas field in their spec. Now, because the auto-scaler needs to not only check pods around on the workload cluster, but also modifying a machine set or machine deployment on the management cluster. So the cluster or scalar needs access to both of the two clusters. Okay, the next part is about how to use the three projects. Cluster API community created a tool called Cluster Carto, which can deploy related CRDs and controllers and also generate YAML files for objects like cluster. We can use cluster Carto init command to create a cluster API core components and each provider. To deploy the meta operator and Ironic, the meta group community provides a script, deploy.sh. This script can deploy the meta operator and Ironic in cluster as a pod. And we can also deploy Ironic outside of the cluster using another script, wrong local Ironic.sh. But to run both of them, there are some variables need to be configured. Here are a part of them. And if using deploy.sh, these variables are defined through one of these two files. And for run local Ironic.sh, they are defined through system environment or we can use default values defined in the script itself. Okay, after all the components are ready, we just need to create objects. They are cluster, QVADM control plan, machine deployment, QVADM, a QubeConfig template and meta-Qube cluster, meta-Qube machine template, meta-Qube data template and also by meta hosts. And with all the needed objects created, the provision job restarts automatically. About cluster auto scaler, it can be deployed using Helm. The chart can be downloaded from GitHub. And when installing, we also need to set some specific variables to specify cluster API as the cloud provider and also specify how the cluster auto scaler could access to workload cluster and management cluster. The cluster API mode here defines how the cluster auto scaler could access to the clusters. And in this example, I am using QubeConfig in cluster mode, which means the cluster, the auto scaler will run in the management cluster and use QubeConfig to talk to the workload cluster and use in cluster authentication to access the management cluster. And the QubeConfig for workload cluster is specified through cluster API QubeConfig secret. The value is the name of the secret stored in the management cluster. To control which machine sets or machine deployments should be scaled, we need to add these two annotations to those target objects. The cluster auto scaler will monitor any machine set or machine deployment that contains both of these two annotations. Okay, finally, the demo part. In the demo, I will first create a one master, one worker cluster using two bare metal servers and then scaled up that cluster. This image shows the network of the entire environment. We, the management cluster need to access to the BMC and also access to provisioning network, which is, which I wrote, which is in which ironic rounds. And also the bare metal network, which is in which the QubeNATUS, the workload cluster QubeNATUS runs. Okay, well, because I'm using bare metal and the provisioning of a bare metal server is very time consuming. So actually this is a video record. And I have already created all the components. You can see here, in CAPI system, namespace, there are a cluster API controller manager and there exists QubeADM controller and QubeADM bootstrap controller and also CAPIM3 controller manager and bare metal operator and ironic. And I have created three bare metal hosts. Their state is available. And if I get the detail of the bare metal hosts, you can find the hardware information in their status. For example here, the X and the CPU and the storage. And the next we need to create our cluster and the control plan. The first is the cluster object. It looks like this. It has cluster network, inter cluster network configuration and the control plan reference and the infrastructure reference. And the next is the QubeADM control plan object. Well, it's a very, it's a little long. And you can see here we have a infrastructure reference to a metacube machine template that will be used to generate a metacube machine object. And also we have template for bootstrap data and it is in the format of cloud in it. We have, I have defined a lot of commands to run and also a lot of files to be created in the server. Okay, the next one is the machine, metacube machine template. And in the template we have the data template to use to generate the network data and metadata and also the image information for the server. And as you can see here the, here are the metacube data template and it contains information about metadata and the network data. And the last one is metacube cluster. It only contains the control plan endpoint and also a boot value indicated that our cluster has no cloud provider. And after created those objects all, we can get the cluster that says the face is provisioning and because the controller is checking whether the infrastructure cluster is ready or not. And if we get curvilium control plan object, here there are nothing in it that because the cluster is not provisioned yet. And a few minutes later, a few seconds later, okay the, because the metacube cluster our infrastructure is ready now. So the cluster is going to the provisioned face and we can see the curvilium control plan has replicas one and updated one and it is unavailable. So it is in the initialized face and initializing face. And we can see a new machine is created and in the provisioning phase. And the metacube machine is also be created about 30 seconds ago. And the biometal host, one of the biometal host is chosen and it is provisioning. And the power is on line and the consumer is our metacube machine object. And almost after 15 minutes in my environment, the data, the OS image will be writing into the biometal host. The biometal host, so the state, yeah here the state after 15 minutes, the state of the biometal host is provisioned, which means that all the data has been written to the server and it but if we get the metacube machine, it says not ready, that because the server gets rebooted and all the command is running now and the Kubernetes is not up yet. And after another two, almost another two minutes, okay, here we can see another two minutes. After another two minutes, the machine has a provider ID set and the face is running. And the metacube machine also has the same provider ID and the red is true. The provider ID is generated by metacube automatically and will be copied to the machine object. And if that provider ID is set, the machine will be considered as running and ready. And we can see here the control plan is now initialized and the API server available. And now we can access to our server to get, and we can get a node. And also the kubeconfig for our workload cluster is stored in the management cluster as a secret. So my cluster kubeconfig, we can get this. Use this to access our cluster. Okay, I have stored the kubeconfig and get node and also get pod. Now then next is I need to create a machine deployment to create a worker for our cluster. The machine deployment has the bootstrap template to create a bootstrap config and a template to create a machine, a metacube machine. Okay, so I was similar with the control plan so I skip this and after these objects are created and here the machine deployment is in scaling up face and a new machine is created and a new metacube machine is created also. And after another two, okay, the machine object has a reference to the bootstrap config and metacube machine object. And after another two, and yeah, and a new, another BMH has been chosen and it is in provisioning phase. Okay, after another two, 20 minutes, it will become worker ready. Okay, the worker is ready now and they have provided ID set and the machine deployment is in running phase and our get node, there are two nodes in our workload cluster. And the next I will use this script to create auto-scaler. Okay, now auto-scaler is created and you can see here the port is running and to trigger the scaling up, I need to ensure that our machine deployment has the right annotations here. Okay. And now the next to trigger the scaling up we'll create a deployment. The port deployed by this deployment does nothing but request for four CPU cores. And because the worker has 20 CPUs, so okay, the port is running. And so if I add this to six and there will be one port in pending status, in pending phase, okay, so here there are a port, there are pending port in the workload cluster. And I can see the machine deployment is scaling up again and a new machine is created and the last payment host is chosen for provisioning. Okay, that's all. Okay, the last is some information about the community. Thank you all.