 Hello everyone, welcome to this session. My name is Shu Kun-Sung. I work for Fujitsu as a software engineer focusing on bare metal management. Today, I will talk about how to manage on-premises infrastructure in a Kubernetes way, which is using two OS open source projects, Cluster API and Metacubed I.O. I will start from some background and then talk about the details of the Cluster API and Metacubed I.O. The word Meta 3 is pronounced Metacubed, and also I will show you a demo about how to use them. At last, I will introduce some efforts we are making to enhance Metacubed, which is expanded scope to also manage network devices. Okay, let's start. Although cloud is very popular when we consider to create something new or migrate our system, there are still lots of systems rely on on-premises environments for some reasons, like security, for example. Those on-premises environments are still important, but the management can be tedious. We need to reduce its operation costs through some automation, for example, if we can increase the number of machines without any manual operation, just like using a public cloud, that will be very helpful. And there should already have been a lot of solutions for this task, I believe. One of them I want to introduce today is using a Kubernetes way. This solution combines the three projects, Kubernetes, Cluster API and Metacubed I.O. Okay, let's start from what the three projects are. Kubernetes is a production-grade container optimization tool. It automates the deployment and management of containerized applications and provides a declarative API. Users can use that API to declare the desired state of the applications and the Kubernetes will make sure that the actual state of the system is equal to the desired state at any time. And the applications deployed by Kubernetes have no downtime. Kubernetes has many other convenient features such as service discovery and load balancing, storage, orchestration, and et cetera. It provides a very convenient way to manage containers. But how about in the cluster itself? Manage life cycle of clusters can be tedious. You need to consider scaling, upgrading, and even migrating. And it becomes more complicated when different infrastructure is used. What if we can manage clusters just like managed containers? Why not use Kubernetes to manage the cluster itself? Users just need to declare the state of the cluster and then Kubernetes will handle everything for us. Such motivation brings us cluster API. So cluster API is a Kubernetes community sub-project created by Sieg cluster life cycle. It extends Kubernetes and provides a declarative Kubernetes-south APIs for cluster management. With cluster API, the Kubernetes estate itself, such as cluster, can be treated as first-class Kubernetes objects. There are some new concepts in cluster API. Two of them are management cluster and workload cluster. Management cluster is a cluster runs cluster API components and manages the life cycle of workload clusters. And workload cluster is a cluster that created on the user-specified infrastructure and runs user applications. Having a management cluster running cluster API components users like an infrastructure management team then can create and manage a workload cluster for each tenant by declaring the cluster's desired state in a YAML file and passing that file to Kubernetes just like deploy an application. And the cluster API can be used for different infrastructure. For example, ES provided by cloud vendor, such as AWS and so on. But it does not directly talk to the infrastructure. Instead, it uses the idea of provider to interact with the underlying infrastructure. The provider knows how to request resources, for example, network and machine, from the ES platform. We will talk about the details of provider later. And today we are talking about on-premises environment, not like cloud, such environment needs to be managed by owner itself. Although cluster API can easily create and manage clusters, for on-premises, we still need someone to manage the entire environment for us. Someone who can provision the meta servers and respond to the requests from the cluster API. Metacube.io could be this someone. The word, it is metacubis an open source via meta provisioning tool. Also using Kubernetes style API. Metacube can also connect as managed by meta servers in a Kubernetes way. And also it has implemented an infrastructure provider for cluster API, so that the workload clusters managed by cluster API can be created on the meta servers. Okay, with the help of cluster API and Metacube.io, we can manage our on-premise by meta infrastructure infrastructure through Kubernetes, just like how we use Kubernetes to manage containerized applications. Next, I want to show you the details about how these two projects make all these cluster management possible. But before that, let me talk a little bit more about Kubernetes itself, because there are some background knowledge we need to help us understand cluster API and Metacube.io. First, we are talking about managing Kubernetes cluster. But what is Kubernetes cluster? Well, a Kubernetes cluster consists of control plan and at least one worker node which runs containerized applications. The control plan consists of mainly four components, CUBE API server to be a controller manager, CUBE scheduler, and AdCity. The API server exposes the Kubernetes API. Controller manager runs controller processes. And the scheduler watches parts and finds a node for those unscheduled parts to run on. And AdCity is a key value store, storing all the cluster data. On the worker node, there are three components, Kubernetes, CUBE proxy, and the container runtime. CUBE let creates parts and makes sure that every container is healthy, and the CUBE proxy maintains network rules on the worker nodes. So creating Kubernetes cluster is actually meaning running the control plan components somewhere and running CUBE let and CUBE proxy on each worker node. All the worker nodes should have a container runtime. Next is about how Kubernetes works. Kubernetes keeps the system always at the desired state. To do that, we first need a way to describe the system. And that is done by using objects defined by Kubernetes. An object is a persistent entity in the Kubernetes system, represents some state of the cluster. For example, pod is an object representing the state of containers. Pod has all the information for running a container, like which image to be used or which configuration should be passed, et cetera. Almost every Kubernetes object includes two fields, spec and status. While spec is the desired state defined by user, and status is the current state detected and updated by Kubernetes itself. With objects, we can describe the system. The next task is to make the desired state happen. Kubernetes do this through controllers. A controller is a process watching at least one Kubernetes object and running an endless control loop called reconcile. In the reconciliation loop, a controller tries to move the current state of the Kubernetes object closer to its desired state. The object and this reconciliation loop is fundamental to how almost everything works in Kubernetes. And the last thing, before we move to the detail of cluster API and the meta-cubed, is about a way to extend the Kubernetes API. Although there are already a lot of kinds of objects and related controllers in Kubernetes, which enable Kubernetes to manage everything related with container. We may still expect Kubernetes to do more. We may need to extend its API to add a new kind of objects to Kubernetes and reconcile it. Kubernetes also provides an easier way to do this extension, which is CRD, Custom Resource Definition. As we can tell from its name, CRD is a definition defines a new resource created by user. And here, a resource is an API endpoint that stores a collection of objects of a certain kind. For example, the ports resource contains a collection of port objects. So a custom resource then is an API endpoint that stores objects of a new user defined kind, which means an extension of the Kubernetes API. And CRD is one way to add the custom resources. With creating a CRD, Kubernetes can understand those new kind objects and users just need to create their own controller to handle the related reconcile loop. Then Kubernetes can be used to manage that new objects. The CRD itself is also a Kubernetes object. You can list them by kubectl, get CRD, to see how many new APIs have been added to that cluster. And as you have probably submits cluster API and meta kubectl, you will CRD to extend Kubernetes to understand API kinds related to cluster life cycles management and the bare metal provisioning. Okay, now we have known how Kubernetes works and how it can be extended. Let's move to the next part, what cluster API and the meta kubectl have done to implement the cluster life cycle management and the bare metal provisioning. How cluster API works. As we talked before, cluster API manages the life cycle of Kubernetes cluster. It adds and relies on several new CRDs and related controllers. First comes to the cluster and the machine CRD. The cluster object represents a workload cluster that need to be created and managed. It contains network configuration for this cluster like pod CIDR, et cetera. And the machine objects represents an infrastructure component hosting a Kubernetes node. That infrastructure component can be a virtual machine or bare metal server, for example. Each machine object has a reference to the cluster which this machine belongs to. And the machine objects are immutable. Once the machine is created, it cannot be updated except for some metadata and its status. If a machine to be updated needs to be updated, then a new machine should be created to replace the current one. And then the current one should be deleted. For this reason, there are some other CRDs defined to handle the changes to machines. For example, machine deployment object for handling update and the machine set object for handling scaling, just like for pod, there exists deployment and replica set in Kubernetes. With the help of machine and cluster object, we can describe the basic state of our cluster. The next is some controllers need to do some reconcile loops to create and manage the target cluster. The job, this job should include at least requesting resources like machine and network from a specific infrastructure and booting a machine into a Kubernetes node. Because different infrastructure usually use different API to interact with and there are also multiple ways to bootstrap a machine into a Kubernetes node, it is obviously not appropriate to let the machine controller and the cluster controller to handle all these tasks. So cluster API community chooses the idea of provider. There are three kinds of providers defined. Infrastructure provider, control plan provider and bootstrap provider. The infrastructure provider we have talked before is responsible for requesting resources from the infrastructure. The bootstrap provider needs to provide ways to a machine to make sure that machine can become a Kubernetes node. And the control plan provider is used to create a control plan for the cluster. For each provider, cluster API defines some new CRDs that the provider should provide and enablement related controllers. So that the cluster API core components, the cluster and the machine object just need to interact with those objects and do not need to worry about any details of provider specific logic. The infrastructure cluster objects here contains the underlying network information for this cluster. Those information the provider requested and got from the infrastructure. For example, the address of the Kubernetes API endpoint. The infrastructure machine objects then has the necessary information for specifying a machine, like OS image, hardware spec, et cetera. And the control plan object represents the control plan for a cluster. The bootstrap config objects contains the configuration that needed by the bootstrap provider to boot the machine into a Kubernetes node. With all these providers and cluster API core components installed to create a workload cluster, users just need to first create a cluster object containing references to an infrastructure cluster object and the control plan objects. Then create that infrastructure cluster object and that control plan object. Then infrastructure provider where they request the network resources for this cluster and the control plan provider will create a related bootstrap config object and the infrastructure machine object to initialize the control plan for this cluster. After infrastructure machine object is created, the infrastructure provider will request a machine from the underlying infrastructure and bootstrap provider knows how to boot that machine into a Kubernetes node using the information described in the bootstrap config object. Then user can create a machine object for this cluster containing references to an infrastructure machine object and the bootstrap config object. Like control plan, a new machine will be requested and booted into a Kubernetes worker node. While the infrastructure provider needs to be implemented by a vendor who provides that infrastructure. For the rest two providers, the community has provided a simple implementation for each of them. CubeADM control plan provider and CubeADM bootstrap provider. Both of them are using CubeADM to make a machine become the Kubernetes node and the actual CRDs are CubeADM control plan and CubeADM config. And as we have talked before, MetacubedIO can act as an infrastructure provider. Actually, there are two main components in MetacubedIO. One is cluster API provider Metacubed. The other is a bare metal operator. Cluster API provider Metacubed, which is usually abbreviated as CAPM3, is the cluster API infrastructure provider part. And the bare metal operator is used to manage and provision the bare metal servers. Okay, now let's talk about how MetacubedIO works. So until now we know CAPM3 will respond to cluster API requests. It will request machines for the cluster API machine objects or the control plan objects. And the bare metal operator responds to those requests provision a bare metal to make it suitable to host a Kubernetes node. So what has MetacubedIO done to make this bare metal management happen? Like cluster API MetacubedIO also use Kubernetes CRD feature to extend the Kubernetes API. Cluster CAPM3 provides several CRDs. Some of them are Metacubed cluster as infrastructure cluster object. And Metacubed machine as infrastructure machine as the cluster API required. And the bare metal operator provides a CRD called bare metal hosts for bare metal management. The bare metal hosts object represents one bare metal server. It contains the information of that bare metal such as its BMC address, hardware details, and so on. Once the bare metal host is created, the bare metal operator will try to manage that server through BMC and inspect its hardware. After the inspection completed, the bare metal the BMH, which is bare metal host. So BMH hardware details will be updated and its status will be changed to ready, which means the bare metal server is ready to be consumed. If a Metacubed machine object is created, Cluster CAPM3 will then try to find a suitable bare metal host from the current ready bare metal hosts to host in this Metacubed machine. And if CAPM3 finds such a bare metal host successfully, it will then copy the info of the OSI image and use the data from the Metacubed machine to bare metal hosts. Then bare metal operator will notice that such data has been added to this bare metal host object, which means that this bare metal host has been consumed by someone. So bare metal operator will start to provision the bare metal server. Next is about provisioning. Actually, the bare metal operator does not directly interact with bare metal servers. Instead, it uses some other software as backends. And for now, it uses OpenStack Ironic project to do the provisioning work. There is an Ironic server running in the management cluster or anywhere else, as long as the bare metal operator can access. The Ironic will accept requests from bare metal operator like inspecting the bare metal or provisioning the bare metal. Ironic uses PixieBoot to do all the things. It will access to the bare metal and change the boot option to UEFI PixieBoot, power on the server through its BMC. Here the BMC means Baseboard Management Controller. Start and also start a RAM disk from one leak of the server and run an agent, which is called Ironic Python Agent on that RAM disk. The Ironic Python Agent will inspect the hardware of the bare metal and send the results to the Ironic server side. Or for provisioning, the Ironic Python Agent will write OS image and config drive to bare metal server. This image shows the details of the provisioning workflow. It starts with bare metal operator registering noted to Ironic and then Ironic will send a term on the bare metal server through BMC and bare metal server will go through DHCP query, download the Ironic Python Agent and the introspection and send the introspection report to Ironic and Ironic update bare metal status to bare metal operator. And when deploy node provisioning, the bare metal operator will send the request, deploy node request to Ironic. And then Ironic will tell Ironic Python Agent to download the image. Then Ironic Python Agent running on the bare metal server will download the image from the DHCP server and write it to disk. And after that, it report to Ironic that I'm ready and Ironic will then update it to bare metal and also reboot the server. And after rebooting, the server will cloud init will complete and the server will join to all, join to the Kubernetes node or create init will run as a initialized control plan. So how to use these two projects? When using class API and the metacubel.io to manage bare metal environment, the total workflow could be, first, prepare bare metal environment, including connecting while and configuring network. Next, create a management cluster. Make sure class API, CAPM3, be a bare metal operator, Ironic are running correctly and created the other necessary environment such as DHCP server, HTTP server and also prepare the related image files. And then create bare metal host inventory and create all the related resources we need to create a cluster such as cluster, metacube cluster, machine and the metacube machine and so on. After that, the provisioning will be started automatically. CAPM3 finds a suitable bare metal host for each metacube machine and the bare metal operator will then start to provision the bare metal server using the information described in bare metal host and bare metal host servers reboot and then join the cluster. Next, I will show you a demo about using class API and the metacubel.io. In the demo, I will use two bare metal servers to create a one master and one worker cluster. This image shows the network configuration of the total environment. There are three networks existing. Management network is used to access to all the BMC of the bare metals and provisioning network is for pixie boots and should be able to access to Ironic. And bare metal network is used as the default network for the bare metal server after provisioning finished. The target Kubernetes clusters stays in this network. All the network can be accessible from the management cluster now. And now let's start the demo. We have already created in the management cluster all the components we need are installed. And currently no bare metal host exists. So let's create one for our control plan. Okay, the bare metal host is created. It state starts from registering and then inspecting. Yes, you can see here the state now is inspecting. And into the back, we set the VM access info and the MAC address of the device used for pixie boots. After inspection, the bare metal host will become ready and because of the time limitation, we won't wait for that. We just skip to see what happened after the inspection finished. Okay, now the inspection finished, the BMH is the state available, which means it is ready for to be consumed by some other components. And you can see here all the hardware information details are updated automatically. Some CPU information, vendor information, storage information and so on. Okay, now we have a bare metal host ready for use. Next, we need to create our cluster and control plan object. The cluster object looks like this. It has cluster network configuration and the reference to a cube ADM control plan object and also a reference to a metacube cluster object. And the metacube cluster has the information about control plan and the point which is provided to our cluster. Now let's create the cluster and the metacube cluster objects. For a cube ADM control plan, I will create it later. And I'm using the script provided by the metacube community for testing purpose. And also for time limitation, let's just skip the creation part. Okay, the creation is completed. We have a cluster, you'll create it and we can check its status. Now here you can see the face is provisioned. That means we have a infrastructure cluster provided. So no matter control plan exists or not, if there is infrastructure cluster provided, then the cluster is in provision faced. And we can also check our metacube cluster. Next is cube ADM control plan. This cube ADM control plan object is used to create our control plan. And so that this object looks like this. It first has a cube ADM config spec which content is a template for generating cube ADM config objects. That cube ADM config object will then be used by a machine to let the machine become into a cube ADM, Kubernetes control plan. So you can see here in the template there are a lot of files defined that those files will be passed to the machine during provisioning phase. And also a post the cube ADM commands and the pre-cube ADM commands also defined. And of course the init configuration and the join configuration or cube ADM to init and all join the cluster as a control plan. And finally, there is a machine template defined here. The machine template is used to generate a machine object. And you can see it has a reference to a infrastructure object which here is the metacube machine template. So that means the machine will have infrastructure machine which is metacube machine. And that metacube machine will be generated by this metacube machine template. And after the metacube machine created automatically the Capn3 will choose a ready B image and then set the B image to set image info to the B image. And so B metaprater then will start to provision the B met host. The metacube machine template itself looks like this. It's quite simple. Just has some data templates and the image info. So I just use the scripts also provided by the community creating this cube ADM control plan object and the metacube machine template. You can see here the cube ADM is created and it belongs to cluster test one and all the replicas are ready and updated and available are not created yet. And this is the status detail about the cube ADM control plan objects. It has some conditions to determine whether the control plan is ready or not. And also some other like replicas on the available replicas, the status of the control plan. We can also see that the new machine is created by the controller automatically and the machine is now in provision phase. That means we are currently provisioning the metal server. And also a new cube ADM configure object is created too. The machine just use the new created machine just use this cube ADM config do a booth into a Kubernetes master node, Kubernetes master node. And also a metacube machine is created. You can see here in the metacube machine status it has some addresses assigned. And those addresses will be set in the Bermuda server. So after provisioning finished, we can access into the server using this addresses. And now the Bermuda host is in provisioning state. And we just need to waiting the provisioning state to finished. So let's just skip this waiting time. Okay, now provisioning has finished and the Bermuda host is in provisioning state. And during the waiting time, I just added a new Bermuda host which is noted to for later because I want to later add a new walker into the cluster. And the machine can see here is in running phase and has a provided ID set in its spec. And also the metacube related the metacube machine also have same provider ID with the machine. And also it's in ready state. And we can access into the Bermuda server to see what happened with our cluster. Now we can see the workload cluster has been created and its status is not ready. That means we have not installed a network add-on to the cluster. Okay, we have a cluster. We have a workload cluster. Now we can add a walker into it. Also I'm using the script created by the community. And this script just created a machine deployment object for us. And also the related QBADM configure template object and the metacube machine template object. The machine deployment is used to create and manage the machine, work machine for us. And you can see the replica here is one. That means there will be only one machine created by this machine deployment. And the template means the machine generated by the machine template will use this such spec. So to generate such machine, the machine deployment needed to specify QBADM configure template and also metacube machine template. So the QBADM machine template has just defined the join conversion for worker to join us during the cluster. And the metacube machine template is just like control plan. There are some image info and the data template defined in it. And also after provisioning finished, we can check that machine deployment object now is in running phase. And it belongs to a test one cluster and has replica one, ready is one, updated is one and unavailable is zero. And of course, the machine is running. And also has provider ID, the same provider ID with the metacube machine. And we can access to the master node. Also the bandwidth host is in provision state. And we can access to the master node to see our node. That's all. The last part is about our efforts to enhance metacube. Until now, you may have noticed that the infrastructure provider should request the resources from the infrastructure including network resources. And there should be someone replied to that request. But currently all the network configuration is done manually. We are replying to the request by ourselves. That means metacube lacks a way to manage the related network devices. So we just are currently trying to add a new operator into metacube community. This new operator aims to manage the network devices which adjacent to a Bermuda server. It should understand the network. The operator should understand the network configuration set by user. Then find out which device the Bermuda host is connected with and configure that device correctly. So that we would not need to do the network configuration manually just to write the needed configuration in some new Kubernetes objects and tell the metacube machine that it needed to apply such configuration. Then after Capium 3, choose the Bermuda host, the network operator. Could help us to do all the network configuration job like put the Bermuda server into provisioning network before provisioning start and put it into the target cluster network and also put it out of the provisioning network for isolation before it joins the target cluster. Here is a use case. So with such a operator, it is able to automatically do things like change the network infrastructure, network structure of the entire environment. For example, let's say you have a Bermuda environment with some clusters deployed, some clusters like one for production and one for testing. They stay in different variants and sometimes the production cluster gets so heavy workload that it need to be scaled up. Well, as you have limited servers, the new node temporarily added into the productions cluster maybe some nodes from other cluster. In that case, all the network configuration change can be done automatically by such a network operator. To implement such operator, we abstract three kinds of new CRDs, network device, device port and network configuration. Each network device will represent a physical device and a device port is one port on a specific device. So every device port will belong to a network device. Network device and its related device port should be created by the administrator of the actual device. And then network configuration is a configuration applied to some device port and it will be created by user. Bermuda host will have a reference to device ports to know which ports the Bermuda is actually connected with. And when creating a machine, user can specify a network configuration for this medical machine. The CAPM3 will then find a suitable Bermuda host, which would be able to apply that configuration. Then, because Bermuda host knows which ports it is connected with, it is able to tell those device ports to apply that target network configuration. By modifying the spec of the device port and the network operator will then detect such change and start to config the physical device through some backend. There are multiple types of network devices. For now, we are focusing on physical switch and the actual CRDS are switch port and switch port configuration. The backend used is network runner, which uses Ansible to interact with the switch. This image shows more details inside the network operator related with the switch. Next, I will show you a demo about how the network operator works. In this demo, I will create a new Bermuda host and show you that the new Bermuda host network conversion has been changed automatically. So this is the target Bermuda host I will create. In these ports, it has two MAC address and the conversion and the switch port defined. That means the Bermuda MAC nick, which has this MAC address is now connected with the switch port, which is the switch port node 11. And also another nick is connected to a switch port node 10. And for node 11, this port where we used to, for Pixiboot, so after creating the Bermuda host, the network conversion for this port will automatically be set into this conversion, which is switch port conversion pixie. And we have already created the switch port to switch ports and switch port conversion here. So the switch port conversion pixie looks like this. That means it has a spec which is on target Veyron 16, which means the pixie network will use Veyron 16. And we have also created the switch port object. Now they are all in idle because we are not using them currently. And for node 11, you can see the spec is empty here. Which means there is no configuration applied to this switch port. And also, we also have created the switch object itself. And in the switch object spec, it defines that it has two ports defined in the port field. The switch port, which means the switch port node 10 is actually the physical port, which name is Dero slash 34 in the physical switch. And node 11 is actually the port Dero 46 port. So next I want to show you that in the current state of the network of the physical switch. So here, current configuration in the interface is 46 and it is shut down. And you can see there's no Veyron setting in here. So now we have created the Vimage. That means the Vimage needed to be inspected by Ironic. So we need to put the target port into the Veyron 16. You can see here Vimage is started registering. And the confusion were then automatically, the network confusion applied to the switch port node 11 were automatically set into the PXE configuration. Yes, here. You can see the confusion now has been set to switch port confusion PXE. And the status is on target Veyron 16. And with a little time, you can see the actual state in the switch is now switch port access. And yes, switch port access Veyron 16. That's all about the network creator works. Yes, our Vimage is now inspecting. The last thing is about some community resources. And that's all. Thank you.