 Okay, I'm sorry for the waiting. Good morning everyone. Thanks for being here. My name is Kong Linxian and maybe you can hear I'm under the weather so my voice may sound strange so please bear with me. And today I'm going to share with you a new way to manage Kubernetes cluster in Kubernetes cluster with OpenSec Magnum. For those of you who don't know us, Catalyst Cloud is an OpenSec based public cloud and we have deployed three regions across New Zealand. Meanwhile, we also do private business. We actually don't sell OpenSec as a product. Actually we allow the customer to run OpenSec but we manage the cloud infrastructure for them as if it was one of our regions. So we use the same software, the same deployment tools, the same group of people with same experience that we have gained over the past five years running public cloud based on OpenSec. Most importantly, actually what pretty much what we do in Catalyst Cloud, we only use open source software. All the code we have developed is contributed to the upstream to benefit the organizations both in New Zealand or the worldwide. So we don't keep anything for us. Of course we also acknowledge the fact that we have gained a lot from the open source community. Catalyst Cloud is also the first CNCF certified Kubernetes service provider in New Zealand which is very important for our customers because first it guarantees the application portability and consistency when interacting with any installation of Kubernetes. Secondly, to remain certified, all the vendors needs to support the latest features, the latest versions of Kubernetes yearly or more frequently. So the customer can be sure that they always have access to the latest feature that Kubernetes is, the community is working very hard to deliver. So this is the feature that supported in our Kubernetes as a service in Catalyst Cloud because we are based on OpenSec. So you can see there are a lot of features that are implemented with integration with other OpenStack services. For example, we are using Keystone as the unified authentication and authorization API for both OpenStack and the customer cluster. And we also using Octavia for the implementation of Kubernetes service of load balancer type. And we also developed the Octavia-based ingress controller for Kubernetes. And we also leverage Magnum itself to implement auto healing and auto scaling. And of course, all the work we have done so far has been contributed back to the upstream. So back to Magnum, I believe most of you here today already know what Magnum is. So here I just gave a brief introduction in case you are missing some important details or the latest update. In the beginning, when Magnum was designed as the container orchestration engine in OpenStack, which supports Kubernetes, Docker swarm, and Apache messes. However, as Kubernetes becoming the de facto standard in the container orchestration platform, currently most of the contributions to the Magnum today is only related to the Kubernetes. As part of OpenStack, Magnum provides the rest for API and can be running in multi-tenancy mode. And the most importantly, Magnum touches almost each of OpenStack service. Magnum is using heat for the cloud infrastructure orchestration, and it's creating virtual machines in Nova. And it's creating cinder volume as SD storage. Of course, it's using neutral for networking. The Octavia is heavily used in Magnum context. First Octavia is used to create a load balancer running in front of the Kubernetes API. And also it could be used to implement the load balancer type service. And last, it can also be used to implement as the ingress controller for Kubernetes with Barbican used as the SSL termination. As I mentioned previously, the cluster created in Magnum is CNCF certified. Currently, it supports the versions from V1.11 to V1.16. And as far as I know, the V1.17 is on the way. Magnum also supports some advanced features like auto-skilling, auto-healing, and rolling upgrade. The Magnum architecture is pretty much the same with when it was created, but with some significant change. First, some API resources has been deprecated or removed. So if you search Magnum in the internet today, you can still see some concepts like Bay and Bay model, but actually they were both replaced with cluster and cluster template. Additionally, as the operating, the host operating system that is designed especially for the containerized workloads, both Fedora, Atomic, and CoreS are supported in Magnum. But as you know, the company CoreS was acquired by Red Hat last year, right? So a new container operating system was introduced named Fedora CoreS, which combines the Atomic and CoreS, but will deprecate both in the near future. Lots of things are changed in the Fedora CoreS, but which is not backward compatible. So the upstream Magnum team is working very hard on the Fedora CoreS support. So I believe in the next release, the Fedora CoreS will be officially supported in Magnum, and the Atomic and CoreS will become the history. So in Magnum, in order to create a cluster, we need to create the cluster template first. In the template, in addition to some necessary parameters like key pair, image, network, or the flavor, the user is able to define something else, like whether or not the cluster should be exposed to the public. So this is important because as we know, this year there was some critical security bug happened in Kubernetes, right, but only affecting the public facing cluster. So it will be better for the customer to create the private cluster to avoid the vulnerability. The most flexible thing in the template is the label definition, but the label here is different thing with the label in Kubernetes. We can call that the Magnum label. The Magnum label is a key value pair in the template. So currently most of the advanced features could be configured in the Magnum label, such as the auto scaling, auto healing, or even the Kubernetes controller service configuration. We can also choose which add-on application can be deployed during the cluster installation. For example, the dashboard, the helm, the premise source, the Grafana, and so on. The last important thing we should know of the template is the cluster rolling upgrade. The rolling upgrade feature in Magnum is initiated by providing a new cluster template. So for example, if you wanna upgrade the Kubernetes version from V1.12 to V1.13, so we need to create a new cluster template. Well, we could keep everything except the Kube tag in the label. So the Magnum will trigger the rolling upgrade process automatically. So here, please allow me to go to the, because I need to have a demo. So here I have an environment which I need to create the cluster template first. So in this OpenStack environment as the admin user, as admin user, I wanna create a cluster template. So you can see from the definition, the command line interface, we have defined all the parameters, also the labels here, and we are going to create the cluster version V1.12, and we don't want to enable the Kstone integration and something else, so we can create the template. Okay, so after we have the template, the next step we could create the cluster. Although in the template, we have defined the overall cluster definition, but the user can still have the ability to do some adjustments. But in this demo for simplicity, we just create the cluster without any configuration change. So because just now we have created the template named the seed cluster, so here I will create the cluster using the template. So the cluster name is the same called seed cluster. Because for the testing purpose, we only create the cluster with one master and one worker. Because Magnum is using heat to do the cluster resource orchestration, so we can get the stack ID from heat. We get the stack ID from heat and keep watching the resource creation process. So we could know when the cluster creation is finished. Okay, so during the waiting, so during the waiting we can take a look at some the current existing problem in Magnum. The first problem is also the most significant one for the end user, the creation time. Because just now we have created the cluster with one master and one worker, right? So in that environment based on my testing, even for a cluster with one master and one worker, it usually takes about 10 minutes. So not to mention if we wanna create a highly available cluster which consists at least three masters and one worker, right? So the reason are many folks. So first, when creating the cluster, Magnum will create virtual machines in Nova and load balancers in Octavia. Both will take a long time, right? So inside the virtual machine, the VM, Magnum will download the system container image from a one by one from Docker Hub which also take a long time. At the same time, Magnum will create and manage and config all the cloud resources needed by the cluster. For example, the networking, the storage. And finally, Magnum will create the worker nodes and install the kublate, config kublate and join the worker node to the cluster. So you can see the whole process will take a long time. It's quite time consuming. The next problem is the scripts management. So if you are familiar with the Magnum implementation, you must know there are a lot of shell scripts in the code repo. So actually I have a link and I can open the link for you but for the demo, I don't want you to go to the back to take a look because there are many scripts with each one corresponding to each step during the cluster creation. So which is very hard to maintain and is proven to error. One of the potential solutions I can think of is to use Kube-ADM in the Kubernetes community which could simplify the whole cluster creation process because the Kube-ADM is stable and mature enough even for production usage. And also I think the latest version of Kube-ADM also supports to create the highly available cluster now. As a public cloud provider like us, it's our responsibility to manage the customer's cluster. But for some unprofessional customer, we even prefer that the system, the controller components and system add-ons are invisible to them because in case some customers may damage the cluster accidentally. But in the current megalom implementation, the other cloud resources created for the cluster is in the customer's tenant. So the customer can make any configuration change as they wish, so which may bring troubles to the cloud provider, especially the public cloud provider like us. Also we care about the resource consumption of our customer, which is important for their business success. But currently in megalom, for example, for a highly available cluster, which consists of three masters, at least three masters, one worker, the billing for the cloud resources created for the cluster is a big part of the customer's invoice, which is a big concern for the potential customers and could be avoided. So as a result, in order to solve some of the problems we talked about, we are introducing a new driver in megalom to manage the Kubernetes in Kubernetes. So in this deployment model, there is a cluster called seed cluster, which should be created and managed by the cloud administrator and is transparent to the end users. Well, the seed cluster is nothing else but a normal megalom cluster, but it's up to the cloud admin to decide how many nodes should be created in the beginning, but it still has the capability of auto-healing, auto-skilling, and a rolling upgrade. So all the functionalities in that megalom provides. And all the tenant cluster are called customer cluster. So all the Kubernetes controller service for the customer cluster are deployed as simple standard vanilla pods in the seed cluster. So we could use the Kubernetes feature for managing those controller components. For example, the high availability of the controller components can be easily achieved by using the internal auto-healing feature in Kubernetes. So if one of the master pods goes down, the controller manager in the seed cluster can detect it, and the pod could be rescheduled and redeployed quickly without any human interaction. There are many approaches to deploy the controller components. So we could use the operator pattern, which follows the Kubernetes principle with integration of the resource management of the seed cluster. And we can also use the Helm charts, which giving the cloud admin more control of the deployment and the configuration. And fundamentally, because all the applications running in the Kubernetes cluster can be deployed using kubikata with YAML files. So I think that's the best way to do some quick evaluation. And the worker node deployment is different, because worker nodes are actually the virtual machines in the customer's tenant. So we could still use Magnum Way to manage the worker node deployment, because Magnum is using heat for the resource orchestration. So we can reuse the existing Magnum code. And it's also a good idea to use the operator, because as we know, there are some projects in the Kubernetes community that can do the same thing. There is a project named the Machine Controller Manager. It can create and manage the virtual machines in OpenStack in other clouds. And if you are a fan of Anspo or Terraform, nothing could stop you from using those automation tools to do the worker node deployment. OK, for example, this is a seed cluster consisting of one master and three workers. If a user wants to create the customer cluster, the Magnum will create all the controller components inside the seed cluster. And also create the worker nodes in the customer's tenant and connect the worker node to the API service of the controller component. And similarly, for a second cluster, it looks like this. And of course, for the third cluster, it looks the same. So the workflow in this deployment model looks like the first step is the cluster administrator needs to create. Sorry, I need to do demo again. The cloud admin user needs to create the seed cluster first, which we already did just now. We can take a look. So here, the old cluster is just a liars for OpenStack COE cluster. It's still creating progress. Let's take a look at what resources it's creating progress. Please wait for a while. Seems like the session is stuck. So if we check again, so the cluster creation is successful. So here we have the seed cluster. The next step is for the admin user needs to do some configuration change in Magnum. So here I have a script. So in this script, the first, I want to retrieve the code config file for the seed cluster and copy the config file to the Magnum configuration directory. The next step is to restart the Magnum services to pick up the new configuration change. And the final step is to install third manager in the seed cluster. Because third manager here is an important component for this deployment model, which could do some certificate management in the seed cluster. So we will run this script. It will take a few seconds. So you can see the third manager has been installed. So we can have it. We can check. Because in this session, we are talking to the seed cluster. So we can directly use the code control command. So the key here is also alliance for group control. OK, so all the pods related to third manager has been created and up and running. And the next step is, again, as the cloud administrator, we need to create the cluster template for the customer. So here we are creating the second cluster template. The name is the customer cluster. And there are two significant difference with the first one. The first difference is the server type. We are using container. But in the first one, the default value is VM. And also, we are going to create a public cluster template. So the end users can use this template to create their customer cluster. So with all those done, the last step is as the customer, they can create the customer cluster based on the public cluster template. So we go to another session. So we are using demo user here. First, we can take a look at the cluster template in the system. So we can see the customer cluster, which we created just now as admin user. And next, we are going to create our customer cluster. So we are going to create a cluster named the customer cluster using this template. And so now the cluster is being created. So if we switch back to the admin user, because currently in this session, we are still talking to the seed cluster. So we can see there's a new namespace created in the seed cluster. So the namespace name is the customer cluster ID here. And also, if we take a look at all the ports in this namespace, we can see the controller components for the customer cluster is all up and running. So I think the customer cluster is still being created. So we can wait for a while. So still during the waiting, let's take a look at some advantages for this deployment model. The first is unified API. Because as you see from the demo, either cloud admin or the end user are using the same API to create the cluster, which means from the open stack user's perspective, nothing has changed. And the second is flexibility. Although in this demo, the cloud admin user has to create the public cluster template. But it's possible that end users want to create their own cluster template. And so they could have the privileged access on both master nodes and worker nodes, but still without losing the capability of automated deployment and configuration. And in this presentation, we have introduced two concepts, the seed cluster, the customer cluster. But from the maximum perspective, they are all maximum cluster. So if we go back to the terminal and as admin user, we can take a look. Because the admin user can see all the clusters. So in the admin user, you can see there are two clusters without any difference. But actually, in this deployment model, there is one seed cluster and one customer cluster. And as I mentioned previously, the management and the maintenance are very easy in this deployment model. Because all the controller components are actually pods running in the Kubernetes cluster. So it's very easy. For example, if we queue one of the controller components and the seed cluster, the controller manager in the seed cluster will quickly recreate a new one. And also the faster creation time. Because if we take a look at the seed cluster creation time, we can see the time step here. So you can see it took about 11 minutes to create a cluster with one master and one worker. And for the customer cluster, I'm not sure if the cluster creation is finished. OK, so we can take a look at the creation time. So you can see from the time step here, it only took about four minutes. So it's much faster. And also the last one is enhanced security. And if we go back to the terminal again, as a main user, if we get all the node, oh, sorry. So we can see there is one master and one worker. And if we get all the system containers, so here KCC is also a LiRs. OK, so we can see there are different controller components. And also there is a pod called OpenStack Cloud Controller because we are using the CPU as integration with other OpenStack services. But if we switch to the customer cluster, because first we need to get to the group config file. And now we are talking to the customer cluster, and we can get the node. So here you can only see the worker node. So the master node is invisible. And if we take a look at the controller components, we can only see the pods running on the worker node. So which means it's the cloud provider's responsibility to do the timely patch or upgrade to make sure the customer cluster is always taking good care of, sorry. Excuse me. OK, so I think most of you may be interested in the communication between the seed cluster and the customer cluster. When the user is creating the customer cluster, actually a load balancer of corresponding to the Kubernetes service is created in the cloud of the main tenant. And it's connected to the customer's private network to make sure the worker node in the customer cluster can talk to the control plane services. So actually there are different load balancers created corresponding to different customer cluster. And if we associate floating IP with the VIP pod for the load balancer, the customer cluster can be exposed to the public. So there are still some work to do in the near future. For example, as you see in the demo, we are installing the third manager manually, but it will be better in the Magnum that they can be deployed automatically. The second is SD performance tuning because the SD deployment is not as easy and straightforward as the stateless application in Kubernetes because SD performance is critical to the cluster operation. So during our POC testing, we have learned there are a lot of options to deploy the SD cluster. For example, we can use pod running in the seed cluster or the SD cluster could be deployed as a dedicated component running outside of Kubernetes. So honestly, we don't have an answer yet. We will continue testing and we'll publish the result. And also other improvements like how to accelerate the worker node installation, how to make it easier for the control plane components management. And before the end of this presentation, I want to especially say thank you to CloudLab, which is a platform we could apply for cloud computing resources, either virtual machines or bare metals. So actually, the CloudLab is a testbed which allows the researchers to do some experiments which cannot be running in the traditional cloud because maybe they require some special requirements like the control and the visibility of the system, like virtualization, storage, or the network layer manipulation configuration. So the application and usage is free to anyone with good reasons. So everyone is encouraged to have a try. OK, this is what I want to share with you today because we delayed for some time in the beginning, so I'm not sure there's time for the QA. But I will stay here for several minutes. If you have questions, feel free to come to me. Thank you.