 Hi, everybody. My name is Indy. I come from Huawei, R&D USA. So I'm very glad to be here and to discuss multi-tenancy Kubernetes container. How do we do multi-tenancy Kubernetes cluster in our OpenStack deployment? So this one is now the advertisement. I have that for some reason. So you can see Huawei have about more than 176,000 employees. So apparently, we have a lot of internal development going on. So we are changing our internal platform into more iJaw and microservice-based. So that raises also our customer. External customers have the same requirements that they want to move their practice into more iJaw. So that's why we need to convert to the container world to work with all this fast-changing world. So now let's bring up the problem we want to solve. So now all the business logic is more breaking to the distributed environment. So now we're all transforming to more iJaw and depth ops practice. So people can turn around and try to fail very fast. So now, I mean, most of people are focusing on the container. Let's say, use containers, but will that solve everything? So we have some questions we need to answer. So first, how should I operate all this container and automate it in reliable ways? In our typical deployment, we can have easily thousands of containers in hundreds of nodes. So that's a very legitimate question we need to answer before we can put it into our production environment. Also, how should I make them work with our current IT infrastructure? We have our OpenStack production-ready product called FusionSphere. So we have, I mean, hundreds of customers already using our cloud solution. So if we want to give them the solution, say, use the container based on our IT infrastructure, so we need to answer this question too. And the last one, say, yeah, we have networking storage. We have our OpenStack nutrient cinder, even not OpenStack storage, it's there. So how should we make this work if we try to adapt to containers? So we have to answer these questions before we can move to the container world. So we decide to pick up the Kubernetes at our container orchestration engine. So behind this, we have a few reasons. First, the Kubernetes have a nice plug-in architecture. So this makes the orchestration and the scheduling more extensible. For example, we were assigned a new container task, so we want to assign to some node. So we need to think about how do we take account, where we should put these containers. So there's already a bunch of container running probably in some node. So we would think, do we want to make it with some other container work together? Because we want to make it efficient. We want some affinity between this existing running task. Or we may say, we want to make it as a high availability. We want to anti-affinity. We put them in the different nodes imperfectly. So that's we need to, Kubernetes help us to make it easier. They nicely support it so we can customer some plug-ins and make it feel-filled with ability and scaling. And also, Kubernetes support multiple container format runtime. I will bring up that later, because we make some changes to make container more secure. So that's a very important feature to us. And also, Kubernetes with this plug-in architecture, you can plug in a lot of networking and storage support. The last one may not be the least, so they have a very active community. So if you have questions and they're making the change, I mean, adapt to the new change or the user requirements really quick, so that's one of the reasons we pick Kubernetes. So in our mind, so this one is I brought up about the Austin Summit. So we think about, OK, the OpenStack should work with our Kuberneteses nicely. So from the left, you can see when you deploy, actually, the OpenStack already support is nicely. We have Macdem and Morano project. You can use either of these, try to deploy a Kubernetes cluster. And the Keystone is a really nice component to work for you for the AAM component. And the container registry, we have an open-source product called Dockyard, but I didn't put the name here, because Docker, they try to make the branding with anything Dock-related, so we are changing the name. So I will bring it here when we find a new name. So then you can see the storage. We can use the networking with our current OpenStack project, the Fushi and the Cura. I think Fushi is integrated in the Cura right now. So it's the nice plug-in. So it will be the bridge between the Kubernetes and the OpenStack. So from the right-hand side, you can see we could use the OpenStack project to do the life measurement for the Kubernetes, like Morano and the new one. I think we have a design summit for this project. So if you are interested, you can go there. And you can see it goes north to go up. So you can either use a native Kubernetes API or you have an API gateway to integrate the Keystone to give you your authentication authorization. So then you have your customer API to work with this. So this is our Huawei's cloud container engine. So basically, you can see in the middle is the Kubernetes-based container orchestration engine. It's the core component of this. So around this, we wrap up our services even with some web app or something. And they name it FusionStage. And this one is working nicely with the FusionSphere. And that's our OpenStack-based IS product. So let's see how this makes multi-tendency works. So you can see in the bottom, you can see that we have a cluster one to cluster N. Does have a few nodes in there. So that's a part of the Kubernetes concept you already know. So let's see this. So in the v1, we say, OK, if you want to make a multiple tenant, so what we should do. You want to use security. You found isolation. So we said, OK, that's easy. Let's make. You have a new content. Let's make a new cluster. So if I have N tenant, let's make N Kubernetes cluster. Apparently, this has problems, right? So it's not efficient. Every time you bring up, you need to bring up a separate cluster. You need to provision the cluster itself. It takes time. It's against the goal. We want to use this container because we want fast deployment and we will turn up and turn down them very fast. But this v1, this one worked actually nicely for our public cloud customer. Because the people that the tenant is come, it won't go really fast. It's not our internal dive environment. You will come and you go with it frequently. It's not. It's more like a stable tenant. So we just need to scale up and scale down the size of cluster. So then in the v2, we think, OK, can we mix the VM and the physical machines together? So that's one of the solutions we are thinking about. So initially, we said, OK, let's make separate pools. So you have VM pools. You have physical machine, bare metal pools. If you want to use VM, let's get a result from VM pool. If you want the physical machine, let's get it from the physical machine pools. So then you have the same idea. If you want a new tenant, let's create a separate cluster for you. Then we think about, why not we mix a VM and a bare metal machine together? We could really limit some virtualization layer for some services we need efficient and we don't want to use the VM. Also, some business issues, one of our users they don't have the license for Oracle database running on the VM, so they can only run on the physical machine. We said, OK, can we make this work and make the mixed results together for one tenant, one cluster, and this one? Then we said, OK, now the question is, do we have to have a separate cluster for each component tenant? So we think, can we have one big cluster? We already extend our orchestration engine support more than the open source community. I believe with one day support 1,000 nodes, we already support 5,000 nodes. So it's a huge cluster. So I mean, why can we take advantage of this and make multiple tenants live in the same cluster? So you don't need to tear down the cluster every time. But you can see in the middle point I point out, maybe we have a security issue there, right? So people think about, is the isolation using namespace, C group, is that enough? Is does it provide enough isolation? And does it, I mean, for the private cloud, some user doesn't care. They said, OK, it's just some different internal users they are not required. That much security, however, for our public cloud, external customers say, OK, that doesn't work. So people need to isolate it. Either you go back to the solution one, each tenant have their own cluster, or you give me another solution. So now we think, OK, we need some secure containers. You can see we are thinking about, there's a two approach to do this. And we go in the one, so one solution is to make your container extensible. And the other one is, I think VMware have the same solution as the same as us. They say, I like to make a VM smaller and smaller and thinner and thinner in the cloud to a container. This is one approach. So we work, I think Intel is working the other direction. It's make a container growing up a little bit to support VTX and hardware visualization, take advantage of that. We work with them with the lightweighted QEMU and make this work. So actually, we still call it a micro-VM because we are not, I mean, we are trying to convert this micro-VM as a container together. However, from outside, you look like it's still kind of a VM, even because it's called QEMU. We try to make it more like a container. But the performance perspective view is already very close to container. It takes about the first one. You just need about 300 milliseconds to bring up one micro-VM. And also, we do the modification of our container OS with this modified QMU solution in there. So we can bring up a multiple container in the same micro-VM. Now, it's very close. So with this hardware, we leverage the hardware visualization solutions. So the container security is very close to the VM. And our goal is to make it as safe as a VM. So then we can claim, OK, that's a secure container. We don't need to worry about this solution. We can put two containers in the same node. This won't be a problem. So this, we use a network virtualization and also memory isolation kernel same-page merging. And we try to make this working. And also, we make this work with ARM support. So not only you can run as x86 CPUs, you can work with the ARM as well. Now, let's see what's going on here. So now we bring it back to our first page. Say, OK, I have an open stack. You have your container orchestration engine. Let's make this work together. And so you can see we have container from physical machines where you can have container with virtual machines. And they all talk to the keystone to do the authentication or authorization and go through the either native API or custom API. You can talk to the container orchestration engine so to achieve the goal with the tendency verification. And also, with this, you want to pass all this information to Cinder, Neutron, all this open stack components to make it working as a whole solution. So let's see what's going on, what I mentioned in the next few pages. So first, this from the old Kubernetes is from 1.1 release. When we try to match it, we discover a few things. So the authentication and authorization is running in the API server as part of Kubernetes master. However, when we try to use this curer to talk to Kubernetes, it's talk to the Kubernetes. This is running in the millions. So basically, there's a kind of broken token and key passing. So basically, you authenticate with keystone from the Kubernetes API server. However, this key is only live in the API server. But when we use the curer to talk to Cinder or Neutron, then we find out, OK, the token is gone. So you have to assign a fixed super user role if you want to create a network or create a volume. So it doesn't work at all. So we need to pass this access key. So then we say, OK, we need a plugin to make this access key to pass to the Kubernetes. So it's connect to Kubernetes master and the millions. And also, the role is different. So in the keystone, we have the role user project concept. In the Kubernetes cluster, you have a group user and namespace. We say, OK, let's make the project namespace map to the project, the group map to the role. So we can make this role working for the Kubernetes and working for the OpenStack develop. Then we have the Kubernetes 1.3 release. Now the official keystone still under development, but they release it. We find out still something is broken. So authentication, they connect to authentication with keystone that this plugin make this work. However, there is still lack of authorization. So how do we say, OK, when they pick Kubernetes, they have the nice plugin mechanism. So we say, let's write our own. So we write our own ABAC authorization plugin. So with this plugin, then we pass the access key to the Kubernetes. And we can make the cura have the key and they talk to Neutron and Fushi with the key and we talk to Cinder. Also, there's a chain. Another chain is, OK, this is the group. We don't need to change that. So you can see, in the Kubernetes, they introduced the role as well. So we can say, OK, it's a nice matching. We don't need to have this awkward group mapped to role. We can have the role to role, this concept can work together. So let's make perfect. Then we said, OK, so we have the keystone authentication. Let's make this work with the other component of the OpenStack. So you can see with cura, we say, OK, let's make cura work with Neutron. So then the Kubernetes can, with your already authorized privilege to work with Neutron server, you can see that's a two different deployment. On the left side is a physical host. Basically, you deploy, you want to have your container running on the physical host. On the right side, basically, you have a VM running your physical host with hypervisor running. So let's make it two different. So you can see on the left side, it makes more sense. The Neutron server to the OBS, you talk to the ethernet in your ethernet in your physical host. However, in the VM, so I use dot line for there. So basically, we have a nested OBS. Basically, you talk to Neutron, talk to the OBS in the physical host. Then the host took the VM OBS. Then you talk to the container. That's two layers. So we say, OK, that's not efficient, but it's working. It's just awkward. Then we have a new solution. Say, OK, we don't need to talk to the OBS. Let's use a trunk pod. We already verified that it's working. We'll do the final touch to make whenever it's ready, I will put it here. So we use the trunk pod to talk to the container. So we don't have this double OBS layers. It doesn't make sense. And also, for the storage part, we have a fushi. So basically, before that, you need a volume to host directory. And then you need to mount your volume to a host directory. Then you need to mount to the container, make it more like a stateful container. And with the cinder, it's much easier. You just mount this volume to the host. Then the container can use that directly. I can show the detail in the next couple of pages. So with this solution, actually we make this work even nicer with the VM. I can leave the container in the same network with the VM. Pay attention to this VM. This is not this user space VM. It's not the VM, the OpenStack running. It's not the management of VM. So basically, a user can say, OK, I can have VM. I can have containers. Let's make this mixed resource and work together. So you can see in here, we have VM 1, VM 2, and the container 1, 2. They all live in the same network, the same network Neutron created. And this is some detailed look for using Fushi work with the storage. So before that, as I said, you need a host pass volume mount to a file directory. So you need that mountain to the file system to make this working. With the Fushi plug-in with authenticated row, you can say, OK, you simply mount this single-wheel volume to a host, the kubelet. So the kubelet can understand. So you can easily use this mount to the container. You can easily use this volume to make it kind of a stateful container here. Yep. So that's what I'm talking about. Thank you. And I can answer some questions. It's over there, I believe. Hi, I'm Adam Young, Red Hat at Keystone Core. So I have some questions about the multi-tenancy aspect of what you're doing there. You showed the potential, and you had that slide with the security access issue on it. And did you end up implementing this such that the resources that were used for the Kubernetes cluster came from multiple projects, not tenants? Projects within Keystone? And if so, how did you resolve the security issues there? The second question has to do with that A-back plug-in that you had into Keystone. What were the attributes you needed to enforce on, and what is that access key that you passed it on? Let's go back to the first one. So actually, we have this integrated already. So your question is, is this actually working in this page, right? No, how did you resolve the security issue? I assume you made it work. I was asking if you did actually do it. It sounds like you did. How did you resolve the security issue? This one, so the issue is the next page I have it showed. It's called the Secure Container. So basically, the container has the isolation, or physical isolation. It's more than the namespace in a C-group isolation. I saw all that. Did you make it so that a user could then not go into their corresponding project via the Novia APIs and access the virtual machines that those containers are running on directly? It's these two layers. So this one, you can see this micro-VM really running this one container per VM. So basically, this is one container per micro-VM. So this micro-VM really fast is not like the old VM you need to. You can use the modified version of our Docker VM to trigger this micro-VM. Does that VM show up as a resource within Nova? No. It does not. So it's running inside of a Nova VM or a bare metal deploy? You can do both. So it depends on how to deploy it. So those are resources that the end user has access to via Nova, and thus they could get access to and at least kill these, right? The NOVA can see on the left side, this is one physical VM and also one VM. You can kill that one. That one you can see. On the right-hand side, that VM, the external VM Nova can control. The internal micro-VM Nova don't have. But that means if your container is running in a VM that's in my tenant. That micro-VM is talk to the Docker daemon. So basically, it's in the Kubernetes world. It's the Kubernetes master controller. So how do you prevent back-end access to the resources that the Kubernetes cluster is getting in so that I can't mess up your containers running in my VMs? So I'm sorry, I didn't get this question. Let's say you and I both have projects that are contributing to the Kubernetes cluster, right? And I have VMs that your containers are running on because Kubernetes placed them there, right? I think you can think about this. So we are converging this micro-VM container as one entity. It's more like the container we are talking about. We call it, so basically, so Kubernetes cluster can put multiple this micro-VM or secure container in the same host, the Nova host or Nova VMs. So these two VMs, these two container or two secure container, they have the isolation similar to two VMs running in one Nova host. But whose quota are those coming out of? Who owns those VMs? Sorry, maybe I shouldn't use VMs. Let's use a secure container. So basically, the secure container is controlled by the Kubernetes cluster. But it still has to run on top of resources allocated from Nova, right? Yes, the Nova allocates the external VM, the actual VM. And so is there a rule that says, my containers only go on my VMs? Yeah, you can define it, basically. So let's see. These are two layers. There's one is called scheduling. One is the resource allocating, right? So you can see with this solution, there's only one pool of the VMs, right? Nova create about, I mean, for example, 100 VMs. There's a VM pool. So then Kubernetes will see that, OK, I have 100 nodes. So how do I allocate that one? That secure container is allocated by the Kubernetes master. They decide which node I put this container on. So then why put them under multiple tenants? Why not have a Kubernetes cluster for each end to keep the isolation there? Yeah, you mean, why is so? You don't mix the, I mean, I think your question is in the middle node, right? Why put two tenant container in one node, right? And how do you ensure that you don't have bleed over between two different paying customers or two different people who don't trust each other within the same Kubernetes clusters? Your stuff should then run in two separate resources? The thing is, we have only one cluster, but multiple tenants. That's our thing. So we don't need to redeploy our control cluster. I mean, the Kubernetes cluster each time. So we only have that one leave forever. So if you have new content, I say, OK, you have a new tenant. You have a new tenant to come. So this, you share the same amount of a resource. Why don't we skip to the second question, which is the back and what those keys were? You mean, that key, we have two keys. And more like the AWS, you have your secure key and you have your SST, right? We pass down this SST key with our plug-in. We pass this SST key from API server to the Kubernetes. So Kubernetes uses talk to cura, cura uses to access, I mean, to talk to Neutron or Cinder. What are the attributes that it's? The SST key, I mean. Yeah, where does that come from? So the ABAC is the file system-based thing, right? So you put your rule in your config file. So basically, based on that rule, we do that. ABAC is a file system-based. So we put all the rules and all the external database that people need to pull it on out of. It's specific to your deploy. Yeah, I believe so. Hey, just a quick question. You said that you made some patches to QMU or maybe I missed that. Will you share some of that information? Yeah, that one will be open source. That's the joint work. It's us with the Intel. So we are going to put this code cumulite or something. It will be open source, I believe. Thanks. Could you go to slide 15, please? Is that RBD? Is that CepRBD? Intended to be CepRBD there? Or it says RDB? It's RDB. So it's not CepRBD? Actually, I'm not sure. I need to talk to my PR to answer that clearly. OK. Thank you. Thank you, Arren.