 Hello, everyone. My name is Iris. I'm a software engineer from Intel. Today, I will together with Steve to deliver this topic to you. Hello, Steve. Hi, Iris and everyone. This is Steve, a Cloud Software Engineer from Intel. I am very glad to address this presentation with Iris here. Thank you. Okay. Let's first take a look at our today's agenda. We will first introduce some background information about our topic. Then we will tell you what multi-tenant challenges will be for service mesh. Then we will go through several options that you can use to achieve multi-tenant in-service mesh. Then we will go further to tell you how you can safeguard multi-tenant in-service mesh with Intel SGX. At last, we will do a demo about one-hour-hour solutions and do a quick summary about all the solutions we have shared in this talk. First, let's take a look at the background information. As you all know, edge computing is very important now, especially in 5G area. It provides functions like 5G slicing, packet processing for cloud-native functions and applications. In the edge computing scenarios, the resource is very constant. There will be multi-users running in the edge. We need to provide a way to make sure the resource is isolated between the multi-tenant. Also, service mesh is acting as an infrastructure layer in edge and cloud. You can offload all the traffic management, security, and telemetry tasks into the sidecar. There will be a central control plan that will manage all the configuration information for the sidecars. We need to take care of multi-tenant in-service mesh to support the edge and cloud scenarios. If we want to support multi-tenant in-service mesh, what are the challenges we need to address? Firstly, it's operational isolation. You need to make sure the service which belongs to one tenant doesn't influence other tenants. The other is config isolation. For example, if you make a traffic management rule, you need to make sure it doesn't influence applications running under the other tenant. The other part is traffic isolation. It means you need to make sure traffic targeted to one tenant will reach the tenant correctly and doesn't reach the non-related tenants' applications. The other part is identity isolation. If there are multiple users coming in the service mesh applications, we need a clear way to identify between these tenants to make sure the correct request has been handled by the correct tenant's application. The other part is there are multiple telemetry data in the service mesh. So how to make sure all this data can be isolated between tenants is also a problem or challenge. So if we want to support multi-tenant in-service mesh, the straightforward way you can think out is to make sure every tenant has his own mesh and own cluster. So in this way, all the isolation will be achieved natively because they are running in different clusters. But as we mentioned before, in the edge computing scenarios, resource is very limited. So it's always impossible to make every tenant has his own mesh or own cluster. So we went further to think about can we support multi-tenant in single cluster but in multiple mesh? So this is the picture that for this solution. So you can see there are two tenants, tenant one and tenant two. Both of them have their own mesh, but they are running in the same cluster, cluster one. So because every mesh has their own root CA, so it means the traffic between them will be isolated. Also in Istio, there is a discovery selector that you can utilize to achieve service discovery isolation for multi-mesh in the single cluster. So you can see in the left part when you install Istio, you can specify which namespace you are interested to watch to make the Istio D watch. So the left side is an example that the Istio D will only watch namespace which has the label environment equals IT. The right part is a sample that for the accounting department. But one thing we need to mention here is actually in Istio upstream multi-mesh in single cluster isn't supported. Because in every namespace, Istio D will create a secret to host the root search for all the tenants in all the mesh. So to support this solution, we actually have a patched solution to respect the revision in your service mesh installation. So if you are interested, you can find the link about the patched solution at the end of the presentation. But this solution, if you want to use this solution, it also has some upgrade issue. So in the future, if you want to upgrade your Istio in a newer version, it will have some problem. So we think it further to see can we support multi-tenant in single cluster, single mesh? So we need to solve the challenges we mentioned previously one by one. The first one is service discovery isolation. So in Istio, it provides a custom resource sidecar. So you can use it to achieve the service discovery isolation. In the left side, you can see there is a sidecar custom resource. It will allow applications running under namespace 1 to access applications running under namespace 1, namespace 2, and Istio systems. For all other namespace, it cannot access. So the right side is, for example, in your mesh, even in your mesh, when they are multi-tenant, there might be some global service that you want all your tenants to access. For example, some logging service. So in this case, you can define a global accessible service here. You can see in the egress host part, it is defined as star slash star. So it makes it a global service. Then for the config isolation part, you can utilize the export to field in virtual service, destination rule, and service entry. So using this way, you can achieve config isolation. Then the next part about the traffic isolation, Istio has a custom resource which is called authorization policy. So the best practice here is to first deny all the traffic between the mesh. So then you can gradually allow the communications between the namespaces. So you can see here, for example, it allows applications running from namespace 1 to access applications running under namespace 4. So this is a picture to demonstrate all the options we just mentioned. So in this picture, you can see there is only one mesh. There are two tenants, tenant 1, who owns the namespace team 1, and tenant 2, who owns the namespace team 2. Between the two namespaces, communication is broken. Why is the authorization policy? Also, it provides config and discovery isolation through export to and the sidecar. But this solution has some obvious disadvantage. So first, because you need to use the authorization policy to enable the traffic isolation, it will add additional iBack filters to your annual sidecar. The other part is, for example, if you have new tenants who want to unbound to your service mesh, or you want to deploy new workloads for your tenants, then the authorization policy might need to be changed to consume all these new workloads. The third part is that there is no identity isolation. So for example, if there are multiple user requests coming in, there is not a good way to distinguish them from the different tenants' workloads. The last part is actually all the applications running in the mesh. Even they belong to different tenants. They are sharing the same root cert. So this is almost unacceptable for most tenants because they don't want their root cert to be mixed with other tenants. So to solve the problem, the first one is identity isolation. We have this solution. It's called Auth Service Configurator. It's an Intel project, but I put the GitHub link here. So if you are interested, you can click here to take a look. So the basic idea for this project is because starting from ETL-19, ETL support external authorization, and there is a project called Auth Service. It's an ETL ecosystem project. So using these two combinations, you can define different OOS providers for different services according to different services, URL or parameter. So this also opens dots for us to support multi-tenant. So in our project, to support multi-tenant better, we have defined a custom resource, which is called Chen. In the Chen CR, a different tenant admin can define their own Chen CRs. Then there is a Chen web hook. It will do some validation to make sure the Chen CR is valid. Then after that, there is an Auth Service controller. It will monitor all these Chen CRs and we'll combine them together to and update the Chen information into the config map, which will be picked by the Auth Service dynamically. So using this way, we can support multi-tenant dynamically as well. Then how to solve the problem that every tenant has the root, same root CA. So I will hand over to Steve to take you through the journey. Hi, Steve. Thanks, Iris. And I will go on the option three about the multi-tenant in single mesh and the single clusters. And at first, I will describe the image in the slide. There is one cluster with single mesh and multi-tenant, tenant one with service A and tenant two D with tenant two with service D. And all these services are under the control of S2. And in the middle of the image, there is the Kubernetes API server. In the right side of the image, there leads some sign-up solutions for certificates of tenants. The first choice is sign-up operator, which means that you can develop develop your own operators to do the signing work. And the second choice is about the certain manager, which is an open-source project. And it has a complementary ecosystem. And you just need to use it. And the third choice is that whatever you can think about it, but all these sign-up solutions need to be watching the Kubernetes CSR all the time. So for the next step, I will give more details about this signing process. The first one is that the service of the tenant will send these two CSR to S2D. And S2D has the registry's authority server, short name for RA server. And RA server will take some actions, including validate S2 CSR from the service and then retrieve some information from the S2 CSR and then organize all necessary information to form a Kubernetes CSR. The necessary information will include such as the service name from the service and some involved variable, such as search-signer-domain info, which is configured in the configurations when installing the S2. And I will give them more details later. And the solution then will assign the Kubernetes CSR when there is a Kubernetes CSR. And the S2D will be notified and updated the Kubernetes CSR once the Kubernetes CSR has been signed. How? Okay. After all of these steps, the certificates of the tenants have been already now. Therefore, S2D can dispatch these certificates to specify the service sidecar. So that's all for the entire process. And you can see that there's no, there's some advantages, such as different receipt for tenants can be supported and no additional fail-to-edit and no additional config changes if new work notes added. However, in this slide, you can see that there are some clusters and one tenant, such as, for example, tenant one or tenant two, each of them have the multi-clusters and multi-maches. However, all of these services in one tenant can be communicated with each other because all of them have been signed by the same signers, which created by their signer operations. And this is an extension and in this situation, it can benefit from our solutions. Okay. For the next slide, I will introduce about a state-of-the-art multi-tenants in service mesh. And as we know now, there are some multi-tenants and some service meshes, so there are some many additional private keys for each tenant's service. So how to manage this private key service? I think it's not popular property, it's unpopular to store them in the configuration files or some config map and secrets like that. So we import the Intel software guard extensions, SGX. SGX is a system of architectures enhancements defined to help protect application integrity and confidentiality of data and to withstand certain software and hardware attacks. And it can have the following items for protecting data. The first is protects against software attacks even if OS drivers, BIOS, and hyperriders are compromised. A second item is secrets. Secrets reminds protect it even when attacker has full control of all the platform. And the third item is prevents attacks like memory bus snooping, memory tampering, and cold boot attacks against memory contents in our run. The fourth item is provides hardware based as test station capabilities to measure and verify valid code and data signatures. So in our solutions, SGX will provide a trusted computing enclave where data and applications are protected independent of the operating system or hardware configuration itself. And every tenant will provide its own provide key in the enclaves. And enclaves will provide some handlers to user to get this rarely key. And as you know, SGX enclaves will help provide a more secure transfer between processes in OS. So there are some very secure installations between the confidential data and the consume data client. Okay, I think the next one more time if you want to know more details, you can see the more details from the Intel homepage. And then the next time and the next step, I will do some demo shows for us. At first let's make things either. That means one tenant with one name space and each name each tenant has two services, HTTP being service and sleep service. And I will share my screen for this demo. Yeah, as you can see, before I will run the demo script. Before we do something, we need to have one clusters because in our scenarios, there are one muscle node on one node. And you can see it's ready now. And then we need to make sure that our signer operation solutions is working well. In our scenario, we use the set manager and our signer solution. Then we need to, at first, we need to create the self-signed class issuer of set manager. It's a concept of set manager. And then there are for the name space bar, full, and issue system. Bar and full is our user username space and tenant space. And then we need to install issue. Before we install issue, goes through the configuration file for the issue. There are two sections for this. For this, the first is the environment. And I have mentioned, there is the environment variable, third designer domain. And we set the set class issuers, set managers at the venue. That means it will use the set manager as the designer. And an external CAE. And then the second section is about the overlays. We need to update the issue D cluster cluster role issue system. We need to gain more privileges for the set manager to sign the CSR, Kubernetes CSR. Let's install the issue. Installation is done. And now let's deploy in the workloads. As we have mentioned, every name space has one tenant and with two services. HTTP beans, service, and the sleep service. And the same things in full name space. Okay, everything is done now. Okay, now let's verify the network connectivity. It's between inside every name space, first in bar. As you can see, we will call the HTTP bean service. HTTP bean service in bars from the sleep bar. We get a response from the HTTP bean bar service. And it's available. The same things in the full name space full. We also got to get the response from the HTTP bean full. Now, we also support another way to do such network connectivity is verification. It's a Istio CTL. And you can use this command to do such verification. Istio, CTL, proxy, config, root say, compile. And the ports and name space and two ports and with their name space. And the result is that most of these services is available with each other. This is the inside the one name space. Another is about the cross name space. You can say we check the network connectivities from sleep.bar to the HTTP bean.full. And then the result is the connection arrow. And the same thing, the same arrow when we do, when we from sleep.full to HTTP bean.bar. And then let's also see the Istio CTL result. Okay, they are unavailable. So that's all the, that's all the very file inside one name space and across the name space. And we also, we also do summaries for this, all of these solutions. And in all these solutions, we have some compiler raisins between them. And the first, but the more important is that I also, I want to highlight is that the third one, assuring that slides. In the second options, single mesh, single cross-term, var, var, multi-c, multi-c, external, also, and sjx. We have mentioned that there are different routes, besides the different routes for tenants, no additional field to edit, no additional config changes if new or no added. We also provide identity isolation by the external or the authorization, and another is the sjx to make sure the more secure it. But then the cons is that there is additional customers they need it. So I think that's all for our today's presentation. Thank you. Thank you. Bye-bye.