 Hello, everyone. Welcome to this session. My name is Iris Dean. I'm a cloud software engineer from Intel, so my focus is trying to accelerate and secure those mesh while unleashing the underlying hardware capabilities. I'm currently a Istio maintainer and serving in the Istio Steering Committee member. So together with me is Fosila. So Fosila, would you please introduce yourself? Hello, everyone. I'm Fosila. I work as a cloud native developer at Ericsson Software Technology. My primary responsibility is prioritizing the 5G telco requirements in all the different open source communities. I'm a Steering Committee member and maintainer at Istio and recently I'm a CNCF ambassador for the latest 2023 group as well. So here is a quick overview of the agenda. Whatever we are planning to cover today. So Iris. Iris is a security maintainer at Istio. So she will be talking about an overview of the certificates in Istio Service Mesh, different types of certificates and the different plugin mechanisms for the same. And she will also touch upon a little bit about confidential computing and her work related to that in Intel. And after that I will dig a little bit deeper into the certificate revocation list and OCSP stapling support in Istio. We will also talk about the different extended TLS configurations that Istio supports. And then since my experience is basically in the 5G telco security, so we will just talk in brief about the 5G telco security overview and then different traffic scenarios in the same and the different certificate configurations for the same as well. Now over to you Iris to get started. Thank you for Zilla. So let's first take over all the certificates in Istio Service Mesh. So as you know, security and mutual TLS are very key functions for Service Mesh. So for most end users actually when they landing Service Mesh in their environment, the mutual TLS might be the first choice of the feature they want to utilize. So to achieve the mutual TLS actually for each of the workload, you need a certificate and a private key to do the TLS communication. So as shown in this picture, in this green arrow here, this is the workload certificate. And also in the mesh edge side, either in the Ingress Gateway direction or the Ingress Gateway direction, you need several different certificate and the while. For example, you need to do the TLS termination in the Ingress direction or in the Ingress direction, you also need the certificate to the TLS origination. So you might also, for example, expose multiple back end services through the Gateway. So in this case, you might config multiple Gateway and certificate and private key for your Gateway. So this is another place in Service Mesh that the certificate is heavily used. Then to achieve the mutual TLS communication between the workload, you actually need a route of trust for your workload. So to achieve this purpose, basically you need a certificate authority server and another key component in the HDLD control plan is the registration authority server. So both of them will have certificates. Typically, the certificate for the IE server and the C server is the same, but they can also be different. So we can see some more details for this later. So let's first take a look at the workload certificate. So for workload certificate, actually it's related about how the NWO proxy gets the related certificate. Then when the data plan traffic comes, it can utilize this kind of certificate to do the mutual TLS communication. So when you... I mean, when you onboard in the Intel Service Mesh, there will be an NWO proxy injected alongside your workload. We call it a sidecar. So when the NWO is bootstrapped, it actually will get a bootstrap configuration. So in this bootstrap configuration, it will get a default SDS GRPC server configuration. So it will be served as a very fixed Unix domain socket. So the address is fixed. It's under slash var, slash run, slash secret, slash workload, specify UDS, slash socket. So it means by default, in this picture and showed here, the Easter agent itself will acting as the SDS server and will serve all the NWO proxy through this UDS socket. So for the Easter agent side, for himself, it has two ways to gather the private key and the third and then deliver it through the SDS channel to the NWO proxy. So one option is option one. It's also the default option for the Easter proxy. I mean, if you install Easter using all the default option, this is option one is the default option. So in this case, the Easter agent will acting both as the SDS server and also as a C client. So it will, you know, generate the certificate signing request and send it to the C certificate authority server. So we will see more details later for this option. For option two, in some kind, you know, you plug in your key and the third directly for the workload. So I mean, for the Easter agent. So it can consume all this plug in the key and the third. This is a very popular scenario especially for the 5G. So for Celia, we'll show you more details for this option later. Then last and not the least, it, you know, since NWO, it will, you know, send the SDS request through this fixed UDS socket. It also means, you know, it opens a door for external world to implement your own SDS server. For example, Spire is, you know, a very popular, you know, the theory trust standard. So it also utilizes fixed UDS socket and implement SDS protocol here. So it implement a supplier SDS server. So in this way, it can plug in and integrate with Istio to provide the workload certificate. In your case, I mean, if you have your own business needed a requirement, you can also implement your own SDS server here just to make sure, you know, listen to this UDS socket because this is fixed. And then follow the SDS protocol, then you will, you know, plug in your own SDS implementation here. So this is the detailed look about option one. Like the Istio agent in this case we are acting as a C client. So now the total flow is like when the annual bloodstrap, it will send the SDS request to the Istio agent because it's an SDS server. So after Istio agent get the request, it will generate the private key locally. It will also generate a certificate signing request, and it will send the certificate signing request through the TLS and the jolt token to the registered authority server, which is running in the Istio D control plan. So the registered authority server its purpose is to verify the incoming certificate request is valid to make sure only the authorized workload can get the third sign from me. So it will approve, I mean, if after it has done all the authentication and authorization work, it will approve if the CSR is coming from a valid workload, it will approve it. And then after the CSR has been approved, then afterwards the certificate authority server will come on board and take care, and we are signing the incoming certificate signing request and sign the third bank. And finally, it will deliver the certificate to the Istio agent, and the Istio agent will send the certificate and the private key to the M1 proxy. So this is the whole flow here. So you can see now finally M1 proxy has got its third signed back. Okay, so next we will take a look about the certificate authority server certificate. So you can see there multiple options as well. So let's take a look one by one. The first option, option one, is also the default option for Istio. So it means when you install Istio by default, it will create a self-signed CE server in its ECOD control plan. So it will generate as a private key in its memory locally. It also will generate its certificate. It also will store all this CE private key, root third, third chain, CE key, and CE third into a Kubernetes secret. It's called Istio-CE-secret. So in the future, when the CSR are coming, it will assign the certificate request using this CE private key and a third, and then finally deliver to the M1 proxy. This is option one. So this is also the simplest way and the quickest way to onboard to Istio. But it also means you will delegate all your workload root of trust to Istio-D. So for example, if your organization has your own key management system, this might not be a good option for you. Then option two, for example, you have the CE and the root third generated outside through another system. In this case, you can upload them to a Kubernetes secret. It's called the CE third and make sure to upload all the kind of material like the root third, the third chain, the CE key, and the CE third. Then after that, you install Istio and when Istio-D is started, it will detect whether such kind of secret exists. If it exists, it will not do the self-signed work anymore. It will utilize this kind of root third and the private key directly to assign the certificate for all the workload. So this is another option. Option three, this is a very useful option, especially Kubernetes certificate signing request. This API has been very stable recently. So it means we can rely heavily on this kind of API. So in this case, the CE server function can be totally disabled in the Istio-D control plan. So it means that only the registration authority server will be there up and running. It will authorize the incoming CSR and make sure all the CSR is valid and then it will do some kind of conversion work, convert the incoming Istio CSR to a Kubernetes CSR and make sure the Kubernetes CSR has been approved and generate a related Kubernetes CSR custom results in the Kubernetes cluster. So in this case, any third party which can consume Kubernetes certificate signing request can handle the incoming certificate signing request and assign the certificate to the related workloads. So this also opens a door for the better multi-tenancy support. We can see a picture later. So the first option is external CA. So in this case, you can define some environment variable when you start Istio. So in this case, the Istio agent will directly talk to the external CA server. So it will totally ignore the Istio-D control plan part. It will go through the external CA server. So this is very suitable. For example, you already have a very matured, established CA server in your organization. So in this case, just leverage your existing asset and integrate it with Istio. So as we said, options rule actually opens a door for you to give you better multi-tenancy support. This is an example here. So in this diagram, you can see there are two tenants, tenant one and tenant two. So for each of the tenants, when the Istio agent generates the CSR, it actually can carry a sign-in information. For example, for tenant one, all my workload certificate is signing from this signer like tenant one. And for tenant two, all my CSR is signed from signer tenant two. Then using the Kubernetes CSR integration method, you can generate different Kubernetes CSR signing requests. In the Kubernetes CSR signing request custom results, there is a field called the signer. So the Istio registration authority server can grab the sign-in information from workload dynamically and fill in the sign-in information to the CSR. And then finally, in the right side, you can have different signers for different tenants. But in this example, I can have a sign operator here for one tenant. I can leverage a certain manager to sign for other workload. So in this way, you can see, even in the single service mesh, you can have a multiple CA. And also it means the communication between different tenants workload will be broken because they are not signed from the same root of trust. Then the last scenario, which heavily uses the certificate is gateway. So for gateway, also two options here. Option one, the default option. So in this case, IstioD itself will acting as the SDS server. So for end users, what you need to do is you need to upload your private key and the third to this two different Kubernetes secret. So for example, there is a service credential secret one. Then you can define your own Istio gateway customer resource, or you can leverage the Kubernetes gateway API to define a gateway customer resource. In the gateway customer resource, you define a credential. This credential will point to this secret. So in this case, the IstioD SDS server will watch all this kind of secret and try to convert the secret to the SDS response and it will send the certificate and the private key back through the Istio agent and finally go to the NWAL proxy. In this case, it's the gateway. Another option is here, option two. So actually NWALCAN is also listening on this. UDS socket is called VAR slash run slash secret slash credential UDS socket. So it also means you can implement your own SDS server and make sure, you know, serve on this UDS socket. So in this way, you can see, you can also provide your own SDS implementation to serve your Ingress gateway or even Egress gateway. So this is something you can check about. Then, you know, all those multi-TLS give you a lot of benefit here, but in current upstream implementation, as I said, third and also private key are very key components when you do the TRS communication, especially the TRS handshake, right? So currently, all the private key is stored in clear text for all these three scenarios, like the CE keys. If you see in the left bottom part, you can see, you can use this simple cubicle command to very easily grab the CE private key. So it also means the attacker can grab the private key and break your whole match traffic. And also for multi-TLS, the private key is in clear text in memory. It's also dangerous. So how to solve this problem in Intel, we have a solution which is called Intel Software Guard Extension. So it provides a process-based trusted execution environment for you. And this technology is totally based on CPU itself. It means no additional hardware is needed here, like some hardware solution module. You do not need them. You can purely rely on the CPU here. So this capability is available starting from our surgeon Intel's young processor. So in this case, the HDX will provide you a confidential memory region. So in this picture, you can see the application has been divided into two parts, the trusted part and the untrusted part. Only the trusted part can access this memory region. For other applications or other software, even if you are an operation system or buyers or firmware, you are not allowed to access this memory region. So in the service management scenario, we actually put all these private keys in the HDX enclave. In another word, this secured memory region is called HDX enclave. So we have three other cases here. So we can store the private keys for gateway, for workload, for C server in the HDX enclave. So in this case, if you're using this solution, if you go to the cluster and you want to try to get the security now, if you're trying to grab the C private key here, you can see you can get nothing because all the private key is in this HDX enclave. So this solution has been totally upstreamed. So open source, so if you are interested, make sure you check out these two reports here. The first one, HSM, HDX server, this is work for the gateway and workload certificate. The trusted certificate issuer, this actually a certain manager, it can coexist with your certain manager environment. So it's actually just a certain manager issuer, but it will make sure your C private key has been guarded in the HDX enclave. So next, I will hand over to Facila to work you through more details. Thank you, Iris. Thanks for the great presentation so far about the service mesh security overview. Yeah, so there were a lot of options shown by Iris, so don't think it's like very complicated with this deal, but it's more like all the complex things are made possible for you so that all the different scenarios are working as expected. So now let's take a look at some additional extended features supported in Istio security architecture. So basically outside of a certificate's natural life, there are two main ways a CA can take a non-expired certificate out of service and make it invalid. So those two mechanisms are like CRL, that is a certificate revocation list, and the other one is OCSP, online certificate status protocol. So let's first see about the CRLs. So CRLs are basically like, if you see here, it's like a list of revoked certificates. It's maintained by the CA's and the distribution point, which can be residing outside and the POs are meant to request this list from the CA during the TLS handshake and verify whether the certificate is still valid or invalid. So it's basically like a black list of certificates that the CA revokes prior to their assigned certificate expiration dates. So this diagram gives you an overall picture of how the CRL implementation is done in Istio. So at Ingress Gateway, we want to check the revocation status of client certificates. So we should check at the server side if the client certificate is valid. Similarly at Ingress Gateway, the revocation status of the received server certificate could be checked. So the solution is not just limited to the gateways. If the sidecars are supporting TLS termination or origination, so we can use this at Sidecar as well for the external traffic and we can have the CRL check enabled in a similar fashion as the gateways. So this comes for us as a 3GPP 5G specification requirement for security. So many of the other enterprise use cases may not need it, but on the 5G world we have the 3GPP standard where there are additional security requirements and it states like the CRL status check should be supported for external certificate validation. So this is one such scenario. Istio also supports the OCSP stapling. So OCSP is like a protocol used to check the revocation status of individual certificates. So previously we talked about CRL which was like a list of revoked certificates. But here it's like applied to a single certificate rather than a list. So appear upon receiving the certificate, communicates with the CEA to check the status of that particular certificate. So Istio will not implement the basic overall OCSP support, rather only OCSP stapling. So in stapling it's like instead of each client having to request the revocation status of the certificate, the server which owns the certificate and serves the content, that requests the CEA for its validity and sends its own status in the form of a signed, time-stamped add-on. So the server can present both the certificate and the staple in the response to the client. So in Istio what we have is the OCSP stapling support at the Ingress Gateway. So that's what we do. Now Istio also allows you to configure additional TLS parameters in some of the scenarios based on the requirements. It's a bit unstructured currently, so maybe it's certain parameters are supported only on the Ingress side, set another parameters other on the Ingress side, but the documentation clarifies that. So it's like, for example, there is an option to configure the cipher suites, UCDH curves, and even the signature schemes. If cipher suites is specified, the TLS listener will only support those specified cipher lists, but it's like only in TLS 1.2. As you know, TLS 1.3 does not allow this configuration. Similarly, ECDH curves as well, but yes, based on the requirements that have come in, we have ECDH curves support only for the mesh external configuration. Mesh external means the traffic that's coming from outside the mesh to inside the mesh. Signature schemes also can be configured, but Envoy supports configuring signature schemes, but there are no standard Istio APIs for configuring the same thing. But yes, we can use the Envoy filter provided by Istio to configure the signature schemes. Now let's take a quick look at the overall 5G system architecture. Before I talk about the specific traffic scenarios, we don't have to go into details about it, but we'll just explain how this can be translated to the service mesh use cases. So many of these things will be already covered by Iris, but it makes sense to explain a little bit on how service mesh applies in the 5G core. So the 5G telco world, it uses the same elements as the previous generations, like 4G. So you have a user equipment and then the radio access network, which is the RAM, and then, yes, you have the 5G core network. So in the figure here, the user plane has the network functions and elements involved in the transport of user data. That's drawn at the bottom level in the yellow background and in the blue background, whatever is there, it consists of all the network functions in the signaling plane or the control plane. So the network functions are like the applications in a Kubernetes cluster. So they have, if you see the blue background ones, they have HTTP2 interfaces and the service mesh or Istio service mesh basically comes to apply only in that area in the signaling plane because, of course, you know that whenever service mesh is introduced, there is some latency compromises which we have to do which is not possible in the data plane and not acceptable in the data plane for real world traffic like the 5G scenario. So we use service mesh only in the control plane. So at the same time, in the same blue background, there can be some services which are onboarded to service mesh, some others which are not onboarded. So it's like in the same Kubernetes cluster, you will have some applications with sidecars and some without sidecars. So all these different combinations lead to different traffic scenarios which we will explain in this slide. So Iris already explained about how the certificates are configured within the mesh. But additionally, she also talked about the gateway certificates. So we already know how external to service mesh traffic works and also mesh to external traffic works in this diagram. But as I told before, there can be certain cases like within the cluster we can have a mesh external service which is basically like an application without a sidecar container. It has to talk to another application or service with a sidecar. So overall this results in like four different scenarios and there are applications within the cluster which have to talk to applications with sidecar and it doesn't make sense to come through an ingress gateway or egress gateway for applications within the same cluster itself. So to solve such cases, we will have to use sidecar-based TLS termination and origination. So this slide, yes, it talks about the four types of traffic which I talked about currently. So it can be like external to service mesh through an ingress gateway, service mesh to external through an egress gateway or service mesh to cluster internal without a gateway directly the sidecar will terminate the TLS or the cluster internal to service mesh without an egress gateway where the sidecar will originate the TLS. So total four cases and it just talks about what are the different Istio custom resources that can be used to configure these certificates in those different scenarios. So yeah, that's all we had to discuss about and then if you know about the Istio user survey, we are inviting feedback from Istio users about the usability of the different features. So please scan the QR code on the left-hand side to give us feedback for Istio. And on the right-hand side, it's the feedback about the session. And yes, if you have any questions, we can take it now. Thank you. Thank you. So I had a question. Certificates in service mesh seem kind of like a choose-your-own-adventure book if you understand that reference, in that there's lots of different ways to do them and it's seemingly lots of different ways to do them badly. Do you, like in your infinite wisdom of the people on the stage, have recommendations on what you would suggest as the purest or simplest route to pursue certificates in a service mesh environment? Which option here is the simplest and easiest in Istio or you're asking about all the different service meshes? Yes. Okay. So which scenario are you talking about, the CA part or the workload or the gateway? Because it's different. Right. Oh. I mean... So because there are several places that have certificates like the CA server, the gateway, the workload, so what are you specifically asking? You are asking the CE, what's the best option for CE or workload or gateway? You know, it depends. It's hard to answer it because every organization has different situations. For example, if your organization has already a very matured CA solution and a mature key management system, maybe the external CA one is the best for you because you can reutilize all your existing investment just to integrate with Istio is good enough. If you are not under this case, I might support you like using the plug-in third, but in this case, you need to take care of the rotation by yourself because all this certificate is plug-in by yourself, so you need to make sure the certificate and private key in the secret is up to date so that your D can pick it up. Then the other option, like if you are already using the set manager like such kind of ecosystem solution, maybe the Kubernetes CSI is a good choice for you, yeah. Thank you. Thank you. Any further questions? Yeah, sure. Hi. I was just wondering if you could go back to the slide about why the Kubernetes secret was insecure and the command you used to get the private key. Curious. And if you could just unpack that again. It was the one about why the Kubernetes secret for the Istio server was insecure. This one. So what's the question is, like how you got the private key here? Right, yeah. I just wanted to take a picture of the slide. Oh, just want to take a picture. Okay. Got you. Thank you.