 All right. Hello, everyone. Hello. Thanks a lot for coming to our session. As you can see, the title for this talk is Bridging Trust Between Multicluster Meshes. In a number of ways, Jackie actually teed this up for us earlier, which is great. So if you were here for her session earlier, that kind of context will be super helpful for this. Either way, if you didn't attend that session, we're hoping that what we cover will still prove very useful for a number of takeaways for you. Before we get started, I just want to mention that Ovidio and I actually met for the first time at KubeCon EU in Amsterdam. We happened to be staying at the same hotel, and he just came up to the same table that I was at having breakfast. And that was the first time we engaged and built a friendship over that. And one of the first things we spoke about was Istio and Spiffy. And now several months down the line here, we are giving a talk together. So I'm going to let my friend introduce himself. Hello. Hey, my name is Ovidio. I'm a senior container specialist solution architect at AWS. And what Konde actually said was true because I told him, look, you know, I did this for a customer, this demo. And Konde said, yeah, I'm preparing something similar. All right, okay. And since then, since Amsterdam, okay, that was our story and what we're going to see right now, it is what we put together. All right. Great. Yeah. The culmination of all of our engagements. So my name is Laconde Mwila. You can also call me Luke and I'm a developer advocate for Kubernetes at AWS, part of the Amazon EKS team. And I'm also a CNCF ambassador. For any of you that are interested in connecting with me, please feel free to do so either online via LinkedIn, but I'm also more than happy to chat with you after this session. If you have any questions or any comments, more than happy to hear those. And for those of you that are interested in watching Cloud Native content, I encourage you to check out the AWS Container Specific Channel on YouTube called Containers from the Couch. You can also check out my YouTube channel and just search for me using my full name. Right. So that out the way, a video and I actually had this idea to kick off the session with a particular exercise that involved handing out bowls of yarn. But we decided not to go with that because we thought it might get messier really quickly, even though that would actually emphasize or highlight the kind of problem domain that we are dealing with in this session. So I'm just going to describe it. Imagine we had bowls of yarn, a number of them, and we handed those out to folks in the front row. And then we asked each individual that received a bowl of yarn to hold onto a piece of it and then pass it on to the next person and then repeat this exercise over and over until everyone in the room is holding some part of the bowl of yarn and until everyone in the room is touching a part of it. Now, some of you might be wondering what would be the point of that exercise? As great as it would be to come together to build the largest Istio sweater ever made at a KubeCon event, that wouldn't be the purpose. But that would be cool. The point of it would be to highlight what a number of environments today look like and if your environment doesn't look like that, well, it might be headed in that direction. Specifically, if we got a snapshot view from a bird's eye view looking down and you saw all these strands across the room, a lot of environments look like that today because of all the interconnections, relationships, and dependencies between different microservices and subsystems. And these kind of environments are increasingly dependent on the network for that service-to-service communication and that creates what we like to call a web of complexity that reflects this to a large degree. You have a combination of microservices, legacy systems, monolithic architectures, perhaps some cloud functions thrown in there. And this web of complexity is distributed and very heterogeneous in nature and it typically doesn't exist in one environment. You could have it in the public cloud, in the private cloud, and on-premise as well. For some organizations, this may even span different orgs altogether, different business units within a single org, but also different orgs altogether. And the goal for these organizations is ultimately to enhance their software architecture, they're enhancing their digital products, trying to make them better to create a larger ecosystem. And that typically comes with more integrations that look like this. So this web of complexity is not going away any time soon. In fact, it's only going to get more distributed and heterogeneous in nature. Now, given that this is highly dependent on the network and service-to-service communication, there is a serious security implication. How do you establish trust in this modern web of complexity? Now, more than ever, it's increasingly important to be able to validate a peer and to ensure that your services are communicating with the right components. And not only that, but able to have secure communication between them, so secure interactions as well. And this has become a very important thing and is increasing all the more, establishing trust in this large web of complexity. Now, some of you might say, well, isn't the most straightforward path to simply take the existing security identity models that exist associated with each of the independent applications or workloads and simply bridge them or bring them together, integrating them? Well, that's one approach, but I would argue that it's not so straightforward. In fact, it's highly likely that you may end up in one of the following pitfalls. You have to be prepared for a number of ongoing meetings and exercises, try and find alignment between the different security identity models. These kinds of meetings will typically be carried out by your platform, security, and application development teams, try to make sure that they actually align before any implementation takes place, a clear understanding of the threat modeling across the board. The outcomes of those meetings or the implementations of those will typically be carried out by the application developers, who then have to continuously modify microservices or subsystems to ensure that they're catering to the different security identity models, and this juggling process is not so straightforward. It's hard enough to manage a single security identity model because security in the software space is relatively complex. And as good as your application developers are, and I'm confident that they are, it's highly likely that there may be security misconfigurations. Covering all your bases in such an environment is difficult. For those of you that don't have a number of services, maybe this seems relatively straightforward for now. But as your system continues to grow in complexity and size, as I mentioned before with the product enhancements and evolution of your software, this will become increasingly difficult. And because of that, a number of teams that I've spoken to choose to defer the matter altogether because of the complexity and the challenges. They say, well, we're just going to kick the can down the road and we'll circle back to this at some point. But the problem with kicking this particular can down the road is that it gets bigger and heavier the more that you kick it down the road. Because it's highly unlikely that your software architecture is going to slow down in its evolution. More services will be introduced with more integrations. And with that increasing complexity and size, that means that the requirements to secure it at large will also increase in complexity and size. So that brings us to the question. How do you secure your web of complexity? Or how do you establish trust in this widely distributed and heterogeneous environment? And before we delve into the details of that, we actually want to highlight the fact that this area or domain of trust and security is essentially just one piece in the puzzle of application networking at large. And a service mesh like Istio can help you accomplish this because it deals with application networking in the bigger context as well. And this is just one piece in the puzzle. And you can use Istio to actually build what is known as a domain of trust using the service mesh. And this is similar to what Jackie covered earlier. And so I just want to walk through what you would get by default if you were to install Istio and get it running. And remember the mesh is essentially functioning like a domain of trust for the workloads that exist inside of it. So you have Istio D which will function as the certificate authority by default. And certificate signing requests will be sent from the workload component. And I'm being rather vague when I say that, the workload component. You can see that there are different elements within there. There's the microservice. There is the Envoy proxy and the Istio agent. And so the certificate signing request is sent to Istio D which will be responsible for signing that particular request. It will validate the particular workload before it issues it a digital identity. And the issuing of this digital identity will come with a specific spiffy verifiable identity document. And I'll elaborate on that briefly. So the spiffy verifiable identity document is a X.509 cert that is part of the spiffy specification. Spiffy in more detail is an open source framework that consists of different specifications for universal attestation and issuance of identities. Now Istio only uses part of that specification for the spiffy verifiable identity documents that will be issued to the different workloads inside of the mesh. And that allows the workloads inside of the mesh to be able to have a specific identity as well as validate their peers to enable mutual authentication. Now there are also some contexts where you can expand this domain of trust across different meshes. And this can be particularly useful for project isolation. And then you can have what is known as mesh federation between your different Istio service meshes to establish trust between the workloads that exist in the distinct meshes as depicted in this diagram. And you can also move up the root certificate authority from IstioD and have a different component functioning in that area, whether it's cert manager or spire as was covered earlier or ACMPCA as depicted in this particular diagram. Now that probably begs the question for some of you thinking, well, does that mean that for our web of complexity we have to move every single workload into Istio? That's not what we're saying. But even if it is your desire it's highly unlikely that the roadmap will align for you to have every single workload in distinct meshes across the board in your organization. So the next question that we have to be asking is how do you maintain a single security identity standard for workloads that are inside as well as outside of your Istio service mesh? Because within Istio we have a single identity standard but we wanna be able to expand that for the workloads that are outside of that particular mesh. If you think back to the web of complexity I mentioned cloud functions and monoliths and other legacy systems that may exist in your larger architecture. Those integrations may even span across organizations and you might not have much of a say in another org adopting a service mesh. So we're gonna walk through how you can use a platform agnostic or an interoperable approach using spire. And spire is the production implementation of Spiffy. Now what does this process actually look like for these workloads? So a key objective over here is to build a domain of trust. So like I said, Istio gives us a domain of trust within the mesh itself but we're looking at building more of a universal domain of trust that goes beyond the walls of a service mesh. And the process to build this domain of trust is to have every workload starting from a position of no implicit trust whatsoever. So zero trust and then working from there to a position where it first gets attested based on a preconfigured criteria or a set of characteristics that you define and only once it passes that attestation process based on your criteria, can it then receive a digital identity accompanied with a digital document that proves that particular identity? Workloads can then exist and participate in this domain of trust only once they've gone through this particular life cycle and that will enable secure communication with their peers inside of the mesh. And so you can follow this process with the different workloads and infrastructure components because Spire is interoperable. Now I want to explore this a little bit further with an analogy that I find usually helps in when speaking about Spire. So a couple of years ago I was working at a consulting firm and when I was working for that particular firm they issued me a company card and that company card gave me access to the Johannesburg office building and the company card also confirmed my employment status and identity. Now while I was working for that particular firm I was consulting on their behalf at a financial institution. Now when I tried to first access the financial institution I mean I knew this wasn't going to work but I thought to do it anyway. I tried to use the same company card from the consulting firm. Anyone ever tried using an access card somewhere and then it just didn't work? A couple of people. I know some of you are just pretending I can see a few hands going like that. All right. It's okay no one's watching. So the reason that happened is because essentially these are independent entities right they have distinct trust domains just because I'm trusted in one domain doesn't mean I'm trusted in the other and so I had to undergo an attestation process all over again with the financial institution and once that was done then they issued me a company access card. If I happen to be engaging with multiple companies at that time like three or four more that's three or four more company cards and if you're like me sometimes the company card doesn't always make it into the bag. You sometimes leave it at home but the point is that juggling process can be difficult but this is akin to what a number of microservices or workloads are undergoing when they have to juggle multiple security identity models and the application developers behind them are having to go through the process of continuously modifying these workloads in order to cater to the different identity models to establish trust. What's more desirable is a passport like model so I have a single passport I don't juggle multiple ones and I use that single passport when I go to different foreign nations and each foreign nation similar to companies function as independent trust domains with an issuing authority but there are bridges of trust between the foreign nations including my country of origin and so first they validate the legitimacy of the document that I present and only then do they use it for verification as they check me to see are you who you say you are and so this is the same thing that we want to translate into the digital space as well and Spire helps us accomplish that where we have an interoperable mechanism for first the testing workloads and only after they've been attested do they then receive a digital document that identifies them and allows them to have secure communication within the trust domain but the next question is how does this interoperable integration work in the context of Spire and Istio so that you can have a universal mechanism across multi mesh multiple meshes as well as domains beyond your Istio service mesh and I'm going to hand over to a video to take over from here. Thank you. So the whole story started at the beginning of the year when I had like a customer approach approached us because they have a use cases they wanted to they wanted to have several Kubernetes clusters with different root domains with the surface mesh between them actually the architecture was much more complex than this one and for each of the cluster they are using the search manager as a root CA and in this case the Spire will act as an intermediate CA not as a root CA what they wanted to do is to establish a trust between one cluster management cluster and several other Kubernetes clusters Okay. That was a bit of a challenging and I started to create a POC for that. All right. It ended up something like this. So I have a EKS cluster that was the one in the left that was the like the management cluster and then I created like another cluster that we will see also in our live demo and I try to put the workloads okay the federation between them and it worked and the customer said yeah, okay it's all good but I want I don't want to use the East West Gateway I want to use the Ingress Gateway I want to access a virtual service through the Ingress Gateway and I want the communication to be from my management cluster to all the other clusters and not also backwards so the communication the trust bundle will be from one way not also each way. All right. Okay, so in our in my demo in our demo you will see that I still have okay the East West Gateway at the beginning you know but then I'm going to cut this access in there and we're going to access the service through the virtual service and through the Ingress Gateway. Okay. Now, I'm going to go briefly through the demo and I'm going to show you also how I built it. First of all, okay say for the infrastructure for the old infrastructure and for the EKS clusters I use the Terraform so what I did here I'm going to briefly with you just to go through the demo is I created two VPCs one for the full cluster and another VPC for the bar cluster All right. Okay. So I have two clusters in this scenario and then into this into this into this modules here I have also the the peering between the two and associate the the routines that are necessary for the cluster for the VPC and for the clusters to talk to each other. Now, after after I have this I'm going to go briefly one. Okay. I just installed a an EKS cluster in this case and the EKS cluster with the Terraform AWS modules Okay. And together with this I installed the EKS blueprints are done that there are some necessary add-ons that needs to be added and to get and also the cert manager because you remember the cert manager was the root CA after this and once the cert manager is installed I passed also the self sign CA Okay. For the root manager so every cluster Okay. What would be like an independent trust domain? It's the same for the for the other cluster as well. Right. I'm not going to go through that as well. After these cluster was installed it was the case for the spire to be installed. Now a couple of weeks ago I've seen that there is a parent chart available and I'm planning okay to add that Helen chart into this Terraform modules and we'll don't have a separate installation for that and then all together with ASTM. So a first I in the first by a YAML for the first cluster I have everything in here defined what I want to highlight for you there are only couple of things. So first of all in for in order for the root CA one of us said manager to be the root CA you need to add the cert manager as a cluster old okay because he needs to be part of it and second in the config map of the spire server okay here I'm defining the bundle endpoint for the federation and what where is the for the spire server the spire server gets installed on some they have some tolerations only on specific nodes nothing gets else gets installed on that on those nodes only the spire server and then I'm I'm telling to the spire that the upstream authority for the it is the cert manager for this after this you have to okay there is a script here behind that will do everything for you and they it will exchange the bundles as well and it will it is everything it's automatic automated the issue then the issue for the issue installation I have to configuration yaml's one foot one for each cluster obviously and with the issue operator I'm adding the sidecar injection webhook for the issue for the spire sorry and these are part also of the ingress gateway and also for the east west gateway in case you want is this great as well you need to have you need to have the Istio CTL into the path in this case and this script in here will get this auto automated for you and after the spire will be integrated with the hand chart in the the terraform Istio can be added as well and everything will be all out of the only with the terraform deployed now this this is the this is the whole setup and in the examples in here I'm using the hello world from the Istio okay nothing nothing fancy than that more than that and in the hello world I'm just defining hello world v1 for one cluster and you have to add the notation for the federation okay and that is part also a the sidecar a spire will be added okay to it in a suite will be ideal to it and the same happening also for the bar and in this here you'll see that this hello world v2 okay this is the hello world v2 now let me show you it's the demo will not take long the preparation was was more was more complicated all right okay so we have the we have two clusters in here okay I have some notes in here if you don't mind okay you can see through the back here I have the spire server in the spire agents because I have three notes and hello world the cert manager and hello world v1 on the first note okay and together with the Istio system the same is happening for the second cluster same setup although here what is different is that there is a different application for the hello world which will be hello world v2 okay now what I'm going to do is to I think I okay right okay it's better what I'm going to do I'm going to validate the east ways gateway traffic but I did at the beginning okay with the customer I have a command in here it's it's a long command so with this command from the cluster one I am I'm going for the hello world service and because I don't have any network policies I have no restrictions between the cluster both services from both clusters okay they should respond all right so it's responding either v1 either v2 okay because I don't have any restrictions in between them through the east ways gateway now what I'm going to do is to scale down the the hello world v1 to zero okay and I'm going to check right now on the cluster one I have none on the cluster two I have the v2 and the virtual service already deployed by me earlier now I'm getting the I'm getting these variables okay for the gateway URL nothing nothing that you don't know until now and now I'm accessing the I'm accessing the service through the ingress gateway the virtual service so the from the cluster one okay I'm using the cluster one and I'm actually I'm accessing the service that is on the cluster virtual service that is on the cluster two and it's running through the ingress gateway right now this is the link to the demo repository we still have one minute and 40 seconds so if you have any questions please I think we're actually over by a minute by a minute by a minute sorry thanks for your patience for those who are interested in asking any questions or just want to chat further about the demonstration or the topic at large please catch up with us we're going to head to the back thank you so much