 All right. Looks like we are finally live here. Thank you so much for joining us on this beautiful day in Amsterdam. By the way, there are still quite a few seats in the front. Feel free to come to the front. Make yourself feel comfortable. I hate to see you guys standing. So thank you so much. Oh, today we're going to talk about Operate Multipendency Service Mesh with the Argo CD in production. My name is Lin Sun. I'm the director of open source at solo.io. It's a small company. How many of you heard of our company? All right. Wow. Many of you. Thank you. So I've been a very long time Israel maintainer, one of the founding member in the Israel community. I actually wrote two books about Israel. My most recent book is Israel Ambient Explained. How many of you actually tried Israel before? Wow. Many of you. Awesome. How many of you actually heard about Israel Ambient? Wow. Are you surprised? All right. Awesome. So if you want to learn a little bit more about Ambient, we do have a book signing tonight with me and my co-author Christian Poster. And since I don't have my family traveling with me in Amsterdam, I thought I'd share our team dinner picture and the walk last night. So I hope you guys are enjoying food in Amsterdam and also the canal. Now I'm going to pass to my co-speaker, Fasila. Hello, everyone. I'm Fasila. I work as a cloud native developer at Ericsson Software Technology, which is the open source software development division of Ericsson. I primarily work from the Ericsson Euro Lab office in Germany. And my role includes prioritizing the telco, cloud requirements in the open source communities, especially Istio and Envoy. And I'm a maintainer of some of these working groups in Istio and Envoy proxy. Prior to Istio, I have also served in the Technical Steering Committee of Open Daylight and some of the other networking projects under the Linux Foundation. So I'm currently a Steering Committee member of Istio as well. Yes, I'm basically from India. And that's my family. And I have an eight-year-old daughter who had an amazing time here last Sunday in the CNC of Kids Day. Thanks to CNC for that. And now over to you, Lynn, to get started with our presentation. Awesome. Thank you so much, Fasila. Can you guys all hear her fine? Does she need to go in the back? OK, awesome. Thank you. So today, we're going to talk about, give you guys an overview of ServiceMesh. Sounds you guys all know pretty much well about Istio and ServiceMesh. So we're going to jump fast through that section. Then we're going to talk about different multi-tenancy models in Istio ServiceMesh. And then we're going to do a little bit of a demo about operator multi-tenancy with Istio and also with Argo projects. And then we're going to wrap up with best practice of multi-tenancy with Istio ServiceMesh. So what is the ServiceMesh? Does you all know what ServiceMesh is? Raise your hand. All right, looks like almost everybody. So essentially, it's a programmable framework that allows you to connect, secure, and observe your microservices without you writing those code in your application. ServiceMesh architecture typically have a control plane that you interact with the control plane to set up your network policy, to set up your security policy, and the control plane programming program all the sidecars to do the policy enforcement within the Istio project. And also we've seen across the ServiceMesh industry space is there are a little bit of new innovation coming in without the sidecars. So how many of you would be interested to run ServiceMesh without a sidecar? Wow, that's a lot of you. OK, so it sounds like we're on the right track with the evolution, right? So essentially, what we believe is we want to slice the architecture into layer 3, layer 4 as one layer, and also layer 7 as the second layer. And the reason we're slicing the architecture into two layer is because layer 3, layer 4, we keep it very minimum, focus on the security and network on that particular layer where we could potentially do a load-level proxy that handles all your policy enforcement and also mutual TLS. In this case, all the co-located parts would shell the same proxy that's co-located on that particular node. For layer 7 traffic, we introduce an issue project. We introduce a Waypoint proxy, which is dedicated for your tenants. And that Waypoint proxy is serving either your namespace or your service account. And you are not sharing that Waypoint proxy with any other tenant. This is extremely important to architecture for us because we don't believe our Waypoint proxy handles multi-tenancy inherently. And we don't believe it handles noise label, layer 7 processing. It's also very complicated. You don't want one tenant's layer 7 processing, bring down the proxy, and impact the other tenants. How many of you are familiar with the CNCF service mesh landscape? I would expect every one of you. All right. If you look at the landscape, it's very, very daunting. There are many, many service mesh projects out there. I would say probably what you most heard is Istio service mesh and LinkD. So if you haven't heard, Istio is a CNCF incubating project. And as a project, we're working very hard to make it a graduation project at CNCF. So now we're going to jump in to talk about multi-tenancy Istio service mesh, since both of us are on the Istio project. And we are inherently familiar with the Istio project. So we're going to start talking about single mesh with multiple tenancy, multiple team. This is the most widely deployed multi-tenancy model in the Istio project today. It's also what's in the default. So in vision, you have a single Istio mesh that you have multiple teams. In this case, you probably have different roles in your organization. You have the team admin. You have the operator. You also have each individual teams who are producing their services. So the teams are typically our service producer or service consumer depends on which role they are serving. And you probably won't deploy in one Kubernetes cluster. So typically, your team would have the boundary of multiple cluster. And some of your services even running across different networks and also on virtual machine. So in Istio, using a really simple example of a couple of services, web API recommendation, and purchase history with web API exposed to the Istio Ingress Gateway so that web API can be accessed for clients outside of the cluster. So using this example, you typically would create Istio gateway resource and also Istio virtual service resource to expose the web API service to the gateway on a particular host and also with a particular port number. And this is the default behavior of Istio, but there's one problem. The problem is, though, the configuration you apply to web API is visible for the entire mesh, which may not exactly be what you wanted, right? Because not everybody is going to consume web API server. And if you're recalling our diagram, the only consumer of the web API service is the Ingress Gateway. So let's see how you can potentially make it simpler with Istio to make sure the configurations are not impacted by other tenants from the web API team. So in this case, we're going to talk about from service consumer side and also the service producer side on what you need to do on Istio. Just make sure we're on the same of terminologies. So in this case, when service application A calls application B, so the application A would be the service consumer, and the application B would be the service producer. In all the simple cases here, the Istio Ingress Gateway would be the consumer, and the web API would be the producer. So with that, let's talk about how to do that with Istio. So the first thing you would need to do is you would need to apply an authorization policy, just making sure only the required services can access the web API service. Only the absolutely minimum needed service, which is the Istio Ingress Gateway here. And the second thing you can do is also adding export to annotation to your service, the producer service, to say, I want to scope the service to only to be available in the current namespace, along with the Istio Ingress namespace, which is the Istio Ingress Gateway resides. That's from the service producer side. On the service consumer side, what you can do is you can apply export to the route configuration, which is called in virtual service in Istio. So you can say, hey, I'm going to apply the scope of this particular virtual service to be applicable only to the Istio Ingress namespace. In addition to that, you can also apply the site card resource on the consumer side to say, I'm restricting the Ingress traffic to be able to only go to the web API namespace. In addition to my current namespace, and also the Istio system namespace, which is where the Istio control plane resides. So that's how you could potentially config tenant isolation with one single mesh among multiple teams. Now I'm going to pass to Fasila to talk about the next model. OK, so I think it's now clear how the single control plane based model works. I think that's the widely used model these days. But for us in the telco 5G core requirements, it was always a case like our platform will come with simply a bare minimum Kubernetes cluster. And then there won't be any service mesh installed. And then we have our CNFs, which are basically applications or network functions that runs on this cluster. And each CNF sometimes may be from a different vendor. And each vendor will have their own service mesh solutions. So in such cases, it was getting difficult to follow this single control plane based model. Maybe it will change in future whenever Istio also similar to Kubernetes comes by default as a standard in the Kubernetes cluster. So the scenario might change later on. But as of now, so we do require in a single cluster mechanisms to run multiple Istio control plane. So that's the reason why we were spending effort in getting this working. So it's basically similar to this diagram. Whatever you see here, it's like different Istio these are running. And I just classified it as user group one, user group two, or you can call it as tenant one and tenant two. And you have different applications running in your cluster. It's a single cluster. And one of the app namespaces is managed by one control plane. And similarly, there are other applications which are managed by different control plane. So to get this working, we started in Istio different efforts in formalizing how to get this running properly with multiple control planes. There were minor issues here and there, which we spent effort and tried to fix one by one. So in this case, finally, this is how we have an experimental feature already available in Istio currently, which gets this up and running. So a tenant can own one or more application namespaces. And each tenant has their own unique control plane. And the resources are isolated using the concepts of revisions and discovery selector. So Lynn already showed how things can be isolated using the export to configuration and all those things were already explained. But we were focusing more on using the discovery selector terminology. I think that was also contributed by Solo a while back, but at that time, it was not extended to custom resources. So what we basically did is extended the discovery selectors to work for the custom resources in Istio. And now with that, we no longer need any of these export to configurations for isolation. And in fact, this is not restricted to multiple control planes, even all the examples which were explained before with single control plane. That can also be enhanced using this feature flag, using discovery selectors. And the concept of revision is basically used for the cannery upgrades in Istio. That's how heavily it's used. But the same concept is being used for this multiple control planes for multi-ton MC as well. Yeah, so this diagram just shows what are the configuration that you need to get this all up and running. So as you can see, you don't have to customize all your custom resources with export to and everything. It's just the initial Istio operator configuration when you're bringing up your control plane. You just need to specify the discovery selectors and revisions. And then maybe you can just enable peer authentication per system namespace so that gross namespace communication will get disabled by default. Because each control plane will have its own root CA. So once peer authentication is enabled by default, the gross user space or user group communication will be disabled by default. And then if you want to enable it for any purpose, you have to use a gateway and then enable the communication. So yes, for the demo, we are using our Go CD. So I think most of you will be familiar now because yesterday we had enough collocated events for Istio Day and our Go Day also. So Lane was giving a presentation there as well. You have to use our Go rollouts. So in this session also we are using our Go CD to deploy with a multi-train and applications. And then we will just show how it works with the example first Lane had shown and then later with a multiple control plane example. So maybe I'll just do the demo first and then we'll go to the subsequent slides. Fine. Yeah, sounds good. By the way, Facela has been fasting so her voice is very soft but she hasn't drink any water since 4 a.m. today. So for the first demo, what I have planned here is Istio is already running in the cluster and our Go CD also is there. I have created an Go CD namespace and deployed all the Go workloads there. So and then now whatever the example whatever Lane had shown before, that's what we are going to deploy in this cluster. This is a local cluster. We will deploy that as an application. So I have my Go UI already here. So I'm going to create a new application here. So you know that for our Go CD, the source of truth is a GitHub repository here. So I have created a GitOps repo with multi-train and C manifests being here. I have a single control plane and a multiple control plane folders created here and the examples whatever Lane initially showed are here under the single control plane example. In case something is not clear for anybody, it's all available publicly in my GitHub so you can go back and refer to the read me here and try it out for yourself. So first let's deploy the application. So I'll just name it as single control plane. I'll give the sync policy as manual for the sake of this demo. And now we have to specify the repository URL from where we want to deploy the manifest. So I'm going to... By the way, can you guys see very well from the back? Is the size okay? Good? No? Okay. Maybe make it a little bit bigger for Silla. Thank you. Now it asks for the folder from where we should deploy the manifest. So in this case, I'll be specifying the single control plane folder. And then this is a local cluster. And namespace, I'm not specifying here because I already have the namespaces listed down in my manifest files. And then I do not need any of these things because directory request is also not needed. So let's just create it. So I've given a manual sync so let me just do the initial sync. Sorry if the laptop is a bit slow, but yeah, now you can see the status here. Let's just check my cluster. So you can see that three namespaces, whatever Lynn had shown in the examples that those are created and all the workloads are also created here. And this particular application has a UI which we can fetch here. So you can see that it says our back access denied. So from her examples, you would have seen that initially when she was telling like, okay, you have different namespaces and then you can by default disable everything. That's the best thing to do. And then one by one, whatever is really needed, you enable only those particular calls and allow only those calls to go through. So currently I do not have those manifests here with export to and site card configuration. So what I all have is this basic alone thing authorization policy which says, okay, do not allow any kind of calls and all these namespaces and then a default site card policy which disables all cross namespace configuration. Now what I'm going to do is maybe let's, I would have committed all these things beforehand but I thought it's better to show the Ago CD way of things. So I have these additional configuration which was there in the example with the authorization policy and export to and everything which is required for this workloads or the services to communicate with each other. So now let's just commit these things to my GitHub repository. So now I have committed these additional resources here. If you see this authorization policy, everything has come here. And now if you go back to the Ago CD UI and do a refresh, you will see that, okay, the application is now out of sync. So if you go here, you'll see that, okay, there are several new things which have been committed to the GitHub repository which are not there in my cluster and it's saying that it's sync is needed. So what we'll do is, we'll just do a sync. And now, yes, everything, the new resources are created in the cluster and everything is synced properly and then let's see if now it, yes, now what you can see is, this is just a very basic application which shows you the service invocation. So you can see that the web API is calling the recommendation service and then recommendation service is calling the purchase history service. Now, with all this new additional resources which were created, maybe I can just show you one example of one authorization policy which says from which source, what operation is allowed and similarly. So similarly, we are one by one allowing explicit calls to be going through and then that makes the whole application work fine. Even though it might be different teams who are managing the different workloads, we are making sure that by default, everything is disabled and then securely, we are making sure that only what is needed is allowed. So that was the first demo and now for the multiple control planes part, probably I just need to delete this one. It takes a little bit of time to get this deleted and then once this is deleted, this time what we are going to do is we will use the Argo CD UI itself to deploy both the control plane and the applications. So before that, since previously I had directly installed, it's still here. I'll just clean it up. So you can see here that I do not have any additional namespaces, everything that was created for the first example are cleaned up and now I'm going to create a new multiple control plane application. I hope you have still time. So let's again take the GitHub repository URL. Now this time the path is different. That's for the multiple control plane and I'm going to deploy it as two applications because I want to make sure that the dependency part is taken care. We need the stereo control plane to be available before we deploy the applications. Yes, Argo CD has mechanisms to order things but I do not have it in the demo yet. So let's just do it this way. So I'll first deploy the control plane. I'll do the initial sync because the sync policy was manual. And now this is going to deploy two stereo control planes in the same cluster. And then maybe let's just go and check if it's created. So what I have done in the deployment manifest is I'm creating two namespaces, tenant one and tenant two. And then this is a system namespace where you will have STOD deployed. So you can see that there are two STODs running in the same cluster, one under tenant one namespace and one under tenant two namespace. We do not have any applications yet. So we can create the application as a separate, I think it's, yeah, it's just trying to check the health of the system. But maybe we can just start creating the application. Actually, we do not have to do with this, like this manually every time from the UI. Argo CD allows you to specify this in a YAML file and we can do it in one shot, but I felt it's better to use the UI like this and then show. So I'm going to use the applications folder now. I have the same application, whatever was shown in the diagram before, where we had two tenants and then three app namespaces, one app namespace managed by one tenant and the two other app namespaces managed by a different tenant. So I'm just syncing the apps now. You see here now, you can see that, yes, these three app namespaces are created, app NS1 to app NS3. And for this demo, we are just using very basic sleep and HTTP bin ports under each of these app namespaces. Before checking that, we can use this Istio-Cuttle PS proxy status commander to see which system namespace manages which application workload. And you can see that tenant one is managing this HTTP bin and sleep ports, which are there in the app NS1. And the tenant two, Istio-D is managing the applications in app NS2 and app NS3. So that's what basically we wanted. We wanted to isolate the application workloads from each other based on their tenants and then that's how the applications are already deployed. So to check if it is working fine, here, yeah, I have it already in the read me here. If you guys want, you can go back to read this. I'll just use this curl command from here. Let's just see. So what I'm going to try here is from the application namespace one from the sleep port, I'm trying to communicate to the HTTP bin port under the application namespace two. If you remember the initial model, app NS1 and app NS2 are under different namespaces. So if you try to communicate, you will get the service unavailable because it doesn't know about app NS2. And now app NS2 and app NS3 are actually in the same, under the same tenant. So if you try to communicate from app NS3 to app NS2 here, that should go through. So now if you can see here, it was just the initial STOD configuration, the STOD install or whatever. You just had your mesh config with discovery selectors and revisions. And then by default, you're getting all these things done and none of your configurations are leaked to other namespaces until and unless so. The workloads are under the same namespace. So that's what we wanted to show in the demo. The demo material is all available in the public GitHub. You can use that. I have put the references in the slides towards the end as well. Now, Lynn will again talk about the other different deployment models which are possible and with multi-tenancy. Yeah, thank you. Thank you for still. That's a great demo. You did awesome to see it actually works. So just to summarize multi-tenancy service mesh with single Istio service mesh, you could have multiple teams. This is with single route of trust. You can run multiple teams within the single cluster. You can also run multiple teams across different cluster, different network, across different virtual machine. The difference though is if you do have to call services across different network, different cluster, most likely you would have to go through a gateway like this diagram shows for cross-network traffic. You can also run multiple Istio within a single cluster like for Selah show her second demo. In this case though, because the service from the tenant A could not call the service in tenant B directly, which is the 503 she showed, it would have to go through a gateway. You could potentially run this in multi-cluster too. The each cluster has its own, as its own tenancy. The key difference in here though is each cluster, each Istio D would have its own route of trust which is exactly why the communication doesn't allow to, even though there may be network connectivity but because they're on different route of trust, it wouldn't allow. Just to really summarize the different models we go through, so as solo most of our customers are using the first model which is the native Istio support and it's the battle test and deployed in most deployment, Istio deployment in production which is the single mesh, whether you are running single or multiple cluster. In this model, it has a lot of advantages. The only thing is there's the export to you, there is the cycle resource, you have to make sure you are config those properly and it's also single route of trust if that meets your requirement which for typical org, it probably meets your requirement so it's very important in this model and saves the resource. I believe for Sela, you guys are using multiple mesh single or multi cluster. Okay, so already explained about this multiple mesh single cluster part so with all these recent enhancements, it's readily configurable using the revisions feature and the discovery selectors so you get a unique route, see a pertinent so you have identity isolation and it aligns well with the default Istio model of single mesh per Istio control plane so the things are easy to envision and when you are doing the deployment, it's very easy that you will not make mistakes like the first case where careful additional config is needed to make sure that nothing is leaked across the namespaces so that problem is not going to happen here and if there are cases where separate versions or for example, if there are multiple applications and they have different requirements for the service mesh if they want separate versions or separate life cycle management that becomes possible with this model but of course you're going to run multiple control planes so that means more resource consumption compared to the single control plane and yes, multiple mesh, multiple clusters. Yeah, one last thing we want to highlight is you can actually run mixed multi-tendency model, right? It really depends on your requirements, your particularly regarding isolation and trust requirement for your organization so if a single route of trust works well you could potentially run that on some of your cluster and then also run a different route of trust as what Facila showed in a separate cluster. With that, I think we are pretty much running out of time. Sorry, it took a little bit long but I think we may have time for one question. Does anyone have a question? And all this, the references will show everything that was done during the demo and even this is already documented in the STO documentation so please feel free to explore this documentation to understand it better. Yeah, thank you so much for joining us on this session. We will be here to answer questions so we'll be here for the next 10, 15 minutes if you have any questions. Thank you all and enjoy Amsterdam.