 Have you ever wondered how to expand your service mesh for your specific business needs? Perhaps you need to add external authorization or advanced rate limiting, or maybe you want to add customer logic in your data paths to perform dynamic routing or policy enforcement. I would like to talk about service mesh extensibility patterns to help you with service mesh. Hi, my name is Lin Sa. I am the director of open source with Solar.io. I have been a long time contributor to the Israel project. I'm a maintainer and currently also serve the technical oversight committee of Israel. So before we get to service mesh extensibility, let's talk about what is a service mesh? As part of the journey to cloud native organizations, many challenges will manage microservices such as connecting, securing, and observing needs microservices. Why? Because microservices comes with a lot of challenges with one monolithic to many, many, many services. These services needs to connect to each other. You need to understand what if the network has failure, how do you do retry and timeout, do you render errors, and how do you observe what really goes on within these microservices so that you can tell which team may have a problem. And also how do you actually secure the communication among these microservices so that your security team would actually be happy with the microservice framework you have. So service mesh in the nutshell is really to help you with these problems so that you don't have to handle how do you connect to other services, how do you secure other services, how do you send the telemetry data, how do you observe your services in your application container and the service mesh provides the proxy for you that you can run alongside with your application container. And also most importantly, the service mesh provides API to allow you to declare infection on how you want to config these proxies and then the service mesh would automatically config these proxies for you so that the proxy would follow your intention but in a language that the proxy can understand. In Istio, the API you interact as a user is the Istio API which is a much higher level abstraction on top of the envoy configuration and the Istio pipeline would essentially turn that customer resource you provided following Istio API contract into the proxy configuration of envoy which easily could be tens and thousands of code of envoy configuration with just a few microservices. So there's strong reason you don't want to do that configuration yourself, you'd rather leverage a service mesh control plan to program the cycle for you. So today we're going to talk about what if the default control plan and data plan doesn't fit your business needs and you want to extend that. So let's talk about the data plan extensibility first. So at the bottom of the layer we have two applications and these applications are as possible a data plan as you can see when app one needs to talk to app two essentially the proxy is the main in the middle to mediate the traffic right each application container has its own proxy as a cycle running within the same pod namespace using Kubernetes as an example. So you might be wondering how does that happen right so how does traffic always goes in and out of the proxy. So we think Istio we have something called unique container that set up the IP table rules for the pod so that it programally config how the traffic are intercepted and redirected on their pod. In fact the first customization point we're going to talk about is the unique container in the Istio project itself we support using any container to complete the IP tables as the default configuration but some of the organization may find out it's really against their security rules to have like that admin privilege to deploy applications into the net because any container needs to config the networking IP table so it needs the net admin privilege. So in those case we provide on something called Istio CNI plugin so the organization could use Istio CNI plugin to config the IP tables it runs as a demon set on your Kubernetes worker nodes and it would allow the the demon set to config the pods as they are added to the mesh it configs the IP table rules what the any container in those case does instead of programally config the IP table rule it simply just validates the IP table rules are good and then finishes the initialization work so that the proxy and the application container can take effect. The second extension point is really the proxy image so the proxy image we in Istio you can plug in your own image so that's one way to extend the second way to extend is you change the proxy config which we provide a config map that dictates how you want the default configuration template to be and you could certainly customize that too you may want to customize a pilot agent or you may want to add some libraries onto the proxy image or you may want to trim down the proxy image to provide your own image to do that before you convince the upstream or maybe upstream doesn't have an interest because it's on your own private scenario. The third customization we're going to talk about is Envoy filter so Envoy filter it essentially allows you to customize the proxy configuration at Envoy level to the language that Envoy can understand it's a very detailed API that you have to understand exactly you know what Envoy configuration looks like and how you want to patch it so for example some of the common scenario we see all our customer do is maybe enable access log for the proxy of this a little bit so the fourth one we commonly see is using Vydom to be able to provide customizations so the reason Vydom is really interesting here is because it's running in the native speed and it's very easy to test you can use a standalone isolated VM to test your proxy and your proxy extension it can do dynamic updates without Envoy restart it really eliminates the need to recompile and maintain your own build of Envoy because you could use Vydom to simply build a extension on top of it and what I like most about Vydom is really be able to store the weather plug-ins in WebAssembly Hub and be able to consume other people's plug-in so think about Hub how many of you use Docker Hub I'm sure it's a lot right be able to reuse other people's work it's fantastic so Vydom really enables you to be able to easily catalog and share their work and reuse work for other people so these are the four extensions the fifth extension points is in this diagram one thing we haven't discussed is the ingress gateway right or egress gateway so typically you would have a ingress gateway that mediates the traffic coming into the mesh and the ingress gateway you can also customize that using any of these customer logic we mentioned so you could have any container you can know most of these you don't need a unique container for ingress gateway you could customize the proxy image you could also customize the gateway using envoy filter you could also use the WebAssembly module to customize the gateway so the gateway is really different from the psychoproxy because it doesn't do automatically injection on demand so you have that proxy configuration already up front and then you can just go to the kubernetes yaml file to customize the gateway configuration so let's call that the fix by reusing some of these same techniques by those but apply on the gateway so we talked about the data plane extension points let's talk about the control plane extension points typically on the control plane side the one common extension points is built the abstraction over the service mesh api for example it's your car sales force was mentioning they build the abstraction using helm so that is one way to build abstraction ebay also mentioned they have their own abstraction using their own customer resource and solo we also build abstraction these are our blue api and that api is it's a role-based api so that you can focus on roles so that's one type of extension on the control plane side the other type of extension is in the control plane you can say i don't want my control plane to serve as a ca instead i want to plug in my own ca so you could certainly do that with your own ca so that your own ca would sign and all send all the keys and certificates for all the workloads in the mesh so these are the two common control plane extension points let's go ahead see a demo um so this is my kubernetes environment running my on my laptop cluster one as you can see so i have a kind cluster and in my cluster i have a bunch of stuff installed i have is still installed first of all i have the booking fold installed the only thing with booking fold is i didn't install version three so i have review version one or two i also have a simple example called is still in action in the is still in action namespace it's the web api example and the recommendation and the history and also the sleep so a lot of the interesting stuff you can see you know these apps are already in the mesh with the two slash two that means it has psychon next to it so what i'm going to do is i'm going to walk you through each of the configuration we talk about right so let's touch the web api for example let's check out that it's configuration so we talk about the first configuration extension point is the init container right so this is the init container used by the is still as you can see it can fix a bunch of ip table rules and it captures the incoming traffic and outgoing traffic and all that configuration is done then the proxy takes place um and the application container so one thing interesting on the proxy is you can see you know the image is highly customizable um the bunch a bunch of other configuration are also customizable and what's interesting in this scenario is we actually enable the configuration uh called hold application until true so essentially this says you know you don't start the application until the proxy is ready and this is important because some of the application uh it's right here some of the application really requires um maybe get out of the network or maybe just for security purpose they don't want the application do any job before the proxy reaches running okay so we talk about one and two let's talk about this the third one which is using our void filter right what we're going to do is apply an our filter called um web api logging audit so i'll show you what that looks like um so essentially what this does is it says hey i'm not applying this because i want to apply it to the cycle of the web api and i want to change my logging format to add these configurations an interesting configuration i would say here is like the search um and also like the the x forward for the response so that's not print out by default so let's um apply this and generate some traffic to it so we're going to do is we're going to generate some traffic from the issue in action from the sleep part in the sleep container so if you look at the sleep um sleep part in the sleep namespace you can see it's one slash one so in this case if you look at the logs it would not have um anything interesting so you can see because the search is like blank because we're not doing any search because the sleep part doesn't have cycle um so let's try to generate the traffic from the sleep container within the issue in action and you can see this time we actually have two slash two in there so let's try to look at the logs and you can see in this time the logs is print out like interesting data right the search information we ask to print the ui it's all printed out for you so this you can use this technique to check you know if the traffic is indeed mutual tis and what are the sorts and the ui are using through the two microservice communication the fourth approach we're going to show is using web assembly let me bring my um thing over so i have a filter here called my filter and in this filter um it's a written in an assembly language and in this filter the only thing i changed is um on the response header side i want to add the hollow word uh through this filter web assembly filter and let's build that so build um the filter and also push the filter to my registry in web assembly hub so you can see i'm running the was and build command um and then i'm trying to build it and then i'm also trying to push it to the hub and you can see it pushed to the hub successfully and then let's quickly validate that in the hub so this is over web assembly hub it's quickly validated in the hub okay so you can see um my filter which is the was a module i just added 30 seconds ago okay so we have our was a module in the web assembly hub let's go ahead and use it on the review version one through our local selectors i can specify this is only for review version one let's take a look at the on-way filter generated as you can see review version one wasm so the corresponding on-way filter is generated for me automatically um let's check how the filter works so what i'm going to do now is form product page to call the reviews and notice we have version one version two of review on the local cluster so you're going to see um sometimes hit the hello word sometimes not so now you can see what we have done is we build a wasm plugin and we push the plug into the hub and then through the wasm deployment configuration we brought that into our cluster and then have it applied to the review version one service so what i'm going to do is as you can see this is my cluster two on the right side and at the bottom is my management cluster so cluster two is pretty much the same as cluster one the main difference is it doesn't show it doesn't have reviews for version one version two just your simple show traffic routing i shift team and if you look at the bottom cluster which is the management cluster that i installed it has um it has the management layer so essentially um that i can deploy my abstracted api which is the robust glue api we talk about so what i'm going to do is i'm going to exit out of the management layer now what i want to do is i want to apply the virtual destination crd which is the abstraction crd we build on top of the istio resources and you can see this says i'm creating a reviews that global to represent all the review services across different cluster and the other thing i just created is traffic policy so traffic policy allow me to say you know shift to the fail over um and if local fails go ahead shift the traffic to the virtual destination um that we did just defined which is go to global okay so now i'm going to go to the cluster ones to see okay now i'm visiting the booking for the cluster ones in quest gateway as you can see right now you know it's run robbing between the clusters um because you know i have both one two three in place but now if i go here and shut down my reviews uh one and two in the first cluster uh let's see if uh if istio can handle that automatically so um let's go here and now guess what i'm expecting to only see versions three because version one and version two are down in the in the first cluster okay let's check out the configuration for these because you might be wondering what's the magic rise config like two resources traffic policy and also the virtual destination and now you have a complicated an interesting fail what scenario already played so what we're going to do is looking at the UI to help you understand what's really going on one thing i like the UI is because it allows you to to look at the debug configuration of the istio so if you go here you can actually see this is cluster one and cluster two you can see actually how many virtual services and service entry that's actually behind this thing including an on-way filter so we talk about on-way filter like in this case we actually have an on-way filter that applies to the gateway to handle the intro gateway among multi cluster and to handle the sni port 15443 to say you know for the traffic that calls for class one that global change to cluster that local so that the traffic can be forwarded from the gateway to the local service so this really shows you um the power of building abstraction and your user doesn't have to create a lot of virtual service service entry and destination it let's wrap it up as you can see service mesh extensibility are super powerful we went through different ways to extend the data plan which can be applied to the cycle proxy and also be applied to the gateways we also went through building abstraction layer on top of the service mesh api and also plugging your own ca for your service mesh we would like to have you share us your extensibility stories and see if any of these extension patterns would fit your requirements thank you very much i would love to hear any questions you guys have