 Welcome everyone and thanks for joining us today. We are Hector and myself, Edgar Pardo. And today we are going to present and share our journey with Kubernetes cluster, managing them at scale and particularly focusing on Kubernetes networking on dosage clusters. So both of us are platform engineers working for Roche. So first, a bit of presentation about Roche for those of you who doesn't know Roche yet. It's a pharma company with more than 100,000 employees across the world which has two big divisions, Diagnostics, which is the division that takes care of building the instruments that takes blood analysis and any kind of test on the laboratories and the devices that interact with those instruments for running process data, running software on top and process like kind of algorithm for processing that data. And then there is the second division which is the pharma one that develops the medicines that treats the patients. So how was working in the past like Roche given that is present in thousands of laboratories across the world which apart from having those instruments has the devices that interacts with those instruments. Every time that they needed to like they had the need of deploying a new algorithm or software to interact with those instruments like for that blood analysis like given that they need to process that data and apply some data sets. They were assigning a team for that and that team had to brought its own device to the laboratory. This means like another device and on top of that developed that algorithm. And then potentially if needed they were building their own solution of logging as well monitoring and whatever need that the business were requiring. Then the problem was that we ended up having like those kinds of devices on the laboratory to interact with those instruments making it super hard to automate stuff and like every device had different libraries different languages and for people with our role that wanted to automate everything it was kind of an nightmare. So then this is where our team came in to going to the second approach where we have like single platform, single device where all software development teams inside ROS can deploy its software making it like single way like a single approach for deploying and like maintaining a single platform. So basically we are like internal platform inside ROS that our goal is to be present on all the places that ROS has instruments and devices that is laboratories, hospital, doctor offices and so on and basically managing clusters at the scale on the edge. But managing those clusters at the edge comes with many challenges because as you can imagine the storage solution there backup and especially networking like comes with a lot of challenges when you don't own the network given that those clusters the laboratories doesn't belong to ROS we are just inside a private network that we do not manage plus we don't manage the firewall so it comes like everything is handed or uploaded to the customer and yes so on the other hand we need to connect to many external services because those edge clusters are not air gap so they need to interact with a container like sorry a cluster that we have like a regional cluster that we have Covernance One deployed on cloud provider AWS and for China AliCloud and there is where we deployed our internal API to interact with those clusters GitOps repositories from there the clusters are pulling all the configuration and also like container registry for pulling images and so on so those clusters need to connect to many external services and as you can imagine given that we were not managing the firewall this ended up like every time we had to deploy a new cluster on a laboratory on the field we needed to ask for another ingress rule like ingress rule for allowing traffic and so on and again it was becoming a nightmare to handle all of that and so we ended up like reserving a static range inside ROS for like for exposing all our requirements or all our external services and we were leveraging Cloudflare for that so that we could leverage every CDN feature like web application firewall rate limiting and so on so on the layout you can see that whatever it's not matching our range static range was not allowed by by the firewall and then dropped and then if it was matching our static IP range then we were allowed to connect to the regional clusters but this was not covering all the use cases because the software development teams had for instance like needed for connecting to a webcook for alerting like sending data reports to SAS platform and so on and even like connecting any cloud service that we do not own right on AWS we do not own the range so for that we were not covering all the use cases so yet we had like limited support for that and again so if we go back to the scenario where we needed to ask every customer on every firewall to get a new rule given the amount of clusters we are managing this was not scaling again so this is our legacy solution we built for that and so some of the services the ones that had like a common scenario we were proxifying them through Cloudflare kind of a reverse proxy as Cloudflare was allowing us to do it but some of the services that like the software development teams had were working like on layer 4, TCP, UDP and we were not able to support them because Cloudflare doesn't support them through that proxifying feature and there were also like additional applications that needed to do some modification of the headers on fly before forwarding it to external services and again this was not covered by Cloudflare and so we ended up placing a second reverse proxy on our regional governance cluster there we were doing the header modification and then forwarding it with the request but again this came with many manual stuff to do because every time there was a new scenario or request coming from the teams as they needed to connect to another external services we needed to study that case and check if it was supported by our external Cloudflare or of loading it to our regional reverse proxy and coming with many manualities so then we ended up like contacting with Cilium guys and again thank you so Cilium was becoming the de facto networking solution for Kubernetes by that time so we approached them and explained them our problem and they came with a very interesting idea so what if you could manage your network traffic declaratively with network policies not only to secure the traffic but also to forward that specific traffic that you need to approach it that you can configure in many ways like for example creating a web socket tunnel to a backend where you already have access to from your edges and encapsulating the original request inside that so it looked very interesting to us so yeah we started at PUC with Cilium and we are here today giving this talk so yeah it went pretty well but let's step back for a moment and revisit what is Cilium service mesh for the people that is less familiar with it Cilium service mesh is pretty simple a set of features that allows to do advanced networking stuff EVPF is a framework well it relies heavily on EVPF and EVPF is a framework that allows to load and run progress in a secure and dynamic way into the Linux kernel so basically it's extending the Linux kernel on the way and for our solution we are using basically two features from Cilium service mesh, Cilium network policies and Cilium envoy configs what are Cilium network policies so they are basically Kubernetes network policies on asteroids they work not only at layer 3 and layer 4 but also at layer 7 and this means that you can filter traffic not only by IP and port for example you could potentially block an HTTP request that is missing some HTTP headers for example what happens is that when you create a Cilium network policy Cilium is configuring EVPF program for that particular port, name, space or cluster depending how you configure the network policy and when traffic is coming from the user space then this EVPF program is executing the logic of the policy so for example if you wanted to allow the traffic then it will forward or redirect the traffic to the Linux networking stack if you wanted to deny it it will just block it and if you want to redirect it to an envoy proxy then it will do that and we will come back to that in a moment because this is one of the new features Cilium envoy configs first of all Cilium runs one unique envoy proxy instance per Kubernetes cluster node so this means that there are no psycars and what happens is that when you create a Cilium envoy config it creates basically an envoy listener in this particular envoy proxy instance an envoy listener also called CiliumServiceMess let's say that can carry individual configurations and operate independently in this instance put it very simple envoy listeners have to envoy as C groups and MSpaces are to Linux and yeah they have basically these two CRDs, the namespace resource and the Cilium cluster envoy config that is applied to the whole cluster coming back to our new solution the thing was that CiliumServiceMess was already just jet by that time so ISOvalent has to develop new features to make it happen and being able to use this WebSocket tunneling feature the first feature was to enrich Cilium network policies with the capabilities of redirecting traffic that the traffic that you want and you define in the network policy to an envoy listener that is defined in a Cilium envoy config and then additionally it was needed to create a couple of envoy filters for doing the WebSocket tunneling functionality so one it was the WebSocket client which we use in our edge clusters this one is in charge of establishing the WebSocket and encapsulating the original request inside it and then the other one was the WebSocket server that we run in our regional clusters and this is doing the WebSocket termination and then encapsulation of the request these are our use cases that we have today so the first one is the edge to cloud connectivity here is where we have an edge cluster in customer premises and they have a firewall as Edgar was mentioning before that is only allowing traffic to our set of ROS IPs that we have exposed in Cloudflare so for this particular case we will use the WebSocket encapsulation tunnel then we have a customer proxy scenario is kind of similar but then the customer doesn't have a firewall that is blocking the traffic but in fact they have a forward proxy out of some regulations and they require that all the traffic that is leaving the customer premises has to be forwarded to that proxy and then we have another scenario which is much more complex we will not go through it today but if you are curious about that we can talk after the talk so use case number one edge to cloud connectivity so let's say that we have an application running in a pod into a specific namespace and we want to reach an S3 bucket the edge cluster is behind the firewall and it is blocking this access by default so yeah the traffic will not go so for this scenario we are leveraging the ceiling service mesh to use the WebSocket tunneling feature how it looks like so this is mainly how it looks like so let's go side by side so first on the right side you can see that we need to run the WebSocket tunneling the WebSocket termination in a place so we are running this WebSocket termination and we proxy in our regional EKJS clusters in Amazon and we are exposing it through Cloudflare in our Rosh Bringer on IP range because the edge cluster has access to this range already we are enforcing MPLS and again if you want to see in detail how we are enforcing it then after the talk we can talk about that and then on the left side for the edge cluster what we have is first of all a ceiling cluster-wide MBOY config there we are loading the WebSocket client filter that is doing the WebSocket encapsulation we are injecting the MTLS client certificate so that Cloudflare can verify and allow the traffic and then finally in the name space of the pod running the application that needs to connect to S3 packet we have the ceiling network policy that is for this particular traffic redirecting the traffic to the ceiling MBOY config okay so this is the ceiling network policy let's go and see now in detail how it looks like so here what we are saying basically in this network policy is for A please for all that traffic that is going to amazonaws.com please don't send it to the Linux Network in a stack and instead send it to this MBOY listener that is running this a ceiling cluster-wide MBOY config then we have the ceiling MBOY config for those in the room that are not familiar with MBOY in MBOY you basically define MBOY listeners and MBOY clusters in the listeners is basically for the request that is coming you can apply many filters and then the cluster is where you define the upstream where you are sending the request to so here in the MBOY listener part of the ceiling cluster-wide MBOY config you can see that we are leveraging the WebSocket client MBOY filter developed by ISOvalent which is establishing the WebSocket and doing the encapsulation of the original request inside the tunnel and then in the MBOY cluster here we are injecting the MTLS client certificate for validation with Cloudflare and finally specifying the destination which is the WebSocket termination endpoint this so the MBOY config configuration corresponds to the WebSocket termination that we are running in our EKS regional clusters and here we are using as you can see the WebSocket server MBOY filter which is doing basically the WebSocket termination and an encapsulation of the original request so yeah finally we migrated to Cilion all our EKS clusters are running Cilion today and with that we increased the network performance we leveled up the network security and observability with Hubble which is a tool part of the Cilion family and at the very end we brought the firewall closer to our workloads to be able to manage this edge to cloud network traffic that we needed following github's approach in a secure way so now it's demo time Edgar will walk us through a quick demo about the scenario number 2 the customer proxy and you will see it in action thanks Hector so before starting I want to mention that we have prepared a repository which is publicly accessible with the whole demo that contains Makefile and a redmi with all the steps so you can follow it yourself and before starting with the CLI let me explain to you the setup that we have prepared which is covering the second scenario that Hector was explaining where there is no firewall but there is a forward proxy that the laboratory or the hospital requires that all the traffic goes through before leaving the customer premises so basically we have prepared on the left side an easy to instance where we are hosting like Covernance Cluster where the security groups are only allowing egress traffic to the second easy to instance which is like the one that has deployed this forward proxy which is just a docker container and the second machine is the one that has egress connectivity to the outside world like 0000 range then going back to the first easy to instance where we have the cluster we have prepared tune and spaces the first one like the test the night one doesn't have any kind of silo network policy neither silo member config and so whenever he tries to reach roge.com he will be dropper out because there is no egress connectivity to that address but then we have the second in space test alone where we will be configuring a silo network policy plus the silo member config where we are going to redirect the traffic to which will go over like through the proxy IP and then like reaching the internet after that so there we have like it's very small right so let me zoom do you see now ok so here is just we have prepared the cluster for the demo and there we have the tune and space that we have mentioned before the test alone and test the night one where on each of them we have prepared just a curl pod like test one that will allow just like to show you that whether or not there is connectivity to the outside world so that is like simple image and then at the moment there is no silo network policy deployed on the cluster and there is neither like silo member config so you will see so if we try like for that we have prepared a main file to make it easy to follow and quicker so if we try to reach the like roge.com so we receive a timeout like that command like the main command that we have prepared just checking into the pod and just testing like whether or not there is connectivity to roge.com and the same goes with the make test deny again this one will just drop out because there is no like in that one we don't even need to configure anything and now there is the second like third command that we have prepared where we will be applying the manifest yeah so now like the command is just connecting to the machine and just applying the silo member config plus the silo network policy that will allow us to reach the internet right so now if we run again the command yeah so now we receive like 200 HTTP code because of the policies right so let me explain a bit like how like the policy looks like yeah basically a sector was mentioning before on the silo network policy we are like creating this egress rule that says hey whatever matches roge.com please ok now fine so whatever matches roge.com please do not send it to the linux networking stat but please send it to the silo member config that we have prepared and then if we go to the silo member config that we have prepared again like Hector mentioned before for those of you who are not familiar with the silo member config files there you can configure listeners which is basically where you apply filters and then the cluster which is basically the upstream where you are sending the request to so in that case what we have prepared is listener that has the following TCP proxy mboy filter that it's the one that where we are config config and then because of that what we are doing is opening a tcp tunnel over HTTP connect and then like saying hey send the request to the original hostname and that is where we are saying ok send it to the cluster which is the upstream where we have like the following endpoint forward it to the following address and this IP is exactly the one that host so this is the IP of the proxy so the second ac2 instance where we are forwarding the traffic to and then if we run it again there we have the like we are connected to the other matching the ac2 where we are hosting the proxy and then I think this one it's as small as well so this is just I'm just docker log so tailing the logs to see it in life if we run again make test hello and go here we can see on the timestamp we are here on the message we are opening a HTTP connect request to the original destination which is the IP like of rush.com and that would be the demo so going back here this is the manifest that I have explained it before so nothing to mention here and so yeah one last thing and is isovalent release one new lab last week it's about layer 7 envoy proxy at pan features part of the of the features that we have demoed today are included in this lab and yeah I mean I know that isovalent has put a lot of love in this laboratory so yeah I really recommend you to do if you do it during this week here in the cubecon in their booth I think that you will get a golden batch sticker so very cool as well and with this this is everything we have for today thank you very much you can give feedback here and we will be around the whole week so if you want to yeah if you have any questions apart from the meeting today and so on just stop by say hello and it will be a pleasure to talk to you thank you and I think we have time for a few questions if people have questions in the room can you also connect the other way around from your systems towards to each node is there a tunnel in the other way so we have like kind of a reverse proxy deployed on the edge this is not has nothing to do with CNI but we have a reverse proxy deployed as an agent running on the edge clusters where we are opening a web socket connection like TCP forwarding for reaching the edge clusters as well so yeah we have an edge running on on the agents sorry we have an agent running on the edges to reach them in case of disaster or whatever we need to connect to them no normal VPN connection no no no so we are doing TCP forwarding so for this solution what we have is this other service that we need is part of our range of IPs that we have in Thoughtflare so already the edge cluster has access to that and to establish the reverse tunnel so it's deployed on our regional clusters same static range the connection is not opening from let's say from the regional cluster but the other way around so it's the edge which initiates the connectivity so that's why we are able to reach it Hi, thanks very much for the talk really interesting I'm curious about what's doing the translation or the wrapping the TCP traffic and web socket in both cases is that done by Envoy and then the cloud unwrapping it and is that the same for the tunnel that you just mentioned that goes the other way so for the first scenario that you have mentioned the filters that Cilium prepare for us are doing that translation and so on and the second one you mentioned do you mean the agent that is running on the edges? yeah the one that you used to send requests from you know centrally no so for this one it's a different agent that we have that has nothing to do with the Cilium CNI yeah so it's just TCP forwarding like opening the connectivity from the edge connecting to the agent that is running as well on the regional cluster like this one is listening and whenever the connectivity has started then there is kind of a tunnel from there we are like sending the traffic to but there is no encapsulation there thank you any other questions? yeah here in the first second row thank you great speech when you were terminating the TLS can you maybe just give some insights where you actually do the termination because it's not translated I guess so you mean for the enforcement of the TLS? so basically what we are using there is cloudflare access I don't know if you know but there what we do is well for creating the certificates and so on we have a root pqa a secret engine we have it in bold in one separate instance of bold and then for the regional clusters we run a dedicated bold instance also in the regional clusters so for each of them there we create a pqa intermediate basically and then with third manager into the DH clusters we connect to that to issue the client certificates basically these are renewed every seven days and then they are being injected into the Scylion Envoy configs when they are established in the web socket tunnel because this communication is then in cloudflare access we enforce the N TLS and we validate against the root CA that we have and then the traffic is being validated there historically it has been quite a manual work with certification do you use it as a manager or something like this well yeah we use our manager for issuing the certificates against the intermediate pqa and this certificate is stored in the Scylion secrets name and space and from there the Scylion Envoy config is fetching it and injecting it thank you very much we have time for one last question during the demo I believe I've seen some SSH connection is it a personal preference is it possible to fully configure it purely through manifests well yeah so the make file the command that we have there it's just for the demo right so we are connecting to the machine just to apply the manifest but yeah so it's just for the demo process but you have like K get nodes and stuff like that you had access to control panel why wouldn't you use just kubectl directly instead of SSH yeah it's just personal preference we would do that with a kubectl config directly okay thanks thank you very much