 Hi, this is your host of the Bhartiya and today we have with us once again Liz Rice, chief open source officer at ISOvellent. Liz, it's great to have you on the show. Thanks for having me again. Hi. Today we are going to talk about Cillium Service Mesh. We have been covering Cillium regularly, but it's always a good idea for our audience to know the background of the company itself, so we will know the associated product and project. So tell us quickly a bit about Cillium. What do you folks do there? Yeah, so Cillium has been around for, well as a project since I think around 2016, providing EBPF based networking capabilities in containerized environments and these days that primarily means Kubernetes, although we do also support networking to external and traditional workloads as well. And over the years Cillium has been more and more widely adopted. EBPF has become a more widely available platform that we can use. So EBPF allows us to essentially program the kernel and that EBPF capability is now available in the kind of versions of the Linux kernel that most users are now using in production. That's why we've seen a huge uptick in the number of users of EBPF-based tools, and it also means that Cillium adoption is going through the roof. And when it comes to service mesh, we already had a lot of the features of a service mesh. When I first joined Isovalent, Isovalent is the company that Cillium project originates from. It's a CNCF project, but a lot of the original developers of Cillium are at Isovalent. And when I first joined, I was told, you know, we already have like 80% of a service mesh in Cillium already. We already do load balancing. We already do encryption. We already do observability. So there was only this relatively small part to turn Cillium from networking or sort of supporting networking and observability and security at the kind of networking layer to also supporting the service level abstractions that we have in Kubernetes. So that's what we've done in the latest release of Cillium, 1.12, which just came out just over a week ago. And we have service mesh implementation in that release. CNCF kind of, you know, if you look at the landscape, there are so many logos, it becomes hard to even see what all is there. Sometimes a lot of projects, they overlap the capabilities. Sometimes there are gaps also. And Cillium is a very good example, you know, where you have these extra capabilities. So why did you feel that the need to kind of, you know, offer those capabilities so folks can also access them when there is already other products? What value is it bringing to the ecosystem? With Cillium as a networking project, one of the strengths that we have is our ability to reduce the length of the network path. Because we can use EBPF to program networking endpoints and connect them together incredibly efficiently. We can use that in Kubernetes in networking in general to create very efficient networking parts. And we were already using Envoy proxy to provide some layer 7 functionality, things like layer 7 observability. And we realized that we could use that proxy that we already had built into the Cillium agent to provide the layer 7 parts of service mesh capability. And what we're able to do with Cillium service mesh is operate in a side carless mode. So rather than having to have a proxy built into every single one of your pods, we can have one proxy per node. And that allows us to create much more efficient networking paths, any traffic in a side car model, any traffic that travels from one pod to another has to go from the application into the kernel back out into user space to the proxy into the kernel and then repeat that again in the destination pod. And that creates at least one, two, three, four transitions between user space and kernel space. With Cillium, we can make that a much shorter path only going to the proxy where necessary and turning that number of transitions into potentially none. In a side car model, every single packet has to travel through the proxy, but we don't necessarily need to do that with the Cillium service mesh. That also comes with the benefit of not having to inject a side car into every pod, which makes it administratively less complex. When we launched Cillium service mesh as a beta, one of the things I was actually quite surprised the extent to which people were excited about just simply not having to administer the side car injection side of things, not having to worry about things like which order the containers inside the pod come up, because that can create problems, it can create kind of race conditions. And so there was a lot of excitement as soon as we announced it about the ability to run, to get service mesh capabilities, but without having to run side car containers. So that's really the main difference. We do also support the side car model where people need it. But I think the performance gains and the simplifying of administration will lead a lot of people to use the side carless model and gain the benefits from that. How have you seen the whole role of service mesh evolve with the evolution of Kubernetes use cases as well? Because once again, as the use cases are growing, the things that you are able to do is also a security observability that is becoming a very important topic today. So talk about what is service mesh playing? And of course, since EBPF will talk about that also there, but let's talk about how we have seen the evolution to better serve customers and users. Yeah. I do think service mesh is a really interesting category of product because Kubernetes already has services as a native concept. Kubernetes doesn't need a service mesh for you to be able to connect your services to each other and the pods within different services can communicate. But service mesh is adding this extra, an extra layer if you like in order to provide, as you mentioned, observability and security in particular and some level of service discovery and things like retries. And it's really interesting if we think about that from an EBPF perspective because so many of those things, security, observability, networking, all really lend themselves really naturally to being implemented in EBPF because we already have a networking stack in the kernel. So it makes a lot of sense to observe what's happening in that networking stack which we can do within the kernel using EBPF. Similarly for security, if we want to implement network policy, dropping packets within the kernel is a very straightforward way to do that. It's a very natural use of EBPF. And then of course, using EBPF to provide the network connectivity, which is what the Sillian project's been doing all along, those three elements are very natural for EBPF implementation. And then we come to things like the traffic management, canary rollouts, all those kind of features that Envoy proxy is very well set up to manage. And we can get the benefits of both worlds that EBPF for that low level kernel accessibility and Envoy proxy for the layer 7 functionality that we expect from a service mesh. Now let's really talk about EBPF. What does it mean for the kernel? First of all, the kernel community, I mean it's a big massive community itself. It has set grounds, foundations for other open source projects there. But what would it mean? Because it will certainly have access to a lot of enterprise-grade developers who will be accessing it. I mean EBPF is already there. So talk about what does it mean for kernel itself? And I would also like to talk a bit about the limitations of EBPF because of the reliance on the kernel itself. So let's talk about the board's aspect. Yeah, so I think it's been a huge amount of excitement about EBPF over the let's say couple of years. And it is incredibly powerful. And one of the things that I've kind of realized over this time is how quickly, I myself have done talks, given workshops, here's Hello World, here's some basic load balancer, here's how to build things in EBPF. But you do really quite quickly start hitting a point where you need kernel knowledge. You're injecting your EBPF code into the kernel. And you can attach EBPF programs pretty much anywhere inside the kernel. But in order to do that effectively, you kind of have to know, well, where is the right place to attach that program? And what are these data structures that I'm looking at? And what effect will my EBPF program have on the kernel's behavior? So I think that in reality, most enterprises of EBPF isn't going to be writing their own EBPF programs in BPF bytecode or in C and compiling them and loading them directly into the kernel. I think for most enterprise users, they're going to be using the power of EBPF as unlocked by projects that create those BPF programs for them. And we've seen lots of examples of this in the CNCF landscape. I mean, Cilium is probably the most advanced, but there's things like Pixi that's providing a wide range of observability and providing some really nice visual representations of what's happening in your cluster. There's been Falco using BPF to do kind of security, observing syscalls. There's quite a few projects in and around the CNCF landscape that are going to help enterprises take advantage of EBPF as a platform. Not to say, I mean, there are plenty of giant companies like Metta and Google and Netflix who have got EBPF engineers actually writing the BPF code themselves. So there are exceptions to prove every rule, but I think for the majority of listeners, they're going to find the most benefits by taking advantage of some of these existing projects. Since you touched upon security, and of course, you have a great background in security as well, and you also talk about your talks and education awareness. How much awareness do you see is already there? There is a big, just like in an Avengers movie, there was a blip of five years because of COVID, there's a blip of two years in the Kubernetes CNCF community where we have not seen each other for almost two years now. But we are going back to events. How much awareness you have seen there, which is already there, where you still have to go out and tell folks about all these things, or you still that, hey, folks are already, all they are looking for tools, and because of the talent crisis, all they are looking for is to make things easier for them. So where are we? It's a really great question. And it has been interesting, if we think about the explosion of interest in EBPF, kind of has coincided with that period of time where we've all been sort of forced to stay at home or certainly travel a lot less, and conferences have been fewer and far between. I've kind of been involved in EBPF Summit on the first occasion as a speaker, and then last year as a host. And it's very much a community virtual event. And it's been really interesting to see how much interest there is and how the breadth of interest from sort of academic level, extremely detailed research right through to end users wanting to understand how they can take advantage of EBPF. There was this incredible level of excitement. And I think it really comes down to the power of being able to change the behavior of the kernel. It's a really revolutionary approach to infrastructure. Most of us don't need to worry too much about how the kernel behaves. But if we need to, if we have really low latency requirements, or we're really pushing a lot of packets through a system, or we just have a complex distributed system, as most Kubernetes deployments are, then the benefits of EBPF and the power of the things that we can do to make the kernel behave more efficiently in these particular scenarios is really, really exciting. And I think we're, you know, we're a long way from kind of getting to the end of all the different use cases that we'll see EBPF used for. Now I want to just go back to this release 1.12. What other, you know, of course, Selim Service Mesh is there, but what are the features, core features that you are excited about? Yeah, so I think one of the things that I think is incredibly powerful about Selim is cluster mesh, which is the ability to have Selim running your networking in multiple different Kubernetes clusters and have them share services amongst themselves. So, for example, if you want to, maybe you have a back-end service, you would prefer to use back-end pods that are local in your local cluster, but should, for some reason, they become unavailable, you can fall back to using back-end services on a remote cluster. And there's been some really nice improvements in 1.12 around cluster mesh. And it's so easy to use is really, you literally just label a service to say how you want it to appear in other clusters. And it almost feels like magic. It almost feels like it shouldn't be this easy to be able to connect services across multiple clusters. So I think that's one of the reasons why I really like it. And another thing that I think a lot of Selim uses in enterprise are going to really value is the ability to connect your Kubernetes workloads with external workloads. So things like having an egress gateway so that you can have a predictable IP address so that your external workload that maybe uses a traditional firewall and it wants to know what IP address your requests are going to come from, we can do that with egress gateway. So I think that kind of support for these enterprise requirements, you know, most people don't live in a greenfield Kubernetes world. They live in a world where they have to integrate with legacy workloads and potentially, you know, distributed around different regions. So I think those are probably some of the highlights in the major areas. There's a whole long, long list of, you know, individual features that have gone into 1.12. So it's pretty, pretty comprehensive. Yeah, I have the list here so I can see that. So that's why I was like, you know, what are the things that you are excited about that you feel 1.12 is here and this is open source projects, you know. But if I ask you, what are the things in your pipeline, there are a lot of things that you can share, a lot of things you cannot share, but I think that you are excited about the problem that you are looking at solving. So talk about that. So one thing that I think is going to be really cool is the new approach that we will have to cryptographic authentication for services. So people talk about MTLS in service mesh and it's one of the features that you can sort of offload from your application to the service mesh, you can say, my application just doesn't need to know all care, but my service mesh will implement the authentication and encryption of traffic. And the next generation approach that we're going to be working on for this, you're going to be able to use network layer encryption, which we already support in Cillium, but it's in the kernel so it's super fast, and use certificates that might come from Spiffy or cert manager or whatever your third party control plane is for managing those certificates, but have that kind of with separating the authentication and the encryption so that you can use whatever your certificate management control plane you want and the data plane using that network layer encryption with those cryptographic identities. I think that's a really exciting improvement and will make for more efficient encrypted connections. I guess the one other thing that we might mention would be Tetragon. So Tetragon is a kind of a standalone project within the Cillium family that uses EBPF and a whole lot of the expertise that we have in Cillium for generating security events, using those to detect potential attacks and even being able to prevent attacks by killing the responsible process from within the kernel. So Tetragon we announced in Valencia or just before Valencia or QCon. Tons of people were very excited about it and that's another area where we're kind of working really hard to make that more usable, to build up more examples, to go from here's our sort of initial release to a really productized security tool. Liz, thank you so much for taking time today and talk about normally the Cillium service machine but also in general EBPF and the bigger challenges, problems, security, observability as it is there. So thanks for sharing those insights and as usual I would love to have you back on the show and hopefully we'll see each other at the upcoming QCon. Thank you. Yes, fingers crossed we'll both get to be there and everyone will get to be there in Detroit. That'll be excellent. Yeah. All right. Great. Thanks for having me.