 Okay, so let's start, hello everyone, we will talk today about replacing traditional legacy IP tables with EBPF and Kubernetes clusters with CDU. We are from Suzha, I'm Michał, this is Swamy. And yeah, let's start from the question what's wrong with traditional IP tables. So there are several things wrong from the point of view of Kubernetes clusters. First of all, IP tables is a technology which has 20 years. And it was designed mostly for simple IP address and port matching, which is a good approach for like traditional server applications for the time where we don't have like huge clusters and huge high availability. But in the era of Kubernetes clusters is not enough in our prion. The other issue is that IP tables is not really aware of the L7 protocols. So you can't filter the HTTP calls. For example, you can filter particular database queries. It's all based on IP addresses and ports, layer 3, layer 4. And the other thing is that IP tables operates on the concept of chains and rules. And to add a rule to the chain, you basically operate on the linked list. So every operation except of insert is ON. So like searching over the chain of rules or modifying the particular rule is always the like average ON operation. And that's why Kubernetes currently uses IP tables extensively. So when you use Kubernetes with the most of CNI plugins, you use IP tables for mainly two things. First thing is implementing surfaces with keep proxy. And the second thing are network policies for filtering the traffic. And it sometimes end up with so much IP tables and so much sadness. But there is one technology which tries to address that. It's BPF, which was already mentioned in previous presentations in the dev room. But to briefly introduce it, it's a virtual machine in the kernel, which allows you to write programs in the subset of C-language, which filters and traces the packets in the kernel. Or it can also be used to trace the kernel function calls. But in case of Cilium and in case of our talk, we focus more on the networking side of BPF. And Swami will talk about details of the network stack. So thanks, Micheal. So Micheal actually explained about what are the issues that we are having with IP tables. So everyone here would be aware about the Linux network stack and how complex it has been. So the design process of the Linux has been for years and the layers are pretty compact and each layers talk to each other layers. And if any packet has to process, they have to process through all these packets, all these layers. And we do have the net filter layer in between. So in order to get rid of the net filter layer, we need to come up with a similar filtering capability with the BPF. So what we are doing is, as the previous sessions also discussed about the hook points. So the BPF has different hook points in the networking stack. So you can see the different hook points that the BPF has. And using these hook points, we can actually achieve the similar functionality that IP filter has and has been providing for the customers for years. So this is just a comparison about the legacy IP tables and the enhanced version of the IP tables with the NF tables that have been used. And then the BPF filters with the host driver and the BPF filter with the hardware output. So you can see there is a substantial increase in performance by using the BPF filters against IP tables. So this picture gives you an overview of how the BPF utilizes the filtering capabilities that IP filter used to do for the networking. So in this picture, we are seeing the five different chains that are currently available for any packet to traverse these chains. So the decisions are made based upon where the packet needs to reach, whether it's going to be inbound, whether it's going to be outbound, whether it's going to be egressed or ingressed. So based on that, you can see the LO loops in here. Those are the net filter capabilities positions that we are having. And then the routing decisions are happening in the input and forwarding and output, as well as the netting is happening either in the pre-routing or in the post-routing. So in case we wanted to achieve the same functionality with BPF filters, we are planning to have the BPF code running in any one of the hook points that we already mentioned. But for the example case, we are taking that we are actually applying the BPF code in the TC hook point. So by applying this, we are going to achieve similar functionality. I'm going to show you the picture here. So the pink region that you're seeing here, both the pink boxes are the BPF programs that are running on the hook points for the TC. And those chains that we have shown here are the ingress chain, forwarding chain, and output chain. And we also have the netting capabilities as well for the pre-routing and post-routing. But for simplicity purposes, I have taken these three chains in order to explain how we are achieving it through BPF. And also connection tracking is involved. So in all these cases, we do have a hook point on the TC when the packet enters, the hook point actually takes into consideration that there is a packet arriving and if there is a BPF program that has been programmed to take care of it, so it takes care of either the ingress chain or the ingress chain. And then it applies the filtering rules based upon what we have configured. So I think then the previous session also we saw that BPF filters has the capability that you can actually point one BPF filter to another BPF filter, which is called as BPF tail calls. So we can achieve a similar filtering capabilities with respect to BPF tail calls, where each of the eBPF program can actually do a partial filtering based on what the content has been derived for. So basically one can do a header parsing, the other can do an IP lookup. So all these things changed together can actually provide a filtering capability that IP tables can provide. And all these things are happening dynamically without any intrusion or without any kernel reprogramming. So that's the advantage of BPF program. I'll give back to Michal to take it over from here. Yes, so here are examples of the other project in Cilium that are using BPF. There is a load balancer written by Facebook, which is OpenSongs, which is called Catran Perf. The Linux utility is using BPF already for tracing the kernel function calls. SystemD has a basic firewall based on BPF, so you can define basic proofs for services. Suricata is using BPF extensively. OpenVswitch has the AFXDP driver. AFXDP is the, let's say, alternative to DPDK, although DPDK itself is also supporting it. So in DPDK, normally you expose the network device directly to the user space, and DPDK has a network driver to use that network card. But in case of AFXDP, you use the network drivers in the kernel, but you have the direct memory access to the network card, and you can bypass the rest of Linux kernel network abstraction you've seen in the previous slides and redirect the packet directly to the user space. So it's similar to DPDK in terms that it's a data path acceleration technology. You can use AFXDP as a PMD driver in DPDK, actually, but you are still using the network device drivers in kernel, and the list of projects using BPF will grow and grow in the time. And these are companies which are using BPF. So Google, Red Hat, Netflix, SUSE. We are using it because we are in our distribution of Kubernetes, SUSE Container SS Service Platform. We are using Cilium as the CNI driver. So we explained what BPF is briefly, and now we will talk more about Cilium itself and what kind of features it has. So Cilium consists of several components. The main of it is the agent, which runs on every node in the Kubernetes cluster, and it actually takes care of generating the BPF programs and loading them into the kernel. And you have several other components to interact with Cilium, like the CLI, like plug-ins to different container runtimes or the policy repository. And speaking about CNI itself, yeah, maybe it's too much to talk because we have five minutes left, but yeah, CNI is the specification used by Kubernetes for creating or deleting the network interfaces and CNI plug-ins are responsible for creating the network interface, getting the IP address and Cilium is implementing all of that. So basically when you create the pod with kubectl, you of course call firstly the Kubernetes API server, the kubelet takes that request and kubelet calls the CRI, the CRI can be Docker, Shim, Container D or Cryo, and then usually the CRI implementation calls the CNI plug-in to create the network interface and provision the networking for the pod. And in case of Cilium, Cilium has a CNI plug-in which calls the Cilium agent to request the IP address and then it calls Cilium agent to actually create the BPF programs which will handle the filtering and in case you are using Cilium for handling the packet encapsulation to the nodes, it's also handled by BPF programs which Cilium agent creates. And then the communication between those BPF programs which are loaded into the kernel goes for BPF maps which are exposed to the user space. So Cilium agent after generating the BPF program, compiling it, loading it to the kernel, it keeps in contact with the BPF program by using maps. And this is like the more general overview of how BPF looks like and how it works when we use it together with Cilium, but also, for example, if you use AF-XDP which I mentioned to do data plane acceleration to VMs and containers. And, Swamin, it's your turn to... So here's the details about the CNI plug-in. As you mentioned about how the CNI plug-in gets involved in providing a networking access and providing IP address management and all those things. So these are the internals that you can see when CNI is configured where each of the containers has an internal interface and it has an LXC interface within the CNI and also a physical interface to a node and the nodes are interconnected in a cluster. So the networking modes and then the policy will be taken care by Michal. Yes, so there are two networking modes basically in Cilium. You can use Cilium to actually encapsulate packet between nodes and the traditional mode, like traditional default method of doing that is VXLan. But in case you want to use BGP or in case you are deploying your Kubernetes cluster in the cloud environments like LWS or GKA, you can use the direct routing where Cilium doesn't route packets between nodes but you rather use something else to do that. And yeah, the most popular case of using the second mode is using it in AWS where you have ENI and also the... So basically the first method is more for bare metal installations at the second for cloud providers or bare metal with BGP. And in case of the first option you have overlay routing tunneling mode and you have the additional network header related to VXLan which is handed by Cilium. And in the case of the other one you have source destination and payload and that's it. And let's talk about packet filtering right now. So Kubernetes already provides by itself the abstraction called network policies which applies on layer three and layer four. And one of the forms of L3 filtering is labor-based ingress filtering. So let's imagine we have two labors to kind of rows in the clusters. Okay, so you have front-end and back-end pods and so you can allow only front-end pods to contact the packet but deny everything else. And there are examples also of ingress filtering where you restrict the pods to connect to the outside world. You have also for filtering for blocking for allowing only particular pods to connect in and so filtering which also takes care about acknowledging HTTP protocols or HTTP endpoints you can connect to. And, yes, so unfortunately we can't talk much about Envoy because our time is up. Do you have any questions? We couldn't make it. Because you are doing one. But in the case of I think ABPF, the way the Selium has been implemented is a little bit different than what I showed because I wanted to show a theoretical approach of how it has been handled in order for us to get an understanding of how it is handled. But in the case of ABPF, what Selium does is it has a map of like a source IP, source port, and then the. I think there was an F table scan to that as well. So if the problem is scaling, then you can already do it with an F table. No, it's not just the IP, that's what I think Mihail mentioned. It's not just because of IP and port. The advantage that we have with Selium is based on labels. And you can actually provide label-based filtering where you cannot do that in the filtering. And also in the case of the Q proxy, Q proxy basically uses IP tables and we have seen a degradation in performance when there are a lot of IP table rules that are handled. You don't have to add more IP tables at this point. You have, as I said. Yeah, so even by including IP set or even by going with a new version of IP tables, the NF tables, we have seen a degradation in performance. I mean, Thank you guys.