 Hello everyone. Welcome to Cloud Native Live, where we dive into the code behind Cloud Native. I'm Itai Shakuri, I'm Director of Open Source at Aqua Security. I'm also a Cloud Native Ambassador and your host for today's show. So this is Cloud Native Live. Every Wednesday we bring a new set of presenters to showcase how to work with Cloud Native technologies. They will build things and break things and they will answer your questions. This week we have Chris Tomkins from Tegera with us to talk about how to leverage EVPF with OpenShift with Project Calico. He will introduce himself and the technology in a second. Before that just a quick reminder that this is an official livestream of the CNCF and as such is subject to the CNCF Code of Conduct. So please don't add anything to the chat or questions that will be in violation of that Code of Conduct. Basically just be respectful of all your fellow participants and presenters. So Chris how would you like to introduce yourself? Yeah well the first thing I'm just really happy to hear you say that I'll break things because that covers me if I break things now so great. So my name is Chris Tomkins. I'm a developer advocate at Tegera. Tegera who developed Project Calico as part of the Project Calico community. I worked as a network engineer for many many years and gradually got drawn towards automation, large-scale automation technologies and when we started to do Kubernetes I was deploying Calico and I really liked the product and believed in it so I joined the team as a developer advocate. Cool so yeah do tell us about Project Calico I think for Yeah yeah I think that's a great place to start. So Calico is an open source networking solution for networking and network security for containers, virtual machines and native host-based workloads. So the idea is it provides a consistent experience and set of capabilities for all of those kinds of workloads in public cloud or on-prem from a tiny cluster all the way up to a multi-thousand node cluster and we can support environments like Kubernetes, OpenShift, Merantis, Kubernetes Engine and so on and the ideas to prevent present the same experience to developers and engineers. We offer a standard you know best practice security model and a high-performance data plane, set of data planes actually and incredible scalability and it's a real-world production hardened product and open source setup that is you know is heavily deployed already. Yeah it's good that you mentioned that it's also open source right? Yeah so there is an enterprise product but everything we do today will be in the open source product and the enterprise product adds on some features around observability well lots of features but primarily around observability. All right that's cool so the topic of today's show involved EPPF and OpenShift and Calico do you want to set up the context for this? Yeah definitely so I guess if you think of it as Calico allows you to implement your network policy and your networking across those environments that we discussed and it uses some technology in the data plane of the nodes to actually do that work. Now I did a separate talk which I could talk for half an hour but I won't about the you know control plane and data plane model for for networking and so on but just suffice to say that the data plane is how we implement the actual networking policy that that enables the networking to happen. Now Calico actually offers a choice of data planes. The chat window has just disappeared on my browser but hopefully it will come back. So yeah Calico offers a choice of data planes and the idea behind that is that different people have different use cases and expectations for the data plane of their of their network environment. So in order to achieve high performance you want to have the minimal data plane amount of data plane code that allows you to implement the feature set that you actually need. So our main original data plane was implemented mainly with IP tables which is great it's still in production for many users and the performance is good it's rock solid and battle hardened but we wanted to offer another data plane which has some advantages over that and so as well as the standard Linux IP tables data plane and the Windows IP the Windows data plane that we offer we have this third data plane which is the Linux eBPF data plane which is what we'll what we'll be focusing on today and then just to say as well there is actually a fourth data plane in TEP Preview which is a VPP vector profit processing data plane but it's in TEP Preview it's very interesting I encourage people to look at it but we won't talk about that anymore today. Yeah you go ahead. No no I just wanted to comment simply eBPF is a really exciting technology I think and very I'm happy that we get a chance to talk about it today because it's very relevant specifically now I feel and see a lot of movement around it in the industry especially from the networking perspective of course so yeah just yeah so should we drill a little bit into into what it what it's all about? Sorry I shall we go a little bit into what the eBPF is all about? Yeah yeah please yeah so yeah as you said as you said it's you know it's a really interesting area and what eBPF basically is in the the old days if you wanted to have some code that ran in the kernel of the Linux kernel then you would need to either actually submit code into the kernel repose and get it approved and obviously that's a really long process and a very difficult process for good reason or the other way you could do that would be to write a kernel module but what eBPF actually is is a way to run you can think of it as a a way to run virtual machines inside the kernel and those virtual machines are extremely high performance they're heavily secured because they only have access to a to a very limited number of of kernel functions depending on where they are mounted in the kernel and so as well as being high performance and secure there's small bits of code that live inside the kernel and operate inside the kernel under those restrictions so in the case of of how calico uses eBPF what we're essentially doing is replacing the functionality that we previously implemented in ip tables in eBPF inside the kernel and and the the actual performance increases obviously but there are also some side effects but later on I don't have many slides but I do have one diagram I'd like to show and when I show that diagram I can explain why why that functionality also improves with with eBPF yeah I once heard a description of eBPF that I really like sorry I don't remember who said it so I apologize to the person who said it to me but they described it as like JavaScript is to the browser eBPF to the kernel basically mechanism vehicle for us to extend the kernel and to put our own code which otherwise would be very dangerous and difficult so yeah yeah exactly and it and it as you know I'm actually not a developer in my you know in my first the first part of my job is not a developer so I have to be careful not to overstep my knowledge but but as well as only being able to make certain function calls and those kind of things eBPF programs are automatically limited to execution time and those kind of things so they can't they can't get into tight loops and they can't they can't chew up the tpu that kind of thing so it's great because there's a lot of protection there to prevent to prevent your program from accidentally misbehaving and yet you get excellent performance there's also this concept of a eBPF map which is basically just a key value store that the eBPF programs are able to access which allows them to exchange data between each other to you know to record flows that kind of thing or whatever you might want to do now eBPF obviously it has a lot a lot of use cases you know it's commonly used for for tracing and for you know tcp dump uses eBPF but in the case of what calico is using it for it's using it to implement the network policy inside the kernel all right sounds good and just the the final piece there we talked about calico we talked about eBPF and then we are going to run all of these on top of OpenShift right so so like i'm assuming most of the people know about OpenShift but just to set the record straight for Kubernetes users could you just say a few words about OpenShift yeah sure and i'll be honest OpenShift isn't is the weakest part of my knowledge you know it's a it's a big beast so but but OpenShift is the way i like to think of it like you had a good analogy about eBPF but the way i like to think of it is that a kubernetes cluster you can extend it in so many interesting ways with different tools and you know for for observability and for for for storage and for cni and all these things and you could spend a lot of time both at time and money trying to figure out what a good kubernetes deployment model is for you and i like to think of OpenShift as a container application platform based on kubernetes that takes a good set of a good set of additional tools so as well as being a container orchestrator it it's an enterprise kubernetes distribution and it has a validated set of integrations it's still kubernetes it's still certified kubernetes and it's still open source but it but it allows you to build kind of like a consistent complete enterprise grade kubernetes ecosystem and and then to to run that wherever you choose to run it you're not limited to running it one place the demo we do today the demo i do today will be what we in AWS but obviously you can run OpenShift in any environment all right i think that's a very good intro so let's get to it let's get to yeah yeah great um so i think if you there's a couple of things we can come back to if we end up with spare time but i think it's a good idea to get to to flip over to uh just literally the the couple of slides i have and then a name to get to the demo um so yeah i think you can see my screen there we go yeah so um this just kind of slide it just kind of this is um red hat so in slide incidentally um which i've used um the url at the bottom there but really this just helps to understand that OpenShift is kind of a big uh suite of tools around kubernetes um or you know for observability and for um developer services and so on but i won't drill into that because it would be challenging for me to talk about everything on that slide meaningfully um but this slide is is a really interesting one and um this is the one i said that i would briefly mention um uh i said i would briefly mention about uh where BPF hooked in so this is the packet flow diagram and you can see at the bottom of the screen this is this is courtesy of um jan engelhardt which is it's uh on Wikipedia this diagram if you want to see it yourself but this is the packet flow through any linux node and you can see that the green part there is the layer three network layer and that's where ip tables and all of those kind of things uh happen so you have um a mango table and that table and um uh and you can see that the the track the the packets flow through all of that and the reason i mentioned all this is because um that's also where coop proxy is implemented um so when you run services on uh on kubernetes coop proxy uh implements those services and it exists in that green uh that green section um however down here you can see uh hopefully it's just about readable um you can see ingress and egress q disks and these two points here are where we actually attach our BPF code um and that means that the BPF code can entirely sidestep the the main packet flow um so that's why i wanted to highlight this diagram and so as a result of that um the advantages of of actually uh running ebpf um excuse me yeah the advantages of running uh ebpf in calico are performance but there's a secondary a really interesting uh secondary benefit which i'll demonstrate uh in the demo which is that uh is source IP preservation so if we have a look at this diagram uh and we'll see this for real live demo in a moment um but you can see with a traditional kubernetes cluster with coop proxy running um you can see that if an external client comes into a to a service pod the first thing they do is they hit a service uh on the kubernetes node and then that that the coop proxy uh that's that's serving that service does a destination that and a source map to replace uh the source ip with its own ip and the destination ip with a pod ip and then the traffic gets forwarded on to the service pod and then it comes back but it's important to note that the traffic when you're running with coop proxy without the ebpf data plane you can see that the traffic has to go back through the coop proxy and therefore it's destination and source knatted and the side effect of that is that the service pod down here never actually sees the source ip of the traffic uh the real source ip it sees the source ip of the load balancer uh of excuse me the coop proxy um so one one of the advantages is that once you turn on ebpf you get a flow that looks like this so when the external client comes in and they talk to the service the service is implemented as a bpf program not in coop proxy and that means that the bpf program can forward on the traffic and that the service the service pod on the destination node sees the real source ip of the external client which maybe might be good if you have a an auditing use case uh or if you have um uh you need to restrict a particular ip block by region by um by country or something like that uh and then the final the final step here shown is number five is that if the network allows it the packet can actually be returned directly um without going back via the ingress node so so those benefits are um yeah you get that's a performance lower latency and you get um and you get uh source ip preservation um so yeah it's pretty cool cool is that is that thanks to i know is it because uh you're um doing the uh network uh routing work at the lower level in the network stack yes yes exactly um uh they they're my colleague shawne crampton um i do i've done some some sessions called calico live which is just me and shawne crampton uh chatting about informally similar to what we're doing now but but he's one of the ebd f data plane developers for calico um so anyone who wants a lot more depth on this um if you if you watch those sessions uh you know frankly he his understanding of the depth is much deeper than mine but yes it's that's exactly we're implementing it at a lower level and therefore um by replacing kube proxy we can we can introduce other examples sorry other other benefits excuse me um yeah so i think we should go ahead and and show it right yeah of course let's do it cool okay so uh so just to explain what i'm going to do um i wanted to create a um i wanted to create a demo that anyone who's watching can actually do this themselves afterwards um so in order to set up open shift we need a um we need a dns name a top level dns name which obviously would cost money um but i was made aware uh by a colleague of um of this website freenom.com i can't actually recommend it you know for for production i don't know whether it's you know i don't know what their their production services are like but if you want a quick uh domain uh top level domain name um you can get that here for totally for free so if i come into into here um i've registered this domain um registered this memorable memorable domain here i mean it's not going to replace amazon anytime soon but um but i have this domain and it's registered for a few months so the reason i show this is because we need we need a top level domain in order to to be able to get open shift set up so the very first step that i did before today because it takes 24 hours to propagate is you come in here you grab these uh you you need to specify where the domains dns should be directed to um so we change it to these custom name servers so let me show you where i got those from so i'm using aws route 53 um and i set up a hosted zone and if we see the code for that when you set up a hosted zone you just it's i think it's create hosted zone and then you specify the the name that you want which is the same name that we just saw and uh amazon aws will come back and give you the top level name service that they want you to use so all i did was take these four these four names here and put them across into this GUI um so that's step one um once we've done that uh the next thing we need to do is is actually get the open shift installer tool um so i'll come back here and there are two uh open shift installer tools um so i will actually delete them and redownload them uh no i didn't want to do that i wanted to do that so actually i just realized yeah one thing hold on one second yeah good okay so yeah so the first thing to do is is to download the open shift installer tools and it's just a couple of w w get commands here so uh you can see it's hitting mirror dot open shift dot com and we're getting the stable open shift installer for linux and then the second one is the open shift client uh this won't take long in the meantime i'll just remind everyone if you have questions or comments just type them in the chat and read them and chris can answer yeah thanks it yeah okay cool so those are done so we just we just extract out those two um extract out those two uh tar files and you can see it just gives us a read me file and a couple of binaries um so i'm going to put our current uh our current working folder on the path um so we should be able to type ot yeah we can so i just was testing that my path was working you can see the open shift client is there so that's fine so i'm going to switch now it's a you and i talked about this briefly before we came on to the call i'm going to shift to a terminal recording just for a short time and i'll explain why that is once it's actually playing um so i'm going to use this tool called Askinima which is amazing if you don't use it um oops i'm going to play play the recording before you play it if you could just resize your window so that it doesn't reach all the way down because your name kind of right yeah yeah gotcha okay absolutely yeah thanks i forgot that would happen thank you yeah that's perfect how's that great okay yeah so oh goodness i'm sorry i can deal with uh Kubernetes clusters but i can't operate my browser and my window let me try again if i go too near to the edge you try to maximize that's great yeah how's that perfect tiny bit off right there we go cool okay good so um yeah so let's start that playing now and um um this is now the recording um but you can see that it's uh it's just showing you can see that this recording was actually done back in june but this is only the first part so we will be doing a live a properly live demo it's just this first part and the reason is because um you can see that i'm in the demo i'm record that i'm downloading those files that we just downloaded the problem is that the open shift install tool actually gives away quite a lot of information that i that you that we don't want to be sharing um you'll see in a moment that when i run it um it shows all of the dns names registered to root 53 for the account the aws account that you're currently logged into it also shows um the secret keys which again i don't want to share so i've actually edited this recording so that the keys you see in a minute are they're not real keys they're they're edited and similarly the dns so here we go so now we're getting to the point where we're we're moving on past so you run this tool open shift install create install config and we're not actually uh creating a cluster yet we're creating an install config i would creating you know the uh the um declarative um definition of how we want our clusters live so we have to specify what public ssh public key we want to use we have to say where where where do we want to run it so we choose an aws um yeah you can see it you can see it was a live demo because i made a mistake you choose which region do you want to run it in so i'll choose us west 2 and this is the point where if i hadn't edited this video it would be listing all the domains here um so that's why i had to edit this but you can see that we go and we're down the list and we choose the domain that we're interested in which is the test domain that you saw before and then we name the cluster and then we put in the pull secret um now i'm gonna pause there just to show you where that pull secret comes from so so anyone following along you can you can search for red hot open shift developer sandbox and if you come here to this website um you can see that you get to you can start a trial and you get a free trial 30 days trial of uh as of an open shift developer sandbox when you sign up for that um then you get this dashboard and in this dashboard you get your pull secret and you basically just need to copy this pull secret here like so and then paste it in so someone asked if you could do uh just uh recap the last 10 commands that you ran i guess maybe you can do even better and share the the script later on yeah i can um just i mean this is an script that installs open shift is it uh doing anything uh special no it's not not yet no it will do we will go into more interesting stuff but let's i think it's better i'd rather recap it here than then ensure recording just in case there's anything hiding in that recording in the file that i don't want to share but yeah so so uh that's fine so so all we've done so far really is um to recap we set up a free domain somewhere we put that into aws route 53 then we downloaded um the two open shift um uh client and open shift install packages which you can google for those uh we expanded those using tar and then we ran this uh we then we put we put uh our current working directory on the path and at that point that's when we then we actually run the open shift install tool and this by using open shift install create install config it's going to create a it's going to create an install config file um and then these are all just this this is all just an easy wizard so you can see it's finished it's created an install config file and this is uh the only the only edit we need to make to that file at the moment is this edit that i've made here so if anyone's not familiar with said what we're actually doing here is we're saying um the install config was created and we're saying search for it and replaced any occurrences of open shift sdn with calico so we're telling uh open shift that when it builds the cluster it should build it with calico networking not with uh open shift networking just to be clear though uh uh this is not going to enable calico ebpf yet this is just calico uh traditional ip uh ip tables data plan just wondering why that's not continuing did i accidentally press yeah for some reason that seems to have stormed um but that's okay i can actually from here i'll pick it up so you can see i've i've pressed control c so we're now back to a live you know actual live live demo um so you can see that the next command you use is uh open shift install create manifests and i won't press enter now because it won't work because i did this already this morning because i knew that this would take too long for the live demo so what that actually does is it consumes the install config and it spits out the actual manifests which are going to be deployed um so it tells that into a folder called cluster so if we go into this folder called cluster um in here there'll be a lot of manifests um and then there's an extra step which is documented um on uh the doc stop project calico website which is actually to download um some extra uh project calico manifests and drop them into this folder so i'll just do one example but they're about 30 or so so you just um it's a curl command like this and we drop we drop uh these manifests from project calico.org into the manual here now when all of that's done then you run um open shift install create cluster uh like so and then it will take about 40 minutes which we'd be pleased to hear we're not we're not going to do that now but um but what happens is it will then go and build the cluster and once you've done that you will have a working kubernetes cluster which is what i have now so you can see that this this cluster was built this morning um in preparation for this so this cluster if we have a look now you can see that um you can see that there's a lot of pods running on this cluster and you can see that in the namespace calico system you can see that we have uh a calico node demon set so this cluster is now running calico networking it's ready to go but the but the um calico node is still running in ip tables mode not in epf mode at this moment so i'm conscious of time time always runs away so i need to run on a bit um so i'm going to demonstrate the the the source ip the lack of source ip preservation so i have um i have a manifest called echo server and if i show you what's in there this manifest deploys um makes a deployment called echo server with just one replica answering on port 8080 and it creates a service um and the service is type nlb network load balancer so this is an aws load balancer set to nlb it's important that we use nlb because without nlb we can't do the source ip preservation um so if i deploy that manifest now created an echo server and i've created a service so echo server external um service load balancer and as well as having a cluster ip it has an external ip and it's answering on port 8385 and it's redirecting to that pod so if i grab that now this won't work yet it takes a moment before it's ready but um if i try to hit hdp and then the the new um load balancer ip and then port 8385 yeah it's not working yet we need to wait a moment um but in a moment we will see that we're able to hit that uh hit that pod but when we look at the pod's logs if you recall back to that diagram that we looked at you we won't actually be able to see the real ip of this my public ip essentially so what actually let's while we're doing that let's have a look if we check my public ip now see that my real public ip is this 31 31 something address so let's see if it's ready yet not yet so uh while this is being created maybe we can take a few questions yeah sure um so there is a question about um support in the epf data plan for ipv6 if you know um i think it's supported but i actually i'm i don't want to say 100 percent because i'm i feel like i might be wrong let me see if i can answer that right now one second um no it doesn't it's listed as a limitation actually so if your users uh let me pull that up now that's i'm glad that i'm glad i checked out i didn't give the wrong information um so if a listener wants to enable the dpbf data plane you can go to docs.projectcalico.org and this documentation is really good you can see that there's some information here about the benefits and there are some limitations and one of those limitations is that it does not yet support ipv6 the question is actually like if you can share or if you know about an eta for that oh uh no i know it's i know it's being considered but i i definitely don't know the eta but if you want to join uh if the listener wants to join our um slack calico user slack that's the best place to ask that question and then i can go away and find out exactly what i'm allowed to say officially and and i'll deliver the best answer i can yeah sounds good uh another question about uh the pros and cons of ebpf yeah security perspective um it really makes i don't think there's a great deal of difference from a security perspective because at the end of the day the policy is being implemented the same policy is being implemented regardless which mode you use uh i suppose that you could say that um the ip table's data plane has been around for longer um and therefore perhaps you could make a case that it was more secure however the um the ebpf data plane requires a newer kernel and requiring a new new kernel arguably has security benefits so i don't think there's i don't think there's a big i don't think it's a big consideration actually um i think i would say they're functionally equivalent really that's a good point actually a question that i also had about uh what's the requirements for installing the ebpf data plan from a kernel or operating system perspective yeah so that's documented uh let me give you the right information um you need a supported Linux distribution which is either um ubuntu ubuntu 20.04 or newer or um 1804 if you have an updated kernel or uh red hat version 8.2 which has a kernel version of 418 or above um or another distribution which another supported distribution which has a kernel of 5 5.3 and above so actually on that same url that i just uh that i just showed you a second ago um if you go to if you just search for project calico enabling ebpf you'll find uh all the information including the the prerequisites around the kernel there's a couple of other prerequisites as well um around mounting the vpf file system uh so yeah all of that is on there all right sounds good cool um i think we should push on just to make sure i get through this in time otherwise we may overrun um so you can see that in the meantime the the echo server is now responding on the port um so if you have a look at the logs um there you go so you can see that we're looking at the logs of the actual echo server pod and you can see that it didn't see the real public ip of the user um so just keep that in mind because we haven't turned ebpf on yet so once we have it should change now the other part i was going to do at this point was to actually demo the performance but to be honest we're getting low on time so i suggest that um on the tigera blog there is a detailed blog post which includes the performance uh graphs so i suggest that um people can go and have a look at that but just suffice to say that the ebpf data plane performance is significantly better um and if anyone you know if anyone wants to test it themselves they can they can follow along and run run their own tests um so let's actually turn on the ebpf data plane now so the first thing we need to do is to switch the encapsulation for calico from ip in ip which is the default to vxlan um so we do that with uh calico cuttle uh what did i miss looks like i screwed up that command one second i think you're missing a another um yeah did i missing one at the end oh yeah i'm missing it at the end yeah i see well done right you spotted that quickly yeah there we go you can see that i ran two calico cuttle commands here there's an ip pool custom resource and this ip it's called default ipv4 pool so what we did is we turned off ip and ip encapsulation and we turned on the x-lan encapsulation um next point because calico needs to speak to the kubernetes api but the kubernetes api is actually hosted via kubet proxy so you can see the problem if we turn off kubet proxy how can calico talk to the api so we have to do something to help it talk to the api directly uh so we do is we use this oc command pretty cluster info and we can see this is where the control plane is actually running um so i've created a um a bit of yaml a config map um called kubernetes services end point in the tigera operator namespace and it has the host name which should be the same yeah that's the same and it has the service port so what we're basically doing is we haven't applied this yaml yet but let's do that now and while it's applying we can talk about what it actually does okay so it's created this kubernetes services endpoint config map which has this configuration and uh the um i'm just going to put it so if we wait one minute now and let kubernetes detect that new config map and then we restart the tigera operator it will actually tell calico to talk directly to the api rather than talking to the kubernetes kubet proxy service um we can see if we have questions uh while we wait yeah sure um yeah i'm just gonna read this one is preserving source ip just an example of uniform software defined general purpose networking rules or is source ip preservation the primary value of calico well there's two parts to that the first part is it's not the advantage of calico it's specifically of the ebpf data plane and actually the way i see it is actually that it's more of a side effect of the benefit of ebpf i think i wasn't in the room when they originally decided to make an ebpf data plane but i think the primary reason for making an ebpf data plane is the performance advantages um but then uh i showed that large diagram of the packet flow through a linux node and because kubet proxy is implemented in uh in the layer three uh green part of the diagram and um ebpf the ebpf hooks are at the start and the end of the flow in order to implement policy in ebpf you kind of also have to replace kubet proxy that's my understanding and if you're replacing kubet proxy anyway then you can you can improve it and i think that was the side effect but source ip preservation is actually a really nice side effect um it's it's something that's surprising was not there by default being able to being able to see the thought ip of your of your user because especially if you consider a compliance type environment you know you need to be able to accurately record your logs i would say it's a benefit rather than the main objective you have another question before we move on uh just a follow-up from the same person uh the incumbent or alternative to the ebpf data plane is calico in calico is based on epitables is that correct that's correct yeah exactly um so if the if the uh if the viewer has time to watch it um i did a talk there was a kubernetes security and observability summit earlier in the year and i did a short talk i think it's about 20 minutes about why we offer multiple data planes and what the advantages and contrasting those are rather than talking specifically about ebpf i said here are the data planes that we have here are the advantages of them all and i really believe that that no single data plane is the perfect solution for all users um because they all have pros and cons um and that includes ebpf it has high performance but uh it has for example it has the kernel requirement um which obviously some some environments that's that immediately means it's not possible um so yeah all right so now that we've waited um if we restart the tiger operator so we're deleting the pod in the namespace tiger operator and it will immediately be recreated um so what we've done there is actually um told we've now told the operator that we want kubernetes uh we want calico to talk directly to the kubernetes api not via kube proxy uh so now that we've done that we can actually disable kube proxy um so you can see that oh yeah i remember now so uh there is um there's an open shift operator and in here we can tell it that we want to turn off kube proxy so right now if we um we have a lurk you'll see that there should be five no six excuse me kube proxies running and now we just patch uh we patch the open shift operator to tell it that we want to turn off kube proxy and as soon as we do that we'll see that the they're terminating already so now we're at the point where we're still not running the ebpf the calico ebpf data plane yet we're still running the calico ip tables data plane um but we've turned off uh we've turned off kube proxy and we've got the we've got it talking directly to the to the api so finally now we can actually turn on uh ebpf and we do that um again this is an operator command so we're merging this config and we're saying that we want to turn on uh bpf the next data plane bpf and the cool thing is that enabling ebpf mode shouldn't disrupt any existing connections so if you have live connections um they will continue to use the standard linux data path until they time out and when they time out they'll reestablish using the ebpf data plane which is pretty cool um so let's prove that we are actually running um it first of all so if we look at now we're looking at all the pods in the calico system namespace and you can see that the modes are starting to restart so you can see that one restarted 14 seconds ago 28 seconds ago so it takes a moment there's still one that hasn't restarted still three that haven't restarted actually so we just need to wait a moment again i've attempted to carry on the demo but i think we have to wait till all six because you know that just by bad luck you know that the workload we care about will be we'll be sitting on the wrong map um of course of course it will of course nothing has exploded on me yet so i'm still expecting something something to explode soon but there it goes we can actually see it terminating nice initializing cool okay so they're all running now so we're running the bpf data plane now there are there are lots of different ways that like we can show that we're running the bpf data plane but one of the quickest ways to prove that we are is to look at the logs for the calico node so if we take any of these calico nodes and we grab grab their logs for the phrase bpf um we should see yeah lots of stuff it doesn't really matter what we see here but you can see that um yeah here we are rescind bpf roots so you can see that it's doing bpf work there is more that we can do if we have time we might i might show you a little bit more but let's first of all check that sort IP preservation again so if we look at the services again nothing's changed this is still the same service still running up time 16 minutes now i'm never sure whether or not i need to recreate the service so we're going to find out so if we curl it again you can see that we've proven essentially that the bpf data plane is working and that remember the service wouldn't be working without kub proxy if we weren't using the bpf data plane so the fact that i got a response from my echo server proves that the bpf data plane is both enabled and working um so if we look at the logs again now yeah there we go you can see now that we have my public ip cool yeah that's good isn't it um so and obviously that means that you're you know you're Apache logs or whatever are going to reflect that and you can start to do maybe another use case actually might be for um understanding uh where your users are coming from your geo gip maybe as well um so there's quite a lot of use cases for that cool um so i think this is good timing actually because we because we skipped the performance part um we can actually we can do two things first one is for me to show you the performance graphs uh and like i say um anyone can read anyone can test this themselves using the same process um oh never mind i thought i was find the blog post um i'll try to find that in a minute um i thought i would find it more easily okay never mind yeah if you if you want to send them later and we can post them in the slack channel for the show yeah perfect yeah that's fine um actually i think i can grab let me see if i can grab it right now yeah here we go brilliant okay actually it's fine i can just show you this on um so this is a a presentation on on a similar topic um here we go so there is this caveat and this is an important caveat um which is that um traffic between two instances which is a single flow in aws can only maximum do five gig um and this is nothing to do with calico this is an aws limitation uh this this this screen grab here is actually from aws documentation just to kind of show that this is you know a real thing um so the reason i say that is because we're testing with a single flow um so this is um open shift with ip tables this is the throughput and you can see that the ip tables is the blue and more is better so you can see that tcp it makes very little difference but if you look at this udp performance it's nearly twice as much um i don't really like spending a long time on graphs anyway so uh you know people can come back and validate this themselves and you can also see that the cpu utilization is lower so but i think more interesting than that let's go back to the cli stuff because people can look up graphs anytime they like uh so let me show you one more other cool thing that we can we can we can see uh and then i think that should leave us with a couple of minutes for any more questions so one nice trick we can do is um we know my ip now so if we create two um variables so i created a variable called ebpf interesting ip and it's my ip my public ip and the other one is uh the interesting port the port number we care about um we can run this for loop and what this is actually going to do is this looks pretty funky when you first look at it but it's not too complicated so just to break it down it's 555 p.m uh my google home talking there sorry um you can see what this is actually doing is it's listing all the pods um it's grabbing the calico nodes and there's one calico node per kubernetes node then it's grabbing the names of those nodes then it's going to iterate over those it's going to print out the name of the calico node and then it's going to kubectl exec so it's going to exec onto the calico node and it's going to dump the bpf connection track table and then it's going to grab my ip and my port and the port we care about so this what this should do is there we go it will show us the flow through the cluster now it's probably timed out the the flow that we did a moment ago is probably timed out already so that's fine so we're expecting it to come back with nothing that's okay okay so you can see at the moment there's no record of my flow but if i run that curl again and then run the command again we should hopefully see that's weird what did i do wrong what did i do wrong can't save you this time no no no i can see that um that's weird i've used that command plenty of times it can't be timing it doesn't time out that quickly so even though you know i run this and then a few seconds later i ran it um okay well i guess this is this was to be honest this was just an add-on thing anyway i was just going to show you can see the flow but uh you what you what what we should have seen there is is the flow coming in on the ingress node and then reaching the workload um and we can actually see that that is not knotted through the flow i don't know why that isn't working um but i don't have time to troubleshoot it now do you have any we're almost up on time already do you do are there any more questions yeah so viewers please ask any questions if you have any um and while you're thinking about your questions uh maybe chris i can ask you about where can people go next if they want to ask more questions find out more about what we've learned today yeah yeah great so um so the best place to go is um doc in terms of documentation the best place to go is to go to doc.projectcalico.org which is is that one i showed here um and you can see that on here there's quite a lot of high quality resources and if you go to um if you search here for ebpf you'll find that there's um quite a lot of information about what ebpf is and and actually you know quite a lot of detail about how it's working and also there are deployment guides for all of not just open shift but for deploying ebpf on clusters on other platforms as well so that's great in terms of documentation um you mentioned you have a slack uh so maybe that's what i was going to say next great place to come is to calico user slack um which i think if you go to tiger there's a project calico community page and well that's taking me back to the same place yeah um if you search for calico user slack you will find oh here we are yeah um so it's this uh tiger.io project calico community and you'll find that um there's quite a lot of information here about how you can get involved um there are these uh certifications which are free which are brilliant uh these are for the open source product community meetings and um you can visit uh you can find our calico user slack here as well um i'm on there all the time and there are people on there as well um shorn and other people who you know who are deeply deeply knowledgeable on on our ebpf data plane so any question you have we should be able to answer wonderful um all right so i think we're out of time and we don't have any further questions perfect so um yeah thank you chris um some one of our viewers said this was uh provoking i can agree uh so uh yeah so thank you for that and thank you for all of our viewers um just a reminder yeah sorry i should say thank you to you though thanks for thanks for taking the time yeah good pleasure and uh yeah see you uh everyone next wednesday every wednesday on cloud native live thanks again please thank you bye bye