 So we now welcome Marino Vijay from Solodotayo and I must say that this talks about Cilium, BGP and Espresso somehow. What's not to like? Please welcome Marino. So I will tell you that I will never be able to be this level of developer advocate to be able to program, do BGP and pull Espresso at the same time like you're seeing in this image. So all credits to Dolly for helping me make this image. Unfortunately I don't know why they placed the keyboard right beside the Espresso maker. So welcome to my talk. We're going to be talking about Cilium and BGP and how to get it up and running in a in a few minutes. But I want to take you through a little bit of backstory around you know why we got here and at the same time I'm not going to be able to lay in all the different features that Cilium and BGP is able to do today because guess what? There's another talk later on by my peer and another individual for my sovalent around more BGP. So come BGP's a session go check that out later on. My session is really about you know my a little bit of the testing aspects of how you build networks where you take those networks how you stage them and at the same time integrate them into your production environments as well. My name is Marina, sorry my name is Marina Wige. I'm a developer advocate at Solo. I you know jump around the world talking about network technologies networking in the world of cloud native and some of my networking history as well. So I always start off with the OSI model because I like to frame things in a way that allow people to realize that while we do have a physical layer we have a network layer we also have an application layer and I've viewed these as separate layers all together even though you know you have seven layers in the OSI model. The ones we really play with the most are the ones where we can touch the devices where we can log into them we can add some control plane logic and we can decide how to make our applications make the best routing intelligent decisions that they can. So there are a lot of different technologies that can achieve this. My area today is to focus specifically on what Sillium is doing in that space. So BGP, what is BGP? How many of you actually use BGP in production today? And I'm guessing you all are network engineers or platform engineers to a degree? Yes? You used to? Okay, all right. So what was the biggest challenge with BGP? Security, complexity, scale? There's so many different things that come about when you're trying to build out a network with BGP but you realize that that's the most tunable protocol out there that allows you to connect various kinds of networks and connect and integrate with other kinds of protocols as well. BGP itself has been heavily used in private networks as well as service provider networks where service providers are using that technology to help you maybe extend your LAN or give you access to other networks as part of your organization. At the same time, BGP is used in the Internet. Literally everything we access today is accessed because of BGP. It's a protocol that allows us to know about other networks. It's a dynamic routing protocol and it's a dynamic protocol that falls alongside of other ones that exist but to be quite honest, when you operate cloud environments or containerized environments, it's likely that you're going to end up using a protocol like BGP and that's primarily because of how it interacts and how you can customize it, how you can tune it for the variety of different attributes that you can specify to influence your paths. Anyways, BGP comes in two flavors, IBGP and EBGP. The difference between the two is peering within something called an autonomous system and peering between autonomous systems and the way updates occur for networks. In the world of BGP, when you have two autonomous systems talking to each other through an EBGP or external border gateway protocol connection, effectively what's going on is there are two separate entities that are exchanging networks, but at speed and they're doing this because they know that they're isolated domains. But when you have multiple routers, like you see in the diagram here, let me highlight it. Let me use this little special highlighter that I got for us. When you look at these little routers here and especially in this little diagram, you see these links between routers inside of what we call an autonomous system. That effectively is an IBGP peering. So it's just internal networking to exchange routing information effectively what BGP is providing us. Now, there are several considerations around what BGP is doing and how it exchange networks. It operates at the TCP layer and because of that, it expects to have an underlying connection to what we call a neighbor. So neighbors are what you are. Neighbors are used to effectively form peer relationships so that neighbors can exchange routes. So every time you see a link between these two routers, these circle boxes, that is what we call a neighbor-peer relationship. Now, BGP has made its way into a variety of different networks, as I mentioned previously, but it's also made its way into Cilium. So how does it work in Cilium? Way back about a year ago, I attempted to do a live stream where I enabled the Cilium control plane. Now, the Cilium control plane back then actually leveraged Metal LB to do advertisements of networks. That's come a long way and a lot of that functionality is now brought into the BGP control plane. So what the BGP control plane is doing as part of those Cilium agents that you have running in your clusters, is they're the ones that are responsible for forming neighborships with upstream top-of-rack routers or upstream routers that are part of your network. So you may have multiple nodes in your environment, and those nodes are the actual endpoints or neighbors that would interface or neighbor up with an upstream device. Now, BGP in Cilium supports a wide variety of functionalities. You can advertise podciders, you can advertise service ciders, you can even create a load balancer IP pool, which can be advertised to the rest of your network. Now, why would you want to do this? So imagine if you operated a private production network, didn't have access to the outside world, you built your own Kubernetes clusters, you also had legacy databases, some bare metal systems, you also had virtual machines, you had the whole gamut of it, right? So you have all of these brand new services that are running in your Kubernetes environment, and some of your legacy services need to tap into that. Well, there are easy ways to make this possible, but they're not dynamic. So we need something dynamic to facilitate this. This is where BGP comes in. So you have your upstream routers, your physical routers, they could be anything, I mean, they could be Cisco, Juniper, they could even be virtual routers like FRR or Bird, and effectively they're the ones that are going to be configured to peer with your Cilium BGP control plane. Now, on the Cilium side, if you notice that every single node that you have, they have a unique pod sider that goes on, right? And when you have that unique pod sider per node, those pod siders need to get advertised into BGP as well. So you have this upstream router, let's highlight it, right? There's a little green thing here that has network A, B, and C. And then down below you have this Kubernetes environment, and you've got three nodes as well. Now, what happens is that when I turn on the BGP control plane process, and you would do this using an Helm installation or a Helm upgrade, you would enable the flag, you then also have to create something called a Cilium BGP peering policy, which effectively is telling your Kubernetes cluster that your nodes are going to peer with upstream devices or upstream routers. So once you have that all established, and it is this specific CRD that needs to get deployed, then you're able to interface, you're going to start to see networks show up on your upstream router that originate from the Cilium side. So that means your legacy systems that exist well outside the world of Kubernetes can access any one of those services. But you also have some flexible models. I'm not going to get into the flexible models that you can deploy because Danian and his co-speaker are going to get into that in a later session as well. So when I deploy a Cilium BGP peering policy, I have to specify a few key details. I have to specify my local ASN or a local autonomous system, the one that I am a part of, and I also have to annotate my nodes and tell them what router ID they would have because you need to use router IDs to establish BGP peers or peering relationships, I should say. But you also need to determine who your neighbors are. So you have to specify who your neighbors are. Your neighbors need to be reachable. They need to be IP reachable. So if they're not directly connected, then you have to find a route to get to them. So you might be using an underlying protocol or static routes to achieve that level of connectivity. And then finally, you also need to make sure you're specifying what you want to export or what you want to advertise. So these could be a pod sider or a service sider. So when you expose services in Kubernetes as well, those services get advertised. Their IPs get advertised as slash 32s to the rest of your network. So you know how to get to those services from the outside world. All right. So let's take a quick look at that BGP peering policy. And there are various kinds, but I'll explain them in detail or I'll explain this one in detail. Now, the very first section here, the node selector is basically saying that this specific node that we're going to use is going to be targeted to peer with our upstream router. So any node that has this label on it is going to be a part of that peering relationship. The next thing we have is we want to set what our export sider is to true because we just want to advertise our pod siders. We're specifying our ASN number or autonomous system number, which is locally significant here. And then when we specify our neighbors, we have to specify that peer neighbors autonomous system number along with its address. And then finally, we also want to make sure we match on labels for the services we want to advertise as well. So if we have service IPs that we need to relate to the rest of the network, we have to specify that here. And that'll pick up on any labels in Kubernetes that match that particular one. So with that, we're going to bridge some gaps with BGP. Let's get into our little demo environment. So you can follow along if you want to hit up that QR code. I created a repo. That repo actually will get you up and running. I actually decided to do something very silly last minute and implement something called container lab, which ended up breaking some of my demo, which is fine because I still managed to make parts of it work. But the idea of using something like container lab effectively allows you to consider how you would stage and test your networks, getting them ready for production. So how many of you actually tested your network configurations in a staging environment before you actually deployed to production? Not many of you. So you just did a copy, run, start, and then hope for the best, eh? Okay, all right. Me too, don't worry. So one of the reasons you have to consider using a staging tool is because if you're trying to test things out like security policies or policies that you implement, or if you're trying to implement rate limiting or you test out QoS mechanisms, or even test out how the flow of traffic occurs, you might want to use something like container lab to test this all out. Because in container lab, not only can you bridge to a live Kubernetes environment that runs psyllium, you can also stand up virtual routers of various types. In my situation, I'm going to use the free range router, right? You could use Bird, you could use Viata, you could use, you can think of anything, Cisco, iOS, or I don't know, iOSXV, or I don't know what it's called anymore. Anyways, there's a whole bunch of virtual appliances. If you go to container lab.dev, you'll be able to check them out. In my environment, I've got a kind cluster that is running psyllium. It's got two basic engine XPods. The psyllium BGP control plane process is already enabled and all I want to do at this point is just deploy my psyllium BGP peering policy because I'll just go ahead and see that I can establish my communication and my peer relationships and then just do a simple thing. If that all works, then everything else technically should work. I'll tell you something though. What broke in my lab is that when you use container lab, so it actually creates a new bridge, a new Docker bridge. That Docker bridge is called CLAB, but that Docker bridge is actually used for all the interfaces that you specify for any sort of endpoint or node inside of your little container lab topology to attach itself to. So you would also have to customize your configuration, and I'll show you one very shortly, to bridge it to let's say a kind network. So let's just say you spun up a kind cluster, it creates a Docker network called kind, and then you would want to attach your container lab topology to that kind bridge because that way you can interface with your kind cluster and then you should be able to see traffic flowing. So that's exactly what I did, but there are also instances where you can just do point to point connections so it's not a multi-access setup, right? And that might be great if you're trying to demonstrate routing functionality because when you do get into multi-access setups, you have a massive broadcast domain and then routing can get a little interesting there. So let's get into the environment and let's take a look and see what's going on. So I do apologize, I'm going to have to do a side-by-side here, so please bear with me. And I will try to maximize, so let me know if you can see that. Looks good, looks good, cool, nope, that's too big. All right. So let's launch the fancy UI called canines and see what's going on here. So I already have a BGP peering policy deployed, I'm just going to delete it because I want to verify on my end, if I do a Cilium BGP peers, I have the Cilium CLI installed on the system so I should be able to run Cilium commands. If I run this command, nothing shows up, as expected, right? If I do, if I go back to my canines and if I type in services, right? Now, if you see, I have a few services, actually, where are the rest of my services? That looks weird. Oh yeah, there it is, okay. So if you see there, I've got a default service Nginx that's been exposed, it's got a cluster IP but it's also got an external IP which you can't really see, if I make this a little bit wider, there you go, you can see it now. So that IP is what I'm trying to get to my legacy network. I mean, I can't get to it now, I can, I mean, I could expose it and then sure it might be accessible. But if it's in a very distant network that I'm trying to access the service, I have to create that routing capability. So let's also take a look at some other things here. I just want to show you that Cilium is running. So if I do a pods, you can see that Cilium is functional and it's operating normally. If I went on this side and I just did a Cilium status, normally it would say that the BGP control plane is enabled or maybe I must have misread that somewhere else, but that's fine. I'm running 1.143. In this instance, it's a pretty recent version so everything is stable and has a lot of extra additional functionality around BGP. More than what I actually am going to show you and that's why I said go check out Daniel's talk afterwards. Now, let's go ahead and take a look at a few things here. So if I do an LS, I've got a few files in this environment. I want to go take a look at my Kate's example. So let's go to Kubernetes. Let's go to five, which is where my files are stored. We'll do a quick LS to check and see some stuff. And let's take a look at that C lab.yaml. So this C lab.yaml is the topology definition for how I've set this up. It's basically establishing all the linkage between the two routers that you saw there as well as the kind cluster. The one thing that I wanted to point out and this is something I had to really think about and hack away at is when you want to interface with a bridge upstream, you have to identify it and provide the bridge name. But if you do a Docker network LS, that's not the bridge name that's gonna show up and this topology file is expecting. So I actually had to deploy or install bridge CTL or BRCTL, get the actual name of the bridge and then take that and drop it in here to make this work. It spent like an hour trying to figure this out and it's just like, oh, it was right in front of me this entire time. So something to note when you're deciding to test this out for yourself. Now, if you look here, the links below are actually demonstrating how these are all connected. So I've got FR router one with ethernet interface one and FR router two with ethernet interface one as well directly attached. That should be a point to point link that's running over the CLAB bridge and then you see the other interfaces as well. Let's go ahead and log into FR2. So the way you would do that, how many of you are familiar with VTYSH? Yeah, network engineers will know because this is the way we log into our devices, right? So I'm gonna show you how you log into an FRR router. So Docker exec because I want to exec into the container and the name of the actual container and then run VTYSH. But if I run bash, I actually get into the underlying operating system and I normally wouldn't need that if I'm trying to enable the control plane process for BGP. So now I'm the router, if I do a show run, you can see my routing configuration, my BGP configuration for my peers. You could see something like graceful restart that's enabled. Why would I use graceful restart? Okay, quick sidebar. So the reason I'd use graceful restart, there are situations where maybe my control plane needs to be restarted or it goes down, but I do not want to lose my routes. I do not want to lose my data path. Graceful restart is a mechanism to tell your peers that I'm going offline for a bit. Don't discard my routes. When I come back online, I'll reestablish my peering and there's a timer flag that you would normally set to make that possible. So that's why it's set there. So this might be great in situations where you're upgrading psyllium or you might be upgrading your other end of the network and your data path is still available for you. We'll run a show IP or show BGP summary. I think that's the command. The container lab expert is sitting there. I can ask him. No, show IP, BGP, there we go. Okay, so I've only learned about one network which is directly connected to me. Even though I've got my routing configuration in place, I still need to deploy my BGP peering policy. So let's exit out of here and go ahead and deploy that. So kubectl apply-f psyllium peer policy GR.yaml. GR stands for Graceful Restart. So that's been applied. If I head over to canines and if I did a psyllium BGP, ooh, there it is, right? And if I, actually, let's see the other here. And if I did a psyllium BGP peers, I'm seeing peers now. But the one thing that's telling me that this is working, what do you think that would tell me that this is actually working? So there's a column here called session. When I see that keyword established, that means I know BGP's working. Let's go check and see what the FRR router is telling us. So if I do a show IP BGP, will you look at that? I've got neighbors and networks coming in. So that means I know that my peering relationship is working. The one last test we wanna make sure of is do I have connectivity? So if I go over to my containers, I have a, oh no, it's actually not here. I have a container here and I have to remember what it's called, Docker PS. Oh, it's called CLAB espresso and BGP, PC1. So let's Docker exec into that as well. exec-it and let's just remove the vtysh and the FRR1, PC1, bash. All right, so now I'm in the little PC that I've attached to that network. If I do an IF config, I'm attached to the same, ooh, this doesn't look healthy. Oh, might've been something that, here I'm gonna just do a one quick test here just to make sure that, okay, yeah, we're good. So I have an interface that's actually in the same network as those two FRR routers, but the thing is it needs to know how to get to those other networks. The one thing though is because the default gateway of this container is actually not the FRR router that it's facing, I have to manually create a static route. And this is all because I'm using Docker networking, right? The reality is if you did this in real world production or your staging environment, you wouldn't have to worry about this because you have so much more control over how you configure your networks and your network TCP IP stacks for your actual endpoints. In my case, let's just take a look at being able to ping the default gate, well, ping my local gateway, 20, 23, which was good. And the IP that I actually wanna ping is my engine XIP. So let's just, or the service IP. So the one I wanna ping is 10.12.151.63. So I cheated a little bit, I'll show you something. I have a static route because again, this device doesn't learn about that network because the gateway is not the directly facing FRR router. But if I do a ping, oh shoot. If I do a ping of 10, 12, 151, 63, demo gods and goddesses, please be with me. Dang it, I knew something wouldn't work. This was working earlier this morning. Anyways, I could troubleshoot it, but the ping is supposed to go through. I'm confident that this is working well because the fact that I'm learning routes from my peer neighbors tells me that BGP is up and running. My peers are exchanging networking information and things should work. In my case, I might have had a missed route somewhere on the other end and that's fine and I'm okay with that because if you actually run through the same Git repo that I showed you earlier on, which is through that QR code, I can go back to it, that doesn't use ContainerLab, that just uses straight up Docker networking and everything works cleanly, so go check that out. All right, well, now that that was a bit of a failure. What I can do, however, is I can actually go to the FRR router and see if it can ping that service. So let's take a quick peek of that and see if that actually works. Because remember that the FRR router is learning about that service IP, right? So I technically should be able to ping. Please don't let me down now. Oh my gosh, this is terrible. You know, the thing is I only have a minute left so I can't debug within that minute, but I don't know why. I'll have to go debug it later on and figure it out and record a nice clean demo for y'all. Anyways, let me find out where my other browser is. It's hiding. Yeah, it could be something else. I don't know what it is. Anyways, all right, so BGP has gone a long way in both the real world and in the world of Cloud Native. The fact that Cilium is actively driving more innovation around BGP functionality and how it can integrate with other networks actually means that this is gonna be very powerful for your use cases as you're onboarding into Kubernetes and onboarding your legacy services into those networks as well. You can use Container Lab, although please don't try and do what I did, trying to change things last minute because now you see how it ping fails. Come find me later on. I am at the Cilium booth. I will try and get this working and hopefully I can show you this. And then also, there are so many other use cases that get to be enabled because of this. So imagine if you are running service proxies in your environment that also require that load balancer service. Well, you can advertise those service proxies into the rest of your network as well. Come find me at the CiliumCon booth over right outside. I'm at the Solo booth. Come check us out. I can answer any more of questions you want regarding Cilium, BGP. Maybe we can try and get this working together and we can hack away or I might have to find Duffy and try and get this working. Thank you everyone for coming to my talk. I hope you learned something today and I hope you have an amazing KubeCon. Take care, everyone. Thank you very much, Marino. And we have time for two questions if people have questions. Any questions? Yes. Actually, I mean, if you want, okay, yeah, go ahead. So the IP tables isn't blocking ICMP or anything like that. I think I just missed a route somewhere else because when I tested this out yesterday afternoon, I had to add a route somewhere inside of KIND across my nodes to get back to that actual container and that's probably what it was. But at the end of the day, if you follow that repo, it will get you to where you want to be. Any other questions? Yeah, I had one question about the graceful restart that you talked about closer to the beginning of the talk. Is there any communication on the go BGP side about actual availability and liveness of translation services? Like things I've seen in the past is like black-holing service translation because, like, for example, Kube Proxy or EBPF was not configured correctly but you were still advertising the service block. Like, is there any communication on the back end of that with the Sillium solution? So Sillium is operating at a lower level in terms of networking. So it's never gonna be able to tap into what the services are doing in terms of when we're talking about BGP. This is where some of the other features in Sillium itself actually provide that. So Hubble, I think, offers that capability to understand when services are going offline or how they've gone offline, how long they've gone offline for, and that's tied to, like, a TCP flow or a UDP flow, which then you can tie back to an actual service inside of your cluster or outside of it. But that isn't related or tied to, like, GR, graceful restart because it's more for the control plane activity as well and making sure that the control plane, even though it goes down, the data path will still function. Yeah, is it aware of the whole control plane, though? Or is it just the BGP part? So it's aware of, the BGP control plane in Sillium is aware of its control plane. It's not aware of anything further upstream unless it's doing, like, some sort of next multi-hop or multi-hop BGP. So I didn't show that part of the demo, but there is multi-hop capabilities that you can peer with a router that might be several hops away and the graceful restart will still apply there. You could still maintain that and still capture that and ensure that your systems and services can be alive. All right, thank you very much for your time. No worries. Any other questions? I actually have one related to the former one, is when you have a service backed by multiple pods, is only, are only nodes where the pods are running announced on BGP and then... Sorry, when you have a service... Like, engine experts, for instance, in your example, imagine you have five pods, you have a total, you have maybe 100 nodes and you have five engine experts. Are only five nodes where these pods are running going to be announcing the service route? Yes, because the other nodes don't have that pod running, so there's no instance or no reason for it to advertise those particular networks. It's only where they are running. But you can also influence your silly and BGP peering policies to control certain nodes to be a part of a peer or peer relationship and then the rest of your nodes in your cluster may not have that same peer policy.