 Hello everyone, so Welcome, thanks for coming the last time I Stand up in front of so many people was when I was 14 and playing a triangle in a brass band, so It means a lot for me My name is Petr Horacek, and I've been working in the redhead here in Bruno for the last four years three years on overt aka reddit virtualization and a year ago I joined a brand new project called Kubbert and Kubbert is an add-on for Kubernetes that brings virtualization support on top of Kubernetes clusters and allows you to run virtual machines next to your containers and in this talk I will tell you about our journey of implementing advanced networking solution on top of Kubernetes By advanced networking, I mean for instance Physical device pass through all or access to multiple layer 2 networks from ports or virtual machines So in this talk, I will tell you about challenges and solutions We came up with and I will start with cluster networking networking evolution. How did the view on? Connecting our services changed with time and this is important to give you Background or show you the ground we are building our solution on and why is it different for the our previous projects and the biggest part of this talk will be about so-called four pillars of networking which is pretty much just a fancy name for the architecture we choose to to follow and Why should you care about this? I mean Kubernetes networking is beautiful It's simple to use for users and its architecture is pretty neat As long as you are dealing with the default networking, which is usually an overlay connecting all the ports together If you need something more advanced like mentioned physical device pass-through or multiple network connections, it can get a little messy This is what we are dealing with We worked with our community and with other projects to align our set of solutions that are dealing with these problems and I think the result is pretty good and I want to ask you to if you don't have a really good reason to choose a different approach and You complete the rewrite it, please don't or at least use the same API so we can all profit from your work and Finally, I will tell you about the tools we developed So let's start networking evolution In the beginning in physical ages Let's say you had these four physical machines with their physical necks and you connected them using Physical switches and routers to create one interoperable network. Once that's done. You can run your services on top of it But maybe you decide on next week that you need additional Connection to another network. So what do you do? You go to your network administrator with maybe his grandpa Maybe he doesn't like you and you beg him on your knees to provide you another switch the router connection and Interface and it it can think it can take some time and Energy from you. Is anyone in here a network administrator by the way? That's great. Thanks, God Another problem Connected to this architecture or approach is that if service a wants to access service B Service aim needs to know the IP address of the second machine and then a port on which the second Service the service be runs in order to access it and in case the service be moves to another machine Service a must be notified that the IP address changed and all of that. So that's that's another problem connected with this approach and then we got into fit virtual ages and Like much didn't change. We still have our machine switches and routers the topology is pretty much the same We just made a virtual machine from a machine and virtual switch for a switch from switch and so on The benefit is that if we decide that we want another network connection All we need to do is to do a few clicks and get our virtual network up and running No need to deal with other people and so Then cloud ages came and the topology is completely different at least from the user's perspective We still have our services and they are running on some machines, but we don't really care about them We just have a set of notes at our services and all these services are connected using an overlay network Or a different network that connects all these services together the difference is that every single service here has its own IP address and is so Which doesn't change and or can't change but the the point is that service a can access service be just using this IP address At least this is a premise of Kubernetes networking Yeah, that's all for Kubernetes networking, but then maybe your service needs a high throughput or access to a storage network And how to do this? It's not defined by the Kubernetes network design You need to somehow connect the service to another additional interface and to the storage network Or maybe you want to use a private network to connect two of your services again You need to invent your own solution for that or you need a physical device pass through for SRLV again You need to do it yourself. It's not given to you But we we had to deal with these problems because These use cases can be useful for containers, but they are really important for virtual machines that we are running so we came out with our four pairs of networking and Our Kuvert mantra called Kuvert razor If something can be used for pots, it should not be implemented for VMs So you will understand it pretty soon. I hope so the first column is node network configuration Then we have logic network definition smart scheduling and VM by Nick mechanism. I will go go through all of them now So let's start with the node network configuration if you need all you need is just the default Kubernetes network you can probably just configured on the day one and use Ansible or shell scripts to configure your overlay or whatnot and you are done however, if you want to use physical devices and Or additional networks requirements for them change next weeks and month and you need to dynamic dynamically configure those Of course, you can again use Ansible or shell scares, but that can be tedious. So what we created in a redhead is something called nm state and Kubernetes nm state for you who didn't attend nm state presentation Friday Nm state is a tool that allows you to declaratively Define this our network state on a host and then just apply it compared to for instance network manager, which has imperative approach to it and Kubernetes nm state uses nm state on cluster level What that means? We created a two new Kubernetes objects first of them is called node network state The second is node network configuration policy node network state is one per each node in our cluster and it allows you to Check what is the state of the network on the host and also to configure it? And if you want to apply some general rules to configure your network on your cluster You can use node network configuration policy and say for instance on every nick that is sri we enabled enable eight virtual functions So Can it look like I mean you you have this basic cluster of your notes and you add a new switch So if you want to expose this switch to your pods or VMs you maybe need to create Bridge below them. So with nm state Kubernetes You can use the node network state status to report the eth1 And check that it's really there and then you can create a specification that says Configure the eth1 and create the bridge on top of it. I won't go into the details of this, but Trust me. So if you apply this You get from this this state you get into this state just using kubectl and those objects so in some cases you need to do dynamic configuration of your cluster networking and Kubernetes nm state makes it easier for you Then the second player logical network definition for a user the logical network might be just a Symbol name for a certain connectivity while for administrator who defines these logical networks it's a method how to connect a pod into desired network and This problem wasn't bugging only us, but many people across Kubernetes so this This Kubernetes network custom resource definition de facto standard was created and you know, it's maybe you know it's under name multus just Quick intro to multus by default Kubernetes uses only the one single network plug-in and So what multus does it becomes this one single network plug-in and then it calls the default one or maybe a second network Maybe the third network and so on So what we do here is that when a user creates is they put their pot they are connected to the default network and Maybe they requested an additional network the blue network in this case. They don't care how they are connected They just want it each one interface connected to this network so Based on the logical network definition They are connected to it using the bridge and maybe some VLAN, but doesn't doesn't matter for the user user just requires the blue network it can look like this This is the definition of the network network attachment definition called blue network And it says connect me to OBS bridge bridge one and take it with VLAN 100 and Then user just requested requested to using an annotation So the second floor logical network connectivity represents connectivity to a certain logical network and In with multus and with the de facto standard we use object called network attachment definition to do that Now the third pillar is scheduling Cluster note in the cluster are not all the same. Some of them may have additional network interfaces some of them SRL V and You need to make sure that when a user request a connectivity to a network They are their pot is scheduled on a note that actually has access to this network and Maybe you can do it using note labels and note selectors But that would be tedious and again user doesn't care about Note they want to get this or that connectivity So as part of the de facto standard we have scheduling as well So let's say I have I'm back to my cluster with my free notes and I have a pot that has no special requirements It can be scheduled on any of these free notes, but now it requires a blue network So the first note is out of the game. We need to have access to the blue switch and Then maybe the third note has so many pots connected to this network that the bandwidth is It's gone. We cannot Schedule any more pots on there. So it's out of the game too and we end up with this single note available So the the smart scheduling makes sure that this will happen automatically for you It's done using node resources, but we don't have enough time to get into that So if you are interested interested check out the slides later There are two methods how to do it, but they are based on the note resources In short, you can either use Extended resources if you handle unlimited kind of unlimited resources or you can use device plugins if it If you care about every single Connection that is available on the note for instance virtual functions of your SROV card As I said, it's handled using the de facto standard and multus and there is additional Annotation in the network attachment definition that says give me I want to be scheduled on the note with this resource available In case Paul to request this network okay, so wrap up for smart scheduling Kubernetes provides enough tools to Implement these scenarios, but it can be tedious to do it yourself So on multus and the de facto standards glues together logical network definition and scheduling Finally the the fourth pillar, which is the only VM and the cupboard specific as is VM binding mechanism And in the cupboard the VM is just another process running in the cluster and it's treated as any other pot would be We and for the previous three color columns or pillars we just consume them so Quick intro to pots and VMs So this is a pot and it's just isolated namespace network namespace in this case and So as a network namespace it is it is zero with connectivity to some outside network Container which is part of the pot is set of another network names. Sorry another namespaces but we don't care about them since they are not network related and In this container we run our processes And in the case of cupboard this process happens to be a virtual machine Unfortunately or fortunately a virtual machine has its own network namespace and it's isolated from the eth0 and cannot see it And it has its own eth0 So how do we connect these two and make sure that the virtual machine has access to the outside network? in Covert we have two basic mechanism to do that the first one is using Linux bridge in this case we Backed back to the pot. We create a bridge connected to the eth0 which has access to the outside network But we cannot keep the IP address on the interface. We need to move it inside the virtual machine So we remove it from eth0 from pot create a DHCP server that offers this IP address on the bridge Then we connect bridge to the VM. VM runs DH client Receives this address And now as you see the VM is connected with those cubes directly to the outside network And it has IP address which originally belonged to the pot. So it It became the pot from the networking perspective The issue connected with this approach is that If we run an additional site site containers on the pot, they won't have network access Since the IP address is missing there This is how the request would look like in the virtual machine definition And again, it's just we request a pod network and we attach it to the VM using a bridge bridge mechanism The second binding option is using net in this case We Forever only a specific port traffic to the VM. The rest stays in the pot And that allows us to overcome the problem with side side containers. We can just redirect certain traffic to the VM and the rest can be handled by side containers So again our Disconnected machine We connect it to a bridge. So now it has access to the pot network namespace, but not the outside network Again, we start the DHCP server, but this time We don't offer the IP address which is address of the pot instead. We offer some static IP address It can be anything is just so the virtual machine will obtain a IP address Finally the difference here is that we use IP tables To say that if there is a traffic coming to the pot and the destination port is 80 It will be forever inside the machine And as I said thanks to that we can use sidecars Now the virtual machine instance definition would look pretty much the same at least except we specify the masquerade binding instead of a bridge Finally for multus and multiple network support in kubberg We use the same mechanism for every secondary network that is passed to the pot and the secondary networks are Received like we asked for them the same way a pot would ask for them and then we just consume what is given in the pot The binding process is the same And it's just an initial interface and initial network request defining the the blue network as we see so before To recapitulate the virtual machine is just another process running inside the cluster Sorry And based on the binding mechanism we choose we get different capabilities and performance So the takeaway from this presentation Um the containers or the kubernetes Network approach change the view on networking instead of In terms of switches and routers we think in terms of a connectivity from service a to the service b And although it's it can be hard to run these type of workloads like physics To connect to a physical interface from the pot it is possible And I hope you have at least some idea which tools to use and how how it works And yeah, that's all. Thanks you for thanks for your attention. And if you have any questions or comments This is the time yes, um Not by default in the so the question is whether it's possible to uh attach multiple IP addresses to a single pot I don't think it's it's possible by default and Uh, but you can do it if you configure Okay, let me rephrase You configure the networking yourself or you configure the plugins that give you the networking So in theory, you could do it, but the question is why would you do it? Yeah, I mean if you want if you want to handle it yourself, it's definitely possible Um, yeah, are you talking about this? Oh, sorry. What happened there? Are you talking about the secondary connections or the primary network? Um, I don't think it's possible or someone does it because there's no need for it. You just need the connectivity from a to b Does it answer your question? Yes Okay, so the fastest option, uh, I didn't really talk about the binding mechanism was would be the The pass through of so the question was which Which mechanism is the fastest and best best work performance? The best would be srv binding, obviously, but then I'm I don't have any data to support it, but I think the bridge option would be The second one and then you have the ip tables option When it comes to the secondary networks Okay, if you didn't hear just for and for the recording It's as for now the multiple addresses are not supported in kubernetes, but it's in the making Any other question? Okay, then thank you very much for all this thing and bye