 Hello, everyone. Welcome to Cloud Native Live, where we dive into the code behind the Cloud Native. I am Itai Shakuri, and I am Director of Open Source in Aqua Security. I'm also a CNCF Cloud Native Ambassador and will be hosting today's show. So this is Cloud Native Live. It's a weekly show. Every week, we bring the new set of presenters to showcase how to work with Cloud Native technologies. They will build things, they will break things, and they will answer your questions. It's every Wednesdays at 11 a.m. Eastern Time. And this week, we have Dan Finneren. He's going to talk to us... Hi. He's going to talk to us about CubeVip. We'll hear about it shortly. Just a quick reminder that CubeCon and Cloud NativeCon Europe has just ended and the videos are up on YouTube for your on-demand consumption. So you can go ahead and binge on that. And just a quick reminder before we get started that this is an official live stream of the CNCF and is such a subject to the CNCF Code of Conduct. So please do not add anything to the chat or questions that would be in violation of that Code of Conduct. Basically, just be respectful of each other and let's have fun. And with that, I'll hand it over to Dan to introduce himself. Fantastic. Hi. Well, thank you very much for having me today. My name is Dan Finneren. I am part of the DevRel engineering team at Equinex Metal. And I spend a lot of my time mainly focused on things like Kubernetes on their metal, working on projects to kind of facilitate getting operating systems and Kubernetes onto physical machines. And also, kind of what I'm here to talk about today, that HA control planes, service load balances inside Kubernetes on your kind of on-prem environment, spare metal, edge, and things like that as well. And with that, the Qubit project has kind of come out of the experiences that I've had and the problems that I've faced and things like that. So yeah. So prior to Equinex Metal, I was at Heptio acquired by VMware, where I was helping customers get Kubernetes clusters deployed, scale them up, grow them, both on virtualized environments and bare metal. And prior to that, I was at Docker, where I was doing very much engineering work and helping customers. So been kind of all over the place a little bit in a cloud native way so far. Nice. Sounds like fun. Yeah. Absolutely. So why don't you tell us a little bit about what QVIP is? Yeah. Absolutely. So QVIP is a project that has kind of evolved to kind of fill some of the areas that I've found lacking to a certain degree. So I mean, I'll cover this in a little bit of detail later on. But as I've been kind of working with customers to get Kubernetes clusters deployed, there's a lot of tooling out there that people can use and things like that. However, when we start looking at kind of lifecycle management and automation and things like that, that's where I've generally found has been a number of issues where I think there's probably ways that we could improve upon. So, you know, QVIP, I think I kind of open sourced it about a year or so ago. Initially, it was, you know, just a few lines of go to make my life easier. And as the months have gone by and people have kind of found that it's been a bit more of a use for them, it's kind of made its way into things like some of the cluster API providers. There's quite a number of kind of people, end users that have used it to stand up HHA Kubernetes clusters. So providing, you know, kind of functionality that means that their clusters are highly available in the event of node failures and things like that. But as time's kind of gone on as well, I've kind of realized that those technologies that I implemented for that particular scenario could be used for other scenarios as well. So not pivoting, but kind of extending that functionality to things like service-type load balances inside Kubernetes clusters. And kind of with that, there's a bunch of, you know, kind of additional technologies that and techniques that we're needed to apply in order to do that. So I guess with that, I can kind of start stepping through what it is that I'm going to talk to, talk to everybody about today if that's okay. Yeah, please do. All right, excellent. So as mentioned today, we're going to be talking about QVIP. It's a bit of a two-phase kind of overview in that it will step through how I got to QVIP. It will talk about the technologies that kind of underpin QVIP itself. It will kind of cover how we can use it for highly available control planes, but it will also kind of cover how we can also use it to expose our services to the outside world as well through service-type load balances. So this is kind of the agenda for what I'm going to be covering today. There's also, I've got a Kubernetes cluster stood up in the background. We can kind of quickly expose some things and see, you know, Vips appear and things like that. But we're going to touch on kind of the inception of it. What I was doing when I suddenly realized that maybe there's a better way of doing these sorts of things. The architect is going to kind of touch on some of the core bits that kind of power QVIP. We will have a quick overview of how it provides the highly available control plane. We'll talk about how it is used in order to provide service-type load balances within Kubernetes. And then kind of briefly cover the roadmap in terms of where people in the community kind of wanting to take it next and kind of some of the functionality that end users have been asking for and things like that. So if anybody has any questions around QVIP and anything that I talk about as I'm going through that, please kind of raise those questions and we can kind of delve into it and all of that as well. So the inception of QVIP, as I mentioned earlier, I was doing a lot of work around bare metal Kubernetes clusters. So there's a lot of different ways that one can get a bare metal Kubernetes cluster deployed. A lot of tooling is normally involved in order to do that, so DHCP, TFTP, et cetera. So I kind of spent a lot of work in the periphery writing software in order to automate bare metal provisioning. So this was a, or it still is a project called plunder, but the idea behind it really was to simplify getting bare metal Kubernetes clusters deployed. It was focused on automating kind of all of the steps of getting an OS deployed and then getting kind of Kubernetes stood on top of all of that. And where I was at the time, there was a lot of conversation around cluster API. So for those that don't know, cluster API is a project effectively to kind of standardize the technique of getting Kubernetes clusters deployed. So you have things like a cluster API provider AWS, cluster API provider Google Cloud, cluster API vSphere, et cetera. And I kind of got most of the bits in place to get Kubernetes deployed on bare metal. The next step really I was, I thought, or maybe I can write cluster API provider for this project that I've written. And that's largely where I started to kind of hit on a number of problems in that it's not very easy to automate. So the cluster API provider would kind of stand up the nodes to a certain degree, but I kind of started to realize that a lot of bits were missing and a lot of bits were kind of hard to automate. So what kind of what am I talking about? Well, typically from a thousand year view, if you look at Kubernetes cluster, you interact with the control plane, but effectively, you know, you don't tend to care too much about the control planes mainly about firing things into there and having the workers that sit underneath that manage all of those workloads. Now, if you lose that control plane under any sort of circumstances, your workloads may continue to run, but at that point you can no longer do any sort of work with that cluster. You can't make any changes to it until either that control plane is fixed or you end up having to rebuild your entire cluster and things like that. So in order to get around that, obviously people want highly available control planes. So in the event that you lose nodes or you want to lifecycle management upgrades of kind of various bits of the control plane, you have that capability without downtime or without not being able to interact with your worker nodes. So typically most architectures would look like this in that you would have three control plane nodes as part of your highly available Kubernetes cluster. And then you would have a number of other nodes that would sit atop that Kubernetes cluster. And their role in all of this is typically to provide highly available access to those control plane nodes that sit beneath them. Now, this is, as I mentioned, this is kind of where things, I really started to realize that this is, there's probably better ways of doing this. I mean, if we just look at this quick architecture diagram, we already have two additional nodes that are required in order to sit and provide that additional functionality. Now in a physical environment, two physical servers, that can be quite expensive. And in theory, these additional nodes are costing money, burning electricity, and they're not really doing a great deal of work in order to provide that functionality. And then furthermore, if we kind of start to look at, kind of what's inside this, this layer that provides the highly available access to control plane, we need kind of two things. We need a cluster technology that will provide this highly available control plane address. And that technology needs to be able to move that IP address around in the event that this HA layer changes for whatever reason. And then underneath that, we need the capability of load balancing traffic to the control plane nodes that sit beneath it. So, if we start to think about that, there's two layers of additional tooling. There's additional infrastructure that's actually required. That's a large amount of operational overhead. And with that, that's not just operational overhead of these machines, it's operational overhead of those operating systems. Then there's the operational overhead of those technologies that need to sit within that layer as well. So, you need all of that operational knowledge in terms of how the tooling works, how to architect it and install it and design it. And then for each of those layers as well, there's separate configuration as well. So, there's different configuration files for perhaps that clustering part of it that moves an IP address around. Another set of configuration for the load balancing part of it, things like that. So, that, you know, incurs all of that sort of debt. And then there is also the lifecycle management of it. So, if I want to upgrade those various pieces or move things around, just lots of surface area at this point, which kind of was the thing that I was hitting. I was at this point, you know, banging my head against my desk. Like, there's just too much for me to automate at this point. I have to provision all this additional infrastructure and then I have to manage all of these additional bits of tooling. So, this is kind of where the genesis of kind of KubeVIP comes from. It became apparent that kind of there must be an easy way that I could perhaps re-implement all of this in a much more simpler way, in a more cloud native way that can kind of sit nicer with a Kubernetes cluster that I'm trying to provide this functionality to. And then kind of taking it a little bit further, as I mentioned, once I kind of implemented some of these technologies, it also became apparent that the same sort of things could be exposed to other areas of a Kubernetes cluster. So, in typical on-premises environments, a lot of the technologies and functionality don't kind of come out of the box. So, you know, as I deploy my pods and things like that, additional technologies are required to expose those pods to the outside world. Typically, you know, through, as I mentioned, Kubernetes of type, service of type load balancer. And it became apparent that, you know, kind of KubeVIP already had those technologies in place. It was just a case of then of marrying up the capability of speaking Kubernetes services with the technologies that KubeVIP already had in place, or which point then, I already had all of the bits there to do that. So, that's where I kind of took KubeVIP to a certain degree to the next level in exposing that functionality. So, cloud users are usually accustomed to being able to easily provision a service, a Kubernetes service of type load balancer and make it external. And the cloud machinery takes care of, right, provisioning an actual load balancer in the cloud provider and redirecting the traffic and everything. So, your goal is basically to bring these to the people who don't use a cloud provider. Is that fair? Absolutely, yes. So, a lot of this was really down to the source of people that I was fortunate to work with. They were looking to deploy Kubernetes and kind of date sensors that had no internet access. They wanted full management of everything. So, running things in public clouds was not really an option for them. And they basically had a requirement to, well, I went to go work with this customer. They sat me down and said, we want this. We kind of want it by the end of the week and then they just kind of disappeared. So, I effectively kind of was left with kind of a week to build them a Kubernetes cluster and kind of start to implement all of this functionality. And from that, I kind of realized that I can automate various bits of it but there's probably better ways of doing that. And in on-prem environments, a lot of the tooling that you get in the cloud isn't there by default. So, I'm gonna talk a little bit about what a CCM is a little bit later on but that is the secret source that makes a Kubernetes cluster able to speak to the infrastructure of where it is. So, in AWS, when you request a load balancer, that CCM that's AWS specific, that does the magic and you get an IP address from somewhere. You don't need to care where things are just exposed to you. And it's a little bit more difficult when you wanna kind of do those things yourself. So, I'm hoping that this project will be, well, it seems to be making people's lives a little bit easier from that perspective. Right, awesome. And just to clarify the relationship between QVIP and the Plunder project, is it required to set up the cluster using Plunder or is it a separate component? It's a, it's an entirely separate project. It was just that when I was doing all the Plunder work, QVIP and what it needed to do was kind of all part of that. As of last week, QVIP now has been moved to its own organization. So, there's now github.com forward slash QVIP and then there's the QVIP project is in there. The QVIP Cloud Controller lives in there as well. So, there's a clear demarcation. The two are not dependent on either each other at all. Yeah, thanks for clarifying. No problem. So, one of the technologies that actually power QVIP so it can do HA and expose things to the outside world. So, we'll kind of quickly step through these. Originally, the grand plan for QVIP to provide HA was that it would use a technology called RAFT. This is a clustering technology that powers things inside Kubernetes. So, the XCD data store which has all the persistent data about what it is that lives inside your Kubernetes cluster uses this algorithm. And the main function of this algorithm is to select or vote for one of the members of that cluster to be the leader. So, effectively the RAFT algorithm, you join your nodes in this and then they will start voting against one another. One of them will be elected leader and that leader can provide services or do what it needs to do as the leader. This algorithm was needed because effectively when I want to do HA, something needs to be in charge of providing the HA control plan address to the outside world. So, I originally went with RAFT. Unfortunately, as we started to work with some of the clustering guide providers, the lifecycle management just, it didn't really work. When we removed members from the cluster, sometimes the voting wouldn't work properly and nobody would be leader. At which point there was no access, there's no virtual IP, there's no cluster IP address and you couldn't connect to the cluster. So, ultimately seemed like a good idea at the time but it was a bad design decision. However, luckily the Kubernetes API provides an alternative technology. Surprisingly, it's called leader election. So, we can actually ask the Kubernetes API to choose a leader for us. And effectively, the way that it works is that any, we can have this code running in a number of different pods or in a number of different areas through the API. They can all say, I particularly want, I wanna hold this lease. Whoever holds the lease is the leader. The Kubernetes API will only allow one of those requests to be allowed. So, effectively what that means is that all of your nodes will try and get access to this lease but only one of them can actually hold that lease at any one time. So, once the leader election has occurred and one of the bits of code has acquired that lease, that is now the leader. And from a HA perspective, that leader now would expose that control plan address and ensure that traffic can hit the Kubernetes cluster that's been running. The lease can be lost in a number of ways. So, if that node were to go away, if that node was to issue areas where, you know, kind of high throughput, it's getting quite laggy, then that lease can time out. One of the other nodes will acquire that lease in the meantime and take over the duties of being the leader in the cluster. So that technology was just there for us to actually make use of. That's super cool. I mean, just that learning from this stream that Kubernetes has a leader election built in system. I mean, it's a pretty tough challenge for anyone building distributed systems. And, you know, we used to set up zookeepers and stuff to just for this. And so, just when I highlight this learning from you that Kubernetes is a building mechanism for this. Absolutely. The code behind it actually is all available in the Client Go library. And it's beautifully simple in that it effectively just requires the Kubernetes token to speak to the API endpoint. And, you know, you can have this code running three times. And each copy of this code will try and get access to that lease. And effectively, the one that gets that lease then will be told that it can do something at this point. So it's just there fantastically that you can kind of make use of. I just want to add another comment for people asking questions in the chat. We see your questions. If you could just clarify them a little bit, then I can pitch them to them. So just a comment to the people online. Thank you very much. So a couple of kind of networking technologies. This might sound like we're going completely off radar, but you kind of need to understand these technologies so you know what happens. So there's kind of two technologies that we're going to focus on. One's called ARP and one's called BGP. And the reason why these two technologies are important are that when a node that is wanting to do the HA components is elected the leader, they need to inform the network that they are the node that has the cluster IP address. So there's two ways that we can go about updating a network. The first one is called ARP, which is address resolution protocol. But effectively ARP is used to look up an IP address to a physical bit of infrastructure. So for instance, in the diagram to the left, if I was wanting to send traffic between two physical machines, traffic doesn't necessarily go IP address to IP address. It needs to actually traverse a different layer. So in this, it's actually going to go to the layer of the networking cards to the ethernet that sits underneath it. So ARP effectively allows us to look up the hardware address that is linked to the IP address. So as mentioned, if I wanted to send traffic from dot 20 to dot 33, I would need to know it's physical address in order for the two networking cards to send that traffic across to one another. Now, why is this important? This is the way where we can effectively let a network know that if I need to send traffic to an IP address, this is the machine that it needs to go to. So if we kind of backtrack here a little bit, when a node is elected leader, that node then will get that cluster IP address. It will then do an ARP broadcast, which will tell the network that if it needs to send traffic to this particular Kubernetes control plane IP address, send it to this physical piece of infrastructure. So this is the way that a new machine that has an IP address linked to a new bit of metal can let the network know where to send it. So this is kind of a, it's called layer two, but this is effectively the linking of a logical IP address to a physical machine where traffic should actually be sent to. And this is the most common way I would say that you can let a network know that this is where traffic should be sent to as an update. The alternative that we see in slightly larger networks is a technology called BGP. A BGP allows a device to publish to the network that traffic should be sent to them. So effectively they participate in a thing called peering. And what that means is that a, if we look at the diagram on the right here, the machine.21 also has a secondary IP address here, 10.0.2.5. It will peer to either a router or a top of rack switch in the network. And using BGP, it will tell that piece of infrastructure that if somebody wants to get to this IP address, then they should go to it through me. I am the route to that traffic. So we can see now that the laptop that wants to get to that IP address, that route has been given to it. So it now knows to route traffic through the machine that is advertising that through BGP peering. So again, when a node using BGP wants to advertise that through BGP peering, they can let the network know that to get to the control plan IP address, you can send traffic to me in this instance. One additional benefit of BGP however, is that we don't need to necessarily do the leader election. All nodes can participate in BGP peering. What happens when a client wants to get to the Kubernetes control plan IP address is that the traffic will be sent to that router. And that router can actually load balance itself to one of the nodes that is doing the BGP advertising. So we also get some load balancing for free using BGP. So this is kind of the other technology that we can use so that Kubernetes cluster can make the rest of the network aware of where to send traffic when you want to hit the control plan IP address. So kind of a quick overview. ARP, very simplistic, doesn't require anything special. BGP however, requires specific infrastructure that supports BGP. ARP can be dangerous in that malicious person could start sending ARP updates, which kind of black hole traffic. So for instance, I could do an ARP update which says the gateway is actually this MAC address, which means all of a sudden, traffic is gonna black hole and things like that. However, BGP can have authentication and you can impose restrictions about who can do what with that network. And kind of some virtualization software can restrict gratuitous ARP. So in the event that that IP address moves to a different host and we need to tell the network this is where we need to send traffic to. On things like the MMO vSphere, the vSwitches would need something like promiscuous mode enabled in order for that to work. So that's kind of a quick overview between the two. So those are the two technologies that we typically use to provide kind of that highly available functionality. Either ARP to say we have moved our highly available IP address to this particular node. This should go to this particular infrastructure or using BGP peering. We've told the network infrastructure to send traffic to us because we're healthy and we have that BGP link. So those are the two core technologies that power it. How do you actually get KubeVip installed? So as I mentioned, I did originally go down RAFT that would allow KubeVip to sit outside of the Kubernetes cluster. However, making use of leader election means that we need to run KubeVip inside of Kubernetes so that it can actually speak to our Kubernetes API. So there's two methods that we can actually use to get KubeVip deployed inside of Kubernetes cluster. And that is either using static pods or through demon sets. Both of them kind of come with quirks that you need to be aware of a little bit in terms of how best to get it deployed. So I'll quickly kind of step through them and we'll see if there's any questions popped up. So this is kind of where I hit upon another kind of strange scenario in that I wanted to use KubeADM in order to standard my Kubernetes cluster. And I wanted to use KubeADM in saying deploy this Kubernetes cluster and this is the control plane IP address. This is the highly available IP address you should use. However, there's an issue there in that in order to get KubeVip deployed, I need a cluster running so that I can say to a Kube CTL apply and stand the KubeVip pods up and they will do leader election and advertise that address to the outside world. However, how can I deploy to a cluster before there is actually a cluster in place? Because what happens is that KubeADM in it will fail the checks because it will try and speak to that highly available IP address before giving me the cluster to use. And without that cluster to use, I can't deploy KubeVip. So I'm in a scenario here where I can't get a highly available IP address because I can't stand the cluster up because the highly available IP address doesn't exist yet. It turns out that there's a way around this that we can kind of cover. So this is kind of how KubeADM in it works. Effectively, KubeADM in it will generate a bunch of static manifests inside a LCD Kubernetes manifests. Then the KubeLit, the process that manages pods on a host will start up all of those components. So when you do a KubeADM in it for the first time, that will start the motion of standing up your first control plane node. It will stand up the API server. It will stand up the controller manager and the scheduler, et cetera. Before KubeADM in it says everything is all good, it will also try and check that control plane IP address, the highly available IP address. This will fail, unfortunately, because we've not been able to speak to the cluster and apply the KubeADM manifests in order to stand it up. It turns out that the solution to this was relatively straightforward. We can actually get a KubeVip, can actually generate the manifest for us and put it in the LCD manifests directory for us. So now with that manifest already there, when I do a KubeADM in it with that control plane IP address, KubeLit will stand everything up for us. It will also stand KubeVip up for us at the same time, which means all the control plane components will start and KubeVip will start next to all of those components. KubeVip starts that highly available IP address is there, KubeADM in it can see that and everything will complete correctly. So we have stood up a Kubernetes cluster with a highly available endpoint actually up and running. The next steps really with that are to add in your additional control plane nodes and add in those manifest static pods, at which point you have a highly available Kubernetes control plane. With a demon set, we have a much simpler deployment method, however, this isn't possible with KubeADM. This is more of a deployment technology that you can kind of use with K3S. So K3S allows us to stand up a cluster without a HA endpoint to begin with. If we look at the second line on that bit of a commander that I'm showing there, we have the dash dash TLS SAM, that's our HA endpoint. And what this actually means is the K3S will come up and it will have that IP address as part of its certificates. So when we stand up our HA control plane, we won't get any certificate errors moving forward. So we start our first node using K3S. That will stand up everything that we need it to do. We can then apply our demon set that has KubeVIP in there. KubeVIP will start, it will do that leader election. It will start advertising that 10.0.2.5 address. And then as we join our additional control plane nodes, as it's a demon set, KubeVIP will just move and grow and deploy itself automatically to those nodes. So as we change our control plane, maybe delete node one, upgrade it with a newer version, KubeVIP will just keep moving around and provide that HA functionality for us. So that's kind of how you actually deploy it in kind of either in a demon set mode or a static pod manifest mode. How does it actually look like? How does it actually work? So if we're using ARP, we can see here, we have three control plane nodes. We have 10 workers that are sat underneath it. And in this example, node one has done the leader election through the API server. It has that lease. So at this point, it has that IP address that HA 10.0.2.5 IP address. End users will connect that with KubeCTL, do KubeCTL applies, et cetera, and deploy things on those worker nodes. In the event that that node is removed or has issues or crashes or whatever reason, the remaining two nodes will suddenly start doing the leader election. They will speak to the API server, the local API server. And one of those other two nodes will be given that lease. When they have that lease, they will then do that gratuitous ARP, let the rest of the network know that if you want to get to this HA IP address, send your traffic to this particular node, in this case, the hardware address of node three. And that's effectively how it moves around. When we bring node one back up into service, it will do a leader election. It will find that number three is already a leader. So it will basically then just sit and wait until there is a new leader election event. With BGP, however, it looks a little bit different. We have our three nodes actually up and running. And they will all be peering to a top of Rack Switch. So you can see on this diagram, they all have that 10.0.2.5 address, which is our HA control plane address. They don't expose that to the outside world. So in order for the BGP things, technologies to work, we bind that IP address to a local host and internal address that isn't accessible on the network. But effectively, when all of these nodes are peering, if any device wants to connect to that control plane IP address, the traffic would go through that top of Rack Switch, that router, and that would take care of sending that traffic to any of the nodes that is part of that peering group. So all three control plane nodes are peering to the top of Rack Switch. They are all saying to that top of Rack Switch, if you want to get to this IP address, send the traffic through me. And that's effectively how it's doing that BGP HA. In the event that we lose any of those nodes, the BGP peering will stop, at which point the path no longer exists at the top of Rack Switch. Traffic will then just be sent to the remaining peers that are advertising that HA IP address. So that's effectively how the HA control plane looks to the networking topology. We don't necessarily need to use leader election with BGP. All nodes can persist as advertising that IP address. It's only with ARP where only one of the nodes can say send traffic to me. If you had all of the nodes all saying send traffic to me, you would end up in a position where things are breaking, traffic is being half sent to one node and then all of a sudden the physical device that it's being sent to has changed and connections will break and all sorts of things like that. So leader election is there to protect you from networking issues like that. I don't believe there's any questions so far. Nope. How come it's different, just the last point that you mentioned that with ARP it was required to do leader election but with BGP it wasn't. Can you just explain that again, what's the difference? Sure, absolutely. So with ARP we are effectively telling the network to send traffic to this IP address, send it to this physical piece of hardware. If we had all three nodes all advertising the same IP address but to different hardware addresses so that MAC address that IP lookup that ARP provides, we'd start sending traffic to node one but then nodes two or node three would have told the network that they should be getting that traffic at which point you're gonna get broken connections and things like that. However with BGP once the connection is established that connection lasts the lifetime of the connection that it happened and the BGP the router or the top of rack switch that supports BGP, that will take care of that connection for us so we don't need to worry about it too much. Okay, so it's because of the connection semantics that doesn't exist with ARP. Yeah, so ARP is a layer two, it's a data link layer. It's the layer that identifies things on the, almost on the physical layer to a certain degree. We don't have a quick question is HAProxy running on all control nodes. It is not, no, we don't need to do that. The Qubvit nodes can either just send traffic directly to the nodes that, so if you're the leader you can send traffic directly to the local API server that's running there but Qubvit also supports HTTP around Robin load balancing. So it will effectively send traffic to one of the other nodes. So for instance, if the leader is node number one, it could be that it traffic hits that node but then is passed to node two or node three. So that can also do that HTTP around Robin load balancing there as well. Cool, so we talked a little bit about the HA part of it. That was the original goal for Qubvit. It became apparent that once I had those two services kind of implemented for the HA control plane, I could also use the same thing for Kubernetes services. One thing to be aware of there are two components that are actually required for that functionality. So you mentioned it before, the CCM or the Cloud Controller Manager. This is normally specific to an infrastructure provider. And then once the CCM has done its magic we need something to provide that networking magic. And in this instance, we're talking about Qubvit that's going to be doing that sort of magic. So a CCM, the CCM is the secret source when running a Kubernetes cluster on effectively other people's infrastructure, also known as the public cloud. And that cloud provider, so for instance cloud CCM for AWS or Google Cloud, et cetera, that effectively is almost a translational layer between a Kubernetes object and the counterpart within that infrastructure. So if I do a Qubectl expose, it is the role of the CCM, for instance in AWS to request an elastic IP address for you and update the Kubernetes object with that information. Same with Google Cloud or Azure or wherever you're doing those sorts of things. The CCM on your infrastructure, however, needs to be very flexible because most people's infrastructures are completely different. It needs to be capable of being quite configurable for different types of networking design and networking ranges. One of the things that I'm looking at doing is being able to plug into things like existing IPAM or different infrastructure management tooling. So effectively when we request a load balancer service, IP address and being able to speak to other things in a person's infrastructure to get that information for us. So how does it kind of all hang together? Well, the CCM typically has one main role. So for instance, I'm doing a Qubectl expose of a deployment called engine X. And what we actually get from that is to begin with is a Kubernetes service and it would look like this. We can see that one part of it hasn't really been filled in yet. So the load balancer IP address is going to be blank to begin with. And effectively it is the CCM's role to update that object with information that's specific to the infrastructure. So again, in AWS, the spec.load balancer IP would be updated to an EIP IP address. So the CCM's role there is being able to speak to the AWS API and populate the information required in order for that service to make sense. If we think about our own infrastructure, Qubivip has its own CCM that we can give network ranges to. We can either give it side arranges or a start range and an end range and things like that. And the CCM will use that in order to populate this spec for our environment. So for instance, a home, I have my CCM configured to give it IP addresses from .200 to .220. So it has 20 addresses that I can use. And if I do a Qubectl expose, my local CCM will keep track of those and it will update the spec with that. And one thing to be aware of here is that we're not using config maps or anything like that. We are sticking with Qubinesse's objects directly. So we don't need anything special here. It's using kind of tight coupling with Qubinesse's objects. The good thing here is that any CCM can replace one that I have written. So for instance, if we look at things like Equinex Metal, their CCM needs to speak to the Equinex Metal API and get me an Elastic IP address, it only then needs to populate this Qubinesse's objects with that information. That's all it needs to do, which keeps there's no tight coupling between QVIP and any particular CCM. As long as one thing is updating this object, then QVIP can react accordingly. So there's kind of two ways that you may want to get QVIP deployed on your worker fleet. You can either deploy as a Damon set, so it will be everywhere. The alternative is that you may want it to be a replica set or you may want it to be tied to specific nodes. The reason being, you may want to ensure that traffic is only coming into certain parts of your infrastructure and things like that. However, once you have some QVIP pods deployed, they will effectively then take care of advertising these services. So how does it actually work? Well, once you have your QVIP pods deployed, they will watch services of type load balancer. So we can see here we've just done that Qubectl expose. The Qubectl, the QVIP pods are all watching. They've seen that a new load balancer service has been created. Once the CCM, whichever CCM it is, has updated that load balancer IP address, it is at this point then that QVIP pod can go ahead and advertise that IP address to the network, to the outside world, at which point then any end user coming in will be able to send traffic into the Kubernetes cluster and then kind of Q-proxy will take care of passing that traffic to the pods in that service. And that's effectively kind of the crux of it. It's the same technologies that are providing that HA technology and kind of turned it on his head a little bit. It now does that exact same thing for providing access into a Kubernetes cluster for those IP addresses that are attached to Kubernetes services. One other thing is that it can also work in kind of a hybrid mode. So we saw a number of end users who wanted to have small Kubernetes clusters or just having traffic coming in to their control plane nodes. So what we can actually do is we can have HA control planes and service type load balancers all sitting together. And in the event that we expose something, it would do exactly the same thing. So those pods that are set on the control plane nodes would advertise to the network either through BGP or ARP. This service IP address as well, I'm also exposing send that traffic to me and I will then send that traffic through Q-proxy internally to services that are actually running. Some of the things that have briefly been added recently are the capability and this is mainly for edge environments is I don't want to have IPAM, I don't want to have to worry about IP address ranges. What we can do here is if we look in the left-hand corner, I'm doing a Qubectl expose and I'm specifying that the load balancer IP address is 0.0.0.0, which isn't really a properly valid IP address. However, what actually happens here is that when QVIP sees that that is the IP address that has been given to that particular service, QVIP, the QVIP pod itself, and this will only work with ARP because it needs one nodes to be the actual leader. That node will do a DHCP request to the network that it's on. It will get that IP address. As I mentioned, this is normally for an edge environment and that IP address will then be used as the service IP address for that service that I'm exposing. So kind of coming to the end, the roadmap, there's a lot of kind of things that are being added to it recently. So as somebody asked, there is no HA proxies required. As I mentioned, QVIP collapses all of those different technologies into kind of a single place, making it easy to manage. We're looking at improved control plane load balancing through things like IPVS, distributed ARP load balances. So as shown on some previous slides, only one node using leader election is allowed to do ARP broadcasts to the network. So after a while of doing Qubectl exposers, only that one node that is the leader will be exposing all of that traffic. So that can start to become a bit of a bottleneck on the network. We're looking at having effectively a leader election per load per service IP address. So that would start to distribute those across all of the Qubectl pods that you've deployed. And then enhancing BGP, so being able to share those routes further afield in the network. A lot of improvements in observability and monitoring and then vastly improved documentation. A lot of help is needed there. I've written most of the documentation myself, so it's not very good, but I'm hoping to improve that soon. The final part of the roadmap really is, it's now been submitted to the CNCF as a sandbox application, as a sandbox project. So, fingers crossed, it kind of gets accepted for there. I'm just kind of grateful for all the support that I've had so far as well. So, yeah, thank you very much. So, yeah, that was a kind of a quick overview of all of the different technologies. I know there's a lot to kind of cover. There's networking technologies, there's Kubernetes technologies, there's clustering. There's a lot to kind of cover there. But thank you very much everybody who's stuck with me through that. Yeah, and we can maybe address some questions. If you have any, please write it down now in the chat. So there was one question about whether this allows us, as opposed to using GraphDirectly to have an even number of nodes in the cluster. Yes, so Raft requires an odd number due to the voting algorithm. However, leader election, there is no kind of odd or even requirement. It's effectively whichever has managed to get the lease from the Kubernetes agent. Thanks, another question about what kind of resources does this require from the control plate servers or the, I guess the question is more about like the controller that is QVIP. Sure, yeah, so QVIP is actually very small. It requires barely any resources. I think the general cap is kind of like 100 meg on the QVIP pod, but it tends to use way less than that. As I mentioned, there's a lot of technologies that are involved, but it's very simple in terms of how it all hangs together and the technologies that it actually uses. So it sits and watches, it reacts accordingly. Yeah, it's quite lightweight and multi-architecture as well. So if you wanna run it on Raspberry Pi, it's on ARM. Do you wanna run it on big metal X86 servers? The choice is yours. Great, and yes, the recording is available at the CNCF YouTube channel. Great, and I see that there are no more questions. Go check out QVIP under github.com slash QVIP. Right, this was in the previous slide. Thank you, Dan, so much. It was a fascinating deep dive into distributed computing, networking, and Kubernetes. And I'll see you again next week on Wednesday, 11 Eastern time, every week, the CNCF cloud native live. Thank you, everyone, and thank you, Dan, again. Thank you, thank you very much. All right.