 Welcome to the cluster API deep dive session. First things first, cluster API is a sub-project of SIG cluster lifecycle. And today, I'm here with Jason. Please introduce us. Thank you, Marcel. My name is Jason de Tiberis, and I'm one of the maintainers for the cluster API project. And I'm currently at Equinex Metal, helping define cloud-native infrastructure management for Kubernetes clusters. Thank you very much, Jason. And I'm Marcel Muller. I'm a platform engineer at Giant Swarm. I'm also a cluster API contributor. You can find me on Twitter here. So let's get into it. First things first, let's go over the agenda. So in the beginning, I want to give a quick overview over why cluster API and what are the problems we are trying to solve. Then we talk a bit about controller managers and the CRDs that are currently provided by cluster API and kind of how they work together. Then we go into a demo provided by Jason where we look at a more real-world use case. Then go into road map for the cluster API project and finally how you can get involved. So let's jump right in. Why cluster API? I've narrowed it down to a few points here, but obviously there's a lot more to talk about if we get into the depth of why we want to do cluster API. The first main thing is that cluster lifecycle management is difficult. So cluster lifecycle management means not only the creation of Kubernetes clusters, but the whole lifecycle. So from creation to eventual upgrades, expansion, or making them smaller again, and then finally deletion. And that is a very difficult problem to tackle because it has many varieties and facets that need to be handled throughout the lifecycle of a Kubernetes cluster, as you might know. And so cluster API really tries to encompass the whole lifecycle management and not only creation or not only some parts of lifecycle management. Additionally, cluster API aims to provide building blocks for higher order functionality. So that means that cluster API tries to set a level of groundwork, which can then be used for other technologies or components to build on top of that. So maybe features that could be built on top would be more automation for autoscaling or better repair and upgrade functionalities, where cluster API mainly is focused on we want to have this core functionality of cluster lifecycle management, and we want to provide building blocks that others can build on top of. And then finally, a point I want to highlight here is that cluster API aims to have a design that's centered around interchangeable components so that we can have many different, for example, infrastructure providers supported by having the complete infrastructure part of cluster API be interchangeable. So you can have either one infrastructure provider or you can have a different infrastructure provider. And cluster API kind of gives you a nice API to make that interchangeable and allow the community to implement different, for example, infrastructure providers. But that's also the same for, for example, bootstrap. So there can be different bootstrap providers that bootstrap nodes in different ways. And cluster API aims to give a nice interface to implement these different providers to an extent. So really the interchangeability is in that focus. And obviously that also is for control plane management and other parts of cluster API. From here, I want to quickly go over how cluster API looks like and how does it work. First from a high level, and then we will go deeper into the individual components and controller managers. So from a very high level, we can see that on the left we have a user who supplies CRs, so on custom resources, to the custom resource definitions that cluster API kind of defines. These are then supplied by the user to the management cluster. We have CLI that's called cluster CTL to make generating these CRs easier, but in general, you can just do it with Qubes CTL. Inside this management cluster, there are then controller managers, which in turn actually manage the lifecycle of these workload clusters we see on the right. So we have components running in the management cluster, which the user interacts with, and those result in workload clusters, which are managed by the management cluster. Seems pretty surgical, what so far. If we take a deeper look now, we can also realize that, OK, as I mentioned before, we have interchangeable components. That also means in the management cluster, we can have, for example, multiple infrastructure providers, which allow us to manage workload clusters on different infrastructure. For example, here, I just took some examples of the available infrastructure providers. We could have one workload cluster on Azure, one workload cluster on V-Sphere, and one workload cluster on AWS. So that is all possible through this declarative approach and this approach of having these controller managers and this management inside the management cluster. From here, let's take a deeper look inside how the management cluster works internally or what's going on inside there. Because currently, we just define it as kind of a black box that does all the work for us. And we can see that cluster API defines currently four major controller managers, which handle basically the CRs and the reconciliation of these CRs and then create and manage the lifecycle of the workload clusters. These four controller managers are the core controller manager, the booster controller manager, the infrastructure controller manager, and the control plane controller manager. Now we can take a deeper look into each individual controller manager to kind of explain the structure behind cluster API and how they work, starting off with a core controller manager. Each block you see here is one CR or one CRD, which is reconciled by the core controller manager. And therefore, each block is also managed by one controller that's running inside the core controller manager. The first block we see here, for example, is the cluster block, which is just a general definition of what a cluster is. It contains some metadata about the workload cluster we are managing. So for example, that can be the name of the workload cluster. It's stored in the CRD. That's just called the cluster CRD. You have CR in this case if it's the instance. And we can also see that we have some CRDs that are focused around machines. And here we can see an interesting pattern that is kind of similar to how Kubernetes does it. So we see a machine deployment, a machine set, and a machine. And you should very easily be able to tell that this is similar to in Kubernetes, you have a deployment, you have a replica set, and you have a pod. So this is very much a one-to-one relation between the concepts, at least. And we can very easily also see the relation here that a machine deployment has a machine set and a machine set contains many machines. And then additionally, we also have the machine health check, which is kind of a way to specify how does the core controller realize that the machine is not healthy or how does it determine if a machine is healthy or needs to be replaced in, for example, the machine set. But so far, the core provider is only showing us concepts, basically, or the glue between the interchangeable components, because the core component is itself not interchangeable. So when you're using cluster API, the core controller manager is always the same, while all the others we talk about next are interchangeable. And the first one I want to talk about is the bootstrap controller manager now. And now we can see here that bootstrap is handled, at least in this example, in two places. In one place, on the machine set, and then the machine. Because a machine needs bootstrap data, which is provided by the bootstrap controller manager to actually get started. And we can see here, again, that it follows kind of the example given by Vanilla Kubernetes, that the machine set has a bootstrap template, where the machine has a bootstrap config, which is basically an instance of the bootstrap template. So kind of, again, similar to how there's a pod template and the pod is then just basically the actual instance of a pod template. And from here, we can now also continue on to infrastructure. And here we see the first difference a little bit. So we, again, have an infrastructure machine, which is actually responsible for spinning up the actual infrastructure machine. So if we go to a real-world example, for example, on AWS, infrastructure machine, the controller manager, contains the controller that handles infrastructure machine. And that controller would be responsible for if an infrastructure machine on the left is created to create an actual machine in your infrastructure provider. For example, on AWS, it would spin up an EC2 instance with the bootstrap data that the bootstrap controller has provided in the bootstrap config. So we can now see that these components are interchangeable, and that's kind of logical. And we also see on the right that we, again, have an infrastructure machine template. What's a little bit different now is that we also see that there's an infrastructure cluster. Or why that's necessary is very simple to explain. There might be infrastructure that is not tied to a machine inside a Kubernetes cluster, but it's just like relevant to the cluster or needs to be there for the cluster to exist. So cluster is not only a set of machines. For example, the infrastructure cluster might create different subnets for the machines to then live in or some network setup or some load balancers, which are just unique to one cluster, so one cluster might only have like one set of these, but they need to be created as soon as the cluster is created and they're not directly tied to a machine. So you can't just say for every machine, I also create a load balancer. That's usually not how it works. So that's why we need the infrastructure cluster as well. And from there, we can now come to the final component or the final controller manager I want to mention here, which is the control plane. Why is the control plane kind of not just managed like a machine deployment? That's very easy to tell if we look into an example from Kubernetes, you know that a deployment, you can just scale to zero. Now for machine deployment, in theory, you could do the same. You could scale it to zero or scale it to one or scale it to two, but a control plane is usually backed by ADCDs. So to have the stateful storage of ADCD for your master, for your control plane, you are running these ADCDs in the background. If you scale down your ADCDs to zero, well, your data is kind of getting lost. And if you scale, for example, to two and then to one, that might also not be such a good idea with ADCD. So there are some very practical reasons to take out the control plane component because it needs a different kind of management than, for example, just a machine deployment. And therefore, it's own interchangeable component. And now we kind of have the full picture of cluster API on a theoretical level of the controller managers that are running and the different CRDs that are provided by cluster API and which CRDs of these are completely replaceable, which would be the bootstrap, infrastructure and control plane parts. Now, Jason is going to give us a nice little demo that shows this a little bit more in action and in real life. Thanks again, Marcel. For this demo, I'm going to actually create a Kubernetes cluster on physical hardware that I have here located with me. In order to do that, I'm going to leverage the Tinkerbell project. So what is the Tinkerbell project? It is a CNCF sandbox project that is looking to bring cloud-native infrastructure management into the data center, similar to what Kubernetes does for applications. Tinkerbell is trying to do similar for the actual infrastructure in the data center. And how does it do this? Well, there are several microservices that comprise what we call Tinkerbell. The first is the actual Tinkerbell service itself. It's the underlying workflow engine. This is the bit that actually lets you define what hardware you want to manage with Tinkerbell and allows you to define what we call templates and workflows. And this is the bit where you actually tell the hardware what you want it to do. There's also the boot service that provides DHCP and Pixie booting services for the hardware. This is what helps get the actual infrastructure up and running into the Tinkerbell environment. There's also a metadata service called Heagle. And this allows you to actually define metadata on an instance similar to what you would have in a public cloud provider. You can define user data. You can also define other attributes. And then you can actually query and get those attributes from the hardware itself. There's also the worker component. And what this does is this is the bit that actually runs on the hardware, reaches out to the Tinkerbell service, and asks it, what do I actually need to do? Do I have any active workflows? What are the actions that comprise that workflow and that sort of thing? And how does it look when we start putting this together? With the cluster API provider for Tinkerbell, we provide the standard resources that you would expect out of a cluster API infrastructure provider. We have a Tinkerbell cluster, a Tinkerbell machine, and a Tinkerbell template. But we also have a shim layer in there that provides a small abstraction layer on top of Tinkerbell itself and exposes the Tinkerbell resources through the Kubernetes API. So we also have a hardware template and a workflow resource. And this allows us to specifically say which hardware out of all of the hardware that Tinkerbell may be managing that we want to make available for cluster API. But it also allows us to simplify the reconciliation logic a little bit to avoid dealing with any impedance mismatches that we would hit trying to put all of that directly in the core Tinkerbell provider controllers. With that out of the way, let me go ahead and take you into my data center here. On the right here is a small form factor box that's running the Tinkerbell services itself. And on the left are five small form factor machines that are actually going to be the available hardware for our Kubernetes cluster. All right, so let's go ahead and get started. If I come in here right now and use the Tink command line, I can go ahead and see that I've already predefined the five hardware resources related to the hardware. And I have not yet defined any workflows or templates yet. And that's expected because cluster API is going to go ahead and create those resources for me. And if I go ahead and look in here, I don't yet have any cluster API related resources or Tinkerbell related resources defined yet in Kubernetes either. Now, to get started, I need to go ahead and create a hardware resource within my Kubernetes cluster so that the cluster API provider will know what hardware to use for that so that the initial control plane instance is a pre-determined machine. I'm actually only going to pre-define one hardware resource right now. And I will define the rest of them after I've had that hardware assigned to that control plane. So let's go ahead and do that. Well, we can go ahead and look at what that looks like first. And here we can see we are defining a type of hardware. And we're given it a name of just HW-A for now. And most importantly is this ID right here. And that matches up to the actual ID of the resource within Tinkerbell itself. So let's go ahead and create that. And let's go ahead and take a look at the resource now that we've created it. So now we can see there's actually more information than what we provided. And that's because the controller went ahead and reached out to Tinkerbell and said, tell me what you know about this hardware with this ID. And we can see that in the status here, we're pulling in what disks are defined on that device. And we use this so that we can basically install the pre-built image onto the right hardware device. And we see some information about the network interfaces that are defined for the machine. And most importantly for our use case, we see that in here Netboot is configured to allow pixie booting and to allow workflows. And if either of these were false, this machine wouldn't actually pixie boot when the time comes. So that's an important piece of this as well. So now that we've already defined the hardware, let's go ahead and create our cluster. And I'm gonna pre-define a few pieces of this here as well. But let me go ahead and get this started and go ahead and power on that hardware so that it can actually start bootstrapping while I talk about that. So while that's booting, I can go ahead and see from the command line here, I've overwritten the pod sider. And that's because the default pod sider that aligns with the Calico default network configuration will actually collide with the Tinkerbell server that I have right here, which is defined on 192.681.1. So I'm overriding that. I'm also telling it that I only want one control plane machine. And I've set the workers to zero right now because I haven't predefined that hardware yet. And I'm also telling it that I want Kubernetes version 1.18.15. And that'll come into play when we're trying to do the image lookup and I'll get into that in a bit. So let's go ahead and take a look at what that created. So get cluster API here. And you may see that there is some content going on on the monitor behind me. Don't worry about it too much. Basically what it's doing at this point is it's basically pixie booting, getting into the Tinkerbell environment and getting ready to run the workflow that cluster API created for it. So in here for these resources, we can see that we do have a Tinkerbell machine here now. And it is saying that it's ready. And we do see that the instance ID matches up with that ID for the hardware that we predefined. So that machine's now bootstrapping and getting ready to go at the moment. So in addition to those resources, we did also create some Tinkerbell resources as well. So in addition to the hardware that we defined earlier, there is now a template and a workflow. And as I mentioned earlier, the template actually just kind of tells everything what to do and the workflow ties it all together with the actual hardware. So let's actually look at the workflow and see what's going on there. So here, what we can see is, basically in the spec of this workflow, we see that it's referencing a specific hardware, the HWA one that we defined earlier. And it's referencing this template and this template is the one that it created for bootstrapping this instance. And we can see that here in the status, we now have actions defined for that. And these actions are actually going to feed into that workflow and say exactly what we wanna do to actually bootstrap this instance into a Kubernetes cluster. So the first action that we're gonna do is using this image to disk action. And we're telling it that we want to write the contents of this image URL into this destination disk, which is the same one that was predefined. So if I had specified the hardware with the different disk device, this would be different in this example and would write to the correct block device for a different type of hardware. So after it writes the contents of that disk image to disk, the next thing it's gonna do is actually write a file to that disk as well. And the file that it's writing is basically, and it's some additional cloud init configuration to make it work a little bit better. So here we're telling it that we want to use the data source, the EC2 data source, and we want to tell it not to use a strict ID because the Hegel metadata service is not exactly the same as EC2. So if we left this as true, it would fail. And we're telling it to use this metadata URL. And this is a link local address that I had configured earlier on the Tinkerbell server. And pointing at the Hegel metadata service on port 50,061. And that's basically gonna give it access to that user data that Cluster API had set on that hardware for us. The next thing that we're gonna see is that we've also defined a default user here with some pseudo access. And that gives us the ability so that if we had wanted to with the cluster that we created, we could have defined an SSH key for this user or a password and then we would be able to access this machine if something had gone wrong. But I have trust that this is actually gonna do the right thing. So I haven't done that here. So after it's running that CloudNet configuration, the next thing it's gonna do is it's gonna use this KExec action. And the KExec action is basically gonna tell it to boot into the kernel for the OS image that we just wrote to the desk. So now at this point, I believe that the machine has already bootstrapped into the Ubuntu instance, but it's probably not run CloudNet yet. This hardware is a little slow. It's a few years old and doesn't have the fastest disks. If this was a newer hardware and had NVMe drives, it would have been bootstrapped already at this point. But it looks like things are getting pretty close to bootstrapping there. So let's go ahead and get the cube config from that host and let's go ahead and use that to try to get nodes. And I can see that I do actually have a node now at this point and the node is not ready. So let's go ahead and deploy Cilium here and that'll go ahead and get the CNI solution going. And as soon as the CNI solutions up, that node will change to a ready status. But I don't want to quite wait for that yet. Let's go ahead and create that additional hardware that I had mentioned. And I've already defined it in this test hardware to YAML file. So here we created hardware BCD and E and that's corresponding to the other instances that I have there. And let's go ahead and scale up that machine deployment now. So let's go ahead and scale it up to three. So if I do a get cluster API now, I can see that now it is associated the three additional hardware resources with those Thinker Bell machines. So I can see based on the predetermined naming that I used, it wants to use hardware BC and D. So let's go ahead and get those powered up. And in just a little bit here, those instances will actually be booted up. So let's go ahead and take a look at the additional Thinker Bell resources that were created. And here we can see, in addition to the five hardware devices that we've defined, we have the template and the workflow related to the control play machine that we previous bootstrapped. And we have similar workflows and templates defined for the three additional instances here. And the exact same process is going on, except instead of doing a QBADM init this time, it's doing a join just for a worker node. And in just a short while here, we will actually be able to see those nodes come up. So let's go ahead and run that get nodes one more time. All right, and we can start to see that hardware coming up now. So we can see that hardware bees are already starting to come up. And I expect, yep, there's hardware D and hardware C probably isn't very far behind either. So as these are coming up, the QBADM join has now run at this point. And now we're basically just waiting on the CNI resources to finish coming up before these nodes actually report ready. So that'll happen here in just a second. But I don't think we need to go any further. So thanks a lot. And back over to you, Marcel. Thank you very much, Jason. From here from the demo, we now come back to the cluster API project. So as I mentioned at the very beginning, cluster API is a sub-project of SICK cluster lifecycle. And from here, I kind of want to talk about the roadmap and what our plans are for the future. So the next upcoming release will be V1 Alpha 4 or version 0.4 in most likely Q2 2021. The focus for this release is mostly on UX and stability with inclusion of some more use cases. I have a selection here as well that we will go over. In previous QCons, if you've already seen the deep dives, we've talked about this quite a lot and there's now a lot of changes involved. So I'm showing only a selection here. So the three inclusions for this release, I quickly want to talk about are firstly, the management cluster operator. As we saw previously, there are a lot of CRs involved for creating a cluster. And therefore there's also a lot of these controller managers involved that need to kind of handle these CRs. And these controller managers also have webhooks to do validation and mutation of the CRs that you supply to them. Deploying the stack of controller managers for cluster API can be a little bit complex, especially if you want to update the version of cluster API or you want to kind of keep in touch like which versions are compatible and so on. The management cluster operator aims to make this a little bit more simple. So to make the deployment of the components of cluster API more simple by having like an easy to use CR that then through the management cluster operator deploys the different controllers that are necessary. You can kind of equate this to how Prometheus operator works. So I assume most of you know Prometheus operator and Prometheus operator really makes it easy to deploy Prometheus in a simple way by just giving a high level abstraction of what you need to deploy in the CR. And this is kind of the similar idea here to make really UX for actually running cluster API a little bit easier. Then we have the externally managed infrastructure proposal or change which introduces a way for you to have infrastructure which is not managed by cluster API but still represented in cluster API. So one example could be you have a subset of machines or all your machines are managed by some other infrastructure and you can't interact with the lifecycle of it but you still want to have the representation in cluster API as like the machine CRs in your cluster API controller management cluster. And for that we introduced like this concept of externally managed infrastructure so that the controllers of cluster API do not start fighting with the infrastructure that is already managed by an outside system. So that should support more use cases in this area. And then the final point I want to touch on here is improved multi-tenancy. Multi-tenancy in this context means that you for example on Azure can have multiple Azure subscriptions and the workload clusters you create from your management clusters might end up in different subscriptions. So that you have one management cluster that for example, spawns three workload clusters in three different subscriptions for Azure. And this is currently not super easy to do. So we are kind of overhauling it to improve the UX there and make it a little bit easier to use for basically users of cluster API. And in the bottom, the slides are provided obviously. There is also a link to see our full roadmap if you want to get a full overview of what all the changes will be in V1 Alpha 4 or the plan changes in V1 Alpha 4. From that, I also want to quickly talk about cluster API versioning. So that's kind of a discussion going on right now. The link's already at the bottom if you just want to click the link and check it out. That kind of talks about, okay, cluster API version 1.0 or V1 Beta 1 for our APIs. What is really necessary? What's our road to get there? And a realization that maintainers of cluster API already made was that we are mostly not fully already operating like a GA project. So that means that we already support our API in some case for up to a year not always, but there is some support that's nearly a year in most cases. And we also support conversions, not enough conversions like a GA project, but we support conversion webhooks and the conversion between our APIs in a higher level than what would be expected from an Alpha project. So from there, the proposal is kind of like what is actually missing for us to move towards V1 Beta 1 or cluster API 1.0. And in this discussion, that was obviously brought up. Yeah, there needs to be a lot of work on stability even though we're already in a pretty stable place and the foundations are being kept in place. We're not 100% confident right now if that will continue to be that way or which additions need to be made. So for this, it would be really valuable for us if you would bring up your use cases or your ideas for cluster API so we can get a better picture of what actually needs to be improved. And also we need to kind of all, there needs to be more work done to have some technical depth reduced that is currently still in cluster API. And then the final things I wanted to say here, so obviously production ready means different things for different people. So currently in V1 Alpha 3 and V1 Alpha 4, there are already people using cluster API in production. And this thing that has been happening for quite a while, but the version does not reflect that properly. So V1 Alpha 4 or V1 Alpha 3 indicates basically it's not a production ready project, but it is being used in production. And with our work with conversions, we kind of acknowledge that it's being used in production. So for that, it's very interesting for the project cluster API to kind of have this discussion and we'll be very happy for you to join in. And transitioning from that, we can straight go into it, well, how can you help or how can you get involved in cluster API? And there are quite a few different ways to do this. The first way would be, yeah, do you have any writing skills? We have a documentation book. You can read through it and you can kind of improve it however you want, provide a pro quest that's open for everyone. And then also if you have product skills, as I mentioned before, we are very happy to get more use cases and to kind of understand what users of cluster API might want or might expect in the future. And also obviously there's backlog grooming is always a thing that needs to happen. So we have some milestone maintenance going on. If you're interested in that, feel free to contribute. Obviously, if you have coding skills, there are different ways to get involved. The first way would be just to review pro requests. We currently always have a lot of open pro requests and reviews are really valuable to us. So you can either review pro requests with a determination or the goal of becoming a reviewer in the cluster API project, or you can search for issues that have the label help wanted or good first issues if you really want to get involved in coding right away. So these are two great ways in my opinion to get started. And then finally, if you have different skills that I didn't mention here or you simply want to get involved, we have weekly community meetings as well as a Slack channel and a mailing list where you can easily get involved. And obviously on the cluster API repository, there's open issues where you can just open an issue or like start a conversation. Again, all the links are in the slides as well if you want to have an easy to click link. And with that, we're done for this deep dive session and we can now go into Q&A.