 So hello, everyone. Welcome to our talk. Welcome to the tutorial, writing a cluster API provider. We're just going to quickly introduce ourselves. I am Anusha. I'm a technical product manager at Nirmata. I've recently shifted my focus to Kubernetes policy and governance. But prior to this, I extensively worked on cluster API. And I'm also a maintainer of cluster API provider for Bring Your Own Host. Hi, everyone. My name is Richard. I work as a personal engineer at SUSE. I'm currently one of the maintainers of the AWS and micro-VM copy providers. And I have a particular interest in how you represent managed Kubernetes in cluster API. I'm a beside-trigger from Red Hat. Among other things, I'm one of the leads of deployment OpenShift on-prem. I'm part of that using cluster API. This is Winnie Kwan. So I'm an engineering manager at VMware. And I'm also a contributor to cluster API provider AWS and GCP and also worked on managed Kubernetes in cluster API. OK, then let's get started. Sorry, the podium is here. The slides are there. And we're going to talk from here. So quick show of hands. How many of you are familiar with cluster API? What cluster API is? Have you all worked with the provider before? Yes, perfect. And how many here want to have an actual use case to write the provider from scratch? And how many are you looking to start contributing to existing providers of cluster API? OK, perfect. I think our tutorial is well catered to all of these. So today we'll be learning, we'll start with what is cluster API, just the basics, so that all of us are on the same page. We will go a little bit into the cluster API provider theory and then jump right into writing a provider. So before that, we have a GitHub repo with all the instructions present. So if you can scan the QR code or directly head over to Cappy Samples GitHub IO. So we have all the prerequisites listed there. Feel free to start doing the prerequisites. It's just a bunch of tools that you'll have to install while we go through the basics in parallel. OK, can we move on? And also, so we have only about 90 minutes today. And I'm going to take up like 10, 15 minutes of those talking theory. So in case you need help during the tutorial and also if there are any audience joining us virtually, I recommend you join our Slack channel in the CNC of Slack. It's called KubeCon NA 2022 Cappy Provider Tutorial. So you can either raise your hand here and one of us will help you out. Or in case we cannot, please use the channel for all discussions. And not just during the session. If you run out of time today and you're still determined to complete the tutorial, you can continue to use the Slack post decision as well. So I'm going to stay on this slide for about five seconds. All right, so let's get started. So what is Cluster API? So it is a solution for declaratively specifying a Kubernetes cluster, just like you declaratively specify workloads that run on the cluster. So this project was built on the premise that cluster lifecycle management is difficult. Because historically, there have been a number of ways and methods and tools to provision a Kubernetes clusters. And the user experience is not consistent amongst these tools. So what Cluster API aims at is providing a consistent user experience in the lifecycle management of clusters. So be it like create, upgrade, delete the entire LCM part of it. And also we have a CLI called Cluster CTL. So it handles the lifecycle of the Cappy management cluster, which in turns does the lifecycle management of workload clusters. So you can use Cluster CTL for initializing your provider or upgrading your provider, et cetera. So there are community calls, that is weekly office hours on Wednesdays. And every provider has their separate office hours in a separate cadence. All of these is available on the Kubernetes community calendar. And for a complete walkthrough of Cappy and understanding what Cappy is, the working of it. So there is a series on YouTube by Stefan and Fabrizio called Let's Talk About. Feel free to check it out. And also on Friday afternoon, there is a Cappy tutorial happening by the core Cappy team. That might be of interest to some of y'all. So what is a cluster API provider? A provider is essentially a Kubernetes operator. It implements infrastructure or operating environment specific functionality that is used together with core Cappy to manage the lifecycle of a Kubernetes cluster. The provider will adhere to certain contracts that is defined by Cappy. Cappy specifies different kinds of contracts for different kinds of providers. So we will shortly look at what are the different types of providers, but there is a contract defined for each type of provider. The contract is implemented via the providers using the custom resource definitions. And the adherence to contract allows interaction between core Cappy and its providers. So let's refresh our Cappy glossary. So the Cappy cluster API book has pretty extensive documentation on various concepts, experimental features, and there are lots of code snippets for examples. But these are some of the high-level things. What is a management cluster? So it is a Kubernetes cluster that manages the lifecycle of workload clusters. A management cluster is also where you run your providers and custom resources like machine or machine sets would be stored. Workload cluster is where you run your workloads on. The control plane is a set of components that serve the Kubernetes API and continuously reconcile from a given state to a desired state. And machine is the declarative spec for an infrastructure component hosting a Kubernetes node. And machine deployment and machine set, now this is analogous to a deployment and a replica set. So this makes sure there are always the desired number of machines present in your cluster. And machine health check, as the name suggests, checks the health of your machine. If it is deemed unhealthy, it Cappy just rolls out a new machine for you. So we talked about different provider types. So there are currently three types of providers in Cappy. One is the infrastructure provider and this is the most widely available. And as the name suggests, this type of provider is used to provision any infrastructure that is used to create a Kubernetes cluster in a target environment. For example, cluster API provider for AWS creates AWS resources like VPCs or EC2 instances. But the infrastructure provider itself doesn't provision Kubernetes. And for that, you'll have to use a bootstrap provider. So bootstrap provider is something that provides with bootstrap scripts. Suppose, say, you're using Cubadium as your bootstrap provider. It will provide you with either Cubadium in it or Cubadium join, depending on whether you want to initialize or join a cluster. These scripts will be available in the form of secrets that your infrastructure provider will read and execute so that you can bootstrap the cluster itself. And the third type of provider is the control plane provider. It is used to represent and manage the lifecycle of a Kubernetes control plane. Or if you're using a managed Kubernetes service like EKS or AKS, then it will directly manage those services in AWS and Azure. Quickly moving on. Masha. Yeah, sorry to interrupt. Maybe let's go back to the prerequisite slide to make sure everyone is. OK, yeah. So I think for people who joined in late, we do have documentation on GitHub. And I urge you all to go through the prerequisites and install a bunch of tools that are needed so that when we get to the tutorial section, we can jump right in. And I want to give a quick disclaimer. We just figured out today morning, I think Docker desktop version. 413. 413, we found some issues. So anything up to 412 will work perfectly fine. There was some problem with the kind cluster, I guess. Yeah. OK, I'm going to jump back. So let's quickly discuss the different custom resource types in Cappy. So Cappy has a number of custom resource types to logically difference part of a Kubernetes cluster. And all the gray boxes here represent the Cappy custom resources. So the one on top, the cluster itself, so it logically represents the cluster as a whole and contains general configuration information like the POTSIDER block or what is your Kubernetes version itself of the cluster. We have these resource kinds that represent a way to manage machines. So we have machine, machine deployment, machine sets, and machine pool. A machine is equivalent to a single node, whereas a machine deployment and machine set make sure the number of replicas of your machine is always in the desired number of replicas that you want. Whereas machine pool is used in terms of auto-scaling or virtual machine scale sets in Azure. You can dynamically scale up and scale down machine pool. And then we have machine health check. Like we described earlier, it checks the health of your machine if any machine goes down. And if it's deemed unreachable or unhealthy, then cluster API provisions new machines for you. Now if you're creating, so that was core Cappy cluster resources. Now if you're creating a Bootstrap provider, you'll be creating custom resources to represent Bootstrap information. So it will need to contain configuration like QVDM init or join, depending on whether you want to initialize your cluster or join a node to an existing cluster. So in this diagram, all the pink boxes represent a Bootstrap provider resource. So one thing to note here is all the pink boxes are encompassed inside the gray boxes. So it means either the core Cappy resource own these provider resources or they reference to these resources. Next, if you're writing an infrastructure provider, then you'll be creating a custom resource to represent the infrastructure you'll be creating in the target environment. So all the orange boxes here represent an infra provider resource. The infra cluster on top is, it represents the base infrastructure that is required for the cluster. So this involves things like networking and security groups. But it doesn't contain any information that is particular to the machine itself. For that you have inframachine, inframachine template. These custom resources have configuration information that is needed to actually bring up a compute instance. For example, an EC2 instance or a vSphere VM. And you can also see the examples of the naming convention that is being adopted by different providers. So you just prefix the resource kind with the provider name, like you have AWS cluster, Docker cluster. Similarly, you'll have an AWS machine or a Docker machine. By now you get the drift. Similarly, you'll have a control plane provider resource and all of these resources will either be owned by core Cappy resources or referenced by Cappy resources. So what actually makes up a provider? So we discussed so far that Cappy defines certain contracts for different kinds of providers and depending on the type of provider you want, you need to adhere to the contract. So you can adhere to the contract based on CRDs. And we also discussed that a provider is nothing but a Kubernetes operator. So it means it'll have a bunch of controllers that will reconcile the CRDs that you've written. Then there will be additional gates resources to deploy the controller itself. So these are a bunch of YAML files and things like RBAC configuration and then metadata and repo layout. So this is not necessarily a Cappy contract but it's sort of like a best practices that is followed which makes your provider releases easier. And also like if you want to use cluster CTL to initialize your provider, if it is in a certain repository layout, it'll be easier for cluster CTL to discover your provider and initialize. All right, I think that was enough theory. Over to Richard. Cool, so we've now, could now move on to the practical part of this. So this is the actual tutorial. So it's expected to be practical so you do the work yourself and actually build your own provider. So what are we actually gonna be building? So we are gonna be building a infrastructure provider for Docker. So this is gonna be loosely based on the existing Docker provider within Cappy but it's gonna be simplified and achievable within the timeframe that we have allotted to us. As Nusha mentions, as it's an infrastructure provider, our provider will have to provision any infrastructure that is required for the cluster. Now in the case of this provider, it's gonna provision container instances. So we will be creating a container instance for a load balancer and the load balancer is for the API server. And then we will be creating container instances for every machine or node within that cluster. So we will be interacting with the Docker API to spin these container instances up. We will be using the Cube ADM control plane and bootstrap providers. So those are the things that we'll be doing actual bootstrapping of Kubernetes within those containers for us but we'll just be using those. So a bit about the tutorial format. So the idea is you will work through the various sections within the tutorial docs. I will put the link up again in case anyone hasn't got it. So the idea is you work through them in order and it will slowly build up the provider for you. So we're gonna allow a certain amount of time per section for you to work on that. But we will then move on to the next section just so that we can discuss each section as we go along. But the main thing is don't panic. You don't have to finish it within that time. Just take your own time. But just for the sake of the tutorial or moving things along, we will move on to the next section. And the main thing is what we wanna do is give you everything that is required or needed for you to go away and potentially build your own cluster API provider. So if you get stuck or if you just have any questions, please just raise your hand and one of us will come and help you. So as you go through this, there may be questions or there may be issues. So yeah, just make sure you raise your hand. Also if you just want some extra context, also just raise your hand and we can answer any questions you want. So hopefully you don't go away with any questions unanswered. But if you do, you can always follow up with us on the Slack channel after this session or grab us afterwards or the rest of QCon. So I'm just gonna put the link up again for those who've just arrived. If you scan that code, it'll take you to the tutorial documentation. That's gonna live there as well after this session. So you can go back to it and use it as reference. Also within that, so if you go to the site and click through to GitHub, you'll also see a reference implementation that goes along with this as well. But again, also just feel free to join Slack as well. So there are a number of sections in there. So we are going to, well hopefully you've already started installing the prerequisites. Then we're gonna do some basic setup and that's gonna be dealing with the repositories, et cetera. Then we're gonna do some scaffolding, so cogeneration, and then we're gonna move on to setting up Tilt. Now Tilt is essential to, it's an essential quality of life improvement for you. Otherwise it can be quite a painful iterative process without that. Then we're gonna move on to the meat of writing the provider. And this is the thing that starts to differentiate it from a normal operator. So we're going to implement the Docker cluster or the infrastructure cluster as Cappy sometimes just calls it. Then we'll move on to the Docker machine representation and its controllers. And then we will talk about webhooks. After we've done all of that, then we can actually create a cluster with our provider and hopefully we'll all get to that point so that you can then apply our YAML to your management cluster and see your new provider create a Kubernetes cluster for you, which will be great. And then right at the end, we are gonna cover some stuff around releases. So if you want your provider to be installable via cluster cuttle or cluster CTL, then there's certain things that you must do as a provider implementer around GitHub and certain files. And we will cover that as well if we have time. So I'm gonna just switch the slides off and then we will, so give me two seconds. Hopefully everyone can see that screen right there. Please raise your hands if you can't see it, if you need anything bigger. And also again, just a reminder, just raise hands if you have any questions. So we won't be executing every single command from those tutorial documents. So for some sections, we will just highlight some key points, but the rest of the sections, we will give you an overview when we get to that with the expectation that you will run through those instructions from the site. So when you go to the site, that's your tutorial site. If you just click on start tutorial, and then you'll see the various sections on the left-hand side. So just a note about the prerequisites, is everyone, has everyone installed the prerequisites yet or are people still doing that? Anyone need more time? No one's raising their hands, so are you doing more time? Oh, you're done. Oh, you're done, cool, that's even better, great. So the first thing is the setup. So in the setup, we are essentially going to have to do two things. First of all, ideally fork the cluster API repo or clone it down to your local machine. You can do either or. Thing to note here is when you are cloning it, clone it into your GoPath. It seems a bit antiquated now to clone things actually into the GoPath, but these are for historic reasons. So some of the cogeneration tooling doesn't work outside of the GoPath and it ends up embedding it in crazy locations. So it's much easier if you actually clone these into your GoPath. So if you can do that with the cluster API repo, and then after you've done that, you will need to create a new repo in GitHub under your own username called cluster-api-provider-docker. Now the name there is follows conventions. Probably don't need to explain the conventions. It's pretty self-explanatory, but pretty much all providers are named the same. There are some slight variations. And then at the bottom, we just add some get ignores. Feel free to do those or not, it's up to you. So I'll probably give people a couple of minutes to do that. And if anyone has any questions in the meantime, so I'm gonna do the same as well. So, create it. Yep, I can do that. Is that better? Yeah. Yep, cool. So I've already created the repo in GitHub, so I'm just gonna clone it down to my machine. So I generally have two windows open at a time, or two terminal windows, or panes, depending on if you're using a T-Max. So the first one is in the directory of the provider I'm working on, and the second one is in the directory of cappy itself. So I don't need more time without this first part. And when you go through these instructions, sometimes you'll see reference to within GitHub to cappy hyphen samples. In most instances of that, you will be, especially on commands, you'll be replacing those with your username. So in case there's any that have been missed within these documents, just to know. Cool, so now that we have the repos set up and cloned down to our local machine, we can actually move on to scaffolding. And the purpose of this section is to essentially scaffold the project, or to generate all of the code that is required as a basis for you to then build your controller. It takes a lot of the pain out of doing all that initial code because QBuilder will generate that for you. Now these steps are essentially applicable to any operator that you build using QBuilder. So you can take this away and build whatever operator you want. So I am gonna do these steps, but essentially what we're gonna do is we're gonna use QBuilder to initialize a project. And then we are going to use QBuilder to add APIs and controllers for our Docker cluster and Docker machine to that project. And at that point, it's gonna generate a whole bunch of code for us and YAML that we can then use without having to write it ourselves. So the first thing we're gonna do is initialize the project. So we're gonna use the QBuilder in it here. So there's a couple of things to note is the domain. So most providers create their CRD definitions within an API group that ends with this cluster.xkates.io. The other thing to mention here is obviously in the dash dash repo, put your repo in here that you have just created. So what this will do is it will essentially generate some code for us and then tell us, well, right now if we've created a project you should probably add an API to it. So that's what we're gonna do next. I'll just pause for a minute. So the next thing we are gonna do, it was gonna add a new API definition or CRD. So this is gonna be for our Docker cluster. As you can see at the end here, that is in the kind. Another thing to note is we are using a group here. So this is a prefix to that domain that we just used. And the convention is that the group is prefixed with the type of provider you are building. So in our case we're building an infrastructure provider. So we're gonna prefix the API version, Kubernetes API version with infrastructure. You also see just to overload the terms, we are specifying an actual API version in the sense of Kubernetes and CRDs of v1, alpha one. You can choose whatever you want here, but just be aware that depending on your level of the API, there are certain API guarantees that you should adhere to. So alpha APIs are the most lenient to change. So I'm gonna execute that. It's gonna ask you, should I create the resource? So should I create the custom resource definition? So, yep. And should I then create a controller for that? So there are situations when you run this where you won't want to create a controller. And this is specifically for things like templates. So you will have later on something called a machine template. And as the name suggests, that's a template that ultimately results in an actual machine. So you don't need a controller to be generated for a template resource. But in this instance, we are creating a cluster infrastructure resource or Docker cluster. So we do want a controller. So I'm just gonna say yes. It's gonna then generate a whole bunch of stuff for me, code and YAML. So then what I generally do then is, this is wait. So then we need to do this basically the same but for the machine. So we are gonna have sent you two API definitions, Docker cluster, Docker machine. So one to represent the cluster as a whole, the infrastructure on one to represent individual machines. And that's exactly the same command. Obviously just changing the name. So I'm gonna execute that now. I want to create the CRD and the controller. So as part of the code generation or the initial init, it's gonna create a make file for you. And there's a bunch of targets in there that you'll need to use as you develop your provider. So the first one is make generate. So what make generate does, as the name suggests, is it's going to run a bunch of code generation tools against your API definitions. So it specifically for things like generating deep copy functions. So it allows you to create copies of your API types. But it will also run things like default generation. So if you want to have supplied default values of some sort that don't necessarily fit into the cube builder defaults definitions. So if you want to do something a bit more complex with your defaults, you can also create something called defaulters. And then there is a code generation tool that will be run from make generate as well to do that. So I'm just gonna run that now and we'll soon have a look at what this is all done. So the next make target that you need to be aware of is make manifests. So as the name suggests, this is actually going to create a Kubernetes manifest from your definition, from your code, mainly. So what you'll see is when we start to look at the code, there's gonna be a whole bunch of cube builder special comments throughout the code. And this make manifest will run some tooling that understands scans and source code looks for these comments and it generates Kubernetes artifacts as a result. So as you make changes to your API definitions and your controllers, make sure that you are running make generate and make manifests to do that code generation. So now we've got our code, we've got our manifests. We should probably just check that it actually compiles because it might not do. So there is a build target. How's everyone getting on with that so far? Any problems? No problem. No problem, cool. Yeah, so I'll give you a few minutes just to get to this point because this is the starting point for the rest of the sections. So I will also, for anyone that's just joined, I will also just show the link. So if you're just joined, feel free to scan this and go to the tutorial docs. So we are just in the setup section and the scaffolding. And also if anyone missed it, there is a Slack channel in the CNCF Slack that you can join. You can ask questions there now or after this session and we will try and answer them as well. I think I've seen this before. I've seen it, I just forgot. So let's try it here too. Okay, so what you can do is actually build that. So there is actually a make target. Probably give people another minute or so and then I'll carry on. You can actually create the conversion. Then if you, what are we running? Make generate? Yeah, so if you do that, it will go. Okay, so now we're going to have a look at what these commands have generated for us. So I'm just going to use VS Code. Probably need to change the size of that the way. Hopefully that's better. So there's a couple of things to note from the generator. This has all been generated by those QBuilder commands. So I'm going to just start at the top and work my way down just to give you an idea of what it's generated. So the first part that is generated is our API definitions or our CRDs. So these are in the API folder. So we ran two QBuilder commands. One to create the Docker cluster API and one to create the Docker machine API. So it has generated two API definitions for us. So I'm not going to go into too much details, but if you go in there, you'll see that there are definitions for Docker cluster is spec and its status. These will be covered a bit more later on. And the same has been done for Docker machine. So if you also remember the make generate command. So that has generated the anything with ZZ underscore that is auto-generated by that make generate command and the underlying co-generation tools. So if you see those, you don't need to do any modifications to any files with ZZ underscore. As well as creating the API definitions, we said yes to creating the controllers. So these live in the controllers folder. And again, there is one for each of the APIs that we created. So one controller for Docker machine and for cluster. Again, I'm not going to go into too much details here, but you can see, is it? Yeah, it looks. Schedule? Yeah, I don't know what to do about that. Can you kind of schedule if you want the link is in there. Can you or see the screen where? It's already difficult to see. Yeah. Is it because it's hazy or? It's okay, yeah. I think the contrast is a little bit hard to see, but yeah. So yeah, so it has generated controllers for us. It's scaffolded. So I mentioned earlier about the QBuilder comments that you can see some examples here. And this is where you can add QBuilder comments to your code and specifically here around RBAC. And when you do the make manifests, it goes through the code and it sees comments like this and it generates RBAC definitions for us. And there's various other forms of these comments throughout the code. So that moves on to the config section. So this is where QBuilder has spent out a whole bunch of Kubernetes YAML for our provider. We will be making changes to this. And some of this is influenced by these comments in the code. That's probably pretty much I'm gonna walk through on this, the code, but there's usual other stuff. One thing also to note is the project file here and this is a QBuilder thing where it records everything that is created for you. Just so you know. So there is one small change that we want to do at this stage from the auto-generated code is to change the port number of the health endpoint. So it has generated it with the port number of 8081. We just need to change that to 9440. So what you'll need to do is go to the main, find the health probe bind address and you'll see it is 8081. Just change that to 9440 and save. And this is mainly because we are simplifying the artifacts and deleting some of the artifacts that are used by Customize where it's using a different port number. So that's pretty much the scaffolding. So we've scaffolded the code, we've got the bare bones for our API and our controllers. So now we can move on. So before we actually get to implementing the controllers and the API definitions, we wanna cover something called tilt. So has anyone used tilt? Nice, good. Yeah, it's a real quality of life improvement. Before tilt, it was just so painful to run your controllers and to debug them. And the CAPI contributors have really put a lot of work into their tilt file and it has a lot of features within it that is really, really helpful. So to give an example and then we'll be going through this, you have the option to start your controllers via Delve. And then so you can just set up, connect and set a breakpoint and debug your controller within the cluster. So for your provider to be usable via the upstream CAPI tilt file, you need to tell that tilt file about your provider. And this is done via a file of your repo called tilt-provider. And this just gives various information to the CAPI tilt file about your provider. So we're gonna create this file now. So I'm gonna have to, yeah, cool. So I am just gonna clone the repo so we don't have to create all of the files on here. Yeah, so if you wanna do that as well, it's up to you or you can go on and go through the instructions and building it yourself. So we were just on the tilt. So in instructions, you're gonna create a tilt-provider file in the root of your repo. So it's gonna look like this, so hopefully you can see that. So the few things you need to be aware of in this file. Number one is the name. So this is the name that you're gonna use with cluster cut or later on when you say you want to install this provider into your cluster. So choose a name that represents your provider. The next is the image. So this is important. The image that you specify here, you have to also use that within your manifest definitions for your controller, so within the deployment. If they don't match when you run tilt, it's not going to replace the image with your locally built one. And so you're gonna have issues. And then it's the library loads section. So you can specify folders and files that you want tilt to watch. And if there's any changes to those files, tilt will essentially recompile your provider and redeploy it into your cluster. So you can get into this iterative development cycle. So for every change, it builds and redeploys for you. In instructions, there are some various other tasks around just making changes to the Kubernetes manifests. So you will need to do those. But essentially, if you do all of that, you get to the point where you can create a kind cluster. You're gonna have to use a specific kind configuration here. And this is because you need to pass in the Docker socket from your host machine into kind so that it can then be used via your provider. So if we go to the instructions, so it's gonna ask you to do a whole bunch of stuff with the manifests. And then it's going to ask you to create a tilt settings file in the cluster API folder. Now this is the thing that when you run tilt and it interprets the cabi tilt file, it knows to locate your provider. So you'll see at the top a reference to your provider, so the repo. So you'll see in here, it's a relative location. So you need to make sure that is right. So that's what we'll tell tilt to load your provider along with the tilt provider file that you've just created. And there's a few other things you can do within the settings. So if you want to enable experimental features in your provider or within CAPI, you can set the environment variables here to enable those features. Likewise, you can override the arguments for your controller. So a common usage of this is to increase the login level. So you get trace messages as opposed to just informational messages like normal. And then the really like magical part from implementing your provider is the debug section down the bottom here. If you had this debug section and you add it for your provider, so you'll see in this instance, the debug and then the Docker hyphen cube con section in there that's specific to my provider. And what I'm, what it's basically saying is when you run my provider in the cluster, start my provider via delve and make it listen on the port specified here. So you can then just use VS Code, create a launch configuration and set a breakpoint and attach to your provider and start debugging it. So after you've done that, it's going to, like I said, you're going to have to use a specific configuration file when you're creating your kind cluster. Now your kind cluster is going to act as your management cluster in this instance. And this is just to pass in the Docker socket, like I said. So I'm going to do that now. So that will run and maybe start sometime soon. Once it's done that, you can do tilt up. And you press, oh, that's not going to start. So that's what will happen if you put in the wrong path. It will fail to start with until you'll get a big red message. So if I change that to copy sample, so I'm going to use the reference implementation. It should be a lot happier now when I start it. So this will take some time. What you'll see in a minute is on the left-hand side as it enables and installs the profiles and installs the providers, you'll see them appear on the left-hand side into various groups, so groups and provider type, but also by binaries and controllers as a whole. So once it's doing that, once it's starting up, so because we added a debug section, if you wanted to, you could actually then create a, if you're using VS Code or something else, you can do the same in Go Land or on the command line, if you create a launch configuration for Go, say connect to server, and then you'll have to give it to the same port as you put into the tilt settings file, which I think was 31,000. So at that point, you should be able to set a breakpoint within your controllers if you wanted to, for example. There we go. And then start the debugger and it will hit the breakpoint. I'm not gonna do that now, because we will cover that later on. So you can see now that all of the providers have started on the left-hand side, and you should see our Docker provider there as well. So if you do all this, this is what will happen for you. Yeah, and if you change any of the files, it will then reload it for you. So I think we're done with that section. So now we can get to the real implementation, which Avishay will do. No, everyone's good. Testing. Yeah, yeah, let's give five minutes, because it's extra a lot of things. Yeah, yeah. Yeah, we need to give people time, yeah. Because like one person was trying to click on cluster, it was taking forever. Yeah, you know, like tutorials, like we give people time to do it. Do you need some more time, or can we proceed? Give a couple more minutes. Keep going. Okay, so we're gonna work on the Docker cluster resource now. I'll go through it with you in the tutorial. I recommend you not to copy and paste all the lines of code right now. You can kind of look along in the Cappy samples repo, but I think it'll be more valuable for you to kind of understand what we're doing rather than typing it all out right now. So basically, when we're implementing Docker cluster, what makes it different than any other controller, any other API is that we have to adhere to the specification, to the contract that we have with cluster API itself. So you can see here a bit of overview. We can see that it's telling us that we must have a spec with control plane endpoint in it, and we must have a status with ready as a Boolean, and it also has some optional fields that we can add. This is part of what's called the cluster API book, highly recommended reference. It has some more details in here if you go along on your own time, including a flow chart of what a typical controller would do. For our example, as Richard and Anusha mentioned, we're going to be implementing machines as Docker containers. So each machine that we bring up will be a Docker container, both for control plane and for workers. And because we're gonna have three control plane nodes for implemented as Docker containers, we need a load balancer to sit in front of them and direct the API traffic to those three control plane nodes. So that's the main work that our Docker cluster controller is going to do. So the first thing we're going to do is define the Docker cluster API. So as part of the scaffolding that you did in step three, we got just some empty kind of file, not empty, but just the basics. There's a foo string in there. You can delete that. And we're going to put in the control plane endpoint that the specification requires of us. And in the status, we're going to put in the ready. Ready is something that Cappy itself is going to look at. And once it's true, it knows that all the infrastructure is ready and it can proceed. In addition, in the spec, we have the load balancer image, which users can use to optionally override. And we have a finalizer. For those of you not familiar, a finalizer, if it's placed on a resource, it means that the resource won't be deleted as long as that finalizer is there. So if a user goes and tries to delete the Docker cluster, Kubernetes won't allow it. The controller that we're going to implement now will see that that Docker cluster is being deleted. It'll clean up the infrastructure, mainly the load balancer. And then the controller will remove the finalizer once everything is cleaned up. And then Kubernetes can go ahead and delete the Docker cluster. So if you did that yourself, or if you're using the example, you would run now make generate, make manifests. And that would update all of the generated scaffolding that we did at the beginning to reflect those changes. And now we'll look at the controller itself. That's in the file controllers Docker cluster controller and the reconcile method. So for those of you who are new to controllers, basically Kubernetes will call this reconcile method. Every time something changes, okay? What is that something? By default, it'll be only Docker cluster. I'll skip down a bit. So if you look at the setup with manager function, this defines exactly when the reconcile function will be called. So it's going to be called for Docker cluster. And for your own controllers, you'll see it with Docker machine cluster in a bit. You can also watch other resources. And you have to make sure that your reconcile function is going to be called whenever it needs to be, okay? If you're missing something, if you're not watching the proper resources, Kubernetes won't know to call your reconcile function and then things won't happen. Okay, so we're going to implement. So we're going to create the Docker cluster reconcile or struct, initialize the logger, initialize the context imports, and let's get to the real work. So the first thing we do, our reconcile function got called. We don't know for what yet. So the first thing we need to do is to fetch the Docker cluster instance and that was modified, right? So our reconcile is going to be called whenever Docker cluster is modified or at some other points, for example, if we returned an error, then we'll get called again at a later point in time to try again. If we return, for example, RQ, we told Kubernetes to RQ us, then we'll get called, but it's generally called when a Docker cluster instance is modified. So we fetch that Docker cluster. Next, we get the CAPI cluster that owns this Docker cluster. So there's a utility that makes that easy for us based on the owner reference. Now if we got an error, so this is all part of the controller convention, we got an error, we returned the error and that will RQ and call the reconcile function again at a later time. And if the cluster is no, it wasn't set yet, so we just return and we'll try again later. We might have an annotation is paused. So if, for example, cluster CTL move will pause the cluster, so we're just not gonna do anything right now because it's paused, so we return. And now finally, after all that initialization stuff, we can do what we set out to do, which is create a load balancer. So we create this, we create the load balancer object, we create a patch helper because as you're all familiar with, every time a controller does something, it should update the status of the resource. So this patch helper is going to do that for us and we have a defer. So anytime we return from now on, we're going to patch the resource, which means updating it in at CD. And now we basically kind of have a fork in the logic. So if we're detecting that the Docker cluster instance was deleted, we do that by checking if the timestamp is not zero, then we're going to call a function called reconciled delete. And if it was not deleted, we'll call a function called reconciled normal. So we define those two functions and let's look at what reconciled normal does. So again, we set up the logging. The first thing we do, if we check if the Docker cluster instance already has the finalizer that we talked about earlier, if not, we add it. And then we return with requeue true. So we add the finalizer and then we come back to and our reconciled function will be called again. An important thing to note here, our reconciled function, it's always taking a look at the spec, taking a look at the real world and making the real world match the spec and then displaying in the status the status of that real world, right? So it has to be idempotent anytime. Don't assume that your reconciled function got called earlier. Don't assume that it wasn't called earlier. You really have to check and make sure that it's totally idempotent. So that's why we're going to now create the actual load balancer and get the IP from IP of that. So LBIP, we'll have the IP of the load balancer. And then we put that IP and the port into the spec.controlplane endpoint. Finally, once we do that, we set status.ready. And we can return. Remember, we had that defer function earlier, so it'll patch the Docker cluster instance. And we're good to go. The next thing we do for delete, I'm not gonna go through it in detail, but we're going to delete the load balancer, right? So we got the Docker cluster instance was deleted. We delete the instance and we remove the finalizer so that the instance will actually be deleted. And that's pretty much it for this. Oh, I'm sorry, one thing I missed earlier about the setup with manager. So we're watching the Docker cluster, but we're also watching Kappi cluster. And not all of the Kappi clusters. If you look, it's watching only the Kappi clusters who are pointing to Docker clusters. It's important when you're watching to kind of filter out as much as you can. For example, if you need to watch secrets for secrets changing, don't watch all the secrets in your entire system. It's going to take forever. So maybe watching a specific namespace or by label or something like that. And again, once you finish all that, make manifest, make build so that everything you just did will take effect. Okay, the next thing we're going to do is the machine, Docker machine. I'll hopefully go through this a bit quicker. The API has a provider ID. That's according to the contract. That's a unique ID, which will be a reference to this machine. It has the ready Boolean just like the cluster did, and it has addresses which will contain the IP addresses for this machine. And just like the cluster, we have a machine finalizer. And after you finish defining your API, make generate, make manifest as usual. Okay, so what are we doing here? Users aren't creating machines directly. When you create your cluster definition, you typically will create a machine deployment, which is similar to deployment that you might be familiar with with pods, which among other things has account of how many machines should exist. And it has a Docker machine template, which will tell cluster API how to create each individual Docker machine. Okay, so what does the reconcile here look like? The first thing it'll check is if cluster, so, excuse me, first we're going to do all the, you know, getting the machine, getting the cluster and so on. I'm not gonna go into that here in this step. You can see it in the GitHub of the CAPI sample. So assume we already got the Docker machine, we got the Docker cluster, we got the CAPI machine, we got the CAPI cluster, we got all of that. And now we're going to check if the status of the cluster, if cluster status infrastructure ready is false, we don't have anything to do at this point. We're waiting for that infrastructure to come up. If the machine already has a provider ID, that means we have no work left to do. We probably just, we're missing the status, that ready, so we set it and return. Okay, now we wanna do the actual work. To do that, we need bootstrap. Okay, if we don't have the bootstrap data, we're going to return. The bootstrap data is basically what will turn that simple Docker container into a Kubernetes node. Okay, so it has all of the instructions to do that, either in cloud init or ignition format. And if you're doing it with a cloud provider, then the cloud provider will typically execute the cloud init or ignition for you. Here, because we're Docker, we don't have that luxury, we're going to actually execute it ourselves in the code. Okay, so we check external machines exist. If the external machine, meaning the Docker container that we wanna create now doesn't exist yet, we create it. And now that we created the Docker container, we're checking if it's a control plane machine, and we haven't yet configured the load balancer for this machine. Then we update the configuration of the load balancer so that it'll point to this Docker container that we just created. And if we did that, then we set status.loadBalancer configured to true and we can move on. Now we check if the Docker machine has been bootstrapped already. If not, then we do what I mentioned. We're going to get the bootstrap data. That's with the helper function and then exec bootstrap. And that's basically just gonna execute all those commands one by one in the Docker container. You can see the get bootstrap data function here. Now we get the IP address of the Docker container and set it in the status. Okay, this next thing that we need to do is a bit of an implementation detail, but it's important. We're setting the provider ID. Okay, the provider ID is what I mentioned. It's a unique identifier for this machine. We're setting it on the Docker machine. It'll be copied to the copy machine. And it will be set on the node in the cluster that we're creating. And we have to make sure that the provider ID on the machine and the node match. It's important for things like auto scaling so that Cappy can see, okay, I'm now deleting this machine that corresponds to this node. It's used for CSR certificate signing request approval. So we're doing that manually here. We get the provider ID. We're setting it on the node here. And we're setting it in the spec.provider ID. Okay, so at the end of this, we have the node and the Docker machine having provider ID set. You can skip these details for now. We set status.ready to true and then the defer statement will patch the Docker machine instance. For delete, it's very similar to the delete of the Docker cluster. So we delete the Docker container. We update the load balancer configuration. If it's a control plane to no longer point to this container, we remove the finalizer and we're done. The setup with manager here is very similar to that of the Docker cluster and not going to go into it right now. And the last thing is the Docker machine template. Also not gonna go into details right now because of time constraints. You can see it in the Cappy samples, but basically it's just a new CRD that's pointing to the spec of the Docker machine. And that way Cappy knows how to create Docker machines when it needs to. For cloud providers, it might have things like, you know, amount of CPU cores, amount of RAM, things like that, things that you want each of your specific machine to have those properties. Winnie? I need to change the computer. So... Are you sure? We have 15 minutes. Yeah, well, I don't know how to create cross-run data. Okay, Jonas, I'm gonna go and we'll switch it. But I have demo here. Do you want to write an infrastructure provider? So in the Cappy samples repo, we already have a cluster API provider for Docker uploaded. So if you're not able to catch up within the next few minutes, feel free to browse through the code and Winnie will run through a demo so it'll be easier to follow along. And also the tutorial is live and Slack channel will be active for a few more days after the session, so feel free to ask us questions over there. Yeah, so I think the idea is to help you get started to writing a provider and also maybe get your contributions to cluster API and its providers. You can get started with all of those. So yeah. Okay, so we actually have a material on the webhooks and also like releasing, but we have like only 13 minutes. So I will not go through the adding webhook section, but basically like QBuilder take care of all the logistics of creating webhook server and like providing the certificate and everything. So just like follow this along when you have time after this one. So I will actually just go to a create cluster section. So now we created the API, we set, we actually wrote the controllers and we now wanna actually create cluster. So every provider usually supply template. So if you look at our provider, we actually already made the 0.1.0 release. And you can see actually cluster template we provide as a provider. And this section actually has a section how to create the default template and all this, but I will actually not go through this. And instead what I will do is that just hold on. So basically if you follow through our instruction, there will be a template directory that has a cluster template. So this has a definition of a cluster, docker cluster, docker machine, QBADM control planer, everything you need to create cluster. So you have this and so that's what we are explaining here. And we actually, so there is an instruction to generate the template. So as you can see here, we actually have some like variable what we call token here. And this, when you generate the template, you can actually set them as an environment variable. And when you use cluster, cuddle generate cluster from this template, it will actually replace those variable with our environment variable. So this is how you generate the template. And I actually already created the template. Just hold on. So, so I already have a template generated. So what I will do is on the left side, I will actually run this watch cluster cuddle, describe cluster command. So we don't have cluster yet. And on the right side, I will actually also watch docker ps because we are creating docker containers. So let me do this. So if I apply this template, so you'd actually generate bunch of YAMLs. And you can see that this cluster cuddle describe, it's actually showing the status of our cluster generation. As you can see, it's like changing. It's actually scaling up, bootstrapping, and it's actually showing the latest status of the cluster creation. And if you see the right here, you actually see that this container is the load balancer container, Abhishe explained. And the next one, we have a control plane coming up, and this is our worker node coming up. So it's almost done. And I was actually giving some command here, you can use. So you can watch your cluster status here, but if you wanna know a little bit more about different things, what you can do is you can actually get cluster. So you see that, oh, it's provision, and you can actually see our docker cluster, and you can get the docker machine, and you see the control plane and worker node. And if you wanna know more about this, it's like, oh, what's actually happening? Then you can actually get the email about this docker machine, and we actually have a conditions for this docker machine. So you can see that a container is provision, bootstrap execution succeeded, and combining this, there is like ready conditions, showing that this docker machine is ready. So, yeah, so that's actually what's happening. Yeah, so it's done. So this machine deployment is forced because we didn't apply CNI yet, but once you apply CNI, it will actually become ready. So I will not go through that, but basically our cluster is ready except that it doesn't have CNI yet. And it shows all the docker PS and everything. And just for your information, if you wanna know what's really going on, you actually can look at the logs of our controller. So for our docker controller, you can actually, I like normally follow, and if you have a tilt, you can actually look at from tilt UI. So it's actually showing all the steps done with our provider, what it's doing. It's like reconciled, it's like waiting for bootstrap provider controller to set the bootstrap data. So if you have some issue, you can actually look at the logs of this controller and see what's happening. So, yeah, so we have five minutes or like, come on. Yeah, so that's actually all I had, but there are like more instructions in this tutorial. So please take a look at it, then we actually have a release section that explains how to release provider using GitHub actions with all the examples. So yeah, take a look at it and yeah, let us know. Does anyone got any questions?