 Hello, everyone. It's now after 4.30. So we are going to go ahead and get started if you're I Hope you saw the first slide or actually this one says it right? So if you're looking for this tutorial for the reconciler, you're definitely in the right room and You know, so I'll just introduce myself first. I'm Scott Rigby. I'm in the developer experience I'm a developer experience engineer at WeaveWorks I I Am one of the maintainers of Helm I Co I maintain I co-maintain the flux community and repo and I Co-chair the get-ups working group and I work with open get-ups. So that's just a little bit about me I'm really excited to be presenting with these fine fellows EKS and open source and My daily work I write controllers or I write generators controllers Maybe if you've been at it Hi everyone, I'm Nikki I'm a software engineer at WeaveWorks. I'm was also a maintainer of EKS CTL And I if you've used it That's really cool. And I also write Controllers currently also so but also to say If you're not familiar with the environmental sustainability Technical advisory the tag within CNCF, please check that out. Nikki's super active in there Yes, please ask her or any of us questions about that if you're interested in the energy optimization carbon optimization for Kubernetes Reach out to me anytime Hello, my name is Suley. I work for WeaveWorks as well. I'm also a flex maintainer Yeah, I'm working on the controllers for flex Hello everyone My name is Sofya Chi. I'm a developer experience engineer at WeaveWorks I'm also a maintainer of an artificial controller one of the controllers that make up block CD So for most of these stuff, I'll be working around and if you're feeling along and you have any issues you could Reach out to me and I'll help out. Thank you Yeah, so just I think the thing to do probably Raise your hand or whatever if you need any kind of like hey, this didn't set up correctly or whatever and one episode Great so okay, so This is just a very quick overview of like what's happening today We're gonna start with We're starting with like a quick Set expectation set, you know just to let you know like hey, what what are we gonna go through? What are you gonna learn? And we are that includes a very practical and memorable challenge that this demo is set up around And also like what your final outcome should look like by the end of this We're we are gonna we're gonna make sure everyone here can install Well, if you don't want to just listen you're welcome to just listen But if you don't want if you want to actually do the hands-on part of it We'll make sure all of you can set up Install the tutorial for requisites We're gonna walk you through setting up a local dev environment one Using one of several methods that we have and just unblock anyone from just like getting in it and doing the code so Just before the step-by-step actual tutorial part. We're going to introduce key very important and less widely known Concepts and other ways of working with controllers like some of the details around conditions and Really just kind of help Just to sort of like help people like be able to envision what we're doing as we're going through the step-by-step And then we're gonna get spend the ball for a time just getting your hands dirty going through the step-by-step We'll have you check out a get repo and and just kind of like check out different branches for each step so yeah, then we'll do a little wrap up and Q&A so We know that Kubernetes controllers are responsible for making the current state of your Kubernetes cluster continue to be to become closer and closer to the desired state This process is called reconciliation Perhaps you are here because you would like to know how Kubernetes built-in controllers Work or you might want to write your own controller for your custom resources your CRDs In this 90-minute tutorial will walk you through building your own controller using controller runtime and cube builder We'll we'll use cube builder, which is a framework for building APIs for custom resource definitions CRDs and We'll also explain lesser documented Best practices and conventions and concepts for writing controllers that the community has developed through trial and error Through projects such as flux and cluster API also known as CAPI And after this tutorial you should walk away with an understanding of what Kubernetes conditions are how to set and respond to them and why they matter and Lastly, we'll also look at some common pitfalls and help relibraries that make writing controllers easier and more fun and more reliable so since This tutorial is on how to write a reconciler We need something for controller to reconcile and so here's a challenge for today as a hypothetical cube con speaker in order to have fun and learn things. I want to be able to manage my Cube con proposal submissions declaratively So we have created a CFP API a mock call for papers API the code and docs are In this link, which will actually have a link to in the next slides But also the presentation you can find it on sked So you can find the slides and click on links through there But we'll also have QR code and a bit Lee in a minute So if you want you can Wait for a sec. Yeah, we'll post it up and make sure you all yeah, we'll make sure it's there multiple times in the next slides So yeah, the We've we built this mock API to allow us to draft edit and finally submit proposals and also create speakers And there's a Yeah, sorry, the API supports a speaker object with a one-to-many relationship so one speaker may be referenced by multiple proposals and each object each Speaker and proposal has Multiple required and a few optional fields that we'll look at and our challenge today is to write Kubernetes controller that defines CRDs For both the speaker and proposal and a reconciler for each of those resources. Yeah, and just to be clear I hope everybody knows this but this API doesn't really exist So I hope I hope Linux foundation doesn't get complaints that they can't submit their CFPs through an API or do it declaratively This is just to help kind of get your mind around an example So in order to run the controllers in your local machine, we have prepared multiple methods and techniques for you to develop the controllers so Sanflush and I will be available In the room if you have any problems for you set up or with the compilation or anything like that Please just wave at us and welcome Help you out and so the repository can be found at this bit Lee You can also scan this QR get the link Yeah, if you get the speaker get this link And also the name of the repo is at the bottom so it's the title of the session Had to write a reconciler using Kubernetes controller runtime just waiting a few Awkward seconds to make sure everyone has it. Yeah, maybe just give sense like a little bit For people to hit it. Yeah And again, if you need it again, we'll put the slide back up Yeah, the link is still at the very top there so you can listen and get the link to So We have prepared Four different ways for different ways of you to do this tutorial with us The first one, of course is you can just watch and learn and see What's really will be presented in a few minutes from now Our second method and the easiest one if you don't have any development environment in your machine is a get-spot link it's gonna spin up a VS code or Go land environment for you with everything within it Don't forget if you ever see a password prompt. It's root root user name is use passcode is real Also have a background file if you find out virtual machines and you already have Something to spin a virtual machine or the background tool. Please go ahead and use the background file you're gonna find it in You're gonna find instructions in the read me file and of course for the bravest of you You can go ahead if you already have go kind walker and customize you already I would like to set to go and you can just follow the tutorial directly I mean, honestly, I think this might be the easiest for folks because if you have these on your computer Or at least if you have most of these dependencies, it'll run faster But yeah, the gig that the get pot should be nice because for anyone who might not have them So the get board one might take seven to eight minutes to spin K3 has cluster for you. So if you want to go for that thing, please spin it right now or at least click on the button right now and Yeah, we should be finding Yeah, do you all see where? Maybe we should put up the read me Or actually rather so this open in get pod there's a button on the read me link So yeah, maybe we'll just go back right in just a second for the So for those who want to do it At least open up the repo so that you'll have that option. I Myself tried to spin up to get pod cluster and I was not able to do it, but others were so Just let us know if you have any trouble. Hey, you want me to introduce this part for you? Get my computer. Oh No, we had switched it up so that we could look at So yeah, Suley is going to show Oh There you go. Yeah a quick demo of basically what What you'll see in a little bit as you go through the demo yourself and what it should look like at the end So we have a make file here that you can use to bigger Like this, okay We have a make file that you can use to set up a kind cluster and then to the to deploy everything in the cluster So that I have already done Sorry, just a real quick. Can everybody read it or does it need to go bigger? Bigger make it bigger Like this Yeah, I was thinking more. I'm seeing hands saying more. Is that readable? So I have already Said it so I have a cluster running a kind cluster running and here we can see that the cloud the controller is running in the namespace CFP system I got Of the API that is running in the namespace CFP right What we're going to do is we're going to install the samples Customers versus definition that we have here. So we're going to insert speaker and then we are going to install To deploy say a proposal So let's do that First you apply the speaker We describe the speaker So what it has done is that we have a new speaker so the controller is going to Make a call on the API and create a speaker and then we have a condition telling you what is the status of the speaker? So the speaker has been Reconciled successfully and you have an ID for this speaker that is called default speaker sample So the ID is the namespace dash the name of the speaker Now let's apply a proposal And we describe the proposal So here the same we have a proposal with an IPA call To be able to create a new proposal and what is tell us that we have a new proposal last updated just now and that the submission is To ask so I'm going to edit this proposal and change the submission status If I find it here to final now describe it again So we have a little bit because now that I have edited it the controller has to reconcile it and then It's going to make an IPA call and it will make it final Taking some time here. What's what's happening here as I edited the APS servers is supposed to Create an event That's not happening for now Did I did it as supposed to You're editing the one on status The spec Yeah Yeah, thank you. Thank you. Yeah, I'm a little bit twice. I'm sorry Okay, now I have edited this spec Describe it. Yeah, now is back to final. I suppose to the other thing is if I delete the Speaker What I will get in the proposal is that you have new conditions that telling us that you cannot reconcile The proposal now because it's because the neck doesn't exist anymore right That was it Yeah, so That just gives you a sense of what what it should what it should do in the end And we also have test cases that you can see that explain exactly what What should happen and well we have the make file that with some make commands that you can just run yourself and we'll show You what each of those tests will do Things I'm gonna get a milk crate here for okay, cool Yeah, so okay, so we I promise that we would Introduce some of the less known concepts that we're gonna see throughout this tutorial Some are very well known some are not as known Some of you might know all of them and some of you may know none of them So that's why we're gonna go over these and there's just five of them so It is it is the fact it is the case that this tutorial is an intermediate tutorial so it assumes some Knowledge of Kubernetes, but I would say if anybody is in here kind of wondering. What's this all about? I do encourage you to check out the CNCF glossary It's just glossary dot CNCF that IO where those concepts are spelled out for Anyone just approaching some of those for the first time like what is a Kubernetes cluster and things like that? Um, yeah So So I'm going to talk about reconciliation, which is at the center of So the Kubernetes community repo if you haven't checked out the community repo of Kubernetes Have a look it's full of great guides on how to build different things. There's also a great guide on controllers for developers that's been around since like forever and that has a list of best practices and information about how to build controllers But there's a catch so first as written in these docs and Kubernetes controller actively reconciles resources what that means is that a reconciler watches an object for crude operations Create update delete et cetera and then it compares the desired state of the world with the actual state and Then it applies Some logic to match the desired state with the current states and the simplest implementation of this control Loop is this is a lot this loop right here, right that we see on the screen But in practice things are not that simple Today we have built-in controllers for the default Kubernetes resources deployments et cetera The scheduler is a controller An operator is a controller and we have controllers for CRD as well So the the Kubernetes back to the Kubernetes developer guide on controllers. It's really great for learning how to build that controller from scratch for built-in resources that you can find Easily without having to build any scaffolding around these resources like deployments for example And it doesn't make use of scaffolding libraries such as Q-Builder In essence as you can see if you've opened up the the the guide You initialize a controller struct that reads from a cache Of resources you then have to list and watch a specific resource with the watcher method Then you use an informer to retrieve information about the resource that you are watching Then you handle some of the logic for the events That the create update the crude Events that that are occurring and then you reconcile your re-sync and you handle Graceful shutdowns and errors this is Relatively straightforward for common Kubernetes resources again like deployments and Pods, but it becomes harder for custom resources right And this is where Q-Builder really shines and this is what we're going to be looking at today and First I mean what is the controller runtime framework? So during this tutorial we're gonna be building controllers using controller runtime But before diving into controller runtime, let's take a look at how native controllers like deployment and replica set are built So initially they were built using four components like informers shared informers Listers and work use at the at the left of this schema you're gonna see that we have the Deployment informer replica set informer and pod informers those are Kubernetes components That will register what we call the event handlers So whenever you create a deployment or a replica set or a pod these informers will catch this Event and apply one of those functions one when those functions are applied They're gonna we're gonna send at the object to the to the worker queue What are they gonna be processed later by a manager? The manager on the right is a component that will execute the reconciliation function Which will move your object from the latest or try to move your objects from the latest to desired states Controller runtime is a set of go library go libraries for builder controllers. It simplifies a lot of things It is a product of years of experience building controllers. We've learned a lot from builder native controllers like deployment and Stateful set and we try to create controller runtime and that's what we're going to use today It's also how it's the complexity of building controllers for example during this tutorial We're not going to see what is a shed informer or what is a lister. We're gonna use directly components that will simplify all this Pipeline for us It allows you also to focus on the logic of the same function It means that you will not have to deal with any events or any specific functions You just focus on how do I move my objects from the latest to the desired states? It is well tested and maintain it. Of course, it's open source It's a very good read if you ever want to go and see how it's built inside and And Just few examples on on what are you going to find in this library as for example, there is PKG slash lock It's for standardized login. So you will not have to import anything about login Everything is built in that regarding verbosity and debugging stuff There's also the leader election So if you ever heard that controllers can also have the leader election mechanism You can have multiple controllers trying to fight or like to get the responsibility of managing your resources You don't have to build that yourself. There is the PKG slash leader election There is also a library to manage webhooks validation webhooks mutation webhooks and other things like rate limiting and Testing for example, whenever you want to test your controller You either can do unit testing or spin a real kind cluster or a real Kubernetes cluster to test your controller Or you can use PKG slash NF tests that provide you a Very similar in via Kubernetes environment. There is not really a real real one but can be used for testing purposes So we are going to use key builder for our project here Key builder is a framework that is controller and time actually and then it it builds actually there's a scaffolding that builds all the project for you and then you really have only to focus on Implementing your controller logic. So there is a key builder in it and Key builder create API that you can use to build the project and have all the API set setup for you There are other frameworks All the frameworks that exist including the operator framework and you have Knative as well that do that You can go to the key builder book if you want to know more about that So let's talk about conditions as we showed during the demo You can have condition set in the status of your customer source the condition that a set of types So you can have ready ready condition push schedule for example And if it's each type has a status And the condition represent the current state of your customer source So it is computed and it can be recomputed every time you don't have to to So it anyway, anyway, anyway You use the condition actually to communicate the state of customers between different Controllers for example if you have one consuming control one producing control and one consuming control If you can use the conditions to know the current state of your customer source the the conditions will be Set in an unknown abnormality polarity to give an example When in our project we do a call to fetch a speaker if the call doesn't succeed for any network Reason we can have condition that we call fetch fail condition this and it will be fetch fail true And we say I don't know more true because those condition are supposed to be on the customer's So only when something not normal happens. Otherwise, you can you should not have them I can take this one if that's cool. Yeah, unless you want to do Yeah, so basically how do There's this cool. How many of you know about observed generation or work with that at all In your day-to-day or ever? Yeah, okay, so a few hands. It's a really good thing to know just like conditions are right so basically The customer research definitions are supposed to set a field called observed generation in the status object This is already what built-in Kubernetes resource types do but The controller the your controller that you're building and the one that we set up as a as a step-by-step for you We should update that field every time it needs sees a new generation of the resource Basically the point of it is to to allow the controller to to be able to tell the difference between resources that that don't have any condition set because They're already fully reconciled From resources that don't have any condition set because they've just been created or aren't quite quite there yet so that is very helpful and If you've ever used controllers that don't do that There's often lots of edge cases race conditions and different things where that are pretty hard to reproduce This was something that was introduced into Kubernetes a while back to help solve those use cases So we're right on time. No, actually. Yeah Does anyone need the link to the repo again so we can go back? We'll break we'll braid these every time Yes, you were asking if anybody yeah, everyone is okay with the dev environment before we start With the step-by-step guide And it's okay to just watch to bet any issues raise your hand. Yeah, great So the step-by-step so we have so we have we have written all the controllers actually and we have set tags for every step of the We're writing the control process so on the step one is just to Use the tag S1 for step one and create a branch for from that I'll do this Yeah, we said it's worth okay. Oh real quick has everybody checked out the Get repo that was on that QR code Okay, cool. Yes. Oh, you have done it. Okay, great Yeah, so the password it's yeah Super secure password roots roots username root password root never do that So as Suley is going through these through these steps you can also if you want to compare For the workshop and just look check out the workshop branch and look at the commits each commit tells Exactly the commits were separated pretty well separated. So we're actually very well separated so that each commit shows What step-by-step what's done step-by-step and those correspond to the to the tags That Suley will guide you through checking out Okay, now that we are on the first step something to know is that we have a Test Target in the myth make file that is going to generate the manifest for you And then you're going to run the API for you and then he's going to run all the annual Individual unit test that you have and then at the end is those in the clean API So this is it is doing this locally either in the port or your own local environment so one API is actually going to win the API and Mended go And the clean API is going to kill the process and Clean up every data that has been created So what do we have here? So we use keep builder we did keep builder in it for beforehand and then we clear we we have written all the controller from there So we have a speaker Custom message definition. So what you have inside is you have a spec So the name the bio and the email of the speaker and we use keep builder markers That you can see here to add some validation So the name has to be a string the email has to follow some regex pattern here Then we have the status and in the status you have the observed generation the ID and a set of some Set of conditions you can see that all of them are optional So at the beginning we won't have any ID any condition and then we can through the reconciliation process We can add those and then What we have is here. He's set by keep builder. So the speaker has to implement type meter and Object meter those are in object meter. We can see that we have the kind on the API version And type meter you have the names in in space another The names names face and other Fields that you have to do implement. So that's it for the reconciler Let's move on So what keep builders for us as well is to generate the series And the manifest actually to deploy How come to this so if you haven't used cube builder Most of what you see the scaffolding was generated by cube builder and it's a Probably should yeah, we can Add the command anyway, it's okay. Well, we do actually have the commands inside of yeah inside not in the read me But the read me shows The read me of the get repo shows how the code is organized where the CFP API is that we've made as a dummy API and then where at the And then the CFP folder it's named that way because It needs to be named for for your Controller inside of inside of that. There's the steps that we used to generate So if you want to go back and say, oh, how do they do that? You could yeah so most of it most of it it it was generated by cube builder and just the Initialization command and most of the work that we will do will be in the reconciled Function that we will see in a minute wait What we have here in internal CFP is We should so this we have a client and What the plan is doing is? Actually, there's an HTTP client that is going to call the API in order to create Update what do credit operation on the speakers endpoint on the on the proposal and point That's this is the API we will be using in our controllers And so that's here and now If you look at man So this is where everything is set up. So we have a scheme That is set up the scheme. What the scheme is doing is the scheme is it's actually identified as just permitting you to To match the West API or KBPS server where you have our customers use this and the go type that you have in your controller so in order For control and time and kind go under under control and time to go fetch those on those Objects from the QBPS server is Because time goes is in go types. It need a way from those go type on the Go version kind you have for your go type to be able to find the Endpoint on the West API to call the QBPS server on the scheme is the what's giving you that so you create on your type and you You set your type in this scheme. I will see So what we have here we have some flags the first two ones are set by Kibla first for this is the metric address for Matrix on the pub is for the health checks that we have and we have added a flag here that we called a safe API endpoint address and this is The end point the end point address of the other the API that we have written one the default is localhost with the port We don't enable the election at all here because While we could we could but we have only one running part one running and sensor by control So we don't need the election here So Then what we do is we create a new manager as I mean I said We have a manager that manages controllers for the managers the manager does a lot of things for us It's going to create a cache is informers for us and set watches to the QBPS server So we can see when so we can get events when new objects are created or objects are updated and Then it's managed our controller a life cycle and it's When we have new events it's forward those events to our controller through the queue And then we declare our two speakers to reconciler We have a speaker reconciler that we were talking about now and then we will have the proposal reconciler And we set them with the manager And that's it Now let's talk about the controllers The controller themselves. So what we did is we we have a suit test that is set up so what Yeah, something we didn't say is that we use the flex SDK here because Well, we work on flex mostly and we use those we use flex SDK because on the flex SDK we have many things that we learned to Writing controllers that we set in in in the package in flexes decade that make it easier for us to write our controller and we think that Well, you could you can use those because it's really make it easier as we will see or you can write your own Functions help us one thing I wanted to do that I didn't do is have a slide right about here where it shows the like thousand lines of code from the handwritten In order to work with conditions For for the Q builder tutorial versus using in the flux package where it's really just a few lines We can explain what they do, but I think that the functions the package the flux CD package runtime controller package Has it's semantically written so it should kind of tell you as you're writing it and as you see the statements What it's actually doing, but if any of that's unclear just just ask us and we can link to it Does anyone have any questions or so far or is not following or anything at all? Can you take the mic I'm so Is that a question around? Check check check just had a question around the leader election ID In main.go and why that was being set to a static value I'm curious if there is say like another instance of this operator that started up Wouldn't that like I guess how does that work and like wouldn't they need unique identifiers? Oh, he's asking that if I if I heard you write this the static ID for the leader election He's asking why we set a static ID and what would happen if you were to install multiple instances of this Yep, I can do that So you're talking about this Yeah, when we're saying did I get that right this one? Okay. Yeah, exactly Um, so what we do with the leader with the leader election is when we have To several instances of a given controller like we have a speaker controller here We only have one part, but we have several parts. We have several instances and they're supposed to do The same job. So they need some kind of synchronization between them to know When a given object is coming how to reconcile that object and to make sure that two Those two or several instances don't do not do the same job So what Those here is that we have An ID and the ID is going to be a whole held by a lease object and when There's a leisure Election app is happening. It is using a waft under the hood. I don't know if you know about wrapped So what projects you know about this? Well, it's a standard for Uh, are you familiar with wrapped or do we yeah, right? I thought so because you asked that question. Yeah, so he's doing this and So and all the All the instances that is going to draft in order to to try to get to be elected leader and the one that is elected leader Is going to have the lease and it's holding the lease and until uh, it's crashed for something like this and there is um Uh How do you call it? A health check that is done every a given amount of time to check whether the leader is still active and it's still It still is a leader and if it And the leader has response when the health check is done. Say, okay, I'm till either interactive I mean, it's not active a new leader election happens and a new object get to be the leader and We have also this leader election release on council that can be set So control and time can say um Okay, at some point this leader crash crash. So we can sell it and we Do some cleanup. I'm never use it. Actually, we don't use it. Yeah, we just we just didn't need to show that for for this part. So we just commented that Thanks So coming back to the test There's another question I could take probably one of the mics too. Oh, just the help of um So on the test for what we use is we use um The test the the ramp test that that is provided by the Kubernetes sig That is actually that is poor that is generating a test environment for us that we can With a real cube a ps server that we can use to run our test We use this and we have a wrapper around this that is The flux test arm Here This is so this is from flux runtime test amp What he's does is giving us nice helper functions in order to when we create a given object To the a ps server we can wait to make sure that this this object has been created for example So we set a the API endpoint And then we generate we we create our reconciliers here and then we start Uh this test environment and in the end we stop the test environment and we clean up That's what is doing for us. So now if you still um So for the reconciliers we have a test case here So for every test case what we do we have a name a speaker a bio an email on some conditions that have to be associated against And we pass those conditions to an asset function So the first test case is to create a speaker And to reconcile it. So we create a speaker named Luke Skywalker We set the the bio to first speaker bio and we set an email to first at potomail.com and the condition that we expect here is that um That the ready condition is to succeeded With the reconciled with a name of the reconciled of the object that has been reconciled to successfully Should we have everyone check out the Step one branch for creating speaker. Yep. Okay, but we said that everyone is on Everyone has checked the the tag on s1 and it has created a bench from that Okay, not yet. That's how you do this You type get check out tags s1 That should be s1 bench. Oh, yeah. Good. Good point. Yeah cd enter CFP and then run mix house. Thank you. Uh, yeah, so let's go on. So What we do for every test case is we create an in space And at the end of the test we delete the name space When we delete the name space, we clean up everything that is inside We create a speaker With all the information that we have in the test case We do quite a weight of the speaker and then we run the asset the asset function With the condition that we want and for the first one on what we do is We wait To have the object With the finalizer I'll be less than what is the finalizer. We wait to have the spec with the finalizer And then we wait for the speaker object to be ready And for it to be ready, we need to To have the generation of the ready condition to be the same generation as the object And we need the generation of the object to be the just the one that is setting the status To to be to make sure that we are reconciling the right generation And that the ready condition represents the condition of the right generation And then we make the ratio and the asset this we Much we look at the status condition to to make sure it matches the condition that we are waiting for So to test this we can just do make test No, where am I? Yep make test And it's from the test first But just as a reminder about the our progress bar if we had a product like a kind of pizza hut delivery progress bar on top We're in step one of seven And it'll just be each each branch or sorry each tag that's checked out So, uh, it will as I explained earlier this will run the api and then make the test and clean everything at the end for you So regarding this Oh Yeah, uh, if you're using git pod you should go to the prepare k3s Make um terminal not the the shell not the other one. Okay, not the one that's automatically opens Just go to the one above That's hard What's the difference the automatic one is the one at the bottom? Yeah, it opens on the wrong one. That's not In the repo Yeah, okay Now let's see what this does So what we have is is a speaker reconcealer and what we um What could be the best for us is create the speaker reconcealer and it put a client This is the client to to speak to the to the api server. So Something to say about the client when he's speaking to the api server is speaking to the api server directly When it wants to create on something an object for example, but when he's reading Objects is always reading through the cache. It's not directly speaking to the api server for that. So we are we have Added an HTTP client to talk to our api We set the controller name and Just api this is the endpoint So we set it up with the manager So we say that we watch is for an object of type talks view on the speaker That's for my api Then uh, what we need to implement is the reconcile function That we turn uh result on an error So the result um So at the end of the reconcile function if we return That reconcile has failed what the controller and time the controller manager is going to us This is going to queue The object again for to be reconciled After a backup But if we have no failure we have it's up to us to set whether we want to be queued again or not So we can do it through the result the results if you look at results We can see that we have a requeue that is a boolean and we can set the requeue after that It's the time so we can set requeue after some time so what we're going to do is For this when when we have a new reconciliation that happened We are going to retrieve the object that we are supposed to reconcile so we do this uh by getting a namespace name that We set the namespace and the name of the object that we want to To retrieve and we use the client to retrieve it and then what we're going to use is uh We get a patch that new helper. This is complex. What is going to do for us is doing the if you want to patch an object like we have Retrieve an object to reconcile it so we have changed some things the status the conditions and We want to patch it so we what is the patch new helper doing us is when he's doing the patch Is if there are any conflict because uh, we are in a distributed system So maybe some other controllers are acting on our objects as well So the patch helper is going to resolve any conflict with that that we have in order to do that is going to What it is needed for us is to set The set of conditions that we own So if you have conflict on those conditions that we own is going to ignore the The conflicts and go on and still patch those objects So the condition that we own are the ready condition Reconciling this one maybe I have to make this bigger Reconciling this one we we said it to say that we are reconciling a given object like The reconciliation process can take several time if you are reconciling an object and we have a failure We recruit and then when we After the backup we reconcile again. We were still in the reconciling reconciling process. So we we set the condition for that Stealth We said still that mean that there is nothing we can do and someone has to So we need some human intervention on that one Here we'll see what when we we said the failed one and then we have create fail and update fail condition So we have a different function We have a different function. That's fun at the end We we we make it a different function because we want no matter what happened after the reconciliation We want this to be applied So what we want to be applied is We said if we have a still condition or a ready condition We want to set the observed generation. This means that we we are not in a reconciliation process anymore We have finished Either because it's still already and we want to set in the status that the observed generation of our reconciliation And then we patch That's what we do here in the defer function. The other thing you have to do is We set the finalizer. So here we have put a link if you want to know more about finalizers When you set the finalizer We add it to the object using a control runtime function But the finalizer is actually Let's see if you did the proposal right here, we can see the finalize that we set So this we can set several finalizers. Those are set of strings and when um When the kubi appears as the seeders if you do keep cattle delete on the kubi cattle delete proposal When the kubi appears several seeders it's going to tell you, okay, the proposal has been deleted But actually the proposal is it's still in 80 cd because it's not using it because it's waiting for controller To unset those finalizer before Finally deleting it from hcd. So if you take kubi cattle delete the finalizer is still there If you can if you do kubi cattle Get a proposal you will still get the proposal until someone delete The finalizer what what we said this so we can do some predilation Here for the speaker for example when we do a delete kubi cattle delete speaker What you want to do on our side is for the rps server to tell us, okay We're going to delete the speaker But you have a finalizer. What are you going to do about that? And on our side what we do is we we make an api call to delete the speaker from the api And then we unset the finalizer Yeah Can we pause for a second just does anyone have any questions so far? About yeah One sec. Let's get the mic Yeah This is probably a more generic question about the finalizer portion, but Let's say your controller can't find let's say you're managing like an external resource or for controller and the controller can't find the resource Well, is it better to stay within that finalizer loop and post an error? Or is it better to complete the finalizer and complete a deletion? Or does it depend? If you can't find the resource when you have set the finalizer Yeah, let's say you're managing. Let's say your controller is managing something completely separate like in a cloud provider or something else And you can't find that Just like in this case where it's managing something within an external api. Exactly. Yeah Is it better to err on the side of caution and keep the crd up in the finalizer state? Or is it better to just Delete and say okay. Well, I can't find it. I'm not going to do any action and then to delete myself Well, this is the logic you have to implement somehow here. What we do is Uh, let me show you what we do Do we solve the comment that shows the Well, we haven't we don't have it on the for now. We just put the finalizer we will see later But what you do on our side is we try to get To delete the object if you get on Never we still go on and remove the finalizer But this is something you have to implement on your side. Gotcha. Thank you So, um, so yeah, we set the finalizer And then we req so it's not on the same Reconciliation that we do we continue the reconciliation when we set the finalizer Then we req after the backup. We we continue the reconciliation then because we we we get a new client I need the clans is if one point to create of the http client something the doesn't work So what we do in new clients? Is we look at the endpoint Is the end point is not a real end point We we we don't create it otherwise. We just set an http client And if this doesn't work because we we don't have the right endpoint when we set up our manager with the with the manifest We mark it as tell this mean that someone has to Change this and recreate the the controller And then we do reconcile We have an expert an unsupported function reconcile on this one Is doing the actual reconciliation So again at the beginning we again have a different function And this different function is going to deal with all our conditions What we start is that if we Don't have a req of the result and we if you don't have any failure we delete all The reconciling and negative polarity condition And you mark the consideration to Don't be ready condition to true Setting that okay the reconciliation process is successful We have a ready condition If not What you're going to do is See what type what type of error we have That's coming from our client That we have here so in the client Every time we do an api call we return a given type of error And all the error we have here On given the method is to get on a speaker pass. We said we have an error fetching the speaker fetching the popular updating the popular speaker and or deleting the deleting the deleting them, sorry and then Given an error we can say you can mark conditions to true or to false So here we mark if you have A creation failure you mark the Create fail condition to true With the reason that we get from the error And we mark the ready condition to false So we will have An abnormal polarity To true and ready condition to false We do If it's a if it's a creation request We set the ready condition to false only and the default just setting the ready condition to false so and here if we are If on the object gen the object generation and the status of the generation diver This means that we are on a new reconciliation because we have a new object generation So we mark it as reconciling And we we go through the reconciliation reconciliation process Every time you have a failure until finally we are we are successful And if you are successful at the end we we delete it here So that was a differ the differ function on setting the reconciling and then we create the speaker Creating the speaker just making the Creating a payload Given what we have in the object And making the IPA code We create it and at the end of what we do is We set up The id in the status because it's successful That was the first step Question maybe Yeah, do you have a question or need help? Okay, give just a sec. There's a mic coming Thanks Sorry, just uh, I guess thinking of a little further ahead um, I guess Metrics and observability side on the controller um How are you guys kind of monitoring or understanding the efficiency of the controller or how If it's running into specific errors or Um, just to see because sometimes you you build stuff out and you think it's going to work out fine But sometimes metrics come back and it might not be as efficient or it's not doing the same things that you want it to do So as far as like observability where are you kind of putting that in? um I was just Thinking we're setting So we're setting we're setting advance right on the the question So it's contra like with controllers you have events and logs Right. Um, it's running in the cluster as a port. So you could log stuff So Um controller wrong time comes with a logger a standard logger. That's you know You could add the name of the custom resource that you're you're looking at and the namespace and then some extra messages There's also an event recorder So you could emit kubernetes events So those are like the top two ways and then of course We're posting back to the status which is also a way of communicating So for you talked about the length of reconciliation So you with those things is your application, right? Those are now application specific specific things that you know how to measure you could use Permit use you could export some permit is makes it like it's the flux customized controllers We try to measure how long it takes to reconcile resources. So we those are now application specific But majorly logs events Yeah, so you also uh, measure the disruptions of the api server to see Uh, how the operator impact to the api server, right? Yeah, because it's making calls, right? Um So with you could try to like if we defer our calls so that after it runs Of course, it has to there's some things it has to do It has to update the object so that it can reflect back to who people getting Like running kubectl commands to see the updates those are things, but it should not be like making so much api calls that it runs it down Yeah, and we really prefer that pattern with with flux controllers by the way because it's just so useful for folks to Get all of the information that they really need on the on the crd itself moving So that was it for the first step For a purpose we move on to the second one To the same just create a branch with the tag Unmove to that branch That's two On the difference we have here is that we have a new test case On this time you're going to update our speaker so what we we have here is that uh We create a speaker the same way And after creating the speaker, we are going to change the body of the speaker and expect the speaker to be updated And to be ready at the end So you can run make test the same way if you want to test this one I'm just going to go over the code changes additions so what we have added here Is Uh, no we have this what we have added is in the reconcile part We have new conditions now we have an error case for when we update the speaker And here before creating the speaker We're going to check whether we haven't an idea in the status if you have an idea in the status It's mean that a previous reconciliation was successful on that the idea was set If that's the case, it's mean we have to end We have to handle the the speaker updates somehow So that's what we do here in hundred speaker update Um, so we're going we are going to create a payload with the new spec of the object And we are going to to create the existing payload uh using this the idea And then we are going to compare them If they are not equal we're going to make an IPA call to to update the given they give an object If it's successful this means that the update was successful So we don't change the idea because the idea is uh is unique so but uh See here It's mean that is successful and if it's successful we just return We don't go over the creation process That's the idea that's the only addition we have for The update so we can do make this make sure that everything work And hopefully if you if you do need to go back over some of this and say what was that logic again You can just look at the commits that we have here which are separated into You know the first step was create speaker this one's update speaker and if you look at that commit That shows pretty clearly with comments like what why all of those little bits of logic are there To handle the different cases Um question or need help Okay, does someone have a Some touch you do you have the um mic still? Okay, cool Right, can you raise your hand again? Okay, cool. Thank you Um, so if the controller is managing an external resource right something say like a gcs bucket And uh, if that bucket gets updated In a different way, maybe if I go through the console and if I say deleted, right Does the reconciliation logic like uh, when when does that uh get triggered if I make such an external update? I'm having a really hard time Hearing I'm sorry Maybe maybe Let's say that we have a controller that manages a external resource like a gcs bucket And um, so if I update the bucket externally if I go through the google console and if I update the bucket Um, does the reconciliation logic kick in at that point of time or do I need to update the crd for the reconciliation logic to kick in Do you want to take that one sometimes? So he's saying like if you update the external resource, right Does the reconciliation kicking? so like Like if you make a change on the api Does the controller automatically reconcile so I could like it doesn't right it's basically It reconciles at an interval, right that you can set and then it also reconciles when you make updates or updates to the object so Those are like the two things So if you make a change to do external api without doing anything on the coaster that will get reflected on its next run So it won't be immediate, but you could um, you could ask the like For example in flocks if you want to like we have a flocks reconcile command So what it does is it adds an annotation to the resource and the controller knows to watch out for that and Trigger your console so yeah, so I think you come here Exactly. Yeah, and and there there are ways to Speed that up to be more consistent with let's say a push-based approach so instead of you know Instead of waiting for that next interval you could build this is actually what some of uh Let's say the flux source controller does is it also has a webhook to be able to So let's just say when you make a git change in that case or some other source change You can emit a webhook to say Oh, go ahead and check again because we've updated it So it will still be happening on the regular interval, but you can you can issue a poke And you could do that with any other api as well if you wanted to build something like that And it's let's say it was really time sensitive and you wanted to hear events as well something what something you want you have to know here is This is eventually the consistent what you uh, if there's a failure when we do the update and the api call is failure Because of net network for you or there's something on the api on our side We don't well, we don't really care. I would say but yeah, okay We would set the condition saying okay for this reconcile is still not okay So we set the condition that says that state that is still not okay But eventually if the api is come back up and have The behavior that we expect at the end we will set the condition saying okay finally we updated Your object and we we can set it in the conditions So we moving on because yeah, we're we're at a we do the delete got about 10 minutes 11 minutes Yeah, I just do the date I wonder if there are any steps that we could cut out at least we could go through So the last part is about the deletion So the same the same way we have A test case to do the delete and what we expect at the end is that the object is deleted That is not found when we try to pull to to pull it to the api server So this is uh set with our final so we talked about the finalizer earlier So stating that we cannot delete because we have the finalizer What we do as well is before do we do the reconcile with we we check When is when is the deletion when you when you do cup cut all delete There is something that is there is a time stamp deletion time stamp that is set on your objects for the api server So if you have this time stamp We we call reconcile delete and in reconcile delete what we do He will check if you have a status And if you have a status ID It means that we have a speaker that exists on the api side So we issue An api equal to delete to delete that object Right I said that we we go on and we remove the finalizer, but actually we don't do if if it's I'm so sorry one one one thing did everyone check out Step three brand the step three tag and make a branch from it Or you can just follow along along on the screen But if you if you're doing it just make sure that you've done that Yeah, so actually what we do is we try to delete and if you cannot delete we don't delete the object on On Kubernetes site we we just continue after Every back off trying to delete again until someone fixes it on the on jpea site But if you delete is successful what we do is using control time we remove the finalizer So on the set of finalizer we won't have any finalizer anymore And then on jpea equipped by ps server And then we don't recruit actually Because we don't we stop the reconciliation and on the ps a ps server site. It can delete the object finally So that's what we are here and we're going to and make this for one last time And so we won't have time to cover the proposal one, but It's rather the same except for the differences tendency, right? Yeah, so I'm going to show you this So I'm going to join directly step seven Um question So when we get the event where on the first time where the timestamp Is set to non-zero Meaning that the object's been deleted Is that an update event to our cube controller reconciler or is that a delete event? You know we have a queue so if something is happening at the same time Okay, but for the purposes of your Predicate to filter out update versus delete events What would it be triggered as? For the purposes of your predicate because your predicate is is like filtering out but In this case we're returning true for both update and delete events But i'm just curious like You know if i'm writing a controller and I want to say filter out deletes because I don't care But I want to get update events and maybe do something different for that scenario Take the question up though Yeah, would there be a distinction There because I guess like does delete fire only when the object's actually removed from api server Because I think you're gonna cover this time logic. It's added I've covered it in logic. Yeah. Yeah. Well, maybe um, thanks for that one We're I think we're gonna address that as we go through the logic and and let us know if it doesn't Cool. Okay. Yeah. Okay, cool Thanks I we can talk about it at the end as well um So the proposal are quickly before We finish On the proposal what we do we do the same exact same things we create a proposal We update the proposal and we delete the proposal but uh Some differences That I'm going to explain quickly The difference is that we have is in order to create a proposal we need to have a speaker So what we do When we and that would be in step four you check out the s4 brand This is for when I'm at the end I explain everything because we don't have time Uh, we we try to get the speaker first if you don't have a speaker We cannot create a proposal and we set the condition saying stating why we cannot create a proposal If the speaker exists we can then uh create the proposal and I reference the speaker on the api side We have a validation when we try to create a proposal We validate that the reference speaker exists. That's the first thing and that is different um The second thing that is different Is on the watch side So what we have I did uh what you want to do is every time you have An update on the speaker side you want to be notified that the speaker has been updated And you want to update our proposal accordingly. So the way we do this is As I mean said that we have a cache Where we from them and from when we watch api server for events We put those objects in our cache and you want to index in our cache Those objects so we can retrieve them. So we index them by A speaker index that we call meta depth of the speaker night We index them so we can retrieve them done Okay So we have a function that is called index proposal by speaker We're going to do is retrieve a given proposal And uh index by by the speaker fans name the speaker that is owning this proposal Then what we do is when we uh set up our manager What we want to do as well we set it for a proposal But we want to watch is to want to watch the speakers as well So we watch the speakers saying every time there's an update on speaker We want to watch this and hand the handle those uh the req for the speaker And when we hunt we do uh this we set up our own function for handling this That we call request for speaker change And in request for speaker change what we do is we check if it's a speaker And if it's a speaker we check uh the id And then uh we retrieve all the proposals That have this given id As a as a key in the index And for the for all those one that we retrieve Which uh we ask for or we ask to reconcile those ones Have a loop where we consider this so To that difference is everything here is on step seven and we have everything in man actually in man You have everything written and you can Go from there with the test What we have as well is That I didn't say you have the deploy target Somewhere We have a deploy target here uh Go to man and in deploy In deploy what we do is we have um Yeah, we have an image just content the controller And if you do uh make deploy you can install everything the ipi and the controller in your In your In your given uh Kubernetes cluster That's it Okay. Yeah, um, I think we had did I don't know if your question was actually answered about the deletion. I think you were asking If I got it right How does the reconcile function In the speaker controller Um Or sorry. Yeah, how does the reconcile function know if it's a deletion or not Did I get that right as opposed to an update and I think if I'm understanding correctly um the uh Basically if there's a deletion timestamp Then we know that it's a deletion. Otherwise, it's not Yeah, and in fact that's in Let's say lines 142 to one Basically, yeah, there's a comment about that in line 141 and then there's a there's an if statement in there inside of the Speaker controller go that wasn't his question. His question is that we have Oh Well, I think it doesn't matter but on the control side, you have a queue to have all records that are coming through the queue And For every case we try to care to get the object in where so if it's not anyway anymore there You don't do anything and on the psl as well. You do this time You try to get from each cd from 80 cd to get the object if the object is not here there anymore nothing happened Do you first check the object? If it's not there anymore, let's change this right here If you check when we're getting Yeah, we we did that we did that today If you delete that completely we don't care Okay, so Yeah, where we can talk about it just after the tutorial Okay, cool. Well, we are actually at the end of our time Um, I hope everyone got was able to get through a part of this tutorial and you can still access those instances For a period of time. Uh, I mean, I don't know how long they're going to be available for do you know the get pod So I think my default is one hour. Okay the community So the community Uh addition for a get spot if I'm not mistaken. I might maybe I'm saying a mistake. Yeah, you know how much You know how much the session is going to stay How long No, no the the get spot session the actual get pod instances 30 minutes. It's not one hour 30 minutes Okay cool, so so So you can also go through that again Um I believe correct. We don't have a specific time limit after they can do this I just wanted to let people know what the next steps are Yeah, you can either watch it again. You can do it again. I think you can do it on get pot again But uh either way you can certainly run it locally and um And follow uh steps all the way to seven and see each of the comments for this So I hope that I hope that helps a lot and thanks everyone. Um, thanks for Thanks to the get pod folks and um, and all of you for coming