 Welcome. When you're thinking about a controller, what comes to your mind? Games, something fun, hopefully. Unfortunately, I'm not normal. So whenever I'm talking about controllers, I'm thinking about codes. I'm a little bit twisted on that end. But during this talk, I'm going to be talking about actually my Twitter part, so the controllers. In Kubernetes, especially, but not necessarily, you can think about controllers in OpenShift or controllers to operate your own specific resource that you can define in Kubernetes as custom resource definition. So controllers per se are not that hard. I'll explain hopefully everything in small detail so that you should get a rough idea how to write your own controllers and most importantly, where to look for the information. But this, what you see on the screen, is actually a real controller that is part of Kubernetes source code. It is a cron job controller. That's one of the simplest controllers that it is available in Kubernetes, sort of simplest. At least the implementation is very simplistic. And well, if you go to the URL, you can see the code. It's roughly 400 lines of code. Unfortunately, you should not be taking cron job controller as a good example. And I'm going to explain a little bit more why actually cron job controller is a bad idea. Why I'm talking about controllers, you might be wondering why this dude is standing in front of us and talking about controllers that he has any knowledge about the topic. So I wrote originally two controllers in Kubernetes. Does anyone had a chance to play with jobs or cron jobs? Okay, perfect. So I wrote both of these controllers. That's why the cron job is a bad example and you should not be looking at it. Job is much better. Unfortunately, it's a little bit more complicated. So let me try to squeeze the code, those 400 lines of code, into something more digestible. The control loop. This is the heart of every single controller. The control loop basically gets items from the queue, processes them, eventually handles any errors or recue items that needs reprocessing. That's it. That's all you need to note about every single controller. It will usually even be named the methods. Most of the cases will be always the same. So this is the pieces that you will be looking for, which is the worker, which is usually those three lines of code, and process next item, which will be getting item from the queue, calling sync handler, calling either a queuing or generally handling errors. And the sync handler, I'll be talking about a little bit more in a bit, but all you need to note about sync handler at this point in time is that this is the actual work you do with your resource in case of a job. Sync handler is responsible for responding new parts so that it fulfills the requirements that the user specified in the job spec. Does that make sense so far? Okay, cool. Well, in that case, I can probably finish my talk because that's all you need to know. So thank you very much. That was very nice meeting you all. But no, just kidding. Where do I start? So there are a couple levels how you can approach controllers. When I was writing my first controllers about four years ago or so, maybe even five at this point in time, there were no documentation. There was nothing. The only resources that I had available were existing code source. So currently Kubernetes has more than 30 controllers. Additionally, there's lots of community written controllers or provider written controllers. OpenShift, aside from the core controllers and the ones that are dealing with builds or deployment conflicts, has a bunch of operators that Jessica will be talking about after my talk. So definitely stay and check out what she has to say about operators because it will be relevant to my talk. Yeah. So going back to my story, I didn't have any documentation back then. I, when I was writing job controller, I opened replication controller back then. Currently it's called replica set. And I was looking at the controller and I was writing my own job controller. So it is an option how you can start writing your own controller, but it is an advanced option. Other thing is that throughout the past four years, all of the controllers got complicated. Like extremely complicated because we got lots of future requests, lots of improvements and fixes that just made the controllers slightly bit more complicated than they were when I was starting. So yes, it is currently more advanced topic, but there are entry level topics that allow you to write your controllers. The most important is sample controller. Sample controller is nothing else as a very simplistic controller that lives in the Kubernetes repo. Why this is important that it lives in the Kubernetes repo? That means that every single PR that lands in Kubernetes ensures that the sample controller always stays valid and is working because we have a proper amount of tests in the sample controller that ensures that this code is always up to date and it is working. That is very important because sample controller will be one of the resources for you as controller authors to look at and it's very simple. I can't remember the exact line number, like 50, maybe below hundreds, but it is a good starting point and it is well documented. The other option is QBuilder. QBuilder was started partially as a 6CLI that I'm co-leading currently a project that grew over time and QBuilder allows you to build, it provides you a framework to generate your custom resource definition files. It will generate a framework for the controller so you can either use QBuilder or you can copy paste the sample controller and then fill in the bits that are relevant for you which is like I mentioned before, the sync handler which is the heart of your controller. Does that make sense so far? Aside from that, there do exist other examples that I haven't mentioned and there is the commutation and I'll be pointing to the relevant points of the commutation that you might need over time. So let's look under the hood and let me pull up again the control loop that I showed you a minute ago. Now, I told you that this is very simple but I'll make it even more simple. I'll try to split this into bits so that you can actually understand each and every single line of the code that is on the screen. So, the Q, I've mentioned one of the important parts is the Q, that's usually how the line, the incantation for instantiating the Q looks like in the majority of the cases. When you think about a Q, it is nothing more than a first in, first out, that's it. There are some additional features such as rate limiting which allows you to sort of penalty the controllers that are having problems so that you can recue the same item but only with some delay so that it doesn't de-doss your controller in case of permanent failures with a particular controller. There are a couple options, how you can define the rate limiting, there's a default rate limiter that is used over here. There are exponential, fast flow rate and there are a couple possible combinations but if you look through the Kubernetes code itself and I'll be mostly talking about Kubernetes sources because I'm mostly working between Kubernetes and OpenShift so there will be my examples coming from. This is actually an example from the sample controller so it'll be easier for you to know what you are looking at when you will be looking for the code. The majority of the controllers, the main core controllers that Kubernetes has will be using the new, the name rate limiting Q actually. The second important element is we need to feed the queue somehow with data. To do that we will be using in all of the controllers and here the cronjup is an exception and that's why cronjup is bad and needs to be rewritten from scratch almost because cronjup is not using shared informers. Why shared informers are so important. Shared informers provide you with a data cache for your specific resource in that case that we have on the screen it's actually for pods. That basically means, and aside from the cache it also provides you with a mechanism to distribute notification about changes to your object. I'll be talking about EventHandler in a minute. Why it is important, well obviously you don't want every single controller to be reaching the Kubernetes API server and trying to get fresh information about what's going on in the Kubernetes API. Assuming that every single controller usually is listening let's call it that way to changes to at least two resources because it's usually the controller resource and the controlled resource. For example, job will be looking at jobs and pods and then imagine that you have like story controllers doing pretty much the same thing and every single of them will be looking, will be reaching Kubernetes API just to get the same data about pods that is way inefficient. You don't want to do it. That's why you do want to use Shared Informers. It provides, it is proof to be flawless at this point in time. It is very efficient and every single controller except for CrunchUp is using it. EventHandlers, so like I mentioned before aside from registering the Shared Informers that you need to do, you then define EventHandlers. What are EventHandlers? Well, the EventHandlers are nothing else more than just a registering for events that you want to receive about your resources. The three possible events are creation of the object, modification and deletion of an object. You can react to all three of them, but you don't have to. It will always depend on your specific use case and your specific controller how it is reacting to your controllers. In the majority of cases, obviously you will want to react to all three of them. I don't think I've seen a controller that is not reacting to all three, but I might be mistaken. To my knowledge, usually it is all three of them. So as you see, the signatures of the functions that need to be defined and usually what the functions are doing, they're in case of a creation and deletion that are always just queuing the resource and putting the resource into the queue and then the resources being processed. In case of an update, we will be comparing some metadata about the resources to know whether we wanna work with the object or we don't care. Final element of the shared informants is listers. So the notification will provide us with objects, but the queue will only hold the metadata information possible to identify a particular object. Then we need to get a fresh copy of the object and then work on it. And for that reason we will be using listers. They provide you a view of the actual store. So what does the actual sync handler that I was talking beforehand look like? Every single sync handler will have more or less this flow. First of all, we're gonna get from the cache, the key of the resource. This is usually namespace slash name of the resource, but it might change to basically just use the method to get the namespace and the name of the resource. Then the next step is you're getting the actual object that you wanna work with from the caches. And here goes very important bit. Caches are shared data stores. So before modifying the objects, you need to perform a deep copy on that object. The reason for that is if you don't do it, you will be modifying cached version of the object, which basically means all of the controllers, all other controllers that are working with the same resources will get corrupted data because the data in caches will not reflect the data in Kubernetes API. So that is very important element. And then finally, your logic. Your logic can be creating pods, your logic can be spinning some VMs, whatever you can think of for your resource, this is where your logic actually comes in. There's a nice picture that was done by this gentleman that is mentioned in the left bottom corner. He visualized basically how the client go and controller are interacting. So the bottom part is actually what I told you as shared informers and the work queue. You're gonna have to deeply look into what's going on because client go is a library that you will be using when you're writing your own controller and the work queue and shared informers are just interfaces that you are using to access data from the Kubernetes API. You can clearly see how it basically works under a hood, but all you need to know is what you will be interested in is the bottom part, which is getting event handlers, putting data into work queue, processing the item in case of a failure of an error, re-queuing the item. Again, for reprocessing in case of a successful processing, you will have to call a forget function so that the queue removes the same object from the queue. Does that make sense? Okay. Okay, so before closing the entire discussion, there are 11 ground rules that every single controller should fulfill. It is crucial. All of the queue controllers, except for cronjup, do fulfill those cronjup, hopefully, within a year, maybe two, depending how much time we'll also fulfill those. So let's go through them. There's a link, of course, to the documentation if you wanna read about those. One, every controller operates on a single item at a time. Always. You're not processing more than one. You're processing only on a single item. Nothing more. There's a random ordering between items. This is a distributed system. There's no ordering. That's the whole point of Kubernetes. There's no ordering. You should not be introducing any ordering. If so, it should be done at a different level. But basically, all the controllers that you were working currently, well, you don't know how people will be modifying your resource. I just produced, I don't know. For example, I'm cronjup controller. It is producing jobs based on a schedule. I don't know what will be the schedule. User can modify the schedule at any point in time. I need to react to the changes that the user perform at any given point in time. That applies to every single controller. If the deployment controller gets information, oh, I wanna scale this application to five parts. It needs to immediately respond to that and spin up those five parts if there were three or two running beforehand. I don't know what will be the order. So the ordering does not matter. Never in the controller. Level driven, not edge driven. Like I said, this is a distributed system. There's no I'm always running. You might be processing a controller all the time because there will be constant changes. But there will be times when your controller will be modified only once in a while. And then nothing will be happening to the controller for a good period of time because it is working stably. Use shared informers. Like I said, shared informers are very crucial when it comes to working with data. Always use shared informers. You will find the controllers that are using listing through the resources available on the server. That's what the cronjup controller uses. The reason that what we did it back then was shared informers were not existent yet. The other thing is cronjup controller is slightly different when it comes to implementation. It's not that simple to rewrite the cronjup controller. All the other controllers are always reacting to the current changes and applying the changes immediately to the resources that they are working with. Cronjup controller is slightly different because it needs its own scheduling mechanism to be able to spin up jobs per defined schedule. Not immediately, but per defined schedule. Never mutate original objects. That's what I told. That also applies to what I was talking about before and also relates to the previous object when you're using shared informers. You are always working with a shared data. You need to ensure that when modifying, you need to have your own copy. Wait for secondary caches. Your object can be working with multiple caches. I mean, you might be listening to multiple objects. You always have to ensure that you have all the caches up to date. So the majority of the controllers will always have a couple of methods that are waiting for caches to be synced, somewhere at the beginning of the cache. There are other actors in the system. Again, we're in the distributed system. Just the cube controller manager is running about 30 plus controllers. If you're running in a more advanced system, there could be additional custom resource controllers. Or in case of OpenShift, there will be Kubernetes controllers and OpenShift controllers. And in the 4.0 world, there will be also operators running in the entire cluster. So there might be 50, 60 controllers working across the entire controller. You don't know which objects they are working with. You need to ensure that you are working with yourself and you are behaving properly. That also applies to re-queuing errors. Anything can happen, I don't know. You are having problems with updating the resource. You process your objects and you wanna ensure that it will, status is updated, but the update fails. You wanna re-queue the same object back and re-process it and update the status at some point in time. If there's a permanent failure, obviously the re-queuing should be with some rate limiting happening. Watch as the informers will sync. That's the guarantee that we are giving you. Use observed generation when possible. That basically means every single object has a spec and status in its definition. You should be using an observed generation as just a number so that you can easily define that yes, the status actually is the same as the specification of the object. Deployment is always using those heavily. And consider using owner references. Well currently, it is still written as consider in the doc, but it is actually a requirement at this point in time because garbage collection relies on the information about your dependent so that when you're removing your controller, all the dependent objects should be removed if the user wishes so. If not, then obviously there is an option to orphan the dependent, but that should depend on the garbage collection and the user, not on the controller writer to do so. Okay, I think I'm out of time. Thank you very much. I hope that presentation was very helpful. If you have any questions, I should be outside of the room or somewhere in the room and throughout the entire conference. Thank you very much. Thank you.