 Here it goes. So hello, hello. Let me, first of all, quickly introduce myself. So my name is Sergi, and I'm a patent developer. I like to say that I'm a patent developer from Siberia, though now I live in Berlin in Germany, but originally at least I am from Siberia. And I have something like 20 years depends on what to count and how to count of software engineering, of which I have, I guess now it's 11 years in Python, and I love Python for all this time very much. And currently I work in in Zalando, which is a well-known European e-commerce for cloths and for fashion as a software backend engineer in the precedent forecasting where we run machine learning infrastructure on Kubernetes. And this is what I'm going to present you now, is how to use Kubernetes to basically extend your applications and to make it a part of your applications, not only from the operations point of view, but also from the development point of view. But before we go there, I would like to remind you briefly in what Kubernetes is just in general, which would be needed later. So at least it is advertised as a container orchestrator, which is true, of course. And it was until some time, it was only the container orchestrator. So basically you could run a container, you could drop them into pods, or multiple containers into a pod. And on top of that, you could build a lot of abstraction layers like deployments, which just keep some amount of pods running and services for the pod discovery and container discovery, and ingresses to connect the internet to your application and so on and so on. And all of these all together basically represents the operations part of the Kubernetes. So where you run your applications, where you operate your applications. However, to take a deeper look at that, we need to understand what Kubernetes is under the hood. So how does it work? And when you take a look at it as a software engineer, you see two main components. First of all, it's an API, which is a REST for API. What's important sounds standardized REST for API. And second part here is that there's a database. So it's basically a state for REST for application, which does some magic on running the containers and distributing them to the workers. And if needed to easy to instances or bare metal machines and et cetera, et cetera. And also it's a common line tool, which is Kube control and numerous amounts of other applications, which integrate with Kubernetes like UIs, like dashboards, like monitoring systems and et cetera, and et cetera. And what we can do here is to extend Kubernetes with our customer sources. And so not only using the built-in resources, but also extending it, sorry, with customer sources, which do whatever we want them to do. And for that, we only need to create one single YAML file, which looks like this on the right. So on the right is a resource definition or customer source definition of which the most important part are the names, like single, plural, kind name and group version and version name. You can also optionally add, for example, aliases like short names here. You can add schemas, open API schemas for validation of the objects and many other things. But these are the important ones because they're used later to create the custom resources, not the resource definitions, but the resources themselves which you can see on the left. And those resources can contain any information that you like in the specification or spec field. So for example, here you can see that we have some durations, some fields, some items defined in our customer source and just for convenience, we have labels and annotations. But that would make Kubernetes only a stateful database for YAML documents, which just makes no sense. Or worse, it would be a database over HTTP, which totally doesn't make any sense. The magic begins when you add some logic behind those documents, YAML documents or behind those objects, which is called an operator. So what kind of logic that is, it doesn't matter. It does something whenever you create a resource, whenever you delete a resource or when you change any field of that resource. And what's convenient for us, the software developers is that when you apply this YAML with a customer source definition, Kubernetes automatically creates all the endpoints for the APIs. It is automatically supported in the command line tools and all other applications without any additional configuration. So you can just use this command line, Q-Control get, for example, as a customer source and it would just work. You can use the API clients and it would just work. And here, when you want to write an application which implements that logic, you quickly find yourself in a situation where you have to do a lot of infrastructure hassle. So first of all, you have to talk with the API properly. You have to authenticate with that API. You have to throttle your requests in case of errors. Otherwise, you would overload Kubernetes, for example, with some bugs and it would just die, which happened to us once. You have to make some parallel processing of your objects. If you have, for example, thousands or hundreds, you have to perceive the state of those objects. You have to handle exceptions in your own code, for example, or worse in the external environments and retry things. And you have to connect this API notation with JSON to your functions and so on and so on. Also, logging is needed. So a lot of infrastructure hassle, which needs to be done or whenever you see something reusable appears on the scene, what do you do usually? Well, you make a framework, I guess. And welcome, KOPF, which is a framework for these Kubernetes operators. So I will first quickly run through the features which are available in this framework and then how it can be used and which patterns in the software development it opens for us. So what's your best guess? What would be the most simple Kubernetes operator ever? How many lines would it contain? And the answer is three lines. Import the library, declare the occasion when we want to trigger something and accept the information. The fourth line here is basically your code already. So it's something useful in theory, at least. Here, it's not very useful, but it can be your domain logic, you can do whatever you want to. Or you can just pass and do nothing. But how to run this operator? At least when you start, the easiest way would be to run it locally in your local machine or notebook or workstation or whatever you have. Not necessarily deploying it to Kubernetes and making all those deployments files and giving it the permissions and so on. So you can just use your own permissions which you personally have for that specific cluster. And to run an operator, you just do cop run and your Python file. And optionally, you enable the verbose mode. And then you create your objects and see how it behaves. And if we will have time in the end, I will show you a quick live demo. Hopefully I will be fast with the presentation here. And once the operator runs, you will see some log output there, something is happening and the object is processed and the events are stored and so on. So the magic just works with these four lines of the source code and one common line for the object creation. So regarding the features, so let's quickly run for the features available. When some people come from the goal language environment and they ask how to make informers, for example, with the operators, they start with a very low level and the very low level of this operator, of this source processing in the operator is basically receiving the events as they happen in Kubernetes, directly as is. And for that there is a handler or a decorator just cop on event. Which does that? It receives the event and you can do something but if you have failed or if there was an error or you just didn't notice this event because the operator was down for some time, well, there will be no entries. There will be no second chance for that because the event has happened already. You can use this for some features but not for all of them. Most likely you would like to use the change detecting hunders, which basically understand when the object was actually created, updated or deleted or the specific field was updated or changed. And they notice these changes even if the operator was down for some time when this change has happened because they store the last handle configuration directly on the resource or in other databases. And therefore they ensure that your processing will definitely happen. It will not be missed. So a kind of eventual consistency here. So eventual execution of the hunders. And this is another reason why you see two creation hunders here. So for example, we have two creation hunders because every handler, every function is considered as a single unit of work. What's more important, a single atomic unit of work. It is either done or failed and if it's failed, it will be retried multiple times until it succeeds or until it fails permanently and explicitly. Which is very convenient, for example, if you create some sub-resources like persistent volume claims and a deployment and a job you would like this creation to happen only once. So you put every single object into its own handler or sub-handler. And once the creation happens it will not be re-executed again even if other objects are still in process of the creation, maybe multiple times. The framework also takes care of the error handling. So as I mentioned, if you fail to create something it will be retried multiple times until you reach the limits either by time or by the number of queries or for some other limitations. So in this picture it just shows four create hunders, creation hunders, of which only one, the second one will succeed eventually after three attempts. The other three will fail. Well, this is fine for this demo operator but probably in your real operator you wouldn't like to fail. You would like to bring all the handlers for the final success. Another feature, which is also a high level feature are the timers so you can execute a function every few seconds, let's say two in this example or in some cases every few seconds if the object wasn't changed for some other amount of seconds. So for example the second hunder here will be triggered every two seconds after 10 seconds after the last change. This is very convenient if you want to monitor some systems outside of Kubernetes or even inside of Kubernetes which are not part of your operator. So if you for example want to query some deployments and their amount of replica for example or if you have an external or internal application which you want to talk to with its own API and get the CPU load. This is exactly an example when the timers are handy. And yet another thing just to mention it but it's very handy also in some cases it's close to what timers do but it runs permanently. So the point here is the timer is executed from time to time and it is expected to exit after let's say a fraction of a second or a few seconds in the worst case. The demons can run permanently for hours, days, well, months maybe if you are lucky and doing something, some calculations for example or some querying of the data or whatever else. And you don't need to manage the feeds or I think your tasks or whatever for that. You just use the decorator and it runs magically automatically. So this is exactly the point where the point where Koff makes everything in Kubernetes easy. You just declare what you want, declare your intentions and you describe your domain logic, your domain code. I mean business domain code without any infrastructure. And as I mentioned, since Koff is intended to be friendly to humans, the breakpoints it's the very important feature of the modern IDs. You would like to put the breakpoints and investigate what's happening in your operator. And since you can run the operator locally on your machine you can do this not only in the terminal you can do this also in the IDE and you can put the breakpoints and whatever something happens you can just put the breakpoint and see the variables available and their values. So these on the right are the keyboard arguments available to every single handler. So it's some information about the object itself but also some helpers from the framework let's say memo to store some temporary in-memory data of any kind or a patch object which allows you to patch the object and so on and so on. So there's a documentation page on every of them so please go read for more details. And there are many more features implemented over the past year since actually year and a half since Koff was released for the first time. So like login, like custom authentication, configuration the framework or the operator can be embedded into other applications. If for example we have a UI application and you want this UI application to monitor what's happening inside of the cluster you can just embed it as an operator component running in its own feed and so on. So these are the features but what can we do with these features of the framework? And there are a few typical patterns how the operators are developed. It doesn't matter in which language actually it can be in Go, it can be in Python, it can be in let's say JavaScript if there was a framework but the patterns are like this. The first and the most obvious pattern is that when you create a parent resource here on the left, the operator automatically creates the children resources like persistent volume claim and a job which maybe optionally creates some sub children subresources and the operator can monitor their status from time to time pulling the status of the posts what was the job actually finished was it successful or failed execution and report something on the original object that initiated this job in the status field. To write such an operator you have to write only these lines of code. So you basically react to the custom resource creation you create a children resource like only one child here it's a job and on the other side you monitor the ports please notice not the jobs but the ports which are indirect children which belong to this application and you make some decisions on what's happening inside of those ports and how to report back to the cluster. So please notice only 40 lines of the source code of which maybe 20 lines is the big, I'm sorry, is a big text with the YAML file here which has just collapsed. Another pattern is that if you can orchestrate everything inside of Kubernetes why can't you orchestrate things outside of Kubernetes? Let's say SageMaker, let's say Spark let's say MySQL Postgres somewhere out there or whatever else. And it's basically very close to that except that you have to query the status of those applications and also report back the status which you can do with timers. So in this example, for example, oh sorry in this example every 10 seconds we check for something running outside or simulate that we do this with random and report the status of the object locally. When we combine these two patterns we get another very common pattern is the application-specific operators. So let's say Postgres operator or in this case Elasticsearch operator done by Zalando which is not code-based, sorry but just to get the idea. You run the same operator to create the children objects children deployment, deployments children pods but in addition to that you talk not to Kubernetes but to the application itself here through the applications API and you check if it's alive or what's the CPU load there internally and if not, if something is going wrong you can just upscale or downscale the application. And for that a code snippet, oh sorry a code snippet how this can look like. In the background it's the same application as in the first pattern when you just created children but in the foreground you can see that we first collect or simulate the CPU load here then we make a decision on what to do with the pods should we upscale or should we downscale based on the CPU load and we change the child here child deployment to upscale or downscale. This is the whole operator need for that so the complexity is very low as I said. And yet another pattern just to quickly mention it so it was the cross cluster communication. In that case we, this is an interesting example actually on the left the user can have access to the cluster A only and they can create the custom resources there but there's no operator for them. The point here is that the operator runs in a totally different cluster in the cluster B which is on the right and they communicate with each other with a little proxy with your authentication system so it can be all two or whatever else you have for authentication. And the execution and the workload of that intention on the left which is expressed by the user via a customer source can be shifted to a different cluster where the operator actually performs all the job. And this is a quite a complex and complicated solution and what do you think how many lines would be needed for that? And the answer is only these many lines so how many is this like below then less than 10. Because you can just inject your custom authentication as long you have a token you can just talk to a separate cluster instead of a local cluster. And this brings us all these patterns all together bring us to a totally different way of looking at Kubernetes and looking at our applications. So we can extend the domain-driven design to Kubernetes and basically bring the state-driven design or sometimes I call it desired-driven design just for fun. When the users express their intentions or the desired state of the system with the YAML documents applied to this Kubernetes cluster and the operator does the magic of reconciling with this reconciliation loop of bringing the actual state to the desired state. And this actual and desired states can be the states not necessarily of Kubernetes itself but of any application. So in this example, a user data scientist asks for a forecast to exist. Initially it doesn't exist but the operator does whatever it needs to and this is totally not of the user's concern what is happening there. So the operator can run a local job, can run a SageMaker job, can run, can do the calculations terrifically inside of the operator but eventually it brings the desired, the actual state to the desired state by uploading a file to an S3 bucket. When the user just notices that and say, who thinks my intention was satisfied and they get this forecast there. These are only a few patterns which I see, which can be used or can be applied in the operators but I hope this is not all and you can bring some new ideas how the operators can be used and which patterns can be implemented. And this is not the end of the presentation. So one or two minutes more. Do we have time? We have time. So just to quickly mention, there are three major tracks in the current roadmap. So one is the technical track where the new features will be implemented. I basically mentioned the same features a year ago and at another conference but this year didn't go well from the beginning and some of the features were not implemented but this is still the plan. And now since the framework is forked recently since this August, I plan to build a community around it or maybe join some other community and community of first contributors and also first of users. So some support for the developers who make code-based operators. And the third track is making all those YAML files easier because the YAML development, YAML driven development is sometimes really annoying. You have more YAML code than of Python code sometimes. So to summarize, the operators can be easy. They can be so easy that you can write just ad hoc operators for here and now without any reuse and it wouldn't be a waste of investment and you can do this in Python and you can orchestrate everything, whatever you need for your business domain without the infrastructure. So please use COP, please spread the word, please share your operators or at least some articles how you make the operators if you can share them directly. Here are the links. I have uploaded the slides already to this link so you can just, I'm not sure if you can Google or click here but feel free to download and to ask any questions in Twitter or in the issues. And before we go for the questions, do we have maybe three, five more minutes for a quick live demo? Yes, so there doesn't seem to be any ask any questions. It was a little bit about some URLs that seems to be resolved. So you do have, let's say about exactly three to four minutes for a live demo. Okay, so this is the operator which I mentioned in the very beginning that the most simple operator ever with one additional line which we can comment out right now. And we want to have a cluster and I highly recommend to use, key 3D for any experimentation. It's a very lightweight Kubernetes cluster which starts in something like 20 seconds, totally clean and from scratch and I will show you how to configure it cluster and how to get to the running operator. This operator on the left, it's basically part of the repository. It's the examples folder and the first one, the minimum example. There are more advanced examples like creating the children resources. Okay, let's take this one instead. This is more fun. Here we have a port description, a YAML file which runs a duty box and sleeps for whatever we have specified in the duration field of the customer source. And it adopts it, so it makes this port a child of the customer source which it handles at the moment which is the code, for example, customer source and it creates this port with the client. So in this case, it's by QBitGee and it remembers some information. So fine, we have a cluster running. We can see the pulse maybe of some other application of the system itself. And what we need to do is first to create a customer source, a definition, which looks like this long YAML file with the names. And then we can run the operator in PyCharm and most likely it will work. I hope so. At least it did so 30 minutes ago. Yes, so it has configured itself. It has authenticated with the cluster and it's ready to run. What we do here now is applying the custom resource itself with a spec and a duration of one minute. So this is how long the port will be sleeping. In some cases, I can type Z instead of Qt Control but it's basically the same. It's just for my convenience, one letter alias. So here we do the Qt Control Applying and this object file and please notice the operator reacts immediately. It notices that it's a creation of the resource. It calls the createfm handler. It does some communication on the low level which belongs to URL leap three in this case. It's not our log line. And that's done, it puts some status on that resource. So we can see the port now in the current namespace which we have created for the suffix but belonging to these customer source. And we can also see the customer source, the status. What's wrong? Yeah, it keeps, for example, nope. So here on the customer source we can see the YAML representation of that object with a lot of fields injected by Kubernetes itself with Qt Control with our own framework. And the most important here is that the status is updated in this last line where we have remembered the UID of the port that we have created. So in the next hundreds, whenever something happens with these pods or with the customer source, we can just remember the things that we have created and we can talk to Kubernetes just by referring by their name instead of querying like what pods belong to our application please tell us. So instead we can know their names. So that's it. The operator is up and running that easy. The whole code base for that can be hit on the screen with proper zoom. Okay, or maybe just the screen collapsed. That's it from my side. Now the questions, if there are any questions.