 Okay, so welcome everyone. I'm Maísa Macedo and I'm here today with Sharca and Michal to present about Kubernetes controller and CRD patterns in Python. Today, firstly, we will go over a bit of what's career and how career works and what are the advantage of using it. After that, Michal will go over the event-driven patterns in Kubernetes and also how that's done with Python. After that, Sharca and I will go over CRDs and finalizes and why we decided to support that and the patterns that we use it. So career, it's an open-stack project that aims to provide networking for Kubernetes pods by relying on Neutron and also to implement Kubernetes services by relying on Octavia. So career has two components. The first one is the career controller, which is the only component that has directly interaction with open-stack and it keeps watching events on the Kubernetes API and it triggers actions on the open-stack API either on Octavia or Neutron and the data that generated from the open-stack resources it's then save it on Obejak on Kubernetes objects. And also the second component that we have is the career CNI, which is responsible to provide the proper binding the proper port binding for the pods. So career can provide a really great gain when the buoy encapsulation of packets is avoided. So basically when containers workloads are running on top of open-stack VMs and the benefits can be seen on the graph on the side. So TCP stream on pods of running on different workloads across different hypervisors when using ML2 OVN, you can see that the blue one is the OpenShift SDN and the red one is the career SDN and there is a considerable difference on the throughput. And aside of those gains, career is also a good choice because it combines everything under just one SDN. So it simplifies your networking stack because the operator will not need anymore to troubleshoot and debug one orchestration system, not two orchestration systems, but just one now. But today we are not going to cover details about how career CNI works. We are going to focus on watching on the career controller and watching the events and saving the data. And now I will hand it over to Michal. Right, so I will talk about what is the controller pattern in Kubernetes and how we did implement that in Python in Kubernetes. So controllers are pretty much like a control loop. So the distinguishing feature here is that they are always watching the changes in resource status in the Kubernetes APIs and then based on the status of all those resources and their own logic, they will interact with some external systems or they will try to change the states of the resources in the Kubernetes API to match whatever logic they have implemented. So basically the goal is that a system state will correspond to the system specification which is somewhere embedded into the Kubernetes API or in the controllers logic. So basically if you consider what's how core controller works, it's basically a set of controllers. So for any pod, we need to create or delete the neutral pod. For the services, we need to create or delete Octavia load balancers and points are basically members of the Octavia load balancers. And network policy is a security group in the Kubernetes world. So basically we create and delete no security groups based on the network policies. So pretty simple, right? Well, if you would consider how to implement that, if core controller will be written in Golang, then it's very simple. There's Client Go and Client Go is managed by Kubernetes team. So it's simply how we write that in Golang as standard. But in Python, there's nothing, at least there was nothing back in 2016 or 2015 when Kurg was actually being created. So this is what we end up with written in Python. So I told you that, well, it's simple, right? Turns out it's not that simple. So let's take a look at each of those components and go through their functions. So Kubernetes Client is simply a piece that is contacting the Kubernetes API directly. And it's with us since 2016. And back then the Python Kubernetes Client that is maintained by the Kubernetes community wasn't supporting watching. So we need to implement that on our own. So it basically implements the watch API in our case. And watch API is simple streaming API that we get stream of events from the Kubernetes API. That's a simple JSON stream of events. What's interesting is that it is retrying the connections. So normally we contact Kubernetes API through the Octavia load balancer and back in Queens version of the Octavia. Connections were dropped after 50 seconds of inactivity. So we need to make sure that we retry. And we retried with the latest resource version that we've seen. So each event has a resource version and we can save the lightest handled one and then try to retry the connection specifying the resource version so that Kubernetes will know that it needs to stream us also the older events that we might have missed. An interesting part is that we've noticed that sometimes we're losing the connectivity silently so that Kubernetes wasn't really noticing that. So we actually needed to implement dropping the connections after a time out of inactivity. And well, that's basically what Octavia was doing and seems like that was a feature not a bug. Okay, so the next component, that's water. So water is a container watching multiple URLs in the Kubernetes API and its handles treating. So each endpoint each URL is watched in a separate green thread so that those won't interfere. And it retries on a broader set of failures than just dropping the connection. And it does reconciling which is not the correct term I think but basically periodically water is fetching full list of resources and putting that into the handlers on the end of the pipeline just to make sure that we are even if we'll lost some events somehow we will still try to figure out the correct state of the system. Okay, controller pipeline is very simple. That's just a handler for dispatchers and handlers. And it's dispatcher is directing stuff into a correct handler and handler is processing stuff. So that's easy. Log exception handler just logs on exceptions if they are coming up the stack so much. And I think handler is something where the magic happens. So it divides handling income green threads so that for each resource, so it's based on new ID. So for each instance of a resource, we get a separate queue processed in a separate green thread green, in a separate green thread. So basically this means that each resource is processed separately. And later on you'll see that this is pretty useful with the retry handler below. So green threads are pretty self-explanatory so now the retry handler. It is handling the predefined list of exceptions and retrying the events in the such cases when those exceptions are attached by the retry handler. So we have the special resource not ready exception that we use in many cases. For example, when neutron is slow, for example, activating the port that we are waiting for, we are just raising the exception so that we won't wait in the handler itself but on the retry handler. And it's with the exponential backup it will retry the event until it's correctly processed. And it is blocking, but it's not a problem because with the async handler dividing the each processing of each resource into a green thread, we are guaranteed that even if you block we will just retry the event and then proceed with the queue of the next events normally. And in order, this is pretty important because we want that in order. For example, we don't want to process delete event and then after that delete event processing some other events that we're considering that the resource is still existing. Okay, the dispatcher is pretty simple. So it simply decides which method of which handler to call. So we have and present on deleted and on finalized. The prefers to our pretty self-explanatory. The third one will be explained by Misa a little later on. And then the real work. So those are the handlers that are actually doing the hard work of creating the open stack resources. And so on. So this is basically how the pipeline looks like in the query controller. So I will now hand over to Sharca and Sharca will explain you about CRDs and how we use them to save data. So if we want to talk about custom resource definitions CRDs, we need to start with custom resources which are extensions of the Kubernetes API. Resource is an endpoint in the Kubernetes API that stores a collection of API objects of a certain kind. Custom resources let you store and retrieve structure data. So the custom resource definition API resource allows you to define custom resources. When you create a new CRD, the Kubernetes API server creates a new, restful resource part for each version you specify. So that means that defining a CRD object creates a new custom resource with a name and a schema that you specify. And the relation of the CRD and the CRD object is similar to classes and instances in object-oriented languages. CRD is a template. So after that, you as a user can create a custom resources with this predefined structure in the CRD. So at the beginning of this presentation, we talk about what Kubernetes are. Now we will talk about why we move to custom resource definition. And that's basically because we need to save data about OpenStake resources. For example, we need to know which Neutron port matches which Kubernetes ports. Furthermore, we have to make sure the OpenStake resources are cleaned up when Kubernetes native resources are gone. So what we have, we've been using before CRDs. We have been using annotation. That was because CRDs were not there and its third-party resources were kind of limited. Annotations are key value stream pairs and they're tied to the Kubernetes native resources. So now the question is why we abandon them? One of the most important reason was improving readability and debugging. In the picture, you can see how annotations were looking. And as you can see, it is really difficult to find any information or to debug something it's really hard to read. And now there's an example of the CRD for the same data and I think you will agree with us that this is way better to read than annotations. Another reason is that with CRDs data, it is not tied with the native Kubernetes resource anymore. That basically means if port is gone, it doesn't mean that the Neutron port associated with it is gone as well. But on the other hand, the port will be gone when the courier port CRD is gone. It's removed. Last but not least, the CRDs are more Kubernetes-like pattern. Now we will move to Maiza and she's going to tell you something more about patterns that we use with the CRD, which will help you to better understand why we moved to them and how it led us to split the work into smaller and specialized controllers. Okay, so a few patterns that we included in our custom resource definition were basically we needed to define the new API version that was created in order to handle the new CRD objects that we are creating. So one of the resources type that we added was the career load balancer, which is basically it's pointed to a service and we also have the specification and the status field. And those fields make it easier for the controller to sync between the current state of the system and the desired state. And also we have the finalizers on the metadata. So basically finalizers, it's a list of strings that you define in order to avoid a hard deletion of this resource. So basically in order to remove CRD that was defined, you would need to remove all the finalizers before that, otherwise it would be blocked. So how do we handle the addition of finalizers and the removal of finalizers in career? So when a deletion event of a resource that has finalizers defined on that happens, the first thing that's included by Kubernetes is a deletion time stamp. And when that deletion time stamp is included, it comes along as a modification event. So career keeps watching for those events for any kind of event actually. And if the resource comes with no deletion time stamp and is defined, it means that career will treat it as an addition of that resource. So it will first add the finalizer to the resource. Here in this case, we have the service as an example. And after that, the finalizer, we also get added to the custom resource. And lastly, career will trigger the creation of all the load balancer resources that are needed to properly handle that event. But if the resource that comes with the event has a deletion time stamp defined, career will remove all the load balancer resource first. And after that, it will clean up the custom resource by removing the finalizer and allowing Kubernetes to delete that custom resource. And lastly, it will remove the finalizer from the service, and the service can also get deleted. But it's important, aside from this sequence diagram, this scenario shown here, it's important to notice that whenever the career controller restarts, the whole list of resources existing on the cluster will be gathered again and watched by career. And this means that if the object comes with a deletion time stamp, the event will proceed with the regular deletion of all the resources that are handled by the object. So this means that the event will not get lost if the controller restarts. And I guess that's all. Thank you very much. And if you have any questions, please let us know.