 Okay. Hello everyone, I hope you all are doing great and for today's session we would be discussing about building operators like Ninja with Kubernetes operators. So let me just share my screen. Okay. Is my screen visible? Okay. Cool. So yeah. So this is the topic for today under the light and myself, my name is Avni Sharma and I am a software engineer at Red Hat and my day-to-day work revolves around Kubernetes, OpenShift and operators. I work in the developer tools team in OpenShift and along with me I have my colleague Akash who is a software engineer as well and we both together work together in the same project. So what is our goal here with the talk today? So our goal here is to help you all to build your very first operator. So it's a very beginners oriented session and we would take you from why we need an operator and when we should plan on adopting to use an operator and then Akash would be doing a live demo on building your operator. So what are operators? So at the very onset of the talk I would like to describe that when we say operator we should visualize a human being who is really adept and who has a very deep knowledge of the systems they are working on and they know how the systems behave and how to react if there are any problems. So when I say operator we should think about a human operator who does manual tasks day-to-day and that person is really skilled and he knows the application domain whatever he is working on. So operator is basically a pattern and this word was coined by CoreOS. So considering the stateless application scenario in Kubernetes today. So if we consider a typical 12-factor web app we can perform the following scenarios. So our pods can survive crashes and node failures which can be done through a replication controller and we can easily scale it up and down and we can internally load balanced traffic to instances via our services. So basically currently in Kubernetes the scenario of managing creating stateless application is really seamless and smooth. But what happens when we have a database or let's take it on a very broader spectrum what happens when we have something stateful it can be a database it can be a message queues anything like that and it can be caches as well anything which has state oriented to it and we can have some legit concerns around it and those concerns can be that instances need to stay with the data and scaling may not be as simple as we did for stateless applications on Kubernetes as we all know that special knowledge would be required to manage the database and it cannot be rescheduled on any host like stateless services so it is pretty cumbersome. So for a stateful application I have considered an example of an HCD database and it will perform the following four actions that is namely to create, destroy HCD cluster, resize, backup and upgrade. So on taking a recap of what we had discussed of a human operator image bringing it down into the Kubernetes world we can essentially say that an operator extends the Kubernetes API to create configure and manage instances of complex state application why why do we say it's complex state application because specific domain application knowledge is encapsulated in it and we build this operator upon the basic knowledge of Kubernetes resource and controllers so we would be using the concepts of Kubernetes resource and controllers and then building our logic for our complex state applications and this is known as an operator. So now I have divided the definition into three important parts so the first one is custom resource definitions so today when we do kubectl or kubectl as many people call it we can do kubectl get pods get deployments but I want to do kubectl get suppose hello anything arbitrary something that I have made and something that is not available in native Kubernetes out of the box so that is my resource that is a custom resource. So when I talk about custom resource I definitely talk about a controller because once we have our resource we need certain actions to be held against that resource and since it is our logic we call it a custom controller and of course the custom controller would have domain knowledge since it is custom and the domain knowledge can be like installing cleanup self-heal restore backup update anything. So when I say I need a custom resource it basically means that I need the whole configuration of my resource at one place I want to mention it in a declarative manner in one yaml and I want action happening upon that yaml. So before we discuss CRDs I would like to show how an api request looks like in a native Kubernetes resource so like I do kubectl get pods pods is a resource and typical api request would look like api slash core slash v1 slash pods and if I see pods is a resource v1 is version and core is group. So to make my custom api request I would definitely require a group and a version so there we go we need to come up with our group version and resource names and this is effectively how we are extending the Kubernetes api that is not necessarily available in a default Kubernetes installation. So as I discussed we we call CRDs as custom resource definitions and CRD is basically a way to define a format for telling Kubernetes that my custom resource is going to be of this format so CRD is basically a blueprint and custom resource would be an instance of that blueprint. So this is a snapshot of a CRD example and it does not cover the whole yaml file I have just taken a snap of it so we are as you can see I have taken a very arbitrary name which is hello and if you see under spec I have defined my group version and this is what I have given its custom to my needs and the kind is hello so like we have kind pod I have given kind hello and as you come beneath you can also see plural and singular so like you can do kubectl get pods that is plural and similarly I can do kubectl get hellos and I can register my custom api request in Kubernetes because of this very cool feature which is known as custom resource definition if you see the kind in the topmost second line you can see it's custom resource definition and it is in v1 beta one stage so this is a typical CRD yaml that would look like so we have the blueprint we have the format we now need to create an instance of that format so I have my group that had described earlier and I have my version and my CR name which is my instance name would be example hello so now if you see the spec so spec would specify the desired state of the cluster this is what I need and that is what you would define in your instance spec so just summarizing we have a CRD so it is very analogous to doing kubectl get pods I can do kubectl get hello and like I have an instance of a pods suppose example pod similarly I can have an instance of hello so now I have my instance there I apply it in my cluster but what happens now I have applied my instance and there's nothing happening around it so for that we need a controller and that controller has the domain knowledge that we imbibe in it and this whole scenario is a custom controller so a custom controller needs a it means that a custom resource needs a controller to act upon its presence inside the cluster and such controllers are called custom controllers so this is the current state of the cluster and we need to reach to a desired state that is the whole aim of the controller so this is a very basic loop that is there in Kubernetes and it works on these three major steps which is observe analyze and act so observe means to observe the current state of the cluster analyze is to compare the current state to the desired state desired state is something that our custom resource defines I showed you the spec that was according to our custom needs that I needed of size 3 of city blah blah and name yada yada and once I know what is my current state and where do I need to reach I can act upon it either I can spin up more parts delete parts anything and you see this is a loop so over here what is happening is that whenever there is any crowd event whether it's create whether it is kubectl edit whether it is kubectl delete there is an event happening and this is event driven this loop is event driven and that event would then trigger observe then analyze and act in a sequential manner so the events drive this whole loop and we have a nickname for this and we would be mentioning the nickname from now on and it's known as reconciled so went to opt for an operator so you should consider these points while thinking that it should you even think of building an operator so you should go ahead if these points have been did against so either you want application that uses declarative API you want top level support from kubectl and you want basic crud actions happening on your instances or custom resources and you need the whole reconcile loop happening which is typically build automation for watches and the whole event handling scenario and that event handling scenario is governed by your business logic that has been encapsulated in your custom controller while it might seem really tempting to just dive in and build an operator it can be really challenging because you have to write a lot of boilerplate code from scratch if you don't have a framework it can be really tedious you need to have advanced knowledge of informers good testing scaffolding a good repo organization and research related to tools interact with the API so it can be very cumbersome intimidating and we definitely need an easy way to create and manage operators so you might think it's time to freak out but sit hold in tight and we have a solution for that and that is effectively operator framework which is an open source project and operator framework has three important parts which is operator SDK operator lifecycle manager and operator metering so we would be touching upon operator SDK and operator SDK is a it's a framework which lets you build your operator and it is really easy to use operator SDK so today we would be concentrating just on building an operator and OLM or operator lifecycle manager is a way to distribute your operators and manage the lifecycle of it so like install upgrade update all of that is governed by OLM and operator metering is related to metrics and we won't be touching these two you can also refer the link that has been mentioned here in the slide so while building an operator we come across operator SDK and within operator SDK we can actually develop operators via three ways which is go ansible and help for today we would be going through building operators with go so this is the demo workflow over here I'm concentrating on the reconcile logic excuse me so first we have observed where we are watching over API requests and this is my custom resource I apply it and once I have applied it it's there in the cluster I'm analyzing it now so I'm checking the size and the version mentioned in the spec in my custom resource which is my desired state so current size currently I have nothing in the cluster so current size current version is none desired count is three desired version is also mentioned so now I see that current size since I have nothing it's less than the desired size and current version is not equal to my desired version so I've analyzed it now I act upon it so I can either scale up ports depending on the size scale down change image tags and advance custom logic would be to have each part in a separate node so this is all custom logic you can act however you feel like so the other thing can be cube cutl edit cube cutl delete so all of these events would eventually retrigger these steps so that is how the reconcile loop works so it's demo time and I would like to hand over the baton to Akash now hi everyone so Avni has given us the basically theoretical knowledge about how we can use operator SDK to perform the automation on our cluster so in this demo what we are going to do is we are going to create one sample at CD operator so what this operator would do this would help us to create basically to deploy at CD cluster on our Kubernetes inside our Kubernetes cluster we can scale up we can scale down and we can also destroy the cluster if you want so all these operations can be performed using operator itself so let's go ahead and start the operator so first thing is we need to create a repo we need to have a scaffolding so operator SDK provides a way so this is operator SDK and I have you can see operator SDK installed okay so we need to do operator SDK in new so this is a command to create a new operator so we need to pass two arguments one of them is the operator name the other one is the repo URL so this is where we actually provide our GitHub repo link so that the import path would be aligned so let's go ahead and create the operator okay I already have it so I'll just delete it and create it again using the same command so you should see it has already it's it's been creating some files go files some set of files which are required to run an operator so we can go ahead and and see into that okay so I can show you the tree so here we can see it has already created nine directories nine different directories some are related to the build some are related to the it has some yaml files which are required for Kubernetes to set up the set of our operator and some go files so right now we can see we don't have anything inside our package directory just empty APIs file and we don't have anything in in our controller as well so first thing as I only mentioned that we need to create we need to have our custom resource defined in our Kubernetes cluster so how we can do that we need to do operator SDK we need to add there is another command called add and we can provide the API and this is where we need we need to pass kind kind is nothing but the name of the custom resource the way we have in our Kubernetes like pod deployment or replica set so something like that we're creating our custom resource with this name and we're providing some version groups as soon as we do that let's go to create another couple of files let's go ahead and we can see it has called it has already created some files under the package directory and the deploy directory so let's go ahead and see the package directory so here we can see it has created this version and the actually types the the go equivalent of the custom resource which we are going to create so right now we have API but now we need a custom controller which acts based on the changes which we do on that on on the CR so basically if we create a CR if we update CR or maybe if we delete the CR our controller should get triggered so right now we don't have any controller defined so let's go ahead and create a controller so to create a controller again we need to add controller and to this we need to pass a parameters and the parameters are kind the same one which we passed in the previous command so this is where we pass the we say that we want to watch this particular resource in our Kubernetes cluster which has this version so operator SDK is going to generate a controller code base for us which which already watches these resources okay so here we can see it has already created some you know controller files with some boiler code so right now I will not go into the this created files because we still have to add our custom business logic to it and we also have to define our CR spec so in our CR here you can see in the notes here our CR will look something similar to this which which which I have here it has a spec size and the version so size is basically the number of parts we want to have for that HCD cluster and this is the version and this version is nothing but the HCD version which will get create the the parts will get created with this HCD version so as soon as the idea is as soon as we create this here we should have HCD cluster ready with this number of size number of parts so what we are going to do in the controller custom the custom controller we are going to create deployment based on the CR metadata values so let's so I'll just go ahead and open up the already created operator here and I'll open it I can show you guys the code base so first I'll go over the so these are the directories which you got generated by operator SDK so first thing I'll go into the APIs and you'll see the API here yeah so this is the API struct which we created so I in the spec I've already defined size and value so we want this to metadata inside our spec field so whenever we create this CR this field will get automatically mapped to the HCD this particular ghost truck so that it will be easier for our controller to face the values and build deployment YAML based on the values we provided in the CR and also we have something called status so status is something just it will return the status of the current CR so whenever we create a CR our controller will perform some operation and as output we should know that whether there are there are any errors or what is the status of the that particular CR operation so this is something API will look like after that I'll just go ahead and show you guys the code of controller itself yeah so this is the code so most of the code is auto generated so only the thing is first thing is we need to watch particular resource so that has been done in the add function here here we can see uh we can watch this in this particular resource watch this particular resource and we can we can already we can we can pass some kind of predicates whether we want to perform we want to trigger reconcile loop on create or on delete or some update fields so that's up to us so this this watch is completely customizable after that so whenever there are some changes detected by the operator it triggers the reconcile loop so this is the main backbone of custom controller we receive so um so we receive a CR in the reconcile function and we can perform the business logic what we want so um in our example we want to have deployments created for this specified size and versions that's what we are doing here if you see I'll just go ahead directly head to the code where we created deployments so this function will internally get called some somewhere in the reconcile loop I'll just go ahead and just go ahead and see the create yeah so this deployment so what we do is we simply create one deployment yaml equivalent ghost truck here and here are two values we are going to define specifically the one is the size so here instance is nothing but the CR which we get which we pass here to this function and we read the size value size is nothing but the port size and another one is we need to have a proper xcd image to deploy so this is something I'm getting the image xcd image from vietnam actually repo and I'm replacing the instance version so this is so whenever we pass like in this year we can see 3.4.9 so it's going to pull the image of vietnam hcd 3.4.9 so this is how it will do it will work and in reconcile we will create this will create this deployment and then yeah basically we'll create this deployment and we'll have extra business logic to perform what to do when there is an update what to do when there isn't deletion so in case of deletion we'll simply delete the deployment and in case of update we'll simply simply patch the deployment okay so this is all about the controller now let's get go ahead and let's begin now run this operator to run this operator locally we need to we'll first make sure that our cube CTL is pointing to the correct cluster so I'll just check if my nodes are up yeah so right now I have a 1.18.2 Kubernetes cluster to run the operator inside that cluster we need to do operator sdk run and local so what this will do this will basically start a process start a go process and the client go will basically be pointing to the cube CTL configs let's go ahead and start but before before starting this we need to make sure that we have CRDs registered because our cluster have no idea about a city cluster CR we can see inside the deploy in CRDs we have this particular yaml file generated by operator sdk it contains all the schema about our CRD what fields we have the then the open schema validation everything will be inside that so let's go ahead and create this CRD so now it's it will be registered we can see the creator now let's go ahead to the root and I'll do the operator sdk run local yeah so we should this we should run this this is now running the operator in our cluster so we should able to perform any operation which we just discussed like creation of cluster managing the cluster so let's I'll just go into the directory okay so right now we already have an example here I'll show you guys example of the CR example so what it has it specifies the epi version kind it's just ordinary yaml Kubernetes yaml but here we have the kind as it's a cluster and the these two specific specs which we defined in our in our epi so which one of them is size and version so this we have quality specified the default values three and this is the city version we want to have a cluster with we can simply go ahead so before that I'll just watch here the Kubernetes deployments because we should see that deployments are up the size and everything we can see here we can do this so right now we don't have any resources so let's go ahead and create create this example here okay so here we can see our reconcile loop got triggered as soon as we created this epi this year with the hd api and here we can see it has already created a deployment file and you can see right now the image should be pulling up yeah so you can also see parts the containers creating so you can see so right now what we what we have done is we have created a cr the simple cr simple cr with the particular metadata and it has created a hd cluster for us with that particular version so let's say now we want to scale up the cluster because we don't want only three nodes we want five nodes to be running for this cluster so all we need to do is we just need to edit that this particular cr so how we can do is we can do tl edit the hd clusters and the name so I'll just set my editor so it will be easier to edit I'll just go down straight to the spec fit now I don't want three I want let's say six and I'll save the file internally it'll look cute so it'll apply we should see if we see the pick and sell loop has been triggered and also the deployment size has been changed so it's easier to scale up or scale down so similar way we can scale it down back to the three again I'll head to the size instead of six let's have three only just apply it internally it'll apply it'll patch that yaml and we can see the deployment has been changed to three now we no longer want let's say we no longer want our cluster so all we need to do is we just need to delete a cr so now whenever we delete a cr the deployment will also will get deleted because we have set the owner reference of this deployment config to the cr which we created so as soon as we delete the cr all the dependent resources will get cleared up from the cluster let's go ahead and do that we're going to delete hd clusters and the example cluster cluster is deleted see here you can see the deployment is gone so our cluster is cleaned now so this is how we can use operator sdk to manage you know small application or any complex application it's up to us up to the business so we can have as much as complex business logic inside our reconcile loop or inside our custom controller yeah that this is all about the demo so Avni will just now give us a recap of what we did here so back to you Avni I hope my slides are visible yeah so a quick demo recap we have less time so I will just rush through it so that we can take questions so what we did with operator sdk was we got an awesome boilerplate code the other thing that we saw Akash doing was to mention the spec that is in the cr in a golang format and we can see it's very well mentioned hcd cluster spec and it is it is just that it is so easy to locate where to add and similarly the status part can be added there as well and this is the types.go file so defining apis is also pretty easy now and the other thing is defining custom logic in reconcile function we have the hcd cluster controller.go file under controller hcd cluster so we have it's like our life is sorted now everything with one framework and in a very seamless manner so I would just like to mention operator hub which is like an app store of operators it's a registry where even you all can contribute your operator or use any available operator which is already built there and it's out there in the community and operator hub can be installed in any kubernetes cluster so this is operator hub installed in open shift and you can see that there is an operator here for example a service binding operator you just click on it and you click install and there you go that's your operator there up and running in the cluster so at the end of the presentation I will I have mentioned some links in the pdf and it would be there on the sked as well I have uploaded the presentation so you all can take a look at all these relevant references and under demo example you would find the example that akash ran the hcd demo operator and you all can ask questions there via issues or drop us an email and you can just take a look at the example and there's a community support around it you can visit kubernetes workspace and kubernetes operators channel or you can hang out in google group discuss and yeah get your questions clarified as well so this is amazing now we can all build our own operator how cool is that so thank you all for attending today's session and we're open to questions and while we are taking questions I would urge you all to scan this QR code drop us a feedback and any questions that you feel that were not answered because we're running short of time so yeah that was it yeah yeah sure any questions yeah I think we have a couple of questions in the chat itself so yeah yeah so operator is demoed locally and the client is basically pointing to the remote kubernetes cluster so yeah it's not actually entirely deployed on the kubernetes cluster if you want to deploy there is a way operator sdk provides a way to you know put that whole operator in the offer image and then deploy it or there is something called operator hub that you can it's a kind of marketplace for operators you can have your operator and um on the operator hub and then you can basically install it on any cluster are there any plans to support other languages in the operator sdk besides school uh yeah I I think that there are I don't think so so we um so right now we support go entirely because all the kubernetes ecosystem uses the go so we have all the uh dependent packages um and it's very easy to write operator because most of the code is already being written but I think there are a couple of open source projects uh they're still trying to um trying to make use of you know the controller runtimes to have it in in different languages I think python is one of them if I'm not wrong yeah just to add to that I have come across a python client libraries like we have client go and some java client libraries so there are client libraries there to make custom controllers however I'm not sure about any framework around it so yeah we do have client libraries also you won't be deadlocked that uh due to any language constraints you can still go ahead but I'm not sure about the framework being built around it also we have a way to have the operator with the helm but again help using help we can only create different resources but having the business logic is kind of difficult using help so go is the better way if you won't write operator uh what are your use cases for using an operator um so in red hat we basically have um dev console operator we have a couple of operators so all the functionality uh which are required to be deep inside the open jib cluster we have operator for that for example um we have something called dev console where we can just provide a git url and as soon as we provide a git url our code should get deployed uh with without any issue so it should get compiled the images should get built and deployed to the where or any other docker registry and everything should work seamlessly so that is something we are working on in red hat and there are a couple of uh we also have a service binding operator uh I was currently working on that and so what a service binding operator does is that it takes applications and services services can be any backing service any db or any Kafka anything and it connects your application with that particular backing service and you would not come to know about it so that is something that happens in the background and that's an operator use case that we are taking to and yeah you can explore it on operator how about yeah and I uh to the only I can add few more points about service binding operator so so you can basically connect two different applications uh in kubernetes without doing any extra configuration like if you are like xd cluster and if you have another node this application which uses xd cluster you just have to create deployments for these two different uh applications or database and the application and they just create a service binding cr specify that you want to connect this xd cluster to this node just application and our service binding operator will basically handle that it to figure out all the um all the required metadata such as host URL username password and it will bind it into the node just pod so that is the use case we have I guess my uh my slides were not shared in the my screen was not shared in the end so you all can take a look at the ppt again I have uploaded it in uh shed as well yeah okay so there's one more uh unanswered question how does the status help in your reconciler uh so right now I think I did not uh show you the status so in in status when uh everything goes right we just print the message that the the xd cluster have been has been deployed with this number of pods or number of nodes and with this xd cluster version and if it fails we just append the error so that it's easier to debug um as a Kubernetes admin right and we have another question is your operator demoed running locally or within the kts cluster sorry which one is this is your operator demoed running locally or within the Kubernetes yeah I didn't answer that so I'll repeat that again so basically operator is running locally using the go process so it's demoed entirely on locally but when we run the operator code it actually uh connects to the remote Kubernetes cluster using the kubeconfig file or any auth mechanism which we have in our class in our local setup in our local system there's one more why do you choose to create an operator uh so we so we basically work on the uh most of the Kubernetes or containerization applications and there should be I think there should be a proper way to automate such a process because since Kubernetes is expanding more and more we need to have a proper mechanism to have our custom application to be deployed in Kubernetes and we should have with the custom controller we have entire control in our hand we can actually see the each like we can actually control the lifecycle of the CR I think that is something um it's very important when we build a Kubernetes application so I think operator SDK provides a better scaffolding it has a great support and you can do a lot of things with these but few commands commands you get the API code generated controller code generated so I think that is the reason we choose to create operator yeah I mean um if I can elaborate on that uh suppose in case of any sharded database suppose I know that I wanted to sharding on my database I want certain parts of my database in one node and the other in another and that is something that I know and nobody else will come and shard my data that is something really niche and something that only I operate so I would make such a custom logic that I put you know that I shard database on particular nodes that I know where to uh in a set manner and certain certain manual exercises you know like I want to do SSH to a server after this event happens or something anything that you do manually and you would be knowing it because you operate it and something that you think you operate it on a daily basis and can be automated you should go forward with an operator I mean of course if you have a very good configuration file or very good uh secret or config map you have uh the configuration is set you can have stateful sets as well uh but then then it becomes a whole orchestra of services stateful sets uh pvc volumes but I want the whole configuration in one yaml and I want to make it really convenient for me that I want I want this database and this stateful set and everything the whole configuration the whole desired state is at one place it becomes really easy I guess that's it and if you all have any other questions you all can reach out to us on slack or twitter yeah thank you for joining