 So, hi everyone. Thank you for coming to our presentation. This is SOS, Simplifying Operators Substantially. So, start off, we're just going to do a bit of an introduction. My name is Manal. I'm an Associate Software Engineer at Red Hat. I work on the Ecosystem Experience Engineering Team, specifically on the Operators Enablement Team. And so, before Software Engineering, I was an English teacher in South Korea, and I graduated with my English degree at University of Minnesota, Kristen. So, I decided I didn't want to be an English teacher anymore, and so I went into App Academy, which is a coding boot camp, and that's how I landed my job here at Red Hat. Next is my partner. Hello, everyone. I'm Sandhya, working as a Software Engineer at Red Hat in the Operator Enablement Team. So, I'm an international student from India. I came to U.S. to pursue my Masters in Computer Science and graduated from University of Cincinnati, and then landed up as a Software Developer here in Red Hat. Okay, start off with our presentation. SOS, Simplifying Operators Instantly. We're going to be talking about the 5Ws and 1HOW of operators, and so, to kind of touch base on how we came with this topic, a couple months ago, we went to Cloud Native Rejects, and we did a presentation on an operator we were creating called the L5 operator. It took an operator, and it fulfilled all the levels of capability levels of an operator. We'll talk about that later on, but we went to Cloud Native Rejects. We did this presentation, and we were constantly hit with the question, why operators? And so, to really understand why operators, we have to touch base on what operators are. So, that's what we're going to be talking about in this presentation. It's going to be a very beginner's course. We're just going to do talk about everything very high level. There's definitely more to operators, but this will just be a starter course for what operators are. And so, the 5Ws and 1HOW operators, who created operators? What are operators? Why are operators valuable? Where do operators thrive? When should we use operators, and how can we create our own operators? At the end of this presentation, we'll also do a little demo, and we'll pride you with some operator exercises that you can mess around with and experiment on your own on how to create operators. Then, we'll also give you some resources that helped us during our learning journey in creating operators. So, 5Ws and 1HOW operators. So, who created operators? Brandon Phillips. He is the one that first termed the word operators. He was the CTO of Coral S. In 2016, he wrote this article called Introducing Operators, Putting Operational Knowledge into Software. And so, he writes, Site Reliability Engineer is a person that operates an application by writing software. They're an engineer and a developer who knows how to develop software specifically for a particular application domain. The resulting piece of software has an application's Operational Domain Knowledge Program into it. So, he goes on to say that his team has been busy working in the Kubernetes community designing and implementing this concept to reliably create, configure, and manage complex application instances on top of Kubernetes. And he caused this new class of software operators. So, what is an operator? What are operators? An operator is an application specific controller that extends the Kubernetes API to create, configure, and manage instances of a complex staple application on behalf of a Kubernetes user. So, an operator builds upon the basic Kubernetes resource and controller concepts. But it includes domain or application specific knowledge to automate common tasks. So, an operator contains three big components, three major components, which is a controller, resource, and knowledge. These are the three main things that make up an operator. So, first off, the first component of an operator is a controller. A controller is basically a control loop that watches the state of your cluster and moves your cluster, your current cluster state closer to the desired state. So, it watches your state, checks for differences, and if anything is out of place, it acts to make sure that your current state is as perfectly defined and similar to your desired state. So, similar, some examples of controllers that you see in Kubernetes are the replica set controller, ornament controller, and demon set controller. And so, to iterate and to really specify, controllers aren't operators, but operators are controllers. Operators contain domain application and domain specific knowledge. So, controllers don't contain that, right? So, controllers aren't operators, but operators are controllers because operators do contain that control loop that's constantly checking to make sure that application specific knowledge is as you defined it in your desired state. So, for a visual representation, everybody's played this game, right, where it's like find ten missing items, right? So, that's basically what your controller is doing, right? So, it sees, you give it the desired state, right? The top image is your desired state, and it sees your current state, and it notices that it's not the same as your desired state. And so, it makes those changes and it makes sure that your current state is as your desired state. Another graph to kind of represent a controller is this graph here. Say, a user addresses or changes your custom resource. We'll talk about that in the next few slides. They change your custom resource, and so your controller gets informed about that via informers, and so it does this reconciliation, and so it makes sure that your current state is as you defined in the desired state, and then updates the status, and it makes the world or your cluster look as desired as you specify. So, Kubernetes controllers allow us to run and manage our application inside a cluster. Controllers are the control loops that watch as the state of our cluster, and makes our request changes where they're needed. So, the next component of an operator is a resource. So, we have a controller and then we have a resource, specifically a custom resource. Say, there are, for a custom resource, it's an endpoint in the Kubernetes API that stores a collection of API objects of a certain kind. Some familiar resources are a pod, secret, or config map, but in specific to operators, we are creating custom resources. You might be wondering what that is. Custom resources is what we, is our, is our operator, right? So, our, is our resource that our operator is watching. A custom resource is an endpoint in the Kubernetes API, right? So, let's say the easiest way to find your custom resource is to just, or to find different custom resources is to do OC, OC API dash resources. Then you can find a list of resources that are available to Kubernetes. Before custom resource, we have a custom resource definition. This is our manifest that our controller is continuously watching, right? It wants, it wants to emulate this. It wants to always be as similar as it can be to our custom resource that we define, or as similar to our custom resource definition that we define. So, here's our custom resource, which is Vesti. Vesti years are kind. Remember, we have pods, secrets, and config maps. Vesti is our custom resource that we have for our operator. Next slide. So, for custom resource, we have quite a few different things that you need for your API here. The REST API is the fundamental fabric of Kubernetes, and all operations and communications between components and external user commands are REST API calls that the API server handles. The API server exposes in HTTP API that lets end users, different end users, different parts of your cluster, and external components communicate with one another. So, this is how all the components of your cluster communicate, right? With the control plane and the API server. And so, for your custom resource, you can have group, you can have a group version, kind, and different specs. So, there, aside from the core API groups, there's also apps and storage, Kubernetes.io, for versioning. Versioning is different API versions. Different API versions indicate the different levels of stability and support. So, say, alpha has, may have some bugs on it. Beta is pretty well tested, but may have some needs and needs to work on, and then stable means your custom resource, your application is good, right? There's different versions out there. Kind is your resource in how it should be referenced, right? Pods, deployments, or in our instance, bestie. And then specs is how your users define their intent or desired layout for the given time. Say, you can specify what container image you want to run, what port should be exposed, or configurations that changes how your app may be deployed in a given cluster. So, this is how it may look, how it usually looks. If you're familiar with Kubernetes, you have the API's group version and kind. Also, you can also have the API's group version namespace, your particular namespace and kind. And so, after controller, resource, and we have knowledge. So, the three main things of operators. Controller, resource, knowledge. What is knowledge? Knowledge is domain or application specific, which is often learned from users or administrators rather than developers. What does that mean? So, it means that knowledge, the knowledge that you want to implement into your operator is often things that your end users need or things that your administrators want to implement as well. Like being able to install the application, being able to scale your application. If there's more requests than you currently have, and you want to be able to scale pods largely enough so that there won't be any downtime. It can mean implementing self-heal. Again, with pods going down, you can create new pods. And with backups and updates, cleanups, you can, if there's a new version of your application, this application-specific knowledge will be able to back up and update your application accordingly. And with resilience and observability, you can incorporate metrics and all of that into your operators so that you understand what's going on with your application and how to respond to that accordingly. An example of the knowledge that we implemented in our operator, the L5 operator, is being able to create a job to seed our database before our application pods were created. So, utilize the Postgres database. Utilize Postgres operator in our operator. So, it's like an operator with an operator. And so, with this job, we decided to seed our database beforehand because if we didn't have our seeded database beforehand, and our application pods were already made and requested going in, then there'd be an error because our database wasn't seeded with all the information that it required. And so, our operator had this implemented. We implemented this into our operator. And so, that's how we got our database all seeded before our application pods were created. And so, here are other few examples of what, of the knowledge that other operators are trying to implement into their operators. So, Argo CD automates the task required when operating Argo CD cluster by managing the full life cycle of Argo CD and its components. The L5 operator, the knowledge it implements is automatic installation, seeding data, managing the application version and upgrades horizontally, autoscaling the pods and giving back metrics according to the request traffic in the application. For a Postgres country operator, the knowledge it implements is being able to install the database software, create databases, starting and shutting instances, managing user and security among a whole bunch of other application specific knowledge for the software. So, again, what are operators? Operators contain three big things, three major things, controller, resource and knowledge. And again, it's basically a robot with a post-it note or the image that it has. It has the controller and it has the resource and it's just trying to make sure that your application is as you want it to be. And then it has other jobs on top of it. Another example is say, you're the owner of the shop, right? You're the owner of the shop and you have a store manager, right? You can give a list of tasks to your store manager and say they can manage money, they can take customer orders, they can do, they can buy products and do shipping and all of that. Or else, they can just manage everything else and you can manage all the money and stuff like that. So, an operator is like your store manager. You give it a list of the things that you want it to manage and it will manage it for you. And those are the people going to your store, your request, right? So, they manage all of that according to what you give it in the list. So, why are operators valuable? Operators are valuable because they build an ecosystem of software that can be easy, safe and reliable to use and operate as a cloud service. So, we have this whole entire library of operators that can be used by by anybody that can be used by companies and other people to utilize in their own applications and it's easy and safe and reliable because we do tests on these and we make sure they are up to par and they're as defined. Their level is as we define it to be. Why operators are also valuable is because they're low touch, remotely managed and they have one click updates. So, again, on Operator Hub or on Red Hat OpenShift container platform, you can download these, install these operators onto your cluster very quickly with just one touch with one click and you can utilize them in your application. Also, operators are valuable because it's software that works for Kubernetes works. So, wherever Kubernetes operators work because OpenShift operators are built upon and extended upon existing Kubernetes concepts. So, when should we use operators? Operators are definitely very useful but it's also not for everyone. So, these are some guiding questions that can help you determine whether your application needs an operator, whether it can be useful for your application. So, is your application a stateful application? Does your application contain application-specific tasks that can benefit from managed automation that Kubernetes doesn't fulfill? And would you like your application to be more scalable, repeatable and standardized? Stateful applications include persistent storage and other elements external to application which require extra work to manage and maintain. So, operators can help ease the task for that and being able to create an application that's scalable, repeatable and standardized, operators can help with that because you can build consistent responses with events that occur in your application. So, where are operators currently being used? Operators are currently being used all over the place. AWS, Postgres SQL, Dell, GitLabs, IBM, Prometheus, Grafana. There's a whole variety of operators that are being used and are available on Operator Hub. We'll talk more about that in a bit here. But there's a whole library of operators that you can utilize and use to even create your own operator, like the L5 operator, use the Postgres crunchy operator for our database to create our L5 operator. And it makes creating operators easier, but then it also, you help the community by building more operators by making your job easier. And so, next slide, how can we create our own operators? I'm going to hand it off to Sundaria as she talks more about that. And while we get into the more hands-on stuff. So far, we have seen why, what, and where do we use operators? Why are they valuable? So, we have covered all the W's of the offer presentation. Now, I'll walk you all through the one hedge, that is, how can we create our own operators? The tools make it very much easier to create or develop your operator from scratch. So, the tools are nothing but the operator framework. The operator framework has three components or the three parts we can see. One is operator SDK, which includes building, testing and iterating your operator. And the next one is operator lifecycle manager, which involves the task like installing, managing and upgrading the operator once you have built using operator SDK tools. And the last component is the operator hub.io. Once your operator is ready, the next step that you would be doing is publishing and sharing across other Kubernetes developers or users. So, this acts as a repository where you can publish your operator. So, let us see in depth about each component here. Let us talk about the operator SDK. So, when we are seeing the operator SDK tools make it easier. So, why do we say that? Because it provides many high level API and abstractions, which help in writing the logic more intuitively. And it provides various tools and libraries for scaffolding your projects, which generates some default files that are required to run your operator and publish them in the operator hub.io. And it also provides some extensions to cover common operator use cases. So, the installation part is if it can be just installed with a single command to install operator SDK if you are using a Mac or it can be done from the GitHub resource that is available. You can just clone the repository and do make install and you would have your operator SDK running on your system. So, when we are seeing scaffolded, what is actually scaffolded here? A lot of files like maker file, main.co, project file, Docker file and a lot of boilerplate and tooling will be scaffolded. The main.co is nothing but the controller manager and the project file has the metadata for the operator SDK and the Docker file is also generated, which is for building a container image calling your operator. And we will be still missing out on our controller logic and API code, which will be explaining while we are giving the demo session on how to scaffold your first operator project. So, what are like main goals or benefits of the operator SDK is it maximizes the opportunity, minimizes human efforts and burden. So, the end user, all he has to do is he should have the container image and manifest and his good to use and develop the operator. So, the next component in the operator framework is the OLM, the operator lifecycle manager. This component will help users to install, update, manage the lifecycle of the operator and other service associated with it while running on the cluster. So, it provides a declarative way to install and manage the lifecycle of your operator on the cluster. So, the features of operator lifecycle manager, it provides a rich update mechanisms to keep your operators up to date automatically. With the OLM kind of formatting, packaging format, operators can express dependencies on the platform and on other operators as well. And this framework will make the operator and their services available for the cluster users to select and install. And it will also prevent conflicting operators owning the same API being installed and ensuring the cluster stability of which we are a cluster stability. And then the declarative UI controls by which it enables the operator to behave like managed services providers through the API they expose. So, so far these are the features and benefits of OLM. Next, let us see the architecture of OLM. So, OLM mainly has two components. One is OLM operator and the catalog operator. By default, the operator lifecycle manager will run in the OpenShift container platform, which aids the cluster administrator in installing, upgrading and granting access to the operators running on their cluster. So, let us see in detail about operator group and catalog operator. So, the main function of the OLM operator is responsible for installing applications defined by cluster service version, which is CSV. And once the required resources are specified in the CSV, the job of the cluster operator will be done. So, after this, the OLM is not connected with the creation of underlying resources. So, if this is not done manually, the catalog operator, which is the second component, can help providing a resolution for these needs. So, next one is the catalog operator. So, once the OLM operator will do the install part, the catalog operator will take up the monitoring of subscription and catalog source and the catalog themselves. So, whenever this catalog operator finds a change in the subscription or operator group or install plan, it will generate a new one and make sure that it is updated automatically. Yeah. So, this is the basic architecture of OLM. So, when we talk about the workflow, it can be discussed in three parts. One is CSV, catalog source and subscription. So, CSV can be defined as the main entry point for packaging an operator for OLM. It is the cluster service version, which is a YAML manifest created from operator metadata that will assist the OLM in running the operator on a cluster. So, this CSV will mainly have the information about logo description version and any technical information related to that operator will be mentioned in the CSV. So, if we see the components of CSV, we have the metadata install strategy and the CRBs. So, here metadata will have the application metadata, which includes name, description, version, any related links, labels and icon for your operator for which the base code will be included in this file under the metadata. And next, the install strategy will be having the type, which is the deployment and it has the set of service accounts and required permissions mentioned in here and other set of deployments. The last one CRDs, it will have the information related to the namespaces and all the list of resources that the operator will be interacting with. And it also has the descriptors section, which will include details about CRD specifications and status field to provide semantic information. And next, the components at an onset here will be catalog source, subscription, install plan and operator group. So, we have seen what is the catalog source earlier. And so, again, it is nothing but an operator index, which will represent a store of metadata that OLM can query to discover and install operators and their dependencies. And subscriptions are nothing but an intention to install an operator, which will describe the channel of an operator package to subscribe to and whether to perform updates automatically and or manually, all these details will be included in the subscription. It also defines the name and namespace of the operator. And next, the install plan. This will define a set of resources to be created in order to install or upgrade a specific version of your cluster service version, which is defined by CSV. And then, so our OLM will support is designed in such a way that it will support all the namespace modes, like single namespace, all namespace, multi namespace or own namespace. And next comes the last component, which is operator group, which will provide a multi-tenant configuration to OLM installed operators. It is said to be a member of operator group. An operator is said to be a member of operator group. If it's CSV will exist in the same namespace as that of an operator group. So, yeah, this is about the operator lifecycle management. And next, the last, how do you deploy using OLM? So, here are the quick commands that you would be using to deploy your operator using OLM once you have developed, built and test using your operator SDK tools. And the last component of the operator framework is operator hub, which is nothing but a web interface that the cluster administrators would be using to discover and install operators to automate their deployment and maintenance of platform services and workloads. So, this operator hub is nothing but a repository or a place where you can find all the operators that are developed and published there. You can use these operators in your application, which will automate your functionalities or tasks that you would like to incorporate in your project. So, the operator hub is designed to address the needs of both Kubernetes developers as well as users. So, when I say it is useful for the Kubernetes developers, it is useful in such a way that it provides a common registry where they can publish their operators alongside with the description, relevant details like version, image, code, repository, etc. And they can also update already published operators to new versions when they are released. So, not only that they can publish their operators, they can also submit the updated version on the operators which are already existing there. And for the users, it helps in discovering and downloading operators from a central location that has content which has been screened for the previous mentioned criteria and scanned for known vulnerabilities. In addition to this, the other use facility or the feature of operator hub is the developers can guide users of their operators with the prescriptive examples of the custom resources that they introduce to interact with the application. So, one such example or the other instance similar to the operator hub is Red Hat OpenShift container problem with a platform which is a private one which serves as a platform as a service for enterprise that runs OpenShift on public cloud or on-premise infrastructure. It is a Kubernetes-based platform and it comes with a streamlined automatic install feature. So, with a single click, users or developers can install the operators and use them in their application. And there are many other capabilities for OpenShift container platform like portability, integrated ecosystem, user interface, integrated CI-CD pipelines, automatic upgrades, multiple clusters, persistent storage and scalability. Let us see in brief what these capabilities mean. So, the OpenShift container platform, we can quickly scale our applications to thousands of instances across the hundreds of nodes and it provides persistent storage where we can leverage our storage to run stateful applications or cloud native stateless applications. And it also has an extensive ecosystem of third-party tools created and integrated by its community. And it also ensures that containers are easily portable between a developer workstation and production environment as well. And it has a convenient user interface where it allows you to directly access a large number of common line tools and multi-device console and many more. So, these are the capabilities of Red Hat OpenShift container platform. So, instead of doing all the work from scratch, you can even utilize already published operators from operatorhub.io or Red Hat OpenShift container problem platform with a single click. So far we have seen how to develop operator and why do we use them. So, the last part here is the capability levels which are divided into five levels as per the operator SDK guidelines. The first one is basic install by which we mean that our application should support all type of install modes be it by a single click from the operator hub or by installing the CRD. It should support every install mode as per the SDK framework. And the second one is the seamless upgrade which means automatic upgrades to the operand or operator. So, whenever there is a change in any functionality of the operator or internal software updates, your application should be able to upgrade this automatically without involving any human support or work. And third one is full life cycle which includes backup and failure recovery. So, whenever we update your operand or operator, the application should ensure that it also functions for your previous version without generating any errors that is what we mean by full life cycle and failure backup and failure recovery here. And fourth one is deep insights by which it involves monitoring all the metrics related to your application like if we take the example of the L5 operator we have been talking about in our presentation, it gives the results like how many operands or how many requests have been made, how many are successful and if the number of requests fails. So, it has the log of this information and this can be done using the Prometheus operator which is utilized by the operator SDK as an inbuilt function. And the last level is autopilot which is horizontal or vertical scaling, abnormal detection or schedule tuning. So, these are the five capability levels of an operator. Further, we will be walking you all through how to scaffold a project using operator SDK. Yeah. So, one of initially we'll be creating a project in our home directory like just MKDIR and then we'll be doing CD to into that path. And then we would be initializing this project with the command operator SDK in it. So, yeah, this will be generating a go.mod file to be used with our go modules and we will be mentioning the repo path here so that if you are creating the project outside the go path, it is required that you make sure to mention this here since it will require a value, a valid module path. So, yeah, once you initialize your repository, the next step would be creating an API for it. So, yeah, as you can see, writing scaffold for you to edit. So, the basic skeleton of the project is generated by giving us all the files that are required for a project. And when you do a tree command, you can see a quick sex skeleton of your project in the command line. Yeah, so we see that a docker file, make file, project, main.co, all these have been generated by doing the create API, the customization YAML file, manager folder manifest. So, yeah, and still we will have to even after doing this, we will have to create our new API and controller which will have our application specific controller logic. And, yeah, so in the memcached controller.co will have your controller programmed for the application. And then in the config folder, you will be having the cluster service version YAML file in the manifest folder. Yeah, and we would be creating our CSV, which will include all the details of your operator like the logo description, technical details and other prerequisites that are involved for using your operator. So, yeah, this is the these are the steps you would be using to step forward your project. We have also shared a quick references and exercises that would be helpful for you all if you would like to start off in the operator journey and develop a new operator from scratch. So, we are we have included these slides and presentation articles here and also the exercises and we can you can also find the link to the repository which which has all the branches with five levels separated for the alpha operator. Thank you. Thank you for attending the presentation, simple and fine operator substantially. If you guys have any questions, feel free to put it in the chat later or contact us. We'll be happy to answer your questions. Thank you. Thank you all.