 Ciao, mi amo Gilbaros, you no parli italiano Sorry, oh una oltra birra per favore the important things to say right So I am in the cloud business unit. I'm a partner product manager and that means I work with partners customers We call it herding cats. I help Integrations come together and in particular operators have opened my latest focus So you've heard a lot about operators in the last few hours Sort of a high-level. I'm gonna dig a little deeper into operators today and Tell you why you should care Why it's important and what you can get out of them We're also gonna talk about Best practices and really what we've learned in the past two years roughly that we've been doing operators and What we've sort of created to help you get there. So we're really gaining momentum a Lot of open-source projects are using operators as the de facto way of deploying on Kubernetes and on OpenShift Nowadays if you look on operator hub You can see a lot of the projects that you use are already there and there's quite a significant backlog But the team I'm on is working on on getting new operators for for new open-source products added to operate a hub But what we're seeing also is that ISV's are Really looking also at operators as for their commercial products for their enterprise Distributions as the way to deploy on OpenShift. So I'm going to assume that you have some experience with operators You've been here for a couple of hours. That's sufficient experience. We'll we'll dig into that a little more And we're assuming of course you have some experience with Kubernetes and OpenShift So let's add to that operators are Procedural best practices they take what's in the minds of Your admins your developers on how to life cycle your application how to do everything from day one deploy the application install it to Fun stuff like metrics like health checks Like auto tuning day one is Fairly well understood right there. Everybody has a script to install their app. Everybody has an Installation tool. That's easy day two is difficult, right? How do you make sure that? You can scale when scaling time When scaling time happens How would you detect that you need to scale? How do you detect that there's something wrong with the health of your app? and Act on that without having to rely on the one Admin the one the one person who deployed it who remembers how oh you have to do it like this otherwise this breaks So the idea is you have Your subject matter experts create the logic for how to do these things and put them in the operator Um Very quickly on how it works. There's a custom resource Which is effectively a Kubernetes controller Kubernetes is watching the the custom resource and the configuration and basically looping on making sure that the Operator and the app are in the Situation that you want them to be right making sure there's enough enough Resources making sure that you know everything is where it's supposed to be But more importantly why why should you care and I mentioned that it's automating Things that are in people's heads That you don't want to get that pager in the middle of you know Saturday night So make it so that we can we can do this and and the day two stuff that I mentioned is the complicated stuff, right? Resizing isn't just scaling resizing is I have to add stuff to load balancers I have to talk about quorum I have to decide where to scale something to is it going to be in this data center Is it on this server? Is it over there? Does it have to be in Asia Pacific? Does it have to be in? The US so there's a lot more to resize than just Add one more instance and you can do that with operators Same thing for upgrades right up upgrades tend to be complicated. You tend to worry about them, but if you codify all of that in your operator It simplifies it and I'll talk in more detail about that as we go reconfiguring or configuring your app a lot of times there are 500,000 possible Tunables configuration options, but you know you really only want your app or your users to change five of those Right, you don't want them to change every other little thing that's going to break something else You only want them to change a few of those settings So you can make it so that your operator Configuration options only include the ones that you want your users to change Backups backup sounds simple until you start talking about backing up databases or stateful data So you can prepare your environment for backup with with your operator And we talked already about healing and and what that means and you know detecting that something is not in the right state And defining what the right states are But it's it's it's more complex than someone that's looking at the dashboard and saying oh this is orange now What do I do? Well? You can make your operator Know what a bad state is know to set it orange and then know how to fix that Is it are you running out of resources? You need to move workloads around things like that? but Let's get into a little bit more of What makes a good operator right? What are the best practices? Those are scenarios or situations in which an operator would be useful, but let's Where can we what are things that we have to do to make this better? So this is Not as bad of an eye chart as Brian's initial one But let's talk about Some of these so this is for the development side. Do I have a pointer? I do excellent I'll try not to blind anybody So this is for the development side of your operator, right? So the if there's one thing that you should leave with here today is Operator should do one thing and do it well Don't create an operator that deploys your entire stack create an operator that does That one part of the app, you know deploys your app and has dependencies, right? So your operator could say for me to deploy this web front end. I need a database. I need You know these three other things But have OLM the operator lifecycle manager handle those dependencies because you don't want to own The operator for all of these other apps all of these other open source projects, which aren't your responsibility, right? those those should be handled by the subject matter experts of those operators Which I think is the first two bullets there Use use an SDK the operator SDK already creates all of the scaffolding like all of the Reusable reusable code that's necessary to make an operator work is already provided for you So using an SDK in the end. I'm gonna have some links to a lot of the documentation on this but more importantly also a link to the learn.openshift.org page, which Walks you through creating an operator It's it's how I learned to do operators a year ago is you just click the learn button and it pops up An environment for you and it pops up the step by step. This is how you create an operator And of course it uses the operator SDK, which we'll talk about a little more Don't hard code things don't hard code namespaces you'd be surprised how often this happens And there's a collision out there waiting to happen. Not all of your environments are gonna look the same And I think lastly on this one APIs have a tendency of sticking around for way longer than we expect So version them properly Have different versions for your operators make sure that they make sense and there's there are communities guidelines and just simver guidelines for how to do that so running running operators on clusters The the The scaffolding the framework is already set up so that you don't need to run your operators roots And I know you're thinking oh no my my specific scenario requires my operator to run his roots It probably doesn't Talk to us about that. We're happy to help. There's there's a whole team of us that Help people with their operators help them develop their operators help them troubleshoot their operators and We've created this framework so that things that usually would need to be run as roots are Handled by the framework so your operator itself doesn't have to write it. You don't need to handle all those privileges that Could cause the issues in the future. So things like CRDs get registered by OLM. So you don't need to do it yourself. So you don't need the provisions for that Writing meaningful status information is Surprisingly a problem that we bump into fairly often. So the one of the groups I'm in Validates and tests and provides feedback on operators that show up in our GitHub repository for operator hub. So I look at a lot of operators During the day and I review them and test them and one of the big things when I'm testing an operator is I don't know That much about your app. I read the description I read a paragraph about your app and I'm trying to deploy your operator and The first thing that I look for when it doesn't look like this installed, right? It doesn't look like this operator is working is what are the what are the stat? What's the status codes that I'm seeing like what are the status messages? Is it telling me that it broke? Is it telling me anything useful about how it broke? and Think of think of me as a user, right? Like I'm a good example of a person in your organization or a person out on the web who Is interested in deploying your app is interested in using your operator, but doesn't know as much as you do about it so Tell me as much as possible in those error messages in those status codes And there's proper places to put them in the in the custom resource object, etc We'll talk about updating But the key thing This this is not the company line. This is my line the key thing for operators for me is One operator one app version So don't have an operator version deploy eight different versions of your app. I'll talk about that more later but more importantly an operator version that should be able to understand how the n-1 version of that operator works and Update that so When you in you know when you develop version You know 1.1 of your operator it gets installed on the system It's going to be working with an existing version of your app and it needs to be able to update that So that's the one thing that you should focus on is that logic needs to go from n-1 to n Don't think of n-2 n-3-4 because we're just going to do it incrementally Don't deploy other operators. I talked about dependencies already use The operator life cycle manager is going to handle dependencies for you so easy Ah Okay, so should always be able to play and come up without user input. So that's another Piece of feedback that I have from Installing you know a dozen or hundreds of operators over the last year is I Don't always know What all the possible tunables are I don't know what the required Configuration options are so when I install an operator it should it When I subscribe to an operator it should install the most basic version of the app with the most Commonly used configuration settings so the app should come up At that point I should go in there and configure it and the operator will then configure the app But it's better for when you deploy an operator to your cluster For it to come up with a version of the app then for it to immediately error out and say you didn't configure blah blah blah You know just have it install like a basic, you know This app can't talk to anybody But it's up and it and it works because then you give that feedback to the user that oh, okay? Everything is good. It's working. I just now need to go and configure it All right, so how do we do this Hopefully you memorize the previous two slides. There is a quiz at the end a test And that will define how many beers you get at the end of the day is how well you score on the Twitter so We'll get there So the operator framework so we've We're aware that there's a lot of best practices that there's a lot of Things that we're telling you you should do it like this. You should do it like that. You shouldn't do this so we've tried to Sort of codify all of this into a framework, right? Like the operator mentality is like put the logic in something so that you don't have to repeat it And we've created some tooling so Three big parts of this is the operator SDK, which I mentioned before It basically creates all that framework for the operator so that when you're writing your operator You really just having to focus on Writing the logic necessary to get that up, you know to get that app deployed, right? You're not worried about to the controller stuff. You're not worried about the resource, you know, like you're We're trying to make it as simple as possible the operator lifecycle manager is What's running on the cluster which handles the the catalog it handles providing access to the operators to the users and also It's it's it's what's taking care of the lifecycle of the operator, right? We've talked about metering a good bit already, so I'm not gonna go into that but you know as as you expect Metering works for operators as well on a per name space basis, etc. etc. So We're gonna go through a couple of sort of a step-by-step scenario on how this works I usually call this person Jane, but I think Julia would be better Okay, so Julia the developer Uses the operator SDK to create a new operator for her app You know the operator SDK does all the scaffolding and she adds the custom logic, right? She adds the how do I actually deploy this app and now she has an operator Easy enough. She gets to focus on the custom logic Taking a step back. So we've sort of Come up with a sort of a five phase You know like what are the depth of the operator that you've that you've created, right? So phase one and two of course are the most common basic install, right? You can you can install your app and you can upgrade it from you know version one to version two or whatever and you'll see that there's Three different options for how you're going to write your operator helm charts, of course a lot of you already have them But they're somewhat limited in that they can really only handle Install and upgrades the we'll get into a little bit more detail on those But the cool stuff is over at the end over here, right? So we can handle scaling auto-tuning detection of issues Scheduling, you know like all of the more complex stuff is also doable over here But you need to use either ansible or Golang or something like that The cool thing about doing this with helm is if you already have the helm charts The SDK can basically just take those and run with it It's it's very little very low barrier to entry. Of course you're it's a good start But you're limiting yourself a little bit on on how much functionality you get If you use ansible, it's fairly similar you need to be able to use the ansible Kubernetes modules, but otherwise it's just ansible, which you already know and the operators to cake and take your ansible playbooks and and run with them with some adjustment, but if you really want the full breadth of What you can do with operators Write it and go and you'll go from you'll go from there while that was terrible But yeah, so three options on how you can get her up your operator written Okay, so I believe this was Joe before is when I do when I do this in the US so Joe will be Okay, so Marco, if I roll the R Marco, okay So Marco will take Julia's operator He's gonna add some metadata and package it up and then Add the operator into the life cycle manager, right? So the metadata is just things like what kinds of permissions our users are gonna have Which kinds of users have different permissions, you know the the RBAC stuff And once it's in the life cycle manager the user who will name later Can then see that operator, you know look at the packages things like that This is I think what I mentioned this before right so operators are really first-class citizens Which means they can do things via OLM that require escalated privileges without having those privileges and OLM really sort of manages these these operator components for you So you don't have to worry about doing a lot of that life cycle stuff. You don't have to worry about pre-registering CRDs You don't have to worry about Setting the actual role-bidings. You don't have to worry about setting role-based access control or namespaces or any of that so Operator life cycle manager really sort of helps set the system up and get things going but Talking about OLM and the life cycle so the way this works is we you subscribe to the operator, right? So the user is subscribing to the operator that You know was in the catalog right Marco put the operator in the in the OLM catalog then the user What did I call the user? Luca Luca is the user Luca subscribes to the operator on his system and the operator gets you know sort of life cycles Inversioned as as it goes right so whenever a new version of the operator gets put in the in the life cycle manager The cluster updates it the OLM updates it on the system. So if we take this a step further I Lost an app off the end of the screen if we take this a step further. We're talking about You know operator version 1.12 being you know deploying your app version 3 and 1.13 deploying your app version 3.1 and etc So this operator knows how to update from this one to this one Easy right which means this operator knows how to update of this version of the app wait this version of the app to this version of the app and Not necessarily previous versions. It doesn't have to do that Because that's too complex why you know why make your life difficult. Oh, I make your life difficult Of course you can we'll let you do that You know, I may get annoyed when I see that in the the operator when I'm reviewing it But yeah, okay, they wanted to do that go ahead And how you do this is basically just set the configuration option in your in your resource and say Deploy this version and then if you want to update it you would just change the resource to okay Deploy the next version and handle the updates. This seems complex to me It's not You're welcome to do it, but do that instead It's so much easier for you to keep track of your app versions by your operator versions You know both will be supported we're happy with you, but but remember In minus one to end Keeps life easy. All right, you've seen the slide before William stole it from me So Dependencies right so we've talked about having your operator do one thing keep it simple do one thing well and that means Your operator is just gonna say I require Yeager or I require cockroach to be and OLM is going to then go and install those for you. So don't write Your own version of the operator for Yeager They there's already a Yeager operator use it life cycle manager will take care of that for you Here we go Julia Marco and Luca all right so The operators made it to here and now Luca says oh, okay. I want that app so Let's Subscribe to that operator channel You know I'm gonna say which namespace I want it to be deployed on and then suddenly there's an operator Instance and the operator creates a managed application easy enough Note that Marco Sets the the privileges so Luca is only gonna see or have access to install the operators that He's able to like he's he has the privileges to so when we look at When Luca looks at the operator hub in his OpenShift console he's gonna see you know certain operators in his catalog and Which ones he can install he just goes and clicks install in this namespace And he's good to go So that was my 10,000 foot view of operators what they do why you care There are some links on how to get started. I don't have a learn dot open shift comm link here. I'm sorry But you saw that in some of the other talks But yeah, go take a look at operators go try them out It's very easy to try them out if you go to try that open shift org It'll deploy a cluster for you And then you can go into the console and into the operator hub in the console and just click through and install an operator try it and Play with creating operators on learn that open shift that work. It's it makes it really easy It was the first time I've used that sort of interactive Learning system. It's all in the web browser. You don't need your own cluster or anything. It all happens in your browser Excellent. Thank you