 Well, hello and welcome again to another OpenShift Commons briefing today. I'm with Jeff McCormick, one of my favorite people from Crunchy Data, who's going to talk to us about Postgres SQL operator that he's created, using it with OpenShift and what all that means. It's a new concept to me, so I'm going to let him explain it and introduce himself. So without any further ado, Jeff, take it away. Thanks, Diane. Yeah, today we're going to talk, I'm going to give an overview of a new capability that we have named a basically Postgres operator. And I'll go into details about what it is and how it's built and that kind of thing in this briefing. And then we'll wrap it up with a short demonstration at the end just to show you how it behaves. I work for Crunchy Data, we're a Postgres company, and that's all we do essentially. We specialize in open source Postgres, sales support, also custom development needs for people doing unique things with Postgres database. We have a presence in the federal and government space with certified version of open source Postgres as well. So I encourage you to look at our website, crunchydata.com, for more information about the different kinds of services and things that we can offer, including 24-7 support. So first of all, what is an operator? That was my question about eight months ago at KubeCon, actually, where some guys at CoroS mentioned this to me, and they said, you should really write an operator for Postgres. And I was like, I don't really know what an operator is in this context. So starting about four or five months ago, actually, we started this project called Postgres operator to do just that. So what is an operator? I guess, first of all, you can find all of this code that we're going to talk about today. It's open source. It's out there on GitHub link, which I've listed there. So you can pull this down. You can build it, inspect the code, look at it. There's built binaries on there for you to try out as well. And the containers for it that it uses are actually out on Docker Hub too. So you can run examples and play around with it. An operator really is just a controller piece of software. And in this context, we're going to write some software that controls deployments of Postgres database components within a Kubernetes or OpenShift cluster. So that's really the focus of this operator and this controller. But really, if you're familiar with just controller patterns in general, that's what this is doing. You deploy the operator on a Kubernetes or OpenShift cluster. It runs out there and it's waiting basically for a stimulus from external events, basically, to cause it to do things. And the things that we are going to have it do are all Postgres database related things. You would use an operator to automate things as well. So there's, in the world of databases, there's all kinds of workflows that DBAs would do or people that are deploying Postgres clusters on a environment like Kubernetes or OpenShift. So what we can do is automate a lot of those manual tasks and the operator itself is a place where we can build those kinds of automation layers. The operator is built in Golang. This particular operator is. And it uses the Kubernetes client API for Golang. And there's a link there. And that's an interesting project. This project depends upon that project, open source project. And it just allows me from Golang to interact with Kubernetes API using code that I would write. Very much the operator basically is just interacting with Kubernetes APIs to do all sorts of things like update labels of containers, create containers, delete containers. So everything it does is based upon leveraging that Kubernetes client API. This operator, this Postgres operator, is different than some other operators in that this one has a command line interface that human beings can actually cause the operator to take action with. So from a command line perspective, the operator works a lot like a Kube CTL or the OC command in that it allows you from your desktop an ability to interact with Kubernetes API or OpenShift API. And whenever you do that, you can get information back from the Kubernetes cluster. You can also create objects on Kubernetes using that command line interface. And that's the primary means right now for the Postgres operator to understand what you want it to do and for you to cause it to do things. The operator runs as just a standard deployment. So you run it just like any other deployment out on your OpenShift or Kubernetes environment. It sets out there and watches for third party resources that we've defined, I think, five or six different Postgres related third party resources. And the operator is sitting there watching for changes on those. So when you create a third party resource called, say, PG cluster, that is a way that the operator will notice that event and it will take action. And that's the primary, I guess, the interesting thing about an operator is that they're based largely on third party resources. Yeah, in future versions of Kubernetes, they're changing from third party resources to another type that we'll talk about custom resource definitions. But in reality, they're sort of serving the same purpose from a Postgres operator perspective. Is it just a means for us to catalog or store metadata about Postgres deployments? It's a place for us to store that metadata and interact with it through a standard Kubernetes API as opposed to us inventing our own similar type construct. So I'm really excited about third party resources and custom resource definitions. I think that was a really a great feature that the CUBE team put together. It makes things like the Postgres operator much more possible than before. The operator uses a template-based approach for what actually makes up a Postgres cluster. So it may be a master database container and it may be a series of replica containers. It may be services for those containers. It may be a Postgres SQL-based router proxy. All of those things make up what we're calling a Postgres cluster and you can define those in a template. And the Postgres operator is designed so that you could add your own set of templates that meet your particular requirements. There's a default initial definition, but over time you'll see more definitions placed out there. So this diagram shows you what kind of the schematic of what the operator consists of. On the outside of OpenShift you have this PGO client and that's just a command line binary going binary like any other. And you run it that connects through the Kubernetes API over to your OpenShift or Kubernetes cluster. So that's how it's interacting with the operator or your Kubernetes. It's exactly the same way like CUBE CTL or OC command works. The operator is running inside OpenShift as that deployment and it's interacting and has close relationship with these third-party resources that we've defined specifically for this Postgres environment. So it's listening and watching for events on those and making changes over time as those resources change. The things that it does is primarily create those Postgres deployments and those deployment boxes. If you take a look inside one of those, it looks sort of like this and it can be quite complicated. In Postgres you can have a master database and then you can have a series of read only replica databases connected to it and it's replicating state essentially. So you have a Postgres cluster and then you can put services out in front of those databases. Those databases also have a related persistent volume claim. So there's lots of things going on here and one of the values of the operator is it treats all of those things as basically just a Postgres cluster. So it's a simplification of Postgres clustering mechanics. Without the operator you basically have to construct and deploy all of these things in pieces or are kind of more on a manual piece by piece basis. So why do you want to do this or what is this useful for? Which was kind of the question I typically get asked is why do I need this operator? I can just run the containers and build templates and deploy those and that's been working that way for a couple of years. We have a suite of containers that there's lots of examples, you run some scripts, there's some JSON or YAML files and you can deploy those things. And if you're a developer or a good DevOps guy, you can string together a series of scripts that will help you automate the deployment of those things. Well, the operator, I list some reasons here why you would find this useful and it's really geared towards people that want to automate work flows around databases. These are things like typically things a DBA would want to do like backup databases or restore them. So if I build some of those workflows, implement those inside the operator, you can do things like reduce human errors. You don't necessarily have to build your own set of scripting around the base level containers to do certain things. And when you start working with large numbers of database deployments that can get really unwieldy over time, given all of the things that make up a functional robust like Postgres cluster. So without some sort of automation, you have lots of things you have to manually keep track of. Some people that are deploying lots of databases, they want an ability to implement a standard set of practices or policies around their databases. So the operator gives us a means to do those sorts of standard practices for people that have very specific policy needs. And then the ease of use, I'll show you in the demonstration, but it's pretty simple. Once you have the operator running and deployed, it's really pretty simple to deploy a Postgres cluster using that versus some other means. Large scale deployments, this is where the operator I think will really shine. If you have like say hundreds of Postgres databases you want to deploy and manage, that's where the operator I think really the value goes way up. It gives you an ability to collect and maintain metadata on all of those clusters and you can then query based on that metadata. So it helps you manage and navigate across large numbers of these Postgres deployments on your open shift. Complex orchestrations is another one. It gives us a place, the operator gives us a place to develop like advanced or hard things today. It also helps us to do in a database where it's like multi step pieces that need to happen in order to do certain database related things. For example, like cloning a database, cloning a Postgres database. There's a series of steps that a DBA would have to do to manage that process. Well, we can implement those in the operator so that it makes it much more user friendly and consistent to manage those complex orchestrations. Building blocks, so the operator depends on a set of base containers to behave and work to deploy Postgres. There's another project that we have called Crunchy Container Suite and it's open sourced as well. And there's four containers in there in that suite of like 12 containers total that we use for right now. And the operator basically is manipulating and instantiating these containers in order to perform certain things. So it's using this as basically as a set of building blocks, but you can look at this base project and if you want to understand those building blocks and what they do, the operator basically just uses these or leverages them. So what are the features of the operator essentially? Here are things you can do with it. From the command line you can say PGO create my cluster and that's going to create a Postgres cluster deployment named my cluster. And it will everything that makes up that cluster deployment, that's the services, that's the deployments, it's the PVCs or persistent volume claims. All of that is kind of a collection of things will get created and instantiated just by issuing that command. And likewise you can delete all of those related objects by just saying PGO delete. For instance, there are secrets that are used to store the Postgres credentials. Those would automatically get created and deleted by these PGO commands for you behind the scenes and you wouldn't you wouldn't have to manually go in and delete those things. PGO show is a command that just displays cluster information. I'll show you that in a minute. PGO test just runs a simple SQL query test against that Postgres cluster. It also prints out the equivalent P SQL commands that you could use and that's useful for people that just want to manually run the Postgres SQL client and know how to connect the simple way to test your cluster connectivity and what's running. PGO show PVC will display the contents of a PVC so Postgres stores its data on a dedicated persistent volume claim. And this is just a real simplistic way to view the contents of that. I've constantly working with these containers. I was always wanting a way to do like an LS command on a persistent volume claim and this is kind of a simplistic way to do that. PGO scale is a command that lets you scale up the number of read only replicas in that Postgres deployment. Initially when you define a Postgres cluster, you can specify zero are in numbers of read only replicas and the default is just zero, meaning that you don't have any read only replicas. Well, if you ran PGO scale my cluster replica count one, it would basically just spin up a new read only replica. And eventually that replica will come up to the same level as the master database after a period of however long it takes to replicate the master database. Right now those replicas are asynchronous and in a future release I'll support a synchronous replica as well. PGO backup and then a cluster name performs a full database backup. That uses a PG base backup utility essentially but it will create a full database backup of a cluster and store it out on a persistent volume claim that you can then reference to do a restore. And that restore command is there below PGO create and if you pass it in the backup PVC name and the backup path that you can actually restore from one of those backups. PGO upgrade is an ability to a minor or major Postgres upgrade and this is very useful for people that let's say you've got an existing Postgres 9.6 and you get a new point release of that. You basically can say PGO upgrade and it'll just automatically take down the old image and bring up that cluster with a new image with the same data essentially. You can also do a major upgrade and that's quite that was actually quite interesting to develop. But what it will do is like convert from a 9.5 to a 9.6 that would be considered a major upgrade. So that involves running an upgrade container based off of the old version and then spinning up a new version. It's quite an involved process. Well that PGO upgrade command automates that workflow. PGO create policy is a way to create a SQL based policy and just name it give it a give it a common name. So this is useful for people that have a series of SQL statements that they want to apply against a database. Those can be security related. They can be just application related. Basically there are pieces of SQL that you're going to name and then you can apply those policies towards a series of clusters based upon a selector criteria and that's really useful. For instance if you had a hundred Postgres databases in use and they and you wanted to apply a specific security policy you could run PGO apply against that entire suite or anything that matched the selector. And that's a nice way to maintain policies and it'll actually catalog which policies are applied towards a cluster as well. So at any time you can look at a cluster and say well has these policies applied to it and that gives you an ability to know exactly where all your clusters at with regards to certain policies that you've set up. And then lastly PGO clone cluster that was complicated to write and it's an example of an advanced database related workflow where it takes an existing database creates a complete new copy of it using a replica and it waits for the replication to complete decouples of the replica from the master performs recovery on that turns it into essentially a fully functioning master relabels it so that and sets up a new service for it. That was an interesting one to write but it's very useful to do a, you know, I think clone of a copy of a database. So it kind of combines backup and restore all in one piece and operators able to do watches on all of that workflow and know whether or not, you know, things have actually completed or whether they've the replication is that, you know, finalized essentially. And whether it's actually back up and running so that's an interesting command that some people will find useful. And those are really the main features of the operator. Again, it's an open source project you definitely can take a look at it there's a few releases of it out there now. We try to do a new release of it about once a month or every six weeks essentially to add new features over time. And what it really is it's a means of controlling like postgres deployments. It gives a high level abstraction around that. And we think for people that are doing lots of postgres deployments on Kubernetes or open shift. This is something they'd want to look at for sure to help make their jobs easier. It integrates tightly with Kubernetes API. So it plays nicely with Kubernetes and will always be in line with where the Kubernetes API is, you know, and it does, you know, fairly sophisticated orchestrations like backup restore and cloning and policy management. In the future, what you will see here are more advanced security and management features no doubt. And more templates advanced templates of what actually a postgres cluster consists of it will follow the path of the Kubernetes API to so today uses third party resources. In the future, it will definitely support the custom resource definitions as TPR has become deprecated. There's some information if you need to reach out and contact me and where the project is. And Diane, do we have time to do like a quick demonstration. We certainly do. But I have one one little question because you keep talking about like the PGO clone being, you know, a technically difficult man to write. If someone wanted to create another command and add it to that, you know, what is what is the work behind that look like. Yeah, what you would do is basically get into this project you would develop, you will see where all of the commands are divided out and where the client actually like has all of the commands in in different Go Lang packages, and you would basically have to develop your own, you know, command package, you would add it to the project and, you know, submit a PR for it. And behind those commands, though, is usually some code you would add into the controller. So you would add some code there to implement a command. So they play hand in hand. On the client side, you're typically interacting with those third party resources. So for some new functions, you have to add like a new third party resource. And then the controller has to have code in it to deal with those changes to those third party resources. So there's code you would have to add in both the client and on the operator side, you know, in order to implement say a new command. Right now that code is not plugable. It's not something you would add like, you know, like dynamically add code to it, you would have to right now add the code directly into the project. You know, and that's how you would extend it today. And the same for the templates as well. The templates you can add to but in some advanced cases, you're going to have to basically submit PRs to the project to get those templates that you want to add it in. That's going to probably change over time as this gets more pluggable. And we start taking more advantage of goings ability to add, you know, dynamic modules and things like that. I can see where people might have custom commands, you know, and larger scale things you put the clone in you done, you know, all the basic stuff there and I actually can't think of anything else off the top of my head that I might want to do that I can assure you that there's probably DBAs that have specific things and maybe the templating bits help them out with that as well but it's totally cool what you're doing and that's the first time someone's explained operators as nicely as you have. And we do have plenty of time for demo so there's a couple of folks on and I don't see any questions from them so why don't you go ahead with the demo and then we'll see again if there's any questions after that. Okay, hopefully everyone can see my screen here so first of all let's just create us a new Postgres cluster. But we can't we can't see your screen yet we're still seeing the questions screen. There we go. Perfect. Good. So, I've entered a command here PGO create cluster, and I'm going to call this one red hat. So by entering that command, I just interacted with created a third party resource on OpenShift, and the operator saw that third party resource get created. And it said hey, I got a user that wants to create a Postgres cluster we're going to name it red hat and it does all kinds of things in that operator like creating those services, the Postgres deployments, it spins up the containers, handles persistent volumes creation. If I do PGO show cluster red hat, you will basically see everything that it created. And it gives you things like that 9.6.3 is the version of the Postgres that it's actually running. It gives you the names of all of the related things like the deployments, the replicasets, the pods, gives you the status of that pod. So one slash one means that it's actually functioning gives you the service endpoints. You can now say PGO test red hat. And this will perform the SQL pane and you'll see the first three say it's working and it prints out the equivalent P SQL command which a lot of P Postgres people will find useful. Now it won't give you the password here. There's another command I could show you that will basically display the password secrets, but it will give you at least the P SQL command here. The last three are not working because there is no replica basically defined. So it basically tries to ping it and there's nothing behind it essentially because I haven't scaled it up. What I can do next is create a backup and say PGO backup red hat. What that command just did was create a PG backup third party resource and then the operator detected that and it will cause it to create a Kubernetes job. And that job will run crunchy backup container. It will connect to the red hat database it will pull all our backup all of its data to a backup persistent volume claim that's allocated. So all of that orchestration is done with that one command. You can say PGO show backup. That gives you the status of the backup. So it basically says it was submitted and gives you things like the backup persistent volume claim information. You can look at a PVC like PGO show PVC. And if that's working it will display basically what's on that PVC. And there's some flags you can pass to that to have it dig down lower, you know, lower levels as well. Like you could do that version of it and it will actually peak a lower level. So it gives you a time stamp there of a database backup that it just performed. So now what I can do is restore a database from that. It's a little wordy. Yeah, if you can click on the word hide on your Eugene's meeting sharing pop up it's covering up your. Sorry about that. That's okay. So what this command does is a wordy command that basically it's going to create a brand new Postgres cluster called restored. And it's going to when I pass it a backup PVC flag and a backup path flag it says it's a clue to this command and the container that you want to basically restore from a previous backup. So that's why you're giving it those paths. And then when you say secret from it's going to use the credentials from the red hat database so it'll copy those credentials so that it'll have its own unique set of credentials. So if I run that command it basically is off the operator now is off doing all of that orchestration for you. You say PGO show cluster all. Now you'll see you have quite a few clusters test was one I created before this demo but there's the restored. cluster and you'll see that it's still kind of working when it says zero slash one. So if I say PGO show cluster stored till just show you that one now it's up and running. I can say PGO test restored. And basically I'm able to connect to it and ping to it. What I can also do is show you a little bit about policies. So policy it can be just any bit of sequel that you would send you know whatever you whatever defines a sequel policy for you and it could just be anything you want from creating objects or adding security settings or whatever. This sequel would have will eventually get applied and run on that Postgres database as the Postgres user. So if you need to switch to different users you would do that inside your sequel. But I have some policies defined here and PGO create policy. Have to give it a policy name sorry that just created a policy and that's what the policy looks like. So that's actually a third party resource as well. And now I could apply those policies by just saying PGO apply against database. So that would look like this. The question Brad's asking a back on the backup stuff. Does the backup persistent volume type need to support read write many. It depends. This will work either with a shared volume type like host path or NFS but it also will work in a different configuration where the pvcs are created individually but yes you would need it to at least have read write for you know. You're basically creating the backups. I hope that answers your question. But it is interesting in that we're trying to make this work for all the different volume types whether they be shared ones or things like GC or AWS volume types where you basically can't share those. So you'll see in the documentation where through some configuration you can you can tweak those those settings. You can also change the templates behind the pvc if you need to add your own attributes to actually what gets created. So like you if you wanted to add a storage class or something of that effect you could add that in those the templates that the operator is using. So it's a template driven thing the operator is it's reading templates for services for deployments. And for persistent volume claims as well so you can get in there and do some fiddling with it and tweak it and add some attributes as well. And you can set up defaults in the PGO configuration file that will specify things like the size of the pvcs. And I think also the read write or access you can specify that there as well and override whatever default that it's set to. I can apply policy like this this command. I'm sorry. Let me explain this command PGO apply policy one is going to basically take that policy one command there which is a create table command and postgres. It's going to apply it against anything that matches that selector in this case. I'm going to say name equals red hat. And I've got something wrong there I think. Yeah it already existed it's already ran that so it's giving me an error back saying you've already run it essentially. But if I did it on restored. And give you output like that and what happened is it basically just the operator runs that sequel against you know whatever clusters match up with that selector. So that's kind of a neat feature as well if I want to scale up one of these clusters I can say PGO is. I can type today PGO scale. Red hat. So that scales up the replica deployment. To one or it sets it to one. So now if I do PGO show cluster red hat. Now you've got two pods out there and it says red hat replica that. This one here is basically the one that just spun up. So there's two deployments one is for the master and one is for the replica. The master is always set at one because Postgres is a single master database. The replica deployment. Is initially set to zero you can set that via configuration to whatever you want. But it's just basically setting if you set it to zero it's basically just setting out there waiting for you to scale it up if you need to now for development. So that's why people are not going to need to scale up a Postgres cluster. So that's why a reasonable default for some people will be just zero but it's setting out there in case you do need to scale it up. So if you take this and run say production system with it you very likely will need to scale up or want to scale up those read only replicas. But that's an example of how to scale up essentially. And lastly probably the most complicated thing is this clone command. I say PGO clone red hat like name my clone to what this command is going to do is create a new postgres cluster called my clone to and base it off of the red hat cluster. And it's going to do what I'm calling a thick clone meaning that it's going to create a brand new replica attach it to red hat allow the replication to finish. And once the replication is finished and that replicas ready to go I'm going to detach it. So I'm basically doing a backup. But I'm using a replica a live replica to perform that backup. I'm going to detach it trigger a recovery on it create a brand new service for it for the my clone to database and basically you just done a thick clone. Now the operator can watch these things for a very long time. So if these backups are restores take you know hours. It's OK because it's just sitting there reading the cube API watching events. And it's registered to look at all of these postgres objects and it will only continue the workflow if it knows that hey the backup is finished so therefore I'll go ahead and trigger a recovery on it. And that's the beauty of using those cube APIs and it's it's able to do these very long watches and watch for events so that a workflow may take you know hours or days and it doesn't really matter to the operator. It's just sitting there waiting for things to happen. The PGO show buster my clone to we should have a well I got something off there. You know I guess it's just taking a while for the clone to happen or I had a problem with it. But that essentially would have normally spun up a secondary clone. I must have taught something wrong or caused it to air off. But that's all of the demo that I have today. And I guess Diane with that I'm kind of you know kind of a point to wrap it up. I think that that was great. And doing a live demo if you don't have something go sideways a little bit. It's not not a live demo. Exactly. Perfect. I think you've answered the questions that the folks had in chat. I'm totally impressed with the level of automation that's in there now. And that's capable of and it's a great example. I think just using Postgres for one. But I think that's a great way to learn about operators and Kubernetes as well. So thank you for taking the time today to do this. And along with containerizing Postgres you guys at country crunchy data have done some amazing stuff to make Postgres. Super awesome on OpenShift and we really appreciate it. So thank you very much for all your work. Just double check and chat that that was the last question. Yeah. Yes, we're all kind of excited to see all the Postgres stuff work and I keep seeing Postgres everywhere. So good stuff. Thank you very much Jeff. Cheers. Thank you.