 So, hello everyone. So, hi. Welcome back. So, this is the, I think, third local today. Fourth, I'm sorry. So, Jimmy is going to speak about Kubernetes and deploying Postgres on Kubernetes. He works for SolarWinds. And enjoy. Thank you. Thank you. I work for SolarWinds. My name is Jimmy. I work in lovely Edinburgh. Very EU-friendly. Edinburgh, may I say? And I'm going to talk about how you can get your favorite database deployed on Kubernetes. Now, that sounds simple, right? It's Kubernetes does containers. First of all, how many people here are familiar with Kubernetes? Okay. And who uses it on a daily basis? Fewer. Okay. So, the first part of this talk is going to be talking about Kubernetes. So, a few of you will be bored because it's things you already know. But I'm also... The reason I want to do this talk here is to expose the Kubernetes, let's say workflow and system, to DBAs and people who play with databases and see how it can make their lives easier, if at all. So, what's the motivation for this talk? The motivation is, please... Okay. Service-oriented architecture and whatever that encompasses. So, everything is moving towards this direction and Kubernetes is no exception. SOA, including micro-services, is the way forward because it decouples your applications from your database, it decouples components of your system from one another. It's easier to code in parallel. It's easier to replace systems when they're decoupled. And that's what SOA is all about. And Kubernetes is a perfect example of SOA because everything is abstracted. You abstract services, you abstract controllers, you abstract deployment methods, everything is disconnected from the actual code and the actual systems that it runs on. And the second reason is Kubernetes is here to stay. It's not a thing like... If you remember Gennetti, which was another doomed Google project, it's not a thing that's just going to get abandoned and only being pedaled around by one company. Kubernetes is supported by the Cloud Native Computing Foundation and that's a lot of companies, a lot of people behind this effort. So it is very well adopted by the community by now and it isn't going anywhere. And one other reason is so we can have fewer phone calls that wake us up at 4 a.m. Kubernetes automates stuff so that you don't have to look after your systems all day long. You don't have to be over the systems seeing what died, the database died, we need to restart the database. Kubernetes does all that for you. Kubernetes is also free so you can play around at home for free or you can get commercial support for it. It's very well supported commercially. Red Hat, for instance, supports Kubernetes very well and you're also able to get commercial support from, you can buy Kubernetes environments from Google or Amazon. And the reason you can do that is because Cloud Compute and Storage is turning into a commodity now. It used to be a luxury when you could offload the running of your systems to something else but now it's sometimes cheaper than actually running on bare metal so it's becoming a commodity and why not have your systems orchestrated by you but running elsewhere. Another reason is Postgres is hard. Well, at least industrial strength Postgres is hard. It's hard because you need to think about reliability, you need to think about availability, you need to think about resilience. When a cluster goes down, which system replaces it, which one's the master, which one's the slave, is it replicated, is it backed up? These are things you can all automate using Kubernetes. And the end goal is you want Postgres to be a commodity for your users. If you're a DBA or someone who looks after the database in your environment or your organization, you want Postgres to be a commodity. You want people to just request the database, get it, and then they should forget about you. It should just work, they shouldn't bother you anymore. And that's the end goal. If it's a commodity, then you give it to them, it keeps working, no administration is necessary for as long as it works. And by no means is this presentation an exhaustive list of the ways to deploy Postgres on Kubernetes or even a good list. It's a few things that I tried out at work and at home and it's more an attempt to demystify Kubernetes, this magical word for us database people. So this presentation is not me fiddling around with terminals. If you've been to other Kubernetes presentations you have noticed that people like to type into their terminal and move terminals around and show you how it works. So I won't do that. I won't type in the commands just to show how they're typed in and then press enter and now it works perfectly and you see what it does when it works. But sometimes it doesn't so let me delete the configuration that I left when I tried it at home before the demo and restart. So this presentation is not that but it is going to talk about the basics of Kubernetes deploying at small scale deploying using Helm charts we're going to talk about that what it is and how it works deploying using the crunchy data Postgres operator and some observations on the previous methods and in general. So let's start with the basics. So what is a container? So if you're not familiar with the K8S thing it stands for Kubernetes it's just a short way to write it it doesn't hold any meaning. So what's a container? A container is a lightweight standalone executable package it's all in one. It's like a mini system that you deploy it has all the bits and bobs that it needs to work it has all the libraries in it it has the executables and it's the same everywhere no matter where you try to run the container it should behave exactly the same. So this is obviously resource efficient because we used to slice new VMs for everything we wanted to try so maybe I should try Postgres 11 let's build a new VM now let's change this thing in the configuration another VM you eventually run out of resources whereas containers let you utilize the same resources whether it's something provided on the cloud or bare metal and it's also platform independent it solves the problem of the developer going it worked in my machine but why doesn't it work on the server? It's exactly the same in your machine that should work on the server so Kubernetes is a container orchestrator written in Go supported by the cloud native computing foundation as we said the things it does automatically is scaling load balancing and letting you update your application in a safe and controlled manner so how it does that is it has the Kubernetes API everything is controlled through the API and the API is totally an abstraction it can be running on bare metal it can be running on some cloud provider it can be running on your laptop it doesn't really care it behaves exactly the same and the API exposes what are called resources because we don't have objects in Go resources and they're the things the building blocks that make up the Kubernetes API also another thing to consider is the pets versus cattle debate which means that do you treat your containers as pets that you lovingly take care of and you would really feel bad if one of them died or is it cattle that you just breed and they're replaceable so we're trying to move from a world where we took desperate care of our system and tried not to let the database fail or it should be up 24-7 what happens if it goes down we need to not care about that because we know it will go down so in this case we treat it as cattle we have many database replicas when one of them dies it replaces it with something else it restarts the service if you request Kubernetes to run three replicas if one of them dies it restarts and it moves from two to three replicas again and so on so let's look at a few terms the cluster in Kubernetes is made up of master node which is the thing that runs the API server which is our interface to the whole of Kubernetes and some worker nodes so worker nodes run and Kubelet is the thing that monitors what are called pods and pods are the basic units of let's say equivalent but not exactly containers that run in your system namespaces are a way to separate your cluster into many virtual clusters so you can hand them out to separate users and you can assign resource quotas so you can have a namespace called database you can have a namespace called SSO in the same cluster that lets people log in you can have namespace called web where all the web service pods are running and so on and you can assign different quotas on CPU or disk usage or memory usage for each one of these virtual clusters Kubelet as we said talks to the master node and pod is a container or group of containers sharing the same execution environment what does that mean it means that effectively they think they're running inside the same box whatever that box happens to be and why would you want that you would want for example the containers to share a common volume for storage or you want them to have inter-process communication so more terms Minicube is a way to run a small scale Kubernetes environment on say your laptop or wherever you want it's Kubernetes in a VM so you just install virtual box or your favorite virtualization tool and you just point Minicube in its direction creates a Kubernetes cluster and you're good to go you can start experimenting Prometheus is the monitoring solution that most people use with Kubernetes it's described as the best fit for Kubernetes because it's also originated by the cloud native computing foundation it has its limitations that people are trying to work around but let's not get into that now one other thing I want to mention are custom resource definitions CRDs are the things you use to expand the Kubernetes API there's a custom code you write to do the things that Kubernetes doesn't do out of the box and to automate your stuff so one thing you can write with CRDs is custom domain controllers that are called operators so if you want something to look after your database you write a database operator that specifically does a few things that you want you code it and go to manage your system the way you want it if Kubernetes doesn't do what you need to do out of the box and it manages this application for you so this eliminates the need for band automation you don't need extra sidecars to load on your containers to take care of things or you don't need hypervisors that keep track of your pods externally from Kubernetes you can do it all inside Kubernetes and you can script it and go now Kubernetes is all about defining stuff and the definitions are all in tons of YAML so if you're involved with Kubernetes you should expect to be seeing a lot of YAML and what you put in these YAML files which I'm not going to show many of them is you specify the kind of resource each resource that you put into Kubernetes has a YAML file that it's sourced from and you describe the kind so by saying this is of kind pod and then you put in some metadata you say this is the name of this resource it's labeled as fast disk or whatever you want to label it with to group things together using labels and what you put inside is the spec so the spec in the definition is the desired state for your resource something which Kubernetes should do for you so if you want the state to be three replicas for instance you put it in the spec you say spec three replicas Kubernetes will try to have three replicas running at all times for you CUBE CTL or CUBE Cuddle as many people say is the command line tool that you use for communicating with the API and the thing you do is you feed the YAML file to CUBE Cuddle Create and it creates it inside your cluster so immediately you go from source from a textual description of what you want to an actual object resource created in Kubernetes and you can use it to run commands like get pods which will show your pods now one of the basic blocks in Kubernetes is the service the service is some resource that you define that exposes your pods through a URL so if you have a group of pods that you want to operate as a database and their replicas you put them in a service and the entry point for those pods is the service it targets the pods to be through labels and what you put in the definition of the service is a selector that says I want this one to be the primary so it selects the primary and exposes that types of services are cluster IP something which answers on behalf of all your databases it's a node port which is it exposes the same port on each Kubernetes node for this service across all nodes load balancer which uses an external load balancer to send traffic to your services and external name which also takes care of DNS so it's a way to route traffic from outside the cluster into your service deployments are controllers are a type of resource in Kubernetes that takes care of your systems and a deployment is basically the controller that automates the running of your application it can enable you to perform upgrades by sending a command to the deployment it enables you to roll back to a previous state or a previous version of your application it defines the number of replicated pods you can say that please deploy a replica set of these three exact replicas for me and it also has upgrade strategies like rolling update which means we turn off one pod at a time replace it with the new version then we move to the next pod and so on or recreate when we want to take down the entire service and start building new pods from the beginning now let's talk a bit about state deployment is mainly useful for stateless applications so it doesn't matter which box your service connects to it's really about being identical so it doesn't matter if you don't have any locally stored differences in each pod each one of them can die and be replaced by another one no consequences that's basically what stateless means here but stateful is much closer to our use case when we're talking about databases because we need persistent storage we need something that can die and can come back up with the original data it had in it because what's the point and also we need to differentiate between nodes because you have different data in your master and different data in your slave or replica so you need some way to have stable tracking of your storage you need to have a stable network identifiers for these pods you need to deploy them in the correct order because what if the node doesn't know if it's a slave if it's a replica or if it's the master so you need to take care of all that and stateful set is essentially what does this for you stateful set defines a set of replicas in pods defines what are called persistent volumes and what's called a headless service which takes care of your network and DNS for you also contains persistent volume claims so a persistent volume is storage you define which is available for usage and a persistent volume claim is something which is using that storage so it's the same as CPU if you have a CPU pods are using that CPU and memory if you have storage then persistent volume claims are what are the pods let's say in a way that are using up your resources of storage if you delete a stateful set it has some side effects it does not remove the persistent volumes so if you take down the service it doesn't delete the data which is kind of safer and it also doesn't guarantee that the pods mentioned in the stateful set will gracefully terminate so the thing you should do is scale the replica set scale the stateful set to a number of zero pods and that's how you delete it so let's look at deploying Postgres in small scale in Kubernetes there is an image so you can build your own image or you can use an existing image the official image is the Docker community quotes official image that is maintained in Docker library it's what you get when you perform a Docker pull Postgres this is the most basic image you can pull then there is the bitnami Docker image which has a few interesting twists for example it doesn't have root access to the controller to the containers and crunchy data also provides containers there are many different container images you can find if you search for them on github it will be really easy to find so how do you deploy this now what you do is you create a configuration map for the configuration values config map is like fake volume that you can mount onto your pods and it only contains configuration it looks like this it's not very interesting you just describe the kind the metadata labels and then some data that you need like I want my containers to grab these values like database user password the next step is you need as we said some storage so you need to create a persistent volume and you need to create a persistent volume claim for your database that will use these resources then you create a deployment that describes how you use this container image and this persistent volume and you create a service to expose it the simplest use case is you write a service of type node port and that just opens a port on the node which is running Kubernetes once you've done that and I'm not going to go into the yaml of it because you can search for it it's really easy it's very well documented Kubernetes has good documentation how to write these things and there are millions of examples on github the last thing is connect to your database and you can do that through the exposed port on the node or you can use kubectl to do port forwarding and connect on local host another method is helm charts what is helm helm is a package manager essentially for Kubernetes helm is the client that you run the commands with the same as you would do with yaml or apt get so you run the commands to install and remove through helm tiller is the server side component because you need something to receive these commands from within the Kubernetes cluster and you have to have tiller running to respond and charts are the descriptions of these packages and you guessed it it's also yaml and they describe a set of related resources so you can group everything that we mentioned like persistent volumes services deployments definitions into one chart and that will create all of them at once thereby giving you an application package that you can use in Kubernetes and you can also customize this prior to deployment you have a file called values yaml which is getting read by helm as it's creating your deployment and you can also add additional files to your charts that get loaded into your containers as they're being deployed on Kubernetes our use case for helm for Postgres is that it's a one step stop installation of a database and also you can request replicas for this database so in one command using helm install this it makes sense to be able to deploy with only one command without configuring anything the chart that I'm going to go into is contributed by Bitnami and it uses their own image repo but you can substitute any image you want and it will still deploy it the installation is dead simple it says helm install give it a name and give it the you can optionally give it some values like configuration for your Postgres instance which is all documented here and you then specify the name of the package and you want stable Postgres and install stable Postgres release in this context means the name of your deployment so my database or I don't know sales database or whatever you want to name it it doesn't have anything to do with release numbers and the output of this command will be some magic that you can run to connect to your newly installed instance and as we said we can you can already provide files that will be fed into the installation so you can provide your own Postgres QoCon for pghbacons you just put them in the files folder because charts are a collection of files in a folder and if you put them in files they will be mounted as a config map and will be used by the pods that you will deploy and this is the output when you run it it says I created these resources for you these services this is the IP that they got it is a stateful set and right now the default configuration is it only creates one pod to connect to and at the time of the output you can see that it wasn't ready yet because Postgres sorry Kubernetes does eventual consistency you describe the state that you want it doesn't guarantee that it will get there in milliseconds or instantly or block until you get a response it takes care of things for you it is very consistent it brings it to the desired state so it gives you some magic it says that this is how you will access your instance from within the cluster and this is the way you get the password this is how you can connect to your database from port forwarding or through the cluster now the internals the home charts of this particular home chart is it creates as we said the stateful set with only one replica so you have one database it creates a headless service and a persistent volume claim to claim storage from your storage provider you can load you can configure this home chart to load your own custom scripts the things that you always did when you installed a fresh Postgres instance you can just put them in the files folder and they will get executed for you as the instance comes up it can also start a metrics exporter to Prometheus this project here the Postgres exporter on GitHub has a few definitions of exported metrics so if you install this on your pod then it will send these to the Prometheus instance in your cluster and you'll be able to get performance metrics and other things from your pod for example it can output PGStat activity replication or even custom queries that you write yourself to determine how your pod or your database is doing another chart which I found interesting is the ready-made Patroni chart you can use it directly from the Helm incubator which is like the staging area for Helm charts it also creates stateful sets of a master database with replicas the default installation in this case is five nodes which is the combination of Postgres and Patroni and they're put in the same image by Zalando the way you install it is you add the repository to Helm and you say please get it from here you update the dependencies and then again Helm install this thing in Cubator Patroni and similarly it creates the same type of cluster you look at something completely different which is the operator pattern the operator pattern is you have some go code administering your database for you so instead of fire and forget in the case of Helm which is a package manager you run the command it installs it you forget about it and the native Kubernetes controllers like deployment and stateful set take care of the running for you in this case you can do more advanced stuff because the crunchy operator knows about databases and it knows about Postgres and you can find it here it's also open source on GitHub you can deploy Postgres with again streaming replication you can also send it commands to scale your databases up or down you can add additional features like PG pool and PG Bouncer in your deployments you can add metrics as we saw with Helm chart you can use the operator to change policies in your cluster you can use the operator to assign labels to your pods to your databases you can also perform minor version upgrades backups and restores it also has a task scheduler so you can schedule backups and they will be performed automatically and everything within the Kubernetes cluster you don't need to have anything external you don't need to have a backup machine that runs a cron job that does this this is all done internally so what you do is you get clone the repo you check out whatever version you want the latest one is a bit risky but some of them are quite stable you then give it the required environment for an example deployment you set up an example namespace called demo and you can find the configuration of the operator in this YAML file here the next step is you need to give the operator permissions on the cluster to create the object that it needs to the resources that it needs to and the controllers that it needs to and then by make deploy operator you it's a convenience method that runs the script that actually deploys it for you PGO is the client PGO is the thing that you use to interact with the operator once it's been loaded into your cluster capabilities like create cluster in this case cluster is what we call cluster in Postgres it's a database instance a database server that contains database instances you can give it parameters like metrics so that it can export metrics immediately from creation if you do a PGO show cluster it will show you the state of the cluster it will show you how many nodes it administers and by scale it will show you it can scale it up or down so you can add replicas or remove them it can also create PG Bouncer and PG pool deployments across your nodes your pods and it also knows about backrest which you can use to create backups of your cluster with a single command or if you don't use backrest you can just type in PGO backup and that will do a PG base backup of your stuff restore my cluster does what you expected to you can also instruct it to perform manual failovers to see what happens when another pod takes over and the way you do it is PGO failover my cluster query that should be two dashes sorry that gets the failover targets and then you say my failover target one I want to target that for failover so a few observations on deploying by hand it's easy it's good for getting started quickly and find templates it's that easy to copy the YAML substitute your own values in the configuration just fire and forget it offers decent isolation it's comparable to the isolation a VM would give you and it's much less trouble than creating VMs and then creating drives to attach to the VMs and how do you share drives and so on it also saves resources it can reuse the free memory and disk and CPU that is not used in the cluster node but it also doesn't offer any cloud native advantages it's convenient but it's architecturally it's really about the same as a VM if you don't leverage any of the Kubernetes controllers now for production usage I think it would be a nightmare to have everything deployed by hand because in order to find out what's happening you would have to dig through all these YAML files and then you would have to examine your cluster's resources and query everything in order to find out what's happening with it so it's not ideal and also the point behind Kubernetes is to avoid having an army of DevOps or DBAs looking after your stuff 24-7 so it kind of defeats the purpose Helm charts good for one-time deployments like I don't care I need to deploy this database and forget about it I can scale it up at will I can add memory I can add disk to it I don't really have any requirements so it's good to fire and forget it's very clean and transparent because everything is defined in the YAML file you find the chart you read it, you see exactly what is deploying and exactly what objects it's creating in your cluster one sticking point is how do you do major version upgrades there is no way to automate that through the Helm chart you can perform minor upgrades by substituting the image but when there is binary incompatibility like in a major version change what do you do you need to handle that manually slave replicas don't actually do fail over automatically unless you explicitly set it up to work that way so out of the box you won't have an HA cluster in a sense but it does give you, because it's so simple, it gives you the flexibility to carry on using your existing solutions it's just that the stuff will be running elsewhere it will be running as containers inside Kubernetes cluster as opposed to VMs or on bare metal and advantages that it can be used without really special permissions if you have permissions within a namespace to create resources then you can just install tiller, you can run Helm and you can deploy anything the crunch operator the other hand is, let's say less transparent because unless you go into the go code of the controller you don't know exactly what it's doing but it does let you perform many actions through the CLI and it also takes care of a lot of stuff automatically and it also manages that you need to be cluster admin in order to use it right now you need to create the role based access control rules and you need cluster admin permissions to do that and because CRD custom resource definitions as we said the ways you extend the Kubernetes API are not namespaced they're universal for the whole cluster so you can't say deploying this in namespace sales you need to have your cluster admin created for you which in a multi-tenant system if you have hundreds or thousands of tenants can get tedious for the administrator of the cluster and this is the relevant thing they were asking for this functionality on GitHub but the developer said we're not going to look into namespacing custom resource definitions for the time being also Crunchy operator is really nifty but I cannot either say use it in development go ahead it's fine or say don't use it in development it's your own risk because it's under heavy development it may not be ready for production or it may do 100% of what you need on your own and test it to make sure that it's exactly what you want to get the caveat is that Kubernetes is also under heavy development it changes from minor version to minor version with breaking changes everything changes all the time observations continued a hard problem that is not solved by the things that we mentioned is how do you create a Postgres cluster that has multiple write nodes it's not easily solvable right now on Kubernetes multi master is not always a solution do you really want one database where people from all over the place write with different latencies and you need to take care of locking for everyone is that something you really want to deal with just to have people writing into the same table or set of tables but what you can do is use the solutions that we mentioned with PgLogical and expose another way to replicate tables and write into remote tables and you don't even need the custom image for that for example you can add it as a post install hook in Helm and it will just run it for you in your container because it creates the pod so you can just have a command that says app to get install PgLogical create extension so and so and you can start using it now what alternatives do we have you can go for a database as a service or a whole platform as a service like Heroku which costs a lot more than mentioned you can also go the way of managing databases on the cloud like enterprises DB Postgres that runs on AWS and manages for you and then there are these solutions here that you can use that offer more or less Postgres in some sort of guise or something which is compatible with Postgres your mileage may vary it is what it is it is a commodity it's really cheap compared to other managed systems but you get what you pay for you can define all of the above as a service in Kubernetes and connect them to endpoints so Kubernetes can even administer things that are running can even orchestrate things that are running outside it all you need to do is define an endpoint and you can say that this endpoint this IP that Heroku gave me is a database and it's part of this cluster so thank you very much any questions can I have questions so the first question hard one why you didn't mention the slant operator on patronym I didn't have time to look into it there is also a slant operator that I haven't had the chance to try it yet another interesting question is it any proxying level in the operator in cranged data operator proxy for queries which can handle some ingress point or off node I'm not aware of that I haven't really tried it ok another question can I use local storage instead of PVC I mean host pass storage is totally independent you can use any sort of storage as long as you define it without storage class I mean empty for example or host pass you can use host pass I can use host part another question what kind of replication can I use ok last question about the last about backups it is good to have backups database and backups in Kubernetes can I make backup to S3 compatible storage yes you can define it you can go into any sort of storage that you have attached to Kubernetes because your storage is an abstraction in Kubernetes it can be an S3 bucket it can be a volume somewhere else it can be EBS, it can be whatever you want speaking of storage so personal storage is mostly networking I suppose so is it more susceptible to the f-sync bug sorry? the persistent storage is usually network storage I believe so is it most susceptible to f-sync bug it depends where your cluster is running if you've defined it to use network storage somewhere else yes if you defined your node to use local storage that is attached to the node then it's local it doesn't have to be over the network that's what I'm saying any questions? any questions from anyone else? ok just one question please because we have just one minute is ok you have integration with Prometheus but what about query analytics can a monitor what is going in my cluster is anything in operator with monitoring with query analytics? well they mostly tend to go the way of Prometheus but you can you can change the go code and you can make it output to whatever