 All right, I think that's, okay, the mic's on now. All right, so we're. So the question, all the other questions that were asked, I'm looking at the camera now, all the other questions that were asked so far, meaningless. You didn't need to hear this. The question that this young gentleman just asked is, where does all the magic happen that allows normal streaming replication? There's like SSHGs and all this other stuff that happened. How is that happening here? And that's why I said, OpenShift doesn't know any of that. That's not part of the OpenShift part. OpenShift just knows how to spin up the replicas. It's the Docker image builder, Jeff, who made it so that when they spin up, they could do this kind of stuff. So now, Jeff. Yeah, this is really simple in that there's no SSHG generation required for this type of postgres streaming replication. The only thing, because it's using just the postgres protocol. So what happens is when a slave connects to the master, it's before that happens, all I do is execute PG base backup and create that replica for that slave container. And then at that point, just start up postgres and it's in a replication mode. The master is doing a protocol with that slave to establish that replication protocol. So there's no SSHG generation required in these examples. So maybe it's an ignorance on my part. I gotta get that part in, just because it's important. But so there's no encryption between the master and the replica in this case. And then wait, Jeff said no, but then Josh looks concerned about that statement. You're not using an SSL connection? There is not an SSL connection between the slave and the master. I'm just using a straight SQL connection. Yeah. It's easy to think it is not currently doing that, but as soon as Jeff leaves here, he promises me that he will make a new Docker container. No, it sounds like it's an easy fix. You would basically just have to put the search on the different machines, right? Yeah. In the Docker image itself. I believe postgres gives you that option, but I didn't choose to do that with this example. But the way I would fix it is I would go into the bash script in that container, and whenever it goes to set up the connection, that's the point where I could tweak the configuration of the replication. By default as well, this is using asynchronous replication instead of synchronous. So those kinds of configuration changes are possible either by tweaking the way I did the bash script, or I give you another avenue with this in that I let you mount a local volume that contains your own postgres.conf and PGA HBA.conf files that are local. So you can even override the entire postgres configuration using a set of external files. And that's just a Docker volume, a config volume that I let you have the option. You can either mount that and put something in it. And if something's in there, it'll use that. Otherwise, it'll use the one that gets pre-generated. The environment variables that Steve pointed out are another way of you tweaking the configuration that is generated by default. But you have the option to completely override the postgres configuration using your own local config files. Yeah. It's not public. By default, the only traffic we route is HTTP traffic. Yeah. Oh, right, right, right. Yeah. Typically, you would mount any kind of certificate store as a Docker volume so that you've externalized all those keys. And we also have one of the volume types that we've also put in. It's an OpenShift. It'll probably get pushed back up into Kubernetes soon enough. It's a secret volume. Right. So you can mount that volume. Everybody can make a claim to that volume and get those secrets inside. It's a separate volume claim. So we have that as well. I know your secrets. You know our secrets now? Yeah. Yeah, there's probably. Yeah, we'll leave now, sir. Yeah. I'm just kidding. So he said something that was inconsequential. I know you couldn't hear it, but it was. The young gentleman here was also making any suggestion about how we might want to tweak it so that the path was an environment variable rather than hard-coded image. I want to keep going. I don't know where we are on time. We have a half hour. Well, that's still a lot of time. We're good. So now, this young gentleman in the front, you couldn't hear it because I didn't repeat it, asked, what happens when you scale up to three? Funny you should ask that question. Let's do that. Go into the screen. I said, oh, you know why? Because I increased the font so that you can see that. There we go. Overview. Ready? Here's how hard it is to scale up to three read replicas. What we're probably waiting for here is these are Docker images. So what has to happen is we have a master, and then we have a bunch of nodes that actually run the Docker images. If that node has not pulled that Docker image down already, it's got to pull down that Docker image and then spin it up. So we're ready to pause running. Or did I just go to two? Or did I not go to three? OK, there we go. So now I've got to wait for the scale to three. So that's what we're waiting for right there. We're waiting for it. Uh-oh. The object has been modified. Oh, there, it's two. I've never seen that error before. Close. Welcome to live data. I know exactly. Is there any way of public open shift to ensure that my replicas end up on different physical hosts? Yes. So the question was, is there a way that I can actually make sure that my replicas end up on there? Now we have three pause running. So that's all it took to get your three running. We'll do the data in a bit. If I answer your question in a second, Josh, just let me. I just want to show one thing. Here's the two that just spun up. We have to now spin up Postgres, right? So I want to show the logs. While it's going, right? So that way, because I can't do this. I can't show that it's replicated yet until that's actually finished. The question that Josh asked again was, is there any way to make sure that they end up on different hosts? And yeah, that actually can be controlled through deployment. Actually through the cluster admin, they can set policies for how the deployment of Docker images happens or how the deployment of containers happens. So you can do things. It's actually a set of rules. You can have it as simple as just affinity. I want you always to pick one that doesn't have one on it already. So when it goes through all the nodes to say, where should I? This is the scheduling part that we talked about. The Docker doesn't handle by itself. So you say, oh, I want those three replicas. OK, I've got four nodes. This one's already got a master. Or this one already has a slave. Where am I going to set up the next one? One of those other three. I've already spun up the second replica. I want to put the third now. Where am I going to put it? I have to put it on one of the other ones that don't have it. So that's a simple just affinity rule. But you can also set things like, but I all want them to be in the same data center. Or I want them all to be in different geographic data centers. So do a just affinity rule for data centers. You can also say, I want the one that is not on the same host that has the least amount of CPU and the most amount of memory. So it'll go through and start finding nodes and then do rules based on that. It should be up now. Did we watch it do it? Yeah. So now, I'm on another one of the new ones. Here I am again. Can you see that in the back? Once I get to it? Yeah. Actually really talking to each other. Here's the master again. Here's where I copy and paste again. That's probably pretty neat. Can you see it now? Copy and paste not working? Is the control shift being here too? No. Oh, no I didn't. Let me try one more time. Nope. OK. You guys get to watch me write SQL on the slide. Thank you, pair programming. Love it. And I don't even sit next to you because that's the best part. There. Done. So I inserted it into the master. Now this is the part where Jeff gets nervous. OK. There. So that's all set up in life. So now I have to. OK, that's enough giving him attention. Don't reward him for his insubordination. OK, so if I go back to this again, and make this smaller, this is where I think, did I make it smaller enough? One more. There we go. One of the things I really want to point out again here, I would never be able to do this if I was trying to support. I might be able to do this after like five days worth of Gap Shaving. And I'm sure I would set it up wrong. And I would probably give up in the end. Because I'm not going to spend five. Does everybody know what I mean when I say Gap Shaving? This is a Linux conference. Really? OK. Well, that's why I think I'm on a BSD list, I think. But Gap Shaving is the idea. I'm going to tell you now for everybody since we have time. Gap Shaving is the idea that you want to check your mail at the end of your 100-yard driveway, and it's the winner in Manitoba. So it's freezing outside. So you're like, oh, I better put a sweater on before I go get my man. Oh, I don't have a sweater. OK, I'm going to get a sweater. Oh, I can't go to the store. I've got to make a sweater. Do I have any yarn to make a sweater? No, I guess I better go spin some yarn. Do I have any wool? No, I don't have any wool. I better go get some wool. OK, I better go outside and shave my yak. And so you spend the whole day shaving your yak, and you still haven't gotten your man, which is a typical problem for developers a lot. I really just want to write an app that has a database and a web tier, and I want the database tier to be replicated. OK, well, spend five days setting up a Linux VM, and then you're going to start playing around with first installing Postgres, and then you're going to play around with how you're going to end your five days later, you're frustrated, and your boss is coming to you and saying, where's that project I wanted? And you're like, well, I'm still mucking around with Postgres five. And that's when they say, use Microsoft SQL Server, because I don't want you spending all this time mucking around with your big five. And then you're sad, and then you put your job, or you become an alcoholic. Yeah. OK, when you created the other replica, did it copy the data from the first replica or from the master? So the question was, when it created the other replicas, did it copy the data from the master or the first replica? Oh, or from the first replica? No, this is typical like master replica replication, where the new replica comes online, it checked in with the master, and it says, where should I be? And then it proceeds to read from the master and make itself like the master. Is that correct? That was just a fact check from the guy who asked me. I like that. Yes. We were not going to get into that other inferior database that begins with a line and ends with a T, SQL. We're not even going to mention that unholy. That's a great project. I've been a PostgreSQL user for 10 years now, and I understood why people use MySQL, because it was really easy to set up. This is not Red Hat Seat at all. This is off myself, because it wasn't even, but I just never saw a reason to use it. Personally, I just never, I wanted rep, I wanted key, I wanted, I wanted transactions, and I wanted short procedures, and I wanted triggers, and I wasn't willing to give that up, and it was never that much slower that I needed to give it up. I'm not Google scale or whatever scale everybody thinks they need to be at. Yeah, do you have any questions? I want to look up some of those in the master. In this scenario? Bad thing, right? Or it feels for a short amount of time, does the network disconnection? Does the network disconnection? Nothing like that ever happens in the real world. So the question was, what happens if the master fails, right, or the master gets disconnected? Jeff? Yeah, right now, there's no automatic failover happening for you. You're going to have to, as a DBA, you're going to have to get in and trigger failover to one of those clients. Wait, but the master would re-spin itself back up. Yeah, that's true. So if OpenShift can see that it's down, because there's lots of different, we used to say going down, and that used to mean something very particular, like about what it meant that this database was down, but there's actually lots of different states that's down, right, so if the OpenShift cluster, or the Kubernetes cluster can't see, if that replication controller says, there's supposed to be one of these, and it's not up, it will spin up a new, it will spin up a new one if it can't see it at all. We also have live-ness checks, like a ready check. I don't know if you've defined one, but OpenShift actually has the ability to say, hey, I know you're up, the container's actually running, but are you actually working? So you can define a probe that the cluster will keep hitting and saying, is it working, is it working? If it's not working, it'll spin up a new one and bring down the old one. But I don't know if what will happen in terms of, if I bring up a new master with no data in it, what I'm assuming is going to happen, because this is the problem with using an empty directory, right, there'll be no data. If we were using some other data directory as the pgxuel data, it could come up, and then, I don't know if we can't see empty data on the whole thing. Sorry? Yeah, and then you wipe out all your data. So that, this is not a production setup here for anything that actually matters. Wait one second, one second, yeah, Josh. What are we talking on Sunday about? So on Sunday, Josh will be talking about automatic failover with containers? Ah, yeah. Oh, with containers. So, Jeff should attend, and I can give a better talk next time. I'll leave Jeff so I can answer that back. One thing, that's a great question. Best question of the day in that, it depends on how the master fails. If the node or the physical server that you've deployed the node to, let's just say goes up in flames. At that point, you need to have made sure that your slaves or your replicas are running on separate physical host, and you're gonna have to make a decision. Can I restore that server? If so, Kubernetes, when you start Kubernetes back up, it'll restart, try to restart the master, and things will come back. But if the server that that's on just completely is toast, then you probably will wanna trigger one of the slaves to become the new master. You're gonna have to do that manually, currently, with the way these containers are constructed today. So, you as the DBA would have to say, okay, I want that one to become the new master. Or you're gonna have to go to a backup, create a new node somewhere else on another server, and call that one the master. But when you do, you're gonna have to reconfigure this cluster all pretty manually at this point. I'll show you another project here soon that does this for you to a certain degree. But that's a great question, and it's one of the downsides to using the empty-dure volume type, okay, because it sets on that physical server, and that empty-dure volume is managed and provisioned by OpenShift on that server. The upside is high performance. You directly are accessing local disk. If I make the choice, or the decision, to deploy all of this on, say, NFS volumes, then all of my Postgres performance deals with the latency or whatever of that network storage. So that's the architectural quandary or decision that you need to make when you deploy this, is am I deploying for ultimate IO performance? If so, I'm probably gonna use empty-dure and deal with if I lose a server. Here's my procedures to re-initialize this cluster somewhere else, or if performance is acceptable using whatever network volume type that Kubernetes supports, whether it's Amazon or Google Cloud Storage, those are both options, then you probably would go there because now you have the ability, you have a lot more resiliency. So that's a great question, and it's one where when I was writing these containers, I can't answer that question architecturally for everybody, because everybody in this room has a different problem they're trying to solve, but I did wanna make sure that I could make the empty-dure volumes work, because I know a lot of people are gonna say, man, I gotta have ultimate IO performance. So as tempting as it is to just say, I don't always use NFS, and not have to deal with that problem, a lot of the thinking, at least in this project and another one here, I'll just briefly introduce to you, it's using empty-dure for a reason, and that's to give you the highest IO bandwidth performance. And it's not a simple one either, the other thing that Jeff is talking about, we're like when he said, can you figure out if the others become the master? OpenShift and Kubernetes, as soon as that pod dies, because the replication controller says, there should be one of these running, it's going to spin up a new master right away. Right, and I'm assuming that the replicas will try to reconnect to that new master that just came up, and you'll say, I have no data, make yourself like me, and you're gonna wipe out all your data. So, this is not, I'm not showing the ultimate production scenario right here, but I'm trying to show something that I don't think any of us, most of you have probably never seen before. Right, and some of the advantages to using the kind of architecture, more work would have to go in, and you could engage the crunchy people and engage the OpenShift people to figure out a better way to do that. Wait, there was a question in the back, and then I'll get to you. Is it gonna, no, the pod will be completely different names, but what will stay the same is the service name, and that's why the replicas are actually reading through the service, rather than directly to the pod, right? So it will work, because it's going to talk to the service, and because you put the service in front, it doesn't care what's the actual pod name behind it. Does that make sense? Yeah, I think you tricked that. Yeah, everything uses the service name, and the service name is the IP entry point that everything can find everything through. So a pod, in this case, whenever I create the replica, I'm passing in the master service name, and if you look inside the pod. I'm gonna show them right now. Okay. The doctor will actually always associate or give you a service name to IP translation. Actually, Kubernetes, it doesn't. Kubernetes. So Kubernetes and OpenShift insert, so it's gonna be Postgres, right? Or is it gonna be master? Well, just cat out Etsy host, and you'll see something interesting there, too, but that's a good example there. Right, now I'm in one of the replicas, right? And I probably have to make this bigger. Come on, can you see it? Okay, so when he said the slay is actually probably using that environment variable to get the IP address of the, oh no. Is that, yeah, to get the IP address of the master, rather than hard coding in, right? Is that, am I right? It's using PG master host as the connection. Yeah, two down. Two down, this one? Yeah, it's using that. And you're like, okay, how does that get resolved? PG master, and yeah, so cat out Etsy host. Host, sorry. I nearly had a heart attack on that one. You're like, what? Do you see it down there? I'm not seeing what I was expecting to see there, but anyway, the Kubernetes in this case, I keep forgetting what it's doing for me versus Docker, but Kubernetes actually... Oh no, Kubernetes, this service, right in that service, that actually maps the IP address. Yeah, and it's getting resolved. That IP address is actually getting resolved by Kubernetes, but I'm referring to that as the connection host name. So the reason I, this is, if I had gone more into the whole OpenShift thing, back in that slide with the golden blue, one of the hard parts about setting up Kubernetes is it doesn't include a networking layer. I expect you to bring a networking layer. We OpenShift actually brings a whole software to find out. And it's got an OpenShift, and runs its own internal DNS to do that kind of stuff. So it's doing it all well in the flow. So but in the scenario where the master dies, it's not gonna be, the players aren't gonna reread, it's gonna reconnect, right? No. No, but the IP of the service stays consistent, right? That, remember what I said, that IP address for the service never ever changes unless you delete it and bring it back to it. But service IPs never change, and which is why you're talking to the service, not directly to the master. All of your talk to the service under the hood, and Kubernetes knows to reroute that to the pod that it actually is an output, right? But the IP for the service has never changed. That's the benefit of this framework. Your energies just don't have to care about the policy going up and down. Drawback, in this case, you end up wiping out the entire case. Steve might have mentioned it, but the service name for the Slaves is actually a proxy, so it just implements a simple round robin. So your app, when it connects to that service and does a request, it actually just does a round robin against however many Slaves you have. That's the other point. So this is the last part of the demo I wanna talk about, because I want Jeff to show off his stuff as well. But go smaller. Thank you. So this is for the app developers or even for the DB admins who talk to their developers, which is a few of you. What is the other nice part about this architecture is, remember when we talked about the web here talking to this? Oh, I promised you an answer to a question. Sorry, I got to wait about the nice part. We're gonna miss the nice part now. Oh, go ahead, taunt them too. Go ahead. It is. So this is a whole Kubernetes thing. We need like a whole. So the question was how could I, if I bring up, if I promote one of the, one of the replicas to master, how do I get that to respond to the service that is actually the master service? Kubernetes uses label for everything, right? So if you look at, if you look at the service right here, I'll click on the service. And if we look at the selector, the selector for that service says, where are the name equals pdmaster.rc, right? So the service knows what to route to based on that selector. So anything that says name equal to pdmaster.rc, that service will then start routing to that top. So all you would have to do is, and you can do this in the version that I'm showing here. You just do oc edit, and you can edit labels. But if you go back and you look at your pods here, these three pods, if I click on these, let's see if it shows me the labels right here. Nope, they must be under annotation. Maybe. Where's the labels? All right, let me, I can't see it through. Oh, there, there's the label. Sorry, it's the new web consideration. The labels are actually all in blue on the top. So this one has the name pdslave.rc. All you have to do is edit that label to pdmaster.rc, and automatically that services start routing to that traffic. You can manually edit this and do it. Without restarting the container, without doing anything else, you could just change traffic around by changing the label. So that's one of the other great benefits of this kind of architecture. Does everybody understand that? So that's the other. And, or, let's say you have a misbehaving slave. It's working, but you're throwing a bunch of errors and you go to the terminal, you see all these errors. So Kubernetes doesn't want to, OpenShift doesn't want to kill it. But you want, and you don't want it to lead it, because you want to look at what went wrong. You can just change the label from pdslave.rc to pdslave.rc busted. And what will happen is, the reputation controller will say, oh, I don't have three of those. I only have two. I'm just gonna put in one. And the service will say, oh, I don't want to route to this because it's not the right thing. But it'll still be up there and running and not dead. So you can go and do all the diagnostics. If you fix it, you can just change labels and do back again. And, okay, so the cool thing, right. I was showing the cool thing. Is that what I was doing? It's not actually even a cool thing to show. It's just something that naturally happens. When you're writing your application, it's very easy now. When you're doing, if your application is doing right, the only service it talks to is pdmasterrc. Right, I didn't have to do any weird configurations. I don't have to tell my programmers anything different. I was like, this is your right service. And this is your read service. All the reads should happen off the pdslaverc. You don't have to tell them to do, it's like, for me as a developer, that's just an easier way to think of what I'm doing. Oh, I want to read in the planes and I want to write for the master. So those are the things I write to and I don't have to worry about anything changing under the hood or how many replicas are up or anything like that. I know PG pool can take care of a lot of that stuff, but here if I'm just using, even if they don't, if you choose not to use it, it's there. And I don't have to care how many replicas are behind it. And I don't have to make my master do any reads at all. So my master can just handle writes exclusively. Which is the preferred architecture, right? Yeah, yeah. So it's just out of the box, we get that preferred architecture. And we get it easy for the programmer to know how to work with that preferred architecture. What? Why did you warn me? Hi, man, you're on a roll. So you're covering everything I was going to talk about anyway. Now, seriously. Two minutes, we're gone. I'll just mention a few more things. Go. I'm about two weeks out from having a PG pool container that will work with this, by the way. Pulling that from this other project called Crunchy Postgres Manager. So probably in about a two week timeframe you'll have a PG pool container that will go along with this. So the idea there is you would deploy that into this environment. It'll know who the master service is, the slave service, and it'll configure PG pool for you so that it'll do smart load balancing. So your apps, at that point, would have one entry point instead of two. And that's gonna actually be pretty useful for people. Also, backup and restore capabilities. People will ask about that. And that's probably another one where I'm pulling from another Docker project. Again, the CPM project, a backup and restore. And that's just a simplistic backup and store where you can run backup on this and it'll create a network volume that will contain that backup. And I have to use a network volume in that case because you want your backups to be kind of long living and whatnot. And you can deal with any kind of slowness to using network storage in that case. The restore will work and basically it'll fire up this crunchy PG container. It'll look for some restore flags and also which restore archive you want to build your new container upon. And it'll basically pull down that archive to the local empty-door storage and it'll start up Postgres using that. And that's another couple of containers that you'll see on this GitHub project pretty soon. And those are just what I would call basic features to get started. Allowing for a simple failover is definitely something that I've been thinking about how to do. Like maybe exposing a trigger file location via Docker volume so that you could trigger it that way. And maybe some utilities that would help with managing some of the state. Some other points is this runs on CINOS and Rail currently. So it's up to you which one you want to use. I supply Docker files to both. It bases, it uses the RPMs from the PGDG repo currently to do all of this. It's where it gets its Postgres bits from currently. Crunchy itself has our own repo that we're looking at maybe offering that version up as well for a more secure or supported version of Postgres as well. So you could run these containers using that say in an enterprise scenario where you needed support. Other things is the, I put in here post-GIF and PG routing by default. So the size of this container is pretty hefty by Docker container image sizes. But the reason is there's a whole bunch of stuff in there that I just said, let me throw it in there. And if people need it, great it's there. If not, they can pair that out just by chopping it out of the Docker file if you wanted to shrink down that image size. But I'm not found where the image size between 300 meg and 500 meg really matters for anything I've noticed. What else was there? I could talk probably for an hour just on the Postgres UID and GID configuration settings in OpenShift. The bottom line is Postgres needs and requires to run as the Postgres user. Out of the box, OpenShift generates random numbers for UID. That's a problem if you just start this container up. It will die, Postgres will die and saying, hey, I need a Postgres UID to run under. I can't run under some dynamically generated number. So there's different strategies for that. And that's the security measure that's in OpenShift. But there's configurations in OpenShift that allow you to totally change that behavior. And you can say, no, OpenShift, run all these containers as any user. You'll see in the Docker file I wrote where I specified the Postgres user to run this under. And also in the Docker file, you'll notice that Postgres is the last thing that runs and it runs in the foreground so it hangs there. And the reason for that is if it ever dies, Kubernetes will know that the entire pod has died and try to perform a restart. And that's just another thing that you'll notice if you look in detail. Quick question? Not in this example. So what I do is I start up a bash script and it calls, I think, start.sh. In there, it eventually makes its way down. So if you look, PID 1 is actually that bash script. Okay. Now some Postgres examples will not work that way at all. But I'm still using a bash script in this example to set up some of these configuration options. And then the very last thing, the hanging process essentially Postgres that keeps that script alive so it won't exit. Don't have time today, but there's a whole nother project here. There's a couple of parallel tracks at Crunchy in terms of Docker work. This has certainly been a big one. We wanted to make it a way for people to run Postgres in an OpenShift container world, Kubernetes world. There's a whole nother project I've did called CPM or Crunchy Postgres Manager that has web UI, it's Docker only. So it does its own level of orchestration. And it's for people that wanna implement a purely on-premise Postgres as a service. But it uses Docker swarm and it's using Docker volume plugins and things like that to perform sort of how, what you saw today, except through a point and click web UI. And there's certain things I can do in that environment because I can control the entire host topology that I can't do in an OpenShift. So I encourage you to look at that project as well. In that one, for instance, I can do a backup and restore just point and click. And I can also do restores from things like PG base backup. I can run PG badger on any container. I can start up a predefined profile of clustering and then I also built in there Postgres metrics collection using Prometheus. So it includes a whole lot of enterprise functionality as well. We have a slide on it with the link to the Git repo as well, isn't there? So just to wrap up, we're in a time of container explosion. The take on my gift to the Postgres maintainers right now, except for Jeff because he already knows it. If you don't know this already, some of the ways that Postgres does things with great backup for it's built, it is not so good anymore in the world that we live in with containers. So it would be nice if we started to think about moving Postgres to being friendlier to containers and stop thinking that everybody's standing it up the way they used to stand it up before. I had a big argument with, wasn't with the actual person, but like extension. Please stop forcing it to be a root user or the Postgres user to install extension. That is not the world we live in anymore in the cloud and container world. I don't need to be root or Postgres anymore to install extension and it's still be secure from an administrator standard. The world is changing in some of the ways that Postgres needs to change after come along with that. And I get to say that now because I got to so far. Kubernetes and OpenTips gets to be on containers. Good time to go ahead for you with containers with Postgres upstream first. So all the stuff that you're seeing is upstream both from Jeff's work and Red Hat work. It's all done in open source first and you have access to it right now to play with it. Reach out to me on Twitter or any of the other places where you can reach to Stevo even at redhat.com. You can't talk to Jeff because he refuses to use any electronic media. And that's it. Thanks everybody.