 Okay, so before we start our presentation, please do us a favor. Wake up from the food coma. Come on, wake up. Yeah, thank you. So, who are we? We are all from Stark and Wayne. Wayne, we'll introduce. Who are we? Stark and Wayne is essentially a consultancy. We help you be superheroes, or we help you and your company be superheroes, succeeding with your cloud foundry or PAS story, essentially. We help with everything from infrastructure operations to any kinds of automation, building backend services and applications, 12 factor applications, we help you figure that out. Basically everything all up and down along the stack, that's what we do. We fill in any gaps that you may have in the services and PAS, and we basically integrate all the different pieces. We don't just do that for you though, we also teach you how to do it yourselves. And so the approach we use, we partner with you guys and we work alongside you so that you gain the skills as well and basically we greatly increase your speed of adoption of the cloud foundry PAS ecosystem. Put it another way, we help you be productive pretty much from the first week. Okay, so you may wonder, then who created this wonderful company? It's this guy, Dr. Nick Williams, Williams. I'm trying to hide in the background there, he's a little stupid like that. Yeah. Good guy, you ever meet him, come up and say hi, he'll talk. Next. Who is Xu Zhao? Xu Zhao is a cloud engineer at Stark and Wayne and she's been working with us on our GE project, which we'll be mentioning shortly. Who is Wayne? Wayne is the CTO of Stark and Wayne, so he's my boss. I also, he named Kaixinguo in Chinese means happiness, like always make people laugh. I guess some of you already see that. I highly encourage all of you to visit our blog. We post technology things for any technology we're exploring, not just cloud foundry PAS, but also PostgreSQL and many other things, vagrant, virtual bots, any different kinds of scripting and automations, various languages, whatever happens to be on our mind or a challenge that we're feeling and saying at some particular point in time. We post it there, all the comments and discussions are always welcome also. Yeah, so that's about us. Next we will go to our talk. So we don't have an outline here. The reason is because we are going to tell you a story by having a conversation like how this project happened and what's the current state, what's the future. So, yeah, so long, long time ago, a princess lived in, okay, I'm joking, you should cut her shit up here. Thank you. So seriously, what is this project? All right, first let me give you some context. Yesterday, gosh, I don't even know anymore. Our first keynote speaker of the conference, Parag from GE, he described the Pre-Dix platform and great overview. Well, GE built this industrial internet or IoT platform called Pre-Dix based around this thing called CloudConner, which is a platform and a service or a pass. This project that we're gonna be giving this talk on is the system providing the PostgreSQL to that Pre-Dix platform that was displayed as a PostgreSQL tile in his slides during that keynote presentation. Yeah, so if you attended the keynote, probably you will have a bad idea of what he's saying, but this is what is this project, but why we need this project? So you will share more details? Sure, so in the beginning there was PostgreSQL. Anyways, and then a long time later came around this thing called Cloud Foundry. Cloud Foundry is basically an enterprise grade platform as a service for running applications like Heroku style, if you're familiar with that, or Docker style, if you're familiar with that. So basically build packs or Docker and a lot of excitement ensued once people actually figured out what the heck it was and what it did for their power of flexibility and business velocity enabled. Thing is, Cloud Foundry has a pass, well they kind of punt on stateful services which are databases and stuff like that. So PostgreSQL falls directly into that category. So that's kind of like why we needed this project because CF doesn't provide that. Yeah, then how did it get started? Like why did you find us to work on this? So we had already been for several months helping GE be very successful with building and deploying and automating everything around the Predix platform, Cloud Foundry-based. And after several months, we met with them and they asked if we could help them provide a PostgreSQL service to the Cloud Foundry-based Predix platform. And of course, as usual, absolutely that's exactly what we do. Yes, on it. So, following that conversation was the immediately next obvious conversation. Let's gather some requirements, let's see what you guys need. Keep in mind that at this point in time, Predix platform itself was under construction, had no users, therefore really no use cases could easily be flushed out. So our initial requirements were essentially from the complete viewpoint of GE's cloud services or a team that was building out the platform. Yeah, so next I will pretend I'm the GE side. I will help him what we require. Who are your users? So at first we want to use this inside our business like other business unit. For example, the industry internet application group. Then later on, hopefully we can provide this service to the external Predix customers, kinds of workloads. If you want an example, OLAP and OLTP. Oh. Yeah. Yes. Okay. How many databases do we need to account for on this thing? That's a good question. We don't have the real user yet, so we have no idea at this time. Can you tell us that you actually view as requirements then? I can tell you two things. First, I want it to be HAA, as HAA as humanly possible. Second, I also want it has a DR solution like you can do backup and restore. Yeah, can you guys do that? Absolutely. Great, we'll get started on this, let's go on. Moving back and then we had internal discussions amongst us, a lot of it in my head. And then we decided to, or when we had these discussions internally, we basically sat down and reviewed the requirements that we had gathered, which were kind of things. We discussed our architecture and how we're going to distribute the things and the life cycle and operations management, how we're going to go about doing those things. Yeah, so you will see there are lots of things outside of Postgres to be called in order to support it. Yeah, that's just Postgres. Yeah. So the first thing that we did is, well, based on what we know from the discussion, what is the requirements of this thing? Well, we know we're going to have, it's connecting to Cloud Foundry for the previous department. So we obviously have to have a CF service program. That is a mechanism by which services like databases get to be connected to applications that are running in a Cloud Foundry pipeline. We know that it's basically going to end up having a bit of a distributed backend because it's going to have multiple clusters in order to run the, we don't know, many databases and things like that. And we're pretty sure at the time there were different service types. So, again, yeah, so when we step back and we're like, oh, what do they need? Well, we're going to have to have multiple clusters running and they need the CF service program connected database to the applications in CF. We're going to have a, basically a semi-complex distributed system. So we're going to need a management cluster to manage the thing. And then we're going to have many service clusters to run the actual databases that the end users are connecting to. Potentially hundreds of thousands of applications might connect to this thing. So we're also probably going to need some connection pooling, like a PG bouncer kind of thing, so. Oh, also maybe it should be easily scale out. Oh yeah, scaling out and I was like, oh, basically, great, we're at capacity for these VMs. We need more, oh, but we need a mechanism by which to do that, yeah, that's a pretty good point. Yeah, later on we will show you like how we make the scaling up and the scaling out so easily. Great, so based on this review of the requirements, we kind of figure out these pieces. Basically the ability to redirect and load balance, the incoming connections and traffic for the admin API and the post-resql connections and the service broker API. We basically use HAProxy for that layer. And then based on the other requirements, the massive amount of connections, we used a PG bouncer for that. The reason we didn't, at the load balancer level, use just PG bouncer, because obviously you can do that feature, is because of where we're also doing the HTTPS and HTTPS traffic for the APIs that were being used for that, and then the whole ultra mega insane high availability requirement, humanly possible thing. Well, at the time, we didn't really have much to go on, but we said, hey, the BDR is, we know for a fact that it's being run in like 100 different production environments, there's a soft point to that, but I'll get to that later. And basically it has some restrictions, but if you could accept those restrictions, it basically could provide exactly what we're looking at for high availability. So we went with Postgres with BDR for our service clusters to kind of get the high availability, so if any one node went away, we just redirected the right traffic. At the time it was 94, 95 hadn't been released yet. To communicate amongst the components of the cluster for the back end system is kind of managing all this stuff. What we've found historically in the past is the best system for us to write distributed systems around is console. We used to use SED, and then console just provided us with that much more, so we kind of standardized inner company with console, so we now use console for that. If you guys don't know what console is, brief synopsis is that it's a multi-data center aware key value store DNS system. Service discovery. Service discovery with using the DNS. Basically it also has some failure detection capabilities built in, very robust tool for how young it is. And then we prefer to write our back end service pieces in the Go language, so for our service broker and the agents and daemons that are running on all the clusters that you can, via console and APIs, we wrote that in Go. The reason we prefer Go is because all the pain is handled on the velvet side, when you go to run this thing in production, you copy it to buy it and it's started. That's a great story for operations versus having to manage dependencies and install dependencies just to be able to use tools. There's a lot of great tools out there, like Wally, Petroni, other things like that. They all fall very short on the operations side. They make operations personal on a bunch of codes. Because there's all these dependencies and you can get into a dependency if you're not careful. So you can set the rooms and stuff like that. I can manage them. Why do you have to? You shouldn't do that. Anyways, rant done. So we talk about all those components, then how we hook up together for all those pieces to all our projects. This is a simple architecture diagram for it. So this part is the CF part. Then on top we have this load balancer. You can see in the management, management, I cannot talk. Management cluster, we have the PD bouncer, ProcreSQL management agent, like talk with CF, use SB API, service broker API, admin API work task can talk with the service cluster. In this stage, early stage we are still using BDR. So you will see in service cluster, each node we will have two nodes in one cluster. So basically it's two nodes in a BDR group. This is the early stage. And how we are going to scale up and scale out is very easy. So scale up means, let's say, you run out of disk for your database. You want more disk, you may want more large VMs. So basically we can only change some configuration in the deployment, then click a button to deploy again. Then you will get large VMs and large disks. For scale out, say, okay, originally I planned for 1000 in the database. Now we have 2000, 5000, how are we going to do that? So it's very easy, you can add more VM instances. So then in that way you can create more user database. So given this service metrics structure, we can very easily scale up and out. We are not going to show the details because that happened in the tool we deployed, like called Bosch, did I pronounce it? Yeah, I said it early. But I mean, if you're curious how is in the later slides. Also, I'm at. So we, internally, we run a 3.0 database. Internally, we can use the restrictions of VR, given the problem with it. So we try a lot of that for the locks, and stuff like that, and you're good to go. So right now, even today, we're still using VR in the management property. We're still reevaluating the decision because of something we'll get to later on in the presentation. But that is it. That's who I are to start with. The main thing about this is it turns out, you guys just, for your information, give us VR. Use 3.0. Use 3.0. This set us up for failure. Because it turns out VR, for the features we're trying to use it for, the 3.0s, are the ones that go away and things still work. Otherwise, you end up in some, I can't find anything else, oh my God, this is a mistake now, and it's very difficult to find. I can't do it. So how do you do it? Peace Bouncer is one, on every single service cluster node, as well as the management nodes, at least in this diagram. So when people connect through, they're actually connecting to, you can't create a project first, but they're connecting to Peace Bouncer Right. We went with the most supportive one, the least assertive one. Yes, it could be such a thing. Anyways, good? Good? Okay. You can? Yeah, so next. Ah, yes. So she kind of gave you a little sneak peek. So this thing that we've been describing, over the phones, and there's a complex, a series of systems, and there's a Jobyel. So how the hell do you deploy that in a legal fashion? How do you maintain or operate CDE responses, that's a big thing, especially since we're doing none of the products by GED, Intel, Closetus. Maintenance, how do you maintain the thing, and as she pointed out, scaling, you need to be able to scale up and out easily. So we had to counter that kind of line. You did? The answer is here. How do you do those things with Bosch? Yeah, it's a word I cannot pronounce correctly. Or a Google service. But Bosch stands for Bosch Outer Shell. Oh my god, did they really do that? Yes, they did, it turns out. Bosch is essentially infrastructure operations orchestration platform. So basically, you have your infrastructure as a service, you may be using EC2, you may be using Google Compute Engine, you may be using Azure, you may be using OpenStack, you may be using OpenStack, you may be using OpenStack, all these different things, maybe on-prem, Azure Data Center, OpenStack, or in-cloud. Bosch supports all of it. Which means that once we built this system on top of Bosch, we then could deploy this system to all of us. It avoids vendor lock-in. This allows people to also have a hybrid strategy. People can deploy any physical data center, some of their data racers, and have some of their data racers in the cloud. Maybe for their purposes, maybe for the same purpose. That's pretty wild. They could have an after application, they could have a cloud path, they could have databases in the data center. Bosch even has bare-metal CPIs. Cloud environment pages allowing you to deploy these things on bare-metal, as well as on DNS. Highly flexible, highly, it's an error-pride brain. So, it's got a great orange flavor, too. Yeah. You even can use it to deploy a cloud-based platform itself. It was actually written itself to deploy Cloud Foundry, which is basically a complex set of microservices, about 21 or so, that are all interconnected on multiple different VMs and stuff like that. So, it was built to deploy that complex system itself. Yeah. It's also the main tool we are using for GE predicts all different kind of deployments. Yes, that's true. So, the GE predicts platform. It was being demoed. All of the Cloud Foundry-based and services, like database, Red, they are all employed using a few multiple data centers. Yeah, good times. Yeah. Oh, did that. Oh, yes. Okay. So, that's great. Now we know how we're going to deploy it and scale it and what kind of tool there, how we're going to manage this running yeast. Well, there's a further story to this. You want to help your lifecycle of your software, not just the initial deploy, but the upgrade pack, which has built in the concept of versions and how to deploy them and stuff like that. But you don't want to stop there. So, there's a lot of stuff about CICD and continuous deployment and those kind of things. So, we want to make sure that our deployments and our upgrades were automated. No new intervention. It's that forward-finding configuration up front. We want to make sure that they're audited. Come on. CICD and Intel did the sort of new building these things for. So, therefore, we need full audit trails of all the changes to the infrastructure and move forward. And we wanted to basically have the ability to use workflow in order to do two things, which I'll get to you later. But basically, we wanted to be able to have workflow. If this, then, that, this passage preceded this state, we get here and that deploy to production without somebody manually clicking the buttons to watch it or react. Those kinds of things. Yeah, well, what's all we owe us for this? So, for this to work, we use concourse. This is another option of Cloud Foundry efforts. Somebody from Cloud Foundry wrote this as an open source project. So, this is not simply a Jenkins competitor. Jenkins is pretty Jenkins and you know how it works internally. All your data is output into WebUI. None of that. Concourse is a far superior system for a few reasons. First of all, none of your entries are through WebUI. You basically describe everything in the YAML manifest. You think the YAML manifest is in the system and then it makes itself into this. You use Docker containers in them. You have full workflow ability. Jenkins is a... to construct workflows in Jenkins you kind of play it up a little bit. To construct workflows in this thing you basically just describe the workflow. You have these things with jobs and resources and these different things. I'm not going to go into detail, but basically you can construct your workflow as you want it. So next, the concourse pipeline is my favorite. I will give you some high-level solutions like how we use concourse in GE deployment. So, basically we have two types of pipelines. One is the product development. The other is automated life cycle management. For product development, let's say many people work on one project. Then you make some change. You want to test that change is working. And if you have a pipeline set up, you make some change, automatically trigger the pipeline, will run it, do the unit test, integration test, pass the test, it doesn't generate a release for you. So basically it automates the flow when you make the product development. So that's for the release itself. Then after, let's say we already cut a release, we want to deploy this into different environments. Then they have another pipeline called automated life cycle management pipeline. So basically before you push anything to production, you may deploy it in Sandbox, do some tests, then go to production. So in this pipeline, they are going to test your deployment if it's working, the functionality and if you have a different version of release, how to upgrade. If you want to push from Sandbox to Prada, then how are you going to do it? Once you set up pipeline, it will be very easy. If you see, oh, it goes through the test, it goes through the Sandbox test, then you just manually click a button, then it will push to production environment. We made this step manually because we want to double check if everything is well before we really go to production. So that's basically two pipelines they use to help develop this. So, again, the two pipelines, the first one that we created is the development environment, the QA environment, staging environment, are working in those things, an automated fashion to make sure that they can get that automated and people can then just review the results of them and then tweak things to fix them. Yeah, so given their requirement, like we need to provide this PostgreSQL service for their platform for CF, then we not only build the service itself, we also work around, let's say, how we make the workflow to upgrade this service, deploy this service more easily. So that's what you see for those Bosch or pipeline stuff. So we work around this for several months, then a few months later. Yeah, so time flies. Then we have after this, a few months, we have our first version in production. Time to the time, we have real users. That's our Chinese version to express exciting, you know. But also it's time to Yeah. Essentially, while the applications have faded. That was fascinating. And it happened pretty quickly. Once they turned it on it was like, oh, we got some database. Yay! And then the next thing is the number kept on ballooning quickly. Scary, but at the same time Yeah. Oh, yeah. So at the same time we found like, since we have this real user storage the requirement is changing too. So we got the user feedback like this. So BDR has many restrictions. Actually users are not okay with that. So they are thinking, we just need PostgreSQL no BDR. And at the same time they require us to add more extensions. Then about like why these BDR restrictions they are not okay. We will go through with us like, what BDR has those operational issues. So the first one, and I know there's other ways going about it. They don't quite restore the map. Yeah. I guess this part we started to go into the content that DBA may care more. Like earlier we do the deployment like a concur. So we may not care about it. Oh, what happened? So that from here probably we will have more database stuff. So one of the operational issues that we did have was we had a frequent failure when trying to add or remove nodes from the cluster. So if we were trying to add a new node or a new cluster or anything else like that. BDR in the background the code is still young. So we got to tell them we will be in the states and there didn't seem to be at the time it may have changed. There did not seem to be many ways where you could add a DBA or something and go in and actually add it to the DDR table. Well they were completely locked out that wouldn't very come. Each database connection in BDR requires another connection because it's logical replication, right? So if you have three that's one extra connection for a database just between those nodes and that can be a user connection. If you have three nodes, then every database on each node connects to two other nodes which is streaming the logical logs so 100 database running on there with 200 connections already before any users have connected. Plus you have to account for adding a connection stuff like that. Your number of connections really starts to alter when you're working on your servers. Global sequences they're only sequential on a per node basis so basically it goes and allocates chunks. You get this chunk, you get that chunk and if you run out of that chunk well you gotta wait for it to basically agree on the next chunk and it may not agree on the next chunk. Restores this is the issue we had with restores it can't set the next fail during restores and we try to modify our dumps in order to be able to restore databases onto this thing that got really interesting I'll just say that and it ended up with some great late night entertainment entertaining for people who don't know but also a lot of these supporters are allowed from time period this is that chunk issue as we're referring to. We ourselves when we learn from that it's a really wonderful idea a very young project you can be successful with EDR if you are in control of your application code so that you can account for BDL locks you can account for these other things and unlisted on this slide if you hire a second quadrant for the understaff you will be successful with EDR. That didn't happen that brings us to back regroup customer what's going on here we did a best effort using EDR from our initial requirements we were trying to do that turns out that's not what the user actually needed to want so to step backwards to go forwards we ended up agreeing that people go with solo service clusters a single master to start what does this address just PostgreSQL that's what we were expecting why isn't this PostgreSQL why am I getting this weird error it's not acceptable it also addresses the operational issues also we were going to add to support more extensions that's real straightforward so of course they said make it so as the they don't watch this movie they're not very nerdy turns out we are extremely good at adjusting course and addressing issues that arise so we did when we made it so so you may be curious so where we are today for this project so today so on the slide to the keynote we said 3,000 databases when I checked last there was 5,000 databases a little birdie this morning they were up to around 13,000 databases across multiple databases there's about 4 4, 2 there's about 4 data centers about 2 to 3 of them are loaded up and active the other 2 are under construction yeah next also about architecture we talked earlier we have a BDR cluster here now after their requirements say only Postgres SQL so we only have one node cluster yeah no they really know their stuff and basically they kind of helped us figure it out where our direction was going next so we ended up basically going for this yeah even when we changed the structure and what's inside the tool we used earlier for deployment and automatic deployment release all those make it very easy so you can do this change very quickly without many manually like change for each environment you just again you click a button all your change pushed to different data center different environment nobody is going in then you literally just go in there and you increase the number of instances from 3 for your cluster and they boss deploy and it goes and it spins up the other 7 am and when they come online there's just a massive cluster and then they pre-provision the databases and then they tell the master cluster about the data that are available and then from there you've got extra capacity all that stuff you need more disk, more CPU, more RAM you change the instance plate for that or change the number of disk the size of the disk by using the increase plate going down it works but you fill up one and it doesn't work and then you just say boss deploy and you go to the platform and you build this on so automatically you go to the sql app there's more to that story as well so from this platform it's running on ECG where with all the original deployments we also have stuff running in the data center free spirit those are the two main ones we run this on other things but for this we're using those yeah so that's what we got but we're still working on the following things the first is since we changed to a single solo that's repeated solo postgrescicle we are going to work on the streaming replication so the step back is basically here then take a step forward so currently also we're working on automated administration for inter cluster database migration so that holds but there's also the problem where you have this one user who's storing very large binary block in the database true story and you have no idea why we're not willing to stop you gotta deal with that so your options are migrate over the else off to another cluster or migrate them to another cloud the problem with this is manually while you're going in there you're doing a dump over here and then file or restore you use netcat in between to make it real quick and easy or whatever you have to be doing you can kind of automate this as well shell script and stuff like that but we said that's all great but the way we approach the world we try to automate everything we made an API endpoint that says migrate this database from this cluster to this cluster this is also why OpenSCG has that PG, the low balancing layer so that all we have to do is once we have that migration complete we just change where it's connected to over to the new cluster so that enables us to have this flexibility so that helps to be able to store it so this helps to the address option and also the back of the restore story that basically where we restore the database to the connection to the chain so yeah just doing some API endpoint requests to the edit API and the database get migrated in an automated fashion in the back end this is a good story this is what we're working on except this type of migration we also need to add all the BDR database to the solo cluster because since we spin up the BDR one we already have user in there now we are changing to solo so we also are working on migrate all those BDR database into the solo cluster so that's what we are currently working on would you like to share what up next we are going to do obviously the step forward is adding and screening replication base clusters so everything we had as solo before will be migrated and screening replication so we can enable failover stories and stuff like that and we are going to use that and add the API loop to do that migration so that will be nice and then after that after that we will give you a demo should we wait for the end of the demo or yeah we can do that at the end we will quickly show you the demo how to use it we are going to run out of time to do the demo at the end announcement okay so GE has graciously flipped the open source bit and get help from Cooley setting up the pages today make public so if you know what I found it's there if you are familiar with Bosch you will be able to take this from run with it if you are not familiar with Bosch try it out ask us questions we are happy to have everybody try to play with it and if everybody tries to play with it that would be great okay yeah so you are going to talk about the open source stuff so what does this mean for the future well the future is kind of exciting and ours so we are going to move right now with the benefit of the state store the service clusters and manager cluster is in sync we are going to make all the states stored on the manager cluster we are going to make this plugin based architecture for the service clusters and we are going to create guide contracts because that opens up some very interesting things yeah the next one actually is more exciting that opens up any more service cluster types we can already add that if you haven't done yet trivial this is allowed on these things to one but we can add once that contract is there we will be able to have a brief on other services maybe we want to add Redis we want to add Elk maybe we want to add time series databases who knows obviously it is my interest to just post it hey the other thing you want to build on is a customer self-service dashboard it will be nice to have an application user of these databases and say back to my database on the mount now give me a URL where I can download it or other tasks the administration while I like APIs not everybody likes so some people like the point clip which is what we want to do with UI for that and also identity stuff is a big issue in the service these days so we're looking at also the time involved extraction of secrets into that since it opens up there are more effort can be added the wonderful things will happen in the future but we may wonder how we get involved in this project give us feedback bring the story with us send full requests or actually share with us the project by firing us and what we need to do it all the stuff I've described is possibility what we focus on next is going to be what we or a client need next so if you have something you want our focus on we are very much open to this so maybe I will give the demo sure so she's going to give a quick demo let me see turn me on yeah so let me see if I'm connected so this is one instance we already created for the user I already connected to this database what I mainly want to show you is how it can do backup and restore so inside here you can see we have a database called PGC 2016 let me see what's inside this table so I need the terminal okay basically I just inserted the information of two of us and I said when I introduced myself to people I said hey I'm XJ then people later hey are you XYG no I'm XJ and our company oh this is our what's that called slogan basically when you're working with us you're not just getting the skills and talents of a single pair of people that you're working with we talk internally a lot and share all our experiences internally so you basically get all 30 people or so yeah so I try to be funny and put something I see here to literally people think I'm XY so now let's okay okay go here so this is the sorry I need to go here so this is the common we use to backup the database so I run this then we will backup the database way yeah now we are going to do something terrible here then I'm going to drop table PA BLE is hard to see because it's not on my screen PGC 2016 now if you see we don't have that table here anymore now we go back to restart our database let me do this one I'll go to the