 Hello. Good morning. Good evening, depending upon which time soon you are in. So for the next 20 minutes, I'll be talking about Postgres HA with PG pool 2 and what's been happening in PG pool 2 works like. Originally this talk was 45 minutes, but I only got 25 minutes lot for this session so I'll have to fly through a few of the slides so bear with me. Okay. Hello, this is me. I'm Muhammad Osama. You can call me Muhammad. I'm a senior database architect with high go prior to that I was associated with enterprise DB. And I'm kind of working in Postgres or related to Postgres for the last 12 years and I'm also an active member and core commuter of PG pool to open source project. Okay, enough about me. Let's get on with the actual content. This is the agenda for the talk. First of two sections. First I'll give a brief overview of database and ways to implement it using PG pool 2. And in the second part I'll go through some of the recent enhancement that have gone into PG pool 2 and we'll talk about the future directions of the community. Okay, so database high availability. Let's start with us. The high availability. Well, if you look at look up to the big idea for a definition it state something like high availability is a characteristics for system which aims to ensure an agreed level of operational performance, usually uptime for higher than normal period. So effectively implementing the database availability means designing a database system in such a way that the system should be able to sustain almost all types of non failures. Database failure, physical hardware failures and any kind of software failure. Another important aspect of database high availability design is to ensure there should not be a data loss in case of any component failure. Okay. So what is recipe auto designer at a database system. Basically, there are three key principles of implementing the high availability. Number one is eliminate any single point of failure. And this is done by adding redundancy so that the failure of any one part of the system does not lead to the collapse of the entire system. Number two is detection of failures. The second thing we require is a health check pink mechanism of the system to effectively detect any kind of component failures. The third strategy or principle is to implement a reliable crossover. Well, once we have our redundant system and a mechanism in place to detect faults. Next, next thing we require is to act upon the alarms generated by failure detection system. And this part of the strategy we need a mechanism or a component that can reliably remove the fail component and shift the workload to a healthy one. I've also listed load balancing as the fourth principle for a chair but that is an optional one thing is when we are adding a standby components, specifically a database system, why not put it to work. So load balance is waiting for a primary one to fail. So it's good to have no ideal components, but it's not hard and fast requirement for a chair. So load balance is good to have but an optional component for for a chair. So this is how we will be implement each part of the required strategy to build a high available postgres cluster. The first thing we need to do is to create or eliminate SPOF or single point of failure for for over me and resource that is postgres server. And that is done by installing multiple replicated database servers. There are lots of third party tools that can do online database replication likes lonely people to also have a built in replication mechanism and process also has it's built in replication mechanism as well. And so we can create cold stand by is warm stand by is and hot stand by is using built in streaming application that is provided by postgres. Well, no matter which technology we use for creating the standby database the requirement is to have a standby database and PG pool to can effectively work with almost all kind of existing replication solution to application technologies. But the go to choice with people who is almost always built in streaming the application and hot stand by. So this is what we use for reference. So once we have a redundant postgres database server, the next piece of puzzle is to detect the failures. So the idea here is to continually monitor the health of each component as soon as a failure is detected. We should act upon the failure to make sure we don't get a service disruption. On face of it, this health checking seems like a simple job but it is little more trickier than how it looks. An effective health check mechanism for a health check system needs to ensure that it should filter out the false alarm and temporary glitches and on top of it it needs to ensure that the failure it is seeing is not its own failure. I mean it's not a health check checking system that itself is failing and so we need to take care of this one. For example, if a database health checker does not ping to a server after every. And does it bring to a server of every few seconds to check if the server is available or not and unsuccessful pain would mean a server failure, but at the same time it could be because of the problem with the system that health check is installed on. Or it could be a network issue between server and health checking system only. So it is a responsibility of an effective health check system to identify and validate the problems and it should be quick enough to do that. We don't want to waste a precious time here. After we have a redundant database system and an effective and efficient failure detection system installed all that stand between us and a high available postless database system is a requirement of a system that can act on alarms for health checker and switch over to a healthy component should any component fails. For our postless system, the most valuable component is our database server and the most critical scenario is when the primary database server fails. So we need a failover management system required to do for basic tasks. So the task for basic tasks, they are listed here a system that to promote a standby primary when a primary progress fails, but just a standby knows to follow the new primary when the primary changes and seamlessly retired failed standby nodes and allow new standby nodes to be added without disrupting the system. So, how, how can we do that. So, here we are, we have a standby postless server that can take over in case of primary server fails, but we need a health check, a health check system to protect the tech problem that we need an authentic failover system to make a switch over. So, the answer for all these problems is PgPool 2. So let's start with what is PgPool 2. Well, PgPool 2 is around for more than a decade now and I'm sure most of you already have a good idea and heard about PgPool 2 and what it does. So, I'll, I'll try to quickly go through this slide. And yeah, I've listed on the main features and components of PgPool 2. So let's, let's get on with the next slide. So, this is the basic idea of PgPool 2. It is a medieval server that sits between client and database servers effectively all client application connects to PgPool 2 and it routes the client request to one or multiple postless server depending on the availability of the server or the type of the request, like right request goes to primary only and load balanced. Okay, so as you know, standard servers are real only nature and we always need to have a primary database server to serve the right role. So what happens when the primary database gets failed. But to be fair, that is a disaster situation and there is nothing much we can do other than to promote a standby server become a new primary. How other HS system out there PgPool 2 are so do exactly that and promote one of the available standby servers to the new primary. Since PgPool 2 sits between the client and postless as shown in the picture, which of course has its own pros and cons. Again, but this setup enables PgPool 2 to make this primary server switching experience very smooth and almost unnoticeable for the client. As far as the clients are concerned, they just keep connecting to the same IP board and database even after primary server fails and the switchover happens. Only the in-flight transactions get averted and the client only feels very small and mutable it is. So yeah, the picture just depicts like the primary fails and one of the standby, the third one becomes a new primary. Okay, similarly failure of any of the standby servers is smooth and not noticeable as well. When a standby server fails PgPool 2 detects it from the exit and it detects it from itself. The administrator gets the notification and well he can take whatever action they want to do. But the main thing is that the service, the database service remains uninterrupted. So is it enough? Do we have just making sure that failed primary server gets a place by promoting a standby to a new primary and seamlessly removing the failed standby from the cluster. Give us a high available database system. Is that what we are starving for? The answer sadly is no. What if PgPool 2 itself gets failed? So just by looking at this diagram we can see in that PgPool 2 is now the single point of failure in the system. So we have taken care of database system failures but failure of PgPool 2 would take down the whole database servers and that's a disaster. So what's the solution? Watchdog. So what is the watchdog? Well it's a built-in HH component PgPool 2 that is implemented as a sub-process of PgPool 2 and is responsible for making sure that client application must always have a single point of contact to the database service. And they always see a consistent view of database cluster no matter what component gets failed or whatever happens. Slightly some of the core tasks that watchdog process of PgPool 2 performs, it does the health checking of all PgPool 2 nodes within the cluster. That is different from the database and Postgres database node health checking that PgPool 2 does to monitor the health of database system. Basically watchdog process of the PgPool 2 perform the health checking of other PgPool 2 nodes. It has nothing to do with the Postgres health checking. Other thing it's a responsibility of our doctor selecting the best leader node among all the allied nodes. And it also manages and controls the virtual IP address switching that is used for always keeping the single point of contact of database service for the client applications. Finally, the watchdog also manages and performs the distributed failure work to make sure that each PgPool 2 part of the cluster always has the consistent and the correct view of the database servers or cluster. So this is how PgPool 2 cluster looks like after the addition of watchdog. You can see instead of a single PgPool 2 node, we now have multiple PgPool 2 nodes all configured with the same database server. This is the first principle of providing a high availability like we must have all redundant systems. So now we have redundant PgPool 2 systems and this question ensures that in case of PgPool 2 node failure, we have a standby PgPool 2 switch to and to keep the single point of contact for client application. We now use virtual IP address instead of a physical IP address. Okay, so this is, so here is when the active PgPool 2 fails, the standby PgPool 2 takes over the center stage and steals the virtual IP address from the failed PgPool 2 node. Do the arping ritual and stuff like that and eventually the client application just keep connecting to the same IP port there without noticing any difference. So now with the backend failover and PgPool 2 watchdog, we have a high available postgres database system. This is the recipe for providing that. So, before before concluding this section of HL, let me just just list out the advantages of using PgPool 2 for building an HL. Well, as we know there are lots of system out there that can be used to configure high available postgres cluster so we can. Why should we use PgPool 2. So one of the things that makes PgPool 2 stand out from the rest of the available system is its ability to utilize the standby nodes as we discussed earlier. So, like those standby nodes might be sitting idle with other HL system in hope to get their turn when something goes wrong with the primary database. In that shell seamless load balancing of PgPool 2 sets it apart from rest of HL providers. Other than that, few of the good things about PgPool 2 are it makes the cluster appear as a single postgres instance. So no special handling is required on application front. That means all standard process client work seamlessly with PgPool 2. And it has highly configurable and robust PgNode health mechanism or health checking and automatic failover mechanism that are custom made for postgres database servers. It gives a flexibility and control to select new primary node when the primary fails and finally it has its own built in watchdog to solve its own SPOF. So this is one in all solution. You only need a PgPool 2 and it will take care of every aspect of HL for postgres database. Alright, so that this brings us to the next part, what's been happening in the PgPool 2 world and where it is the community heading to. I think I've already consumed my allocated, yeah, part of time. So I'll try to just quickly go over some of the major highlights and try to speed up the time. So basically, the recent focus of PgPool 2 community revolves around four aspects, reliability, performance, high availability and usability. So keeping this in these focus areas in mind, the recent version of PgPool 2 have added and improved quite a few modules of PgPool 2 like PCV that is the management interface of PgPool 2. It used to be a fun command at a time interface and now it can handle multiple concurrent commands. On top of that, we have added some new PCP utilities and existing ones also got new options as well. Similarly, we have made lots of strides in terms of performance and memory utilization optimizations. So the second query handling was revamped in version 4.1, I guess, and that resulted in around 100% performance improvement in some cases. A new share relation catching mechanism was introduced in PgPool 2 last year and also produced some good performance numbers. And when it comes to high availability functions, I think this is the area that has received the most upgrades and improvements. The watchdog that is the core HHA function of PgPool 2 was rewritten from scratch around a couple of years back. And now it uses all the modern distributed system algorithms to ensure quorum and build consensus to verify and detect faults and uses very specific mechanism to elect the best leader node. And also Postgres backend node failover system is also revamped and it now builds consensus and ensures the quorum to make the whole process less error-prone and more reliable. On top of that, we have enhanced failover mechanism to wide distributed logs and that paid dividends in terms of improved failover speed. Other than that, if you compare some three or four years old, older version of PgPool 2 with the latest one or with the current one, you'll immediately feel the improved reliability. We have spent lots of, lots of hours in fixing and fixing the reliability and memory utilization aspects of PgPool 2. And we have added exception and memory managers and scotch some lots of parts that resulted in very robust and reliable PgPool 2. So the latest version of PgPool 2 is really very stable. And yeah, there were some complaints around the reliability and the memory management of PgPool 2 around a few years back, but you'll not see these anymore in the current versions. So next two slides talks about more details of enhancement that I have just given an overview of. So I'll just quickly skip through these. So no more split brains. And those of you who are familiar or have used the older version of PgPool 2 might have come across some split brain syndrome complaints. So the latest return watchdog in PgPool 2 has taken care of all these type of complaints. And yeah, again, this is, this is some internal improvement we have made a component of PgPool 2 and I think I've already covered. So that's, that's good. We have implemented a protein mechanism. And yeah, this is another little usable feature we have added. The return watchdog gotten away from using the binary communication and now uses JSON data format and exposes an interface to hook external health checking systems. This is, this is a very, very useful small feature to reply PgPool 2 in cloud environments that already provides a health checking mechanism. For example, if your cloud provider already has a built-in health checking mechanism, so why not use that. So, so this comes then also we have added PgPool to WDECLI utility. You can, you can take a look at that just in case you are interested in to hook some external health check mechanism. It has handy functions. You just need to call those functions just to integrate. You don't need to write your own JSON and stuff like that. So, this is a good improvement we made. Well, this slide just lists some of more notable enhancement that have been added to recent versions. Most of these are self-explanatory, but one thing I do want to mention here is a new snapshot isolation mode. It's a really interesting new addition and can use to build a multi primary postgres server with atomic visibility. I don't think there are many solution out there for postgres that provides atomic visibility with multi primary postgres setup. So do explore this. This is a very big topic. I can't explain it any further, but do explore this if you are interested in scale out solution for postgres. That also provides the atomic visibility. Okay, so yeah, the current version main focus was visibility improvement and a listed few of the features that we have added towards that. And similarly for dot three, this is currently development version. And some of the things in the pipeline for this one are the Chinese documentation we want to currently be able to provide Japanese and English and a part of French and French translation, but we are focusing on providing a Chinese documentation. And then we are also working on memory user optimization, easy configuration and administration. And yeah, we are also working on a new GUI monitoring and computing tool for PG42. And that's a big project and hopefully we'll get it done by before four or three weeks. Okay, this brings me to the end of the talk. So thank you very much for sharing your precious time. I'll be happy to take a question here or you can just reach through email link down Skype or whatever suits you. Thank you very much.