 Okay, who has used PG Bouncer before no Okay, it's about a third. Okay so quick intro PG Bouncer I PG Bouncer calls itself a connection pooling Demon, but I like to think of it as a postgres protocol proxy Basically, you have a database Client normal the postgres database application Which would normally connect to a postgres database and Instead you can stick a proxy in between and the concept of proxying is well known in other protocols HTTP proxies are very popular and have all kinds of functionalities and other protocols of proxies as well And this is just the same right it reads the protocol looks at a couple things and May or may not do things with it and then it forwards the Data to the regular database and also on the way back. So that's what PG Bouncer does and There's alternatives such as PG pool, which I'll talk briefly later that essentially do the same thing just brief So detail so you can understand a couple of the later Details I'm going to go into PG Bouncer is just started as a Command you give it a configuration file Which is often at that location, but it doesn't have to be you can put it wherever you want I find it useful when I log into a box. I look at the process list I can actually see what the configuration file is because otherwise that could be confusing and It has a the configuration file has this dot any format Which is mainly known from Windows so it has like these brackets sections and in the PG Bouncer configuration the main sections of database section and the PG Bouncer section database a section Determines where your downstream connection is going so in this case if I connect to the proxy as And say I want to connect the database foo Then what PG Bouncer does is like aha I'm actually going to send this information to this place and this is a postgres connections string And if you connect to this database and it goes there and if you connect to a database That's not listed here. It gives you an error message that the database doesn't exist And there are a lot of details to this But that's the basic functionality and then down here in the PG Bouncer section you give a general configuration information most important one perhaps is you give it have to give it a port number and As you know postgres uses 5 4 3 2 as its standard port PG Bouncer I have seen in practice all kinds of port numbers being used the default in the sense that the the source code is Template configuration file uses that port is 6 4 3 2 But there's no reason why you'd have to use that you could also you know cheat and use 5 4 3 2 I guess if you want and then have the actual postgres over use a different port that might be a good idea or not So that that's just a super quick intro to PG Bouncer So let's talk about connection pooling which is the the Most well-known functionality, but as I will argue later, maybe not the most useful one So this is a client server with a database application database over now what what happened in in the really old days is that the Typical example was a PHP application, which was composed of a bunch of pages Because the page only lived as long as the page was executing then everything was removed Every time you loaded a new page it had to open all its resources again, right and it had to open all the database connections again which is very slow and Then well some smart people said well Let's just have a connection pooling thing in between and admittedly PG Bouncer was not the first one to come up with that PG pool was the first to do that, but you would basically Possibly run it on the same host even so in the same box here, and then the application would connect to the proxy which is quite quite cheap and quick and then the proxy would keep connections open to the backend database which is Which are slow to open, but it just tolls on to them Nowadays you could say well, I don't just keep your connections open don't close them all the time And that's probably a good idea, right? But there's still reasons to do this Just in terms of how you design your applications in terms of resiliency especially if your applications are complicated and You have multiple threads and all kinds of things It's very it's kind of can keep can be very difficult to to keep one connection open all the time and be able to use Every else is sometimes it's actually quite useful. You just throw the connection away when you don't Don't need it anymore and because you know Opening another one is cheap. So this is still kind of useful But maybe the main use case from you know in the 90s is what I'm thinking of this is maybe is going away so let's look at some measurements and The resolution on this is a little bit low, but I'll read it back. So I was interested in how much overhead does this present So I wrote a couple of quick test programs Which are I will supply with the the slides, but they're very simple C programs. The first one is just Open a connection Just check make an error check that you actually open the connection and then disconnect do that 10,000 times If you connect directly to Postgres So 10,000 connection opening closed basically directly to Postgres 24 seconds Connect to PG Bonsor four seconds Point something so that basically shows why why that is useful What I did a couple of months ago now, so I wrote my own Clone of PG Bonsor in in Python just as an experiment just highly recommended if you also want practice in programming Written in Python totally unoptimized doesn't have any smarts about it Just you know accepts connections puts them into a list dust them locking And then forwards the packets through So connecting to that 10,000 times 10 seconds Compared to 24 to post guys So even you know you can see how what the how a slow connected post code really is even if you use like a totally dumb proxy implementation You can totally beat out connecting directly to post guys. So That's that's the rationale for connection pooling. So second test was Connecting and then running a query, which in this case is just select more and then disconnecting So at least you're doing some work in the back end, right? So directly to post guys 10,000 times is 27 seconds to post at PG Bonsor almost seven seconds and My own proxy clone Almost 13 seconds. So you can see the overhead of actually running the square is the same The connection overhead is really significant in this case Obviously, we're not doing a whole lot of work once you do more work The idea where it goes goes away a little bit and and finally This is just keeping one connection open and Then doing 100,000 in this case not 10,000 100,000 queries and then disconnecting And there you can see there is some overhead to to having a proxy between right So if you just keep a connection open directly to post cuz it's almost seven seconds via PG Bonsor It's 11 seconds. So that's quite a bit of overhead. Obviously the work here is cheap just select one So once that overhead gets more significant that difference goes away a little bit But there's some overhead right in my own stupid implementation 21 seconds and with PG pool 12 seconds so it's slightly slower than PG Bonsor the reason I have PG pool here and not on the previous slides is that I that those previous two tests actually crashed PG pool, so it's Sorry, what's that question so The whatever was Yeah, what I was packaged the most reason I had a version of this talk a couple months ago in Philadelphia post-cursor group where I did these measurements and I Actually PG pool functioned what I made a mistake was pointed out to me afterwards because I originally I ran these tests with the database PG database equals post-curs And then PG pool came out really badly. It was really slow and Someone you told me later that if you connected the post-cursor database PG pool doesn't do any pooling So it has somehow is this special case. That's kind of weird So in this case, I had to actually make sure I used like some other non default database and some in that case PG pool actually did its pooling, but didn't do so well So If you know if someone actually is an expert about PG pool, I would gladly demonstrate that and debuck that with someone later or online So now the really so that was the sort of default use case of connection pulling More interestingly to me is connection queuing That's a term I made up. So that's not a standard term I work at meet me dot com which is a social network site and So we are basically a website exposed to the internet, right? We get up and there's a you know Some web Clients going on and they are basically exposed to the open internet. And so we have you know the number of Connections that could become coming in here is You know unbounded by whatever happens out there in the world, right, but we have learned quickly and this has become quite important to us in the database administration team is that It's very important to restrict the number of connections that a post goes server takes on and there are Ways to configure that inside a post goes, you know max connections and you can have connections limits per user and that kind of thing but once those are Exhausted then what happens to your clients? They they will just then they are out and that's not good for them so what PG Bonsor can do and then does by default is You tell it how many connections you want to pass through and Everyone else has to wait So he's you know default setting is 20 You know that depends on obviously how Mainly how you know big your database of a hardware is you could do that proportional to the number of CPU cores possibly or what you know depending on what workload you have, but You want to have it, you know less than 100 or 200 something like that, right? Certainly not a thousand but those those, you know those incoming connections are Not infinite as it says there, but you know There's lots of web servers and the web servers have and this many threads and this many connections But it could be many thousand right and and PG Bonsor Helps basically control that and PG Bonsor itself can actually accept many thousands of client connections at once And it will basically just make sure you get you know only 20 or so or Pass through at a time and the other ones have to wait And they only wait effectively briefly So this doesn't actually Hurt anyone so the way you set this up is you have in the configuration file in the PG Bonsor section you set default pool size The default setting is 20, but again you should set that to something that you like another option that is in there Which I haven't actually Used but you can you can set it so in this case for example that if a client Has to wait in this case three or more seconds To get a slot it will then open up this reserve pool of Five more connections or whatever you said, I don't know how useful that could be but those are the kinds of options that PG Bonsor provides and Now I say well, okay now if you don't have that many connections, what do you do you wait? obviously that can work briefly but Eventually if there's too much waiting at least that's a signal for you Maybe you should provide more database resources or you know split up your database over in a different way or optimize your queries or something like that The PG Bonsor has a way to You can monitor that PG Bonsor has a sort of an internal fake database So the normal for those who are not too familiar with PG Bonsor just a brief explanation No, the normal functionality of PG Bonsor is I connect to it. It just passes the connection through But if you connect to it the database called PG Bonsor That's actually internal to the proxy where you can read statistics out and has these Fake SQL like commands like show pools for example Which gives you information about this situation So in this case, there's not a lot going on here, but if you had some for example here or the incoming Clients from the web tier for example Currently they're none waiting If this were fully Like if this had full traffic this servers active would say something like 20 and then you might have a couple waiting here And then down here it says how long the longest one has been waiting and this is the sort of thing you want to throw your monitoring system on, right? basically Whatever monitoring system you use or quite a check which can use whatever, you know normal Postgres client you have Connect to this address and run this query and then this is this is a normal Postgres result set So you can all that use all your normal API is to look at this data and then you know when the Clients waiting is too big then you should you know at least record that or throw in a lord and then if That happens a lot then you need to provide more processing power. So the experience for the Client might be it has to wait a while, but at least it won't error At least the database or will stay up, right as opposed to the alternative is that The client might get a result really slowly, but then all the clients are going to be super slow and the database so it gets trash So that's not good. So that's what we want to avoid and this is At least four side like ours where you are Basically exposed to the open internet. This is super useful questions about this so Then let's talk about connection routing Again, this is a term I made up How do you connect to Postgres? there was a presentation in another conference couple years ago where a Someone was Presenting their use case of how they use postcards and what issues they had and first issue they had was like it's how do you connect to Postgres? Well, obviously you do something like this, right? Whatever programming environment you have you pull out the driver Put in some host name and things like that and then you connect it for that's easy, right, but That's obviously not what you really do right because you don't want to hard code host names in your code Maybe you do but okay many people don't want to do that Then okay, you think of some okay, I should have a configuration file But how do I distribute that configuration file? Maybe I should have a configuration Server system, how do I get that distributed? And maybe you have something really fancy where you can really how does that then work across different programming environments, right? If you have the first line, they're supposed to be PHP second line is supposed to be Python, you know, and you could have many more How do you make sure all these environments get your connection and configuration information all the time? Well, maybe you have that figured out But then surely you have like some random cron jobs running around somewhere with just a shell screw, which is Probably not gonna connect to whatever super configuration system you have So there's always some nonsense like that lying around but someone is just the hard coding configuration information like that. So How do you how do you address that? Well, let's look at this Diagram from earlier again If you do this Then your client doesn't actually have to know where your database is Because all it has to know is where this guy is And that's pretty easy because it's on local host and if you decide on the port number, then that's pretty much all you need to know And then this guy in its configuration file knows where all the database are and That solves that problem. So essentially you write a configuration file for PG bonds like that You know, you have to name all your Applications somehow, but that's pretty much all you need to do and then you need you have One configuration file format, which contains all the knowledge of where all the database servers are what port You might have a different port a different use or anything like that You have this one file you can distribute that across, you know, it's the same file You can distribute somehow across all your servers, but then All your all your code needs to know is local host which is The default so you don't even have to mention that need to have a port, okay? As I mentioned earlier, this is not what I'm necessarily recommending But as I mentioned earlier you could also run PG bonds on five four three two and then you can even omit that Maybe that's not such a good idea. It's too too implicit. Maybe and that's it And that those that's how our applications kind of look like because we do that So we don't have the the application developer. So whoever comes up with these this code Doesn't have to know anything about what the database servers are all they need to know is on your local machine There's a PG bond surrounding on that port you connect there and everything will be fine and My job for the last year has been actually to like just trace down all the code that violates that and you know very difficult process to then Identify find the code find out who's responsible and who wrote that ten years ago and it's very annoying So this is much more helpful PG bouncer It's gonna talk about it a little bit But PG bouncer is does not have the fanciest access control mechanism. It has usernames and passwords So you can have you can have some access control. Yeah, that's basically a good idea you How how you distribute passwords is you know perhaps a separate question, but it's no different from what you would normally do right you can use a PG pass file or Or you know LDAP look up or there's other ways to do it, but I Don't have like a really sophisticated solution for that, but it's basically the same as what you would normally do, right? So now that this previous example was just one host in practice might look more a little bit like that right and then in In practice you have a lot more clients than those three certainly You know we have hundreds You have a lot more databases, but you have a lot more clients and databases probably and now One thing that this allows you if you need to move a database down there to somewhere else This is you know quite easy because there's only one place you have to change it But it could be difficult to Make that one change and propagate it to all those hundreds of clients at the same time Yeah, especially if you want to do sort of a life Failover or things like that you really need to have everyone in sync at least within sort of a very very short time frame and That could be difficult just in terms of how long does it actually take to log into all these machines and make those changes Depending on what system you use or what if just one of those clients is currently down then your whole process will fail, right? so if you have that situation then I I Rather propose a system like that, which is actually what we're using to have another layer of PG Bouncer, which has all that knowledge and and that's Then very simple to update right so we have basically these locally running PG Bouncer's which You know provide connection pooling to the app and let's let the app Easily connect the Postgres because it only has to know about this and this guy knows about this guy And this guy knows about all the databases And then whenever there's a change in one of the databases We have to add another one or if to fail over change hard by anything like that all we have to do is update this and that can be done very quickly and sort of quasi atomically well it does Introduced latency, but we accept that basically right Obviously because you don't I mean there's other considerate considerations to application design, right? There's obviously caching here and stuff too. So by the time you hit the database you have already accepted that you Or have much more latency than if you read from casual things like that. So yes Because that also gets moved around You yeah, if you really I mean this this is sort of If you don't like this you can also do what you know he proposed and just skip those But then you expose these guys to your design down here And sometimes this changes right obviously we don't only have one here because that would be a single point of failure We have several here which are load balanced and we currently use a hardware load bouncer But you know this changes from time to time. So we have basically made the choice We insulate all this business up here from all this business down here and Well, you could also be cynical and say well this you might have heard this like the way an architecture Comes out is sort of informed by how the teams are divided that worked on it, right? this In a way is Reflected here because sort of the the organizational boundary is kind of here, right? so which you know is Again, if you have only one team, maybe you don't want to do this But there's you know the organizational situation the way it is is also a fact that we have to work with and So this affords us Insulation of these guys don't have to worry about it and these guys don't have to worry about it It's just sort of a robustness You know in all and all on all levels levels. Yes We don't do that, but you could do that. So his what he is referring to is PG Bouncer is It's a single process. It uses lip event to do its multi flexing I guess So it's a bit like Redis in a way that it only uses one CPU really it's a well so first so The the proposal was you could run multiple PG Bouncer's in one machine. Yes, you could certainly do that We have not found the need to do that because it's as it was said it's fast enough We have never had any problem with PG Bouncer not being fast enough And we are you know, we have literally tens of thousands of connections coming into the Client side there and it's not a problem But you could certainly do that if you want that's that's an option. We basically only use we use We use multiple instances in this middle layer But we have also chosen to put them on different Hardware boxes for the obvious reason that we need to be resilient against hardware. So You know, we if we need it if we need it, let's say we need it 20 PG Bouncer's then I would say Okay, let's only buy five boxes and put four on each but it doesn't turn out doesn't we that we need 20 But that's that's an option to you could do so That's kind of how We approach it. I don't actually know how the load balancer works And we are actually moving to different load balancers now and I've never had a problem this way or the other I Could if you're interested I could get that information from all guys who do the load balancer As far as I know, there's you don't have to do anything particularly special so connection Maintenance this comes in to what I said if you're here And you want to move some of these databases around but down there You could say obviously I don't need PG bonds at all. I can all use virtual IP addresses and all that kind of business You could certainly do that We have chosen not to do that because there's all kinds of complications with different networks that are involved and It's also not protocol aware right you could say okay, I can move the virtual IP address around But then the the protocol gets broken so the the the client will get confused or you will have to break the connection The client gets upset what PG Bonser can do It can do this in a post-course protocol aware fashion. So you can again connect to this sort of fake internal like administration database And you can use a command like that and you can say pause this application and then what will happen is it will wait PG Bonser will wait for all the connections that are currently in or all the queries that are currently Processing on that connection to finish and it will not allow any new connections What it what it means by not allow any new it will just let them hang the Client will not realize what is going on except of course it will take a little while to come back but Other than that it doesn't even know what's going on and that you do in any maintenance So what you can then do when you're done you say resume and everything just keeps flowing And if you put that together you can do things like that you can You can restart your back-end post-course instance Without the client even realizing If you build a shell script like that, but maybe a little bit more arrow checking and things like that But the principles like this so you you pause pausing will wait for everything to finish if you have a long-running query This will not make you happy because pause will just wait so a don't have long-running queries be put in put in Statement timeouts or things like that or have a timeout logic in this So you need to make this work for your case Then you can do restarts or you can do other things like move things around or anything like that and then you run resume and You know often this can depending how long the restart and things like that takes you know you can do this within a couple seconds and Your clients just keep working. They don't notice that the connection was broken or anything like that So that that's really useful. There's Are those other commands that you can do for example, you can also just Kill all connections to a database which is a little bit more Radical than pause but if the pause somehow doesn't finish for you You can say kill then of course the client will get upset But at least you you get rid of all those connections. You don't want anymore If somehow the client doesn't have proper behavioral timeouts, so We have some databases that are called production And then I like to run this command and that always scares me a little bit when I do it But it doesn't actually do anything so That's always kind of fun so And this is this is the magic So the previous couple of snippets I showed you could those are for restarting a postgres server Online You can also run PG but you can also restart PG Bouncer itself online And it does that If you provide this our option So initially we started PG Bouncer without this our option, but now it let's say we want to do an online restart be past this our option And then what it will do is it will The new process will start up it will connect to the old process Because it knows where the old process is because it has the configuration there So it will connect to the old process Will tell the old process give me all your sockets It takes all the sockets. I didn't know that worked, but it does So the one process asked the other one process for its sockets Then it starts to pro servicing these sockets And it kills the old process And the clients don't notice it So that's that's total magic, but it does work and it's really really useful. So you can actually just and this Functionality is usually in some init script or something like that, right? You don't have to do that manually You should check whether your init script does that some init scripts that were written a long time ago If you do, you know, etc in a deep PG Bouncer restart, they might just actually stop Sleep start that's not what you want But a lot of the ones, you know distributed from Community RPMs and the end and Debian packages and things like that if you do a restart They will actually do this so you can just go in and restart PG Bouncer. It will actually not interrupt your your traffic Which is awesome So those are the You know the the facilities that PG Bouncer provides us right the connection pooling the queuing in terms of Controlling or the traffic to the actual database server the routing the connections and telling So that not everyone has to know where all the database servers are essentially and then these maintenance commands so you can Start and stop and move things around Without interrupting the online traffic too much So what PG Bouncer does not do so well or other things So as we already discussed Main access control is not very sophisticated. It has a password file You know, you can put user names and passwords in there and then it does the normal password authentication But it doesn't do it doesn't have anything like LDAP or Kerberos or anything like that. So And it doesn't have any SSL support Either income and or outgoing for incoming I've heard people talk about using s tunnel That I've seen work I don't actually use it actively at the moment But s tunnel has some kind of awareness of the post post protocol so it can kind of make this work In practice the answer in places I've seen is to kind of firewall off your Database subnet as much as possible and then only allow certain access points And then tightly control who can you know read the password files and things like that from there So that that's usually the approach that you have to take if you want to Use lots of PG Bouncer PG pool is better like that PG pool has a more of an pghba.com approach Managing the pool sizes is tricky in a way, you can think of all this kind of as a Pipes lots of pipes right with lots of connections and you want to control how big the pipes on how many connections are coming through and If you have multiple hops and things you have big pipes and small pipes And then when you do maintenance basically what you do is you turn the water off right and things like that So you kind of become a little bit of a plumber at all levels And you have to make sure if you have you know these these Somewhat complicated setups here and a different You know these might all also be different of different sizes and these might be different And so you need to just stay on top of how many connections you allow at each point so that you're you know Your network of pipes will make sense at the end otherwise You know pipes burst I guess or the water backs up or anything like that, right? So that's those are problems PG Bouncer originally came out of Skype Because Skype used to be a big Postgres supporter and They wrote all kinds of cool tools including PG Bouncer and PL Proxy and London and think and then things like that The I don't know if they I think they probably still use Postgres because they obviously not just gonna change that from one day to the other But as is well known it's owned by Microsoft now most if not all of the original team is left Skype and So there's no sort of full-time professional maintenance of PG Bouncer anymore I know the people are still working on it, but it's more of a spare time thing The way I understand it. I tried to reach out to the to the maintainers before this. I haven't heard an answer yet, but So, you know, it's still being worked on here and there, but It's a bit doubtful The complete roadmap, so but it's open source. It's out there. You can fork it, but it's super complicated code in my opinion so if you need More features than there's PG pool is as well known PG pool is You know effectively has the same Sort of it has the same label Postgres connection pool, but has a totally different purpose in terms of Has a lot more logic in it does things like it can parse queries and no like is it a read query Is it a right query? It can go there it can do the load balancing It can do like health checks and back-end servers and things like that. So it's a lot Bigger in terms of functionality I like PG Bouncer because it's it's lightweight and fast So those that's a pretty obvious trade-off and the fact of course that PG pool crashed on my test case, so But Yeah, so especially, you know for for like a dot-com type website all of this functionality is like indispensable really and Some of these drawbacks we can work around in other cases. Maybe that's not the case. So That's I Had to say So I wonder what he thinks about that That's my wife always asked me when I go to these postcards conferences like are you gonna talk about dog? It's like no, but now now now I've done it. So that's our dog Do you have any questions about the dog any questions about other topics? If you do a PG Bouncer restart If someone is still connected to the old PG Bouncer, yeah, it will take that over Okay, you in this case you don't actually so I'll repeat that for the microphone first of all His question was if you remove a an entry from the PG Bouncer configuration for an old database You don't use anymore and then do a restart what happens to existing clients, right? So first you in that case you wouldn't actually do a restart because you can do a reload So it keeps the same process. It just changes the configuration Um What happens to existing clients in that case? I couldn't actually tell you that because at that point I would have usually already ensured that nobody is connecting anymore before I remove that so But you could easily test that out. I I couldn't guarantee one result or another in terms of Yeah, okay does that kind of You don't actually have to do a restart But if you do a reload and there's still clients connected, I don't know what happens It's kind of the difference between restart and reload is kind of similar to restart and reload in the Postgres For most configuration changes you just do a reload But there are certain things where you need to do a restart Most you know obvious examples are you change the port number or you want to just install a new version Know that the restart would do a Yeah, yeah, yeah, that's basically how like a package upgrade works right you just do in yum upgrade or whatever and it will You know if the package is set up properly, which most of the reasons ones I've seen are it will just do it You know a restart as part of its upgrade procedure and that will do this online Procedure that works very well In the Yeah Yeah, so you almost always do a reload For any of these changes of you know connection parameters or even certain time-out settings The only time you really do a restart in practice. I mean if I said if you change the port number But you don't really change the board number often right so mostly Upgrades is it possible to pass the Configure the client application information through PG Bouncer just for tracking in right That's a little bit of a tricky point Well, there's a there's a patch in the git master, which is not released It's version 6 1 release candidate one or something like that where the client information is passed through via The application name Connection parameter so in the back end so where you will actually see as the application name this came through PG Bouncer with this connection From this client But that's not released. I actually have to test whether that still works if you have multiple hops, but What you can also do is? Log into the PG Bouncer admin database and it has a List of clients and servers that are currently connected and how you combine them And then you can you can trace it that way. That's a kind of a manual process I actually published a blog yesterday or how you can do that but That's it's not easy, but the information is there you have not have to string it together yourself sometimes If you want to you know have this kind of information up to date, but the problem with that is actually that the clients Connect so quickly and there's so many that actually sort of having a full view of that It would be actually very slow and then unwieldy I think Does that doesn't really help but that's what yeah, okay the question was do we use PG Bouncer has a pool mode parameter, which is could be a can be what was it's statement transactional session which basically says how long does it hold on to a Server connection for a client and those are three different granularities, right? If you say session one client keeps the same server connection for the entirety of the session And then for transaction and so on we use statement type statement pool mode for everything the reason for that is to be basically do everything through procedures So we don't need any other setting and other settings will also kind of interfere with all the sort of Queuing and routing and then it just holds things open longer for you know for possibly know for idle Sessions and things like that. So yeah, and then you can't do prepare statements and things like that But through you know doing everything through store procedures We kind of have this implicit preparation in PLP GS go for example. So that kind of works around that. Yeah But you know, there's you always have like the fact that you're using procedures and itself is overhead. So, you know, we Pay a price for other facilities, right? So it was a question in the back No Okay, thank you. If anyone wants to debug that PG pool buck. Let me know. Otherwise. Thank you