 Welcome to Enterprise Authentication and Connection Pooling. This session is all about Enterprise Connection Pooling and Authentication with Postgres. Today you will learn how to reduce Postgres resource utilization with per user and global connection pooling limits and how to simplify database access management with one active directory and LDAP and two group extraction for Postgres. My name is Lindsay Hooper and I'm actually one of the conference organizers as well as your moderator for today. I'm here with Eric Brandsburg, CTO at Heimdall, and Roland Lee, head of product at Heimdall. Eric specializes in application networking technologies for web and backend data. His diverse experience at Altion, Citrix, and Juniper Networks gave him a solid understanding of operational complexities in a modern infrastructure. And this is what led him to the founding of Heimdall data. Roland heads up product at Heimdall. His background is also in application networking focused on improving web and SQL performance. With that, I'm going to hand it off to our speakers. Eric and Roland, you can take it away. Thank you, Lindsay. Hi, everyone. My name is Roland. And as Lindsay mentioned, today in this session, we're going to be talking about how database proxies can help out with enterprise authentication and connection pooling. Again, you're going to learn how to reduce Postgres resources with our advanced pooling function with per user and global connection limit controls. And you're also going to learn how to simplify database access management. And this slide here gives a overview on the various database proxy vendors out here in the marketplace. You're probably familiar with PG Pool 2 and PG Bouncer. They support a certain level of connection pooling. However, Heimdall supports a two level connection pool. There's also read write split that we can support. We also add an extra capability of replication lag. And then what's notable on our proxy is we can also support automated query caching and invalidation and LDAP for which we'll be talking about today. The way that a database proxy is deployed or how Heimdall is deployed, you can deploy in two ways. On the right hand side, we can instantiate as a JDBC driver. So if your application is a JVM, you can insert our JDBC driver in between your current application and your Postgres driver. So all it requires is a change of a URL and you're done. So it's no application changes. If you're not a JVM, you could use us as an agent on the left hand side here that runs as a separate process. It could be as a separate proxy tier or you can even put us either in the application instance itself or in the Postgres database. This agent can basically be run or installed anywhere. In this situation, it requires no application changes. It's just simply a networking change, IP import change for which the application connects to the proxy and the proxy connects to Postgres. This is an overview of the various modules. Heimdall Proxy supports and I'm going to start with the first row, which is the most notable and beneficial for Postgres. We can support automated query caching, as I mentioned. The way that query caching works is that we provide the query caching logic and the invalidation logic to the storage of your choice. So you choose the storage or the grid cache, whether it be Redis, Amazon Elastic Cache, Hazel Cache, or even Local Heap. You point our proxy to that storage and we'll begin to populate that cache intelligently and we'll ensure that the cache is fresh and eviction and validation occurs. We also support read-write split. So if you want to horizontally scale out your Postgres to a write master in read replicas, you can do that without any application changes. We'll automatically route the writes to the write master in the read to the read replicas and load balance across the read replicas. What's notable about our read-write splitting solution is that we can support replication lag. We can detect the lag and ensure that whenever there's a read from Postgres, you're always going to get fresh data. And this helps meet asset requirements. The third module is what we'll be talking about is connection pooling. We have granular connection pooling limits to ensure that there's a minimal amount of connections to the backend and that connection management is optimally used so that Postgres will not be overwhelmed from hardware resources whether it be CPU or memory. The other module that Eric will be talking about is authentication and authorization. And I'll have him discuss on this in the next few slides. But this gives you a flavor of the power of a database proxy and what modules are supported on the Heimdall proxy. The use cases are plenty and it depends on what your needs are for Postgres. The way that the Heimdall proxy is deployed is that, again, it's deployed in the middle between the application and Postgres. You point the application to the database proxy tier and the configuration and the analytics are being controlled by a central console. So it's quite simple and it's operationally elegant. You have a control plane, which is the central console and the data plane, which is a bunch of proxies that represent the proxy tier. Let's speak on behalf of Postgres. Now we're going to get into the topic of our session. And this is enterprise multi-user database requirements. Cusford typically wants some type of unified authentication via Active Directory or LDAP. They may want support for large user accounts. So basically there are a lot of users that are directly connected to Postgres. Typically they may not have an application server. And you need to have some type of mechanism or solution to manage these large user accounts, these many connections going to Postgres. They also want the auditing of queries, knowing who accessed, what, when to ensure compliance and also security. And the Heimdall proxy provides such a solution. And with that, I'd like to hand it off to Eric to get into these granular features that we support. Eric? Thank you, Roland. The main issue with scalability for Postgres is that it uses an operating system process model as opposed to a threading model that other databases use. So while Postgres, if you're dealing with say even a thousand connections, all can result in a huge amount of memory being allocated. Other databases like SQL Server, for example, will have a very small amount of memory with a thousand idle connections. The net result of that is that within Postgres you have to carefully manage and control how many connections are connecting to the database. And what you want is to try to keep the number of idle connections to a minimum without disrupting the user's traffic. So as a result, what we want to do with a proxy is allow users to connect to the proxy and it will manage the actual back-end connections to the database so that the total resource load is as low as possible for the amount of traffic that is actually needed. In addition, while Postgres is great when you have a relatively small number of users that are connecting, for example, if you're using an application server, trying to manage large numbers of users becomes a fairly large burden in order to manage those users and to account for any data governance policies within a company. So that includes ensuring that, let's say, a user is terminated or they move groups within an organization and you have group policies on who should be able to access what data, then the process of managing that within Postgres itself can become a very large burden. And this is where Heimdall comes into play in order to help bridge the gap between what enterprise environments need and what Postgres provides natively. So the first piece that Heimdall provides, and I'm going to be comparing against PG Bouncer because it is fairly commonly used. It's something called multiplexing and within that PG Bouncer has transaction pooling or it has something called statement level pooling. Heimdall provides that as well, but it provides additional functionality that it'll be getting into in a little bit that makes it work even better and in a more generic way. The second piece is that you need to be able to control connection limits carefully on Postgres and that boils down not only to the total number of connections that are allowed at a global level, but the connections that each individual user is allowed. Now with PG Bouncer, what we found is that the way that it creates pooling is that if you set up a pool configuration limit, then every user that connects inherits that limit. So if you set a pool limit of say 100 connections, you would think that all users would be constrained to 100 connections. If you only have one user, then you're correct. But if you have five users, then it would be five times 100. Heimdall allows you to specify connection limits on a per user basis and you can have different connection limits for different users. So for example, if you have a system user, then you'll be able to set that user to have say 100 connections of their own and then individual data science users could be limited to say 10 connections each. And we allow the connection limits to be handled on a per pool or per user basis and you can control a total connection limit that represents the total capacity to the database. Another feature that we have is multi-port listening, where you can specify different ports and then you can even change the behavior of the pool or based on what port is coming in. So instead of necessarily configuring different users to have different behavior, how they connect to the database itself can dictate the behavior. And sometimes that's useful when you want to have a different port for system accounts versus what your individual actual physical users are connecting as. Finally, we get into the LDAP configuration and Active Directory support. Heimdall provides a very simple LDAP configuration. It also provides LDAP group extraction and group synchronization. And this for anybody that has worked with trying to map roles to a central directory. This is huge because you can now specify on a group by group basis what individual users are actually able to do. So if they're a member of accounting, they can access different data than if they're a member of say HR. Okay, next slide. Okay, going into the LDAP configuration, Heimdall has made it easy in order to configure the LDAP. You can see that even in complex scenario with LDAP search parameters, you have four parameters to enable LDAP. Further, you can do rule-based configuration and what the demonstration on the right-hand side is showing is how you can synchronize LDAP group information into your database. Now, you'll see in the notes that this was for green plumb, but this applies equally to Postgres as well. And what this is doing is basically allowing you to create a stored function on the database that will accept the user, the LDAP groups that in fact, they can also accept the password that was provided in order to configure and manage the users as needed. So this way, we're not trying to enforce a one-size-fits-all behavior. Instead, you can code everything within a stored function on Postgres that receives this information and then using appropriate run or execute as credentials on that function. You can ensure that the database configuration is in sync with your Active Directory or LDAP configuration. So just to counter dealing with PG Bouncer, let's talk again about this. All PG Bouncer was really designed to provide connection pooling to a single user. That means that if you've got an application server shut up, then that will work well. But once you start using multiple users, the constraints that it enforces start to break. So as an example, if you did have the limit of 100 per pool and you've got five users, you're going to end up with 500 actual connections allowed. And that doesn't work right because Postgres wants to configure the limit as a global setting, not necessarily just as a per user, although you can boil down to that. But this is all about memory and constraining the total and allowing everybody an equal share. Contrasting that with Heimdall, we work with a total max active pool and then you can set the max user active settings. What will happen is as the number of users approaches the max active total, we become more aggressive in calling idle connections. And one of the important points is how we do multiplexing, which we'll be getting into in more detail in just a few minutes. But what will happen is that an individual user may have 50 connections open, but they actually only need a couple of connections in order to service the queries that those 50 connections are needed. Okay, so as we approach the total number of active connections, we'll end up calling idle connections and forcing the connections per user down. And then they will reuse those connections as they are available in order to service active queries. You also have the ability to set a maximum wait time so that if they try to open more queries than they actually can get connections for, it'll wait queuing up the queries and then it'll allow those queries to come through in a as the connections free up. In order to manage all the connections, obviously you also want to be able to see what users and what connections are being used and you want to be able to manage that. In order to do that, Heimdall supports actually intercepting commands that are issued say through Psequel or Dveaver or other tools. It allows you to show how many are connecting, how many are busy, how many are sit waiting idle. And it allows you to have an idea of who's doing what. And then we also have the ability to clear individual users connections to the backend server. So if one user is using too many or something needs to be reset, you're able to do that as well. So additional tools are available in order to manage things at the proxy level. Oh, this is still highlighting kind of another idea reiterating that you can have lots of users on the front side of the proxy, but then on the backside you can constrain the users so that their total number of actual connections to the backend database are very much more limited. So in this slide, you're trying to show that user A has 10 physical connections of the proxy, but only one is allowed to the backend. User B may only have five, but they're allowed four physical connections in the backend. Now multiplexing is more interesting piece. And if you're familiar with the transaction statement pooling, this relates to it, but I'm going to get into why Heimdall is better. And what this is, is it's taking the individual query on the front side. And it's breaking apart the dependency to a connection on the backside. So while a connection is idle on the front of the proxy, you don't have to have an actual assigned connection on the backend. What will happen is when a query comes in, we will then I'll take that query, pick an appropriate connection on the backend, execute it on the backend, pull the result set, and then push that to the client. So this can dramatically reduce the number of connections that are actually established to postgres, freeing up precious memory resources. Now here's where it becomes a very different from PG Bouncer with its transaction or statement pooling. The first one is Heimdall supports a feature called delayed transactions. All when you're in a transaction, you actually have to pin the front side connection to the backside for the duration of the transaction. So if you start a transaction at that point, the resources are allocated on the backend server. With a delayed transaction, Heimdall will actually wait until the transaction is actually needed for DML before it actually starts the transaction. And this is as a result, this feature is needed because many frameworks insist on doing all activity against the database in a transaction, even when it's not necessary. As a result, connection multiplexing doesn't give you much value in that type of environment. Well, with a delayed transaction, if you start the transaction and then you did simply do a bunch of selects, no transaction will ever start. And the time between selects will actually allow the backend connection to be freed up. Okay. And that's quite important. The next piece is rule driven multiplexing. There are some cases, for example, if you're using temporary tables that you don't want multiplexing to occur, well, you can use Heimdall rules in order to turn off multiplexing for a given user session on an as needed basis. So it's again, not a one size fits all. You can deal with your special cases on an as need basis, while still getting the benefits of multiplexing for the vast majority. And then again, multi listeners, you can change the behavior of multiplexing and pooling based on what port someone is coming in on so that you can simply differentiate traffic as a result. Now, I'm not going to go into all these options, but the connection pooling interface for Heimdall is based on a very commonly used library. It is the Tomcat connection pooling library that we have extended to support multiple users. As a result, there are a lot of different options that can be used for tuning. And then all these parameters can also be adjusted on the fly as needed in order to change behaviors of individual sub pools for individual users. And with this, I am going to go into live demo of a couple of the features that we have. And at this point, if you have any questions, please let me know and or just add them into the Zoom group chat, and then we will answer them at the end. And at this point, you should be able to see my screen can roll in. Can you verify that you can see it? I can. Okay. Thank you. So what I'm going to show here is a couple of different pieces. I have an application that's called Udu, which is an e-commerce application that operates off of a Postgres database. Now, at the moment, I don't have any features active against this particular database, so we're very few. What we can do is we can come to the rules section and we can see that a lot of different features are enabled simply by configuring rules. If I enabled this rule, then it'll enable caching and we can simply do a commit. We come back over to our dashboard and what we see is that our cache hit rate goes up to about 80% now. Our total or our average query response time in this case is going from about 1.2 milliseconds down to around 200 microseconds, simply as a result of the caching. And this is just one simple easy to use and understand example of how Heimdall operates on the rules. Another example that I'm going to do is I'm going to enable ReadWrite Slit. And this is a feature that allows for read-only nodes to be leveraged in order to offload the primary database. So I'm going to come back over here and I'm going to come and you can see that when I applied this, the cache hit rate went down because they disabled that. But you can now see that there's a difference between on the traffic load. The primary server while caching was enabled was getting around 100 to 200 queries per second. And we're still getting around the same number, maybe a little bit lower when ReadWrite split is enabled. But you can see all the queries that had been cached are now being pulled off of the secondary server. So the net result of that is the central writable server still doesn't have any additional load. We've just pushed it off into a different place. Now the same type of rules can be used to apply connection pooling behaviors or multiplexing on a per user basis. So I could say user Tom I want to change his pooling behavior to disabled multiplexing. So any multiplexing functionality will be disabled for the user Tom, but not any other user. Okay, we can also do say port of five four three three. If someone connects on port five four three three and we come over here and we enable port five four three three to be configured on our listener, then it changes behavior based on the port that's coming in. Okay, so we can have, for example, a default multiplex enabled, but for those users where the multiplexing does not work properly, we can simply disable it and not have to worry about different configuration or different PG pool type configuration for that user that's managed separately has its own connection limits, etc. So this allows for vastly simplified configuration. Okay. All as another option that we have, we have something called password provider query that allows you to drive all your authentication out of a separately managed table within the database itself. And what I have here is an example that just has a user set up for postgres with a password. I can set up a artificial LDAP groups if I'm not actually doing LDAP group extraction within the options. If I wanted to, I could configure LDAP authentication here. If you're familiar with the PG HDA configuration file, this is closely matched to align with that. The same type of functionality, the same behavior except that it is table driven instead of it being a physical file on the postgres file system, which in my mind, this is the way that it should have been done to begin with. Now, if you go into postgres 10 or higher, it actually has a table that you can use to build this table for Heimdall as a reference that represents the PG HBA file, a configuration file. So it's actually moving towards exposing it as a file just like this. The benefit of this is that you can automate through the database itself, adding and removing users if you're not using Active Directory. You can manage their configurations. You can manage everything else. The password file here, you'd say, well, that's not very secure. You don't need that. You can actually have pass through authentication as well where you simply tell the through the password authentication method that it should use a password and then it'll simply trust the database for that user that you can then tune all other behaviors. Just like with rules, you can change the behavior on a per user basis. So this column is providing a way to control the pool multiplexing behavior just like this rule would right here. So instead of doing it through a rule within Heimdall, you can do it within an authentication table as well. Hopefully this has sparked some interest and if you have any questions later on, just drop us an email at support at Heimdalldata.com. If you want to try us out, we do have a free trial on the AWS Marketplace where you can engage us for free trial of the software as well in any other environment. And thank you for your time.