 Am I too obnoxiously loud? Yes? Okay. All right. So our topic this morning is audit logging with Postgres. My name is David Steele. I am the Senior Data Architect at Crunchy Data Solutions. I could be a little louder. I could be a little louder. Okay. How's that? A little better? Okay. Great. So I'm the Senior Data Architect at Crunchy Data Solutions. We're obviously one of the myriad Postgres support companies, but we do focus on security. We focus on auditing. We focus on compliance and standards. So that's some of what we're going to be talking about today. I've been actively developing Postgres since 1999. So many, many, many years. But before we start, I'd like to make sure that I give Second Quadrant a call out. The extension you're about to see is based on an extension that was written by Second Quadrant by Simon Riggs, Ian Barwick, and Abhijit Aninsan. So what happened was there were some capabilities we wanted to add to that. We wanted to maybe try to get it committed to core. So I took over that extension and added a lot of new functionality, a lot of tests, a lot of et cetera, et cetera. So at this point, Crunchy Data and Second Quadrant are in lockstep on the PG Audit module. There's just one PG Audit module that we're both supporting. If you go to the Second Quadrant PG Audit GitHub, it's now a mirror of the PG Audit project. All right. So here's our agenda. The first thing is we're going to talk about what is audit logging. If you're here, you probably know. But if you were sent here by your boss, you may not know so we're going to cover that really quickly and make sure we're all on the same page. We're going to talk about why you would audit log. Of course, there are lots of reasons. We're going to talk about how to audit log, i.e. not with PG Audit. This is a persistent need. So over the years, lots and lots of ways have been developed to do it, of which PG Audit is the newest. We're going to look at some examples of how PG Audit is superior to other auditing techniques, and then we're going to have a demo. So let's get some boilerplate out of the way. So an audit is an official inspection of an individual organization's accounts, typically by an independent body. So the distinction to make here is that what PG Audit does is PG Audit creates an audit trail or an audit log. So PG Audit itself does not do an audit. An audit is an independent action by somebody that comes in, looks at your records, and decides whether you're in compliance with whatever standards you're trying to be in compliance with. So the purpose of PG Audit is to preserve that information that is required to do the audit. And so the real goal here is, like I said, there are lots of different types of audits. So you've got government type audits, and standards, financial ISO certifications. There's lots of reasons why you might be under audit. It might be completely internal. Your company may cross audit or they may have their own internal auditing body or any of that kind of stuff. It can also actually be used for debugging and metrics and monitoring. There's a lot of information you can extract from PG Audit, so it's possible that you may use it in development. That's not its primary purpose, but of course there's lots of ways that you can use tools. So how to audit. So given this such an important topic, over the years lots of people have done lots of things to do auditing. So the first one is using triggers. So in Postgres, of course, you can set up insert, update, delete triggers. And whenever that trigger fires, you can take the information that's being inserted, updated, deleted, put that into an audit table, and you've got your audit trail, at least for modifications. So who here has written these types of triggers or seen them in place? All right, yeah, more than half the room. So that's not surprising at all because I've written them probably 10 times over my career, myself, in different forms. So there are some downsides to triggers though. So the first and foremost thing is that you can't log selects. So if you're just looking at modifications to the database, then you're in great shape. But if you wanna see what people are looking at, then the triggers don't do you any good at all. The second thing is that previously, DDL was always ignored by triggers. Now you actually can do event triggers on DDL and as of nine five, they're actually useful in PLBG SQL, so it's broken down enough that you can figure out what commands being run, what kind of command it is, what object is being run on. So it's a lot more useful. However, it still doesn't work with global objects like databases and most importantly, roles. And roles is one of the modifications to roles is something that a lot of people wanna see. So unfortunately, this is a big hole in event triggers. So and the last thing about event triggers is that event triggers are tied to the transaction that you're running in. So if the transaction rolls back, you won't have a record of the change. Now there is actually no change, of course, because the transaction rolled back, but you won't have a record that someone tried to make the change, which in a lot of cases can also be important. Another way to go about it is with functions. So essentially what you do is you, for every operation you wanna do on the database, you write a little feedback, aren't I? Let me try to notch this just a little bit and see how it goes. Maybe I'll stand a little further away. Can you still hear in the back with that minor adjustment? Okay. So this technique relies on using functions. So basically for every operation you wanna do in the database, every insert, every update, every delete, every select, you write a function. And the users, the developers, whoever your users are, call those functions for all their database operations. So they can't directly do insert, update, delete, select or whatever on the database. So this works. You can capture everything you wanna capture about every operation the users are doing on the database, but it's extraordinarily painful and every time you wanna do something new, you have to write a new function and the users, I mean, you're basically crippling yourself because here you have this wonderful SQL system and you've essentially told your users no, you have to call this function to get your data. So it's a pretty bad method, but if it's the only thing you have and if the requirement is absolute, this is one way to do it. And the next thing that's used is setting log statement in Postgres. There are actually a couple different levels. Here I've just kinda given the example of log statement all for illustrative purposes, but you could also set it to modification, just select or select plus update or select plus update plus DDL or the setting we have here, which is all. So this one's great because it gets everything. Every client statement that comes to the database is logged. So that sounds like it'd be a great thing. The problem is, it's very hard to parse. And we'll go into examples of why that's true, but you're just getting the raw statement from the client. It could be a do block. It could be some kind of dynamic SQL. There's all kinds of things that could make this, make it very hard to understand what's actually going on. The other thing is that there's really no way to filter. I mean, there is sort of, as I said, because you can set log statement to four different levels. But within those levels, there's not much way to filter. And if you wanna see everything, then you turn this thing on on a busy system and just forget it. Not only will performance suffer, but you're gonna start filling up disks quickly and you're gonna need to offload those logs somehow. So that brings us to PG Audit, how to log the PG Audit way. So what PG Audit does is it brings a lot more granular logging to the table. So the idea is to introduce more classes and allow you to turn them on and off independently rather than having this tiered system. So the classes that it supports are read, which is kinda obvious select, of course. But there are also other read operations that might be so obvious, like for instance, copy. Copy is a read operation. If you're doing insert from select, that's a read operation and a write operation. So now you're gonna log two ways. You're gonna log the things you were selecting from and the things you were writing to. You've got write, which again, covers things like copy, update, delete, insert. So there are a couple things that are included there. Function means to log all the function calls that are made. So if you're calling them directly or you're calling them in your SQL, you'll get a list of all the functions that were called, which can be a little overwhelming. Roll is a pretty important one. We've actually split this out. You can kinda think of roll and DDL almost in the same category because you're creating and moving objects around. But because rolls are so important, we split those out into their own category. You've got DDL, which is pretty obvious, create table, drop table, you name it. And then miscellaneous. Miscellaneous is things like vacuum, re-index, things like that, that are primarily administration functions and don't actually modify data. So that's a pretty important thing about the miscellaneous category. Nothing in there actually modifies data. It's just moving things around or kind of admin type functions that you probably don't wanna see but are fairly rare anyway. In this case, for vacuum, it wouldn't log anything, auto vacuum and things running in the background. These would be explicit vacuums run by the user or the super user that would actually get logged in. So in addition to these kind of broad classes and session logging, so you can imagine yourself maybe for, I've been in situations before where we had to basically prove that our DDL had not been updated. Functions hadn't been updated, tables had not been updated from release to release. And we had to show that all those updates happened in the release window. And that was one of the audits that we went through. So you could use this for that. You could turn on DDL logging and now it's gonna go through and it's gonna capture all the DDL. You can capture those logs and show the timestamps of all those DDL commands and show that those happened within the release window to verify that, in this case it was just a verification that people weren't randomly making changes to production which is a bad thing generally. But for obviously, especially for things like read and write, you can get that fire hose again. So you turn on read logging and now suddenly every select is being logged. Now we're in a bad situation again and a really busy system. So we've also introduced object logging. And what object logging does is it gives you fine control over exactly what you wanna log in terms of insert, update, delete. So you can create an auditor roll and assign permissions to that roll and the logging will be based on those permissions. So for instance, if you gave the auditor roll select privileges on certain columns in a table, let's say the password column. Every time someone selected from that password column you would get an audit event. If they selected other columns in the table there would be no audit event. It would be ignored completely. And you can, you've got fine grain control over select, insert, update, and delete using the grant system on that auditor roll and those are exactly the things that will be logged. So for the read and write classes you can narrow this down enormously. And you can literally just audit one column of one table in your entire database if that's all you wanna see. And then from there, of course, you can build policies that go out, extend. You can have different audit policies for different users. So there's a lot of flexibility here. It can get a little complicated but it is incredibly flexible. The other thing about PG Audit is there's a lot more detail in the logs. So the log statement all was basically just giving us the big client statement. In PG Audit you're gonna get a lot more information. You're gonna get the object type. So say if it's a create table, not only will it tell you the command is create table, it'll tell you the object operated on was a table. It's gonna give you the fully qualified object name which is really important because with search paths a query, one particular query can work very differently for different users. If there's a search path, a lot of people will use search paths to do multi-tenancy. So the same query can be interpreted differently. One user is gonna go into one schema and get this table for this customer and another user is gonna go in and get this table for this customer. If you're just looking at the bare query that was sent to the server, this is not going to be obvious. So one of the things PG Audit is goes in and for that particular, in that context at that time which tables are being addressed and it will list those out for you so you know exactly where you're going. It also gives you some nice things like stack depth. So if you've got functions calling functions calling functions, you can actually see where in the stack you are. So is this a top level call or was it something that was called buy something? It gives you the actual statement that was run of course, the bare statement. Although as we'll see, even that's gonna be more useful than what log statement all will give you and we'll have an example. It'll also spit out the parameters. So if it's a parameterized query you'll get a list of parameters that were passed to that query rather than just a bunch of question marks which really aren't that useful, et cetera. So before we go to look at an example, let's talk a little bit about the design. So PG Audit is implemented as a Postgres extension. It is open source of course. There are several support companies that provide packages for it including of course, Crunchy Data, who I work for. And so you don't have to build it yourself. It is possible to get packages for it but if you wanna do that you're welcome to download it and build it. It's supported on 95 and 96. It uses features in 95 that were introduced in 95 especially the event triggers but there's also some internal reasons why 95 is required. So basically it just, it sits there, it uses every hook in the book almost. Executor hooks, utility hooks, object, I mean everything and what it's doing is trying to figure out what Postgres is doing internally from the information that's coming through all these hooks. It's a bit of a trick actually. Oh yeah, sorry. So it works. It actually works extremely well and it will give you an incredible amount of detail about what's going on in your system. But if you actually decide to go look at the source code don't be surprised. Anyway, so a caveat is that it may log statements that eventually raise an exception. So let's say you're in a transaction, you're doing update, update, update, update, update and then you get to some statement that errors out and it rolls back the entire transaction. Well those statements will still be logged. There is a tool that we're gonna look at a little bit later that will help you identify transactions that did not commit to. So you can identify statements that did not commit but in general if you're just looking at the bare logs you need to be a little careful because not everything necessarily will have committed. And it also doesn't log statements that contain syntax errors. So if the SQL itself is malformed it's not gonna get far enough down into Postgres. Basically it's gonna blow up in the parser and that'll be it and it'll get rejected and sent back to the client. PGOT will never see it and therefore it won't be logged. There are a couple things that are here though. So first of all it will be logged if you have logman messages set to error which is actually the default. So that error is gonna pop up in your Postgres log just like any other SQL error would. The second thing of course is that the user never got any data from that statement. So whatever that statement was it was rejected by Postgres before there was any extraction of data from the database. So you know that they didn't see anything. So you can make the argument that the statement does not need to be audited at all. And the same thing is true for statements that are attempted while transaction is an aborted state. So because Postgres is just gonna immediately kick those statements out it won't allow anything to run while the transaction is aborted. So those also won't be seen by PGOT it. All right let's look at an example. So imagine the user has sent us this statement. This is a pretty contrived example but I think it illustrates the idea I'm getting at of how difficult it can be to just look at the statements coming from the client side and interpret exactly what those are. So in this case this is a DDL statement. We're going to create a table which we are going to call important table. And of course there's a little bit of a a little bit of obfuscation here to make it not obvious what's going on even though we can read it quite clearly. So the next statement in below is what actually gets logged with log statement equals all. So what gets logged is exactly what was sent. So that's kind of cool. It's a verbatim copy of what was sent but it's not very useful because we still don't know that you might be able to grep through pretty easily and figure out that a table was created but getting the name of that table is going to be extremely difficult from here. And getting the schema of the table would be even more difficult because you'd have to know what the search path was. So let's look at the same example of PG audit. So the user statement comes in and then what gets audited is a couple of things. So first, the first thing it gets audited because in this case I actually do have function level auditing turned on and do blocks are considered to be functions for this purpose. They are, they're just anonymous functions. So the first thing it gets logged is basically the same thing that log statement equals all logged. But there are a couple of differences already. One is that, can you see my arrow up here? There we go. One is that it's been classified as a function. So we know that the statement that was sent is a function. We know that it's a do block. So we know it's an anonymous block of code. There are a couple of fields here which aren't filled in for this particular action. And then the next thing we get is the actual create table statement. So obviously when this function runs that string is put together and it's sent to Postgres with the execute. So Postgres eventually gets the real create table statement and lo and behold, here it is, we can actually see it. So it's DDL, the command type is create table. The type affected is table which is pretty obvious in this case but isn't always obvious. The name is public dot important table so it'll clarify that this was created in the public schema. And finally it gives you the statement and we can see here that now the statement clearly says that we're creating important table. So sure, we can get that information out of the columns here but if you really do wanna see the actual statement that was run, here it is. So this makes dynamic SQL auditable. Well, those guys. So one of the things about PG audit, so this does not stand alone. PG audit is expected to work kind of in an ecosystem with other tools to fulfill certain purposes. So to that end, these are the standards, common criteria level two plus which we got certification for last year and the security technical implementation guide which Jason worked tirelessly on and some of you may have seen his talk on Tuesday. So that actually just came out last week. So these are two standards that you can use or the community can use. Common criteria is a little different but certainly in the case of the STIG, this is just a general guideline for making Postgres secure and making it compliant with the NIST DoD standards. So even if you're not DoD, if you're DoD then in theory you're supposed to, or maybe more than in theory, you're actually supposed to comply with all of the provisions of the STIG. But if you're not DoD and you're just interested in securing certain parts of your enterprise or getting logging working right or getting auditing working right, role management, encryption over the wire, I mean there's a whole list of things that the STIG covers and so you're welcome to cherry pick, pick and choose and say, okay, we're really interested in doing authentication and encryption correctly. So go look at that. An example is that the standard does not allow MD5 and Postgres currently in 9.x uses MD5 as internal password hashing. So there are alternatives laid out in the STIG for things that are DoD compliant like Kerberos and certificates and some other things that you can use. Anyway, by the way, oh, the password thing is solved in 10.0 though, the scram patch made it in so now we actually have good password hashing. Yeah, woo, everyone's happy about that. All right, so let's do a bit of a demo here. Let's look at some actual live data. That's always good, family fun, okay. So tell me if you can read this in the back. How does that look? I can go bigger. How about that? Top window look good? Okay, I'll make the bottom window match the top window or maybe it already did match. They both look good? Okay. All right, so what we're gonna actually look at here is a PG Audit Analyze which is a companion program to PG Audit and what it allows you to do is take the raw logs that PG Audit is spitting out into the Postgres logs, parse those, analyze those and get them into data structures that are much easier to work with. In addition, it adds a couple of extra features which are not specifically enabled by PG Audit with regard to user logging. So determining when users have logged on, how many successful attempts they've had, last successful attempts, number of failures. You know, this is all really standard kind of stuff that you see in a lot of systems but isn't available with Postgres natively. So, but you can turn on connection logging in Postgres and it will spit out log messages for everything that it does in reference to log on and disconnect and all that kind of stuff. So that's what we use here. We collect all that information out of the log and then in addition, of course, we're pulling all the PG Audit information out of the log and loading those into tables. So this is actually what I'm gonna do is run the, you know, some of the regression tests and we're just going to watch the output and talk about it a little bit. So we need to admit this. We need to figure out what our log file is now named. We'll poke around the database a little bit too. All right, so this is a lot. So I'm gonna try to, well, part of this is because we've just started up the database so we've got a whole lot of messages that have come across that we're mostly going to get rid of at this point, except, yeah. So these first tests actually are all related to connection and making sure that connections are logged correctly. We're gonna kind of skip over most of those and come back to them and, you know, look at some of those user tables in a minute. Like I said, this part of PG Audit Analyze is not directly related to anything that PG Audit produces. These are log messages that are produced already by Postgres. So we're going to kind of skip those and hopefully get to something we're a little more interested in. So this is something more interesting. All right, so this example, this test, is just make sure that a select statement is logged using the read setting we talked about, session read. So we're not gonna do anything fancy here yet. We're just going to go look at this and make sure that this is turned on. So we go through, first thing we do is, since we're doing everything this in session, we're actually just setting the logging privileges that we want. Now to be clear, all the PG Audit settings can only be set by a super user. So a user is never going to be able to set their own audit settings. That would be kind of bad. But for the purposes of the example, it actually logs on as Postgres creates this user, sets a bunch of settings, logs on again as that user, and then does the audit command. It works well for testing. So the first thing we do is set PG Audit log to read. The next thing we do is set log relation to on. And what this means is, rather than creating a single audit record for a query, which is the default, it's actually going to create a separate record for every table that was touched in the query. So this is a very verbose mode that we're going to use to allow us to get a different light item for every query and make sure we're seeing everything that we want to see. So we create our test table and then we grant select on that test table to our test user. And then we connect as that user and run our SQL. So select count star equals zero from test table. Okay, so not too exciting, but it is a query. And then this next statement here, which we can kind of come back to in a minute. But what it does is, so now all of this is, so what's happening is in the background. So these statements are coming out. The PG Audit analyze daemon process is tailing the log and reading everything. And once it gets that information, it goes and inserts it into tables in the database. So you won't want to audit those tables. Auditing the audit table is a really bad idea because obviously it's going to create this lovely little loop for you. This has happened once or twice. So this query, which is a little complicated and we'll play with it in a couple of minutes, is allowing you to just go in and pull information out of these audit tables. The audit tables are highly normalized. So on top of them, there's this view called VW Audit Event that brings all the highly normalized tables together in kind of a big denormalized view. So if you're just poking around, if you're writing code against the audit tables, you probably don't want to use this view. It's not going to be as efficient as going against the table. But if you're just an admin poking around and looking for something, the view might be the way to go. So, and in this case, so this was the test. So the test was does this row exist in the audit table? So select count star equals one from the audit table where all these conditions are true, which will exactly match the query that we just audited. And that demonstrates that, yes, that the audit records for that query did appear in the audit log. And we can actually see them down here, of course. So like I said, this is all pretty verbose, but we can see here's our audit record. And it shows us that it's a session audit record, the type is read, it's a select statement, it's selected on a table, the name of the table is public.testTable, and then here's the statement. If you're actually auditing multiple tables, you can turn off the statement logging for subsequent tables. So it'll just log it for the first one, and then subsequent ones, it won't log it. And the way you would actually associate those is by the statement ID. So this is a sequential number that's going to increase throughout the entire user session. And so you can associate audit records. If there's multiple audit records for one statement, you can associate them in that way using the statement ID. This guy next to it is the stack depth. So as you work your way down through functions, this stack depth is going to increase. All right, so let's look at our next one. All right, so in this case, what we've got is, we've gotten an error. So this is an example of what I told you that you'll still get errors in the log. So what we're trying to do here is change the audit permissions for a user, basically as a user, which isn't going to work. So this is actually one of the things that we have, the reason why we have a test for this to make sure you cannot do this, you should not be able to change your own audit settings. If somehow there's a code regression that allows that, then that would be a major violation. So we do test that this error actually occurs correctly and that's what we're doing here. And you can actually see that these attempts, also these attempts at setting things, the successful attempt was logged and also the unsuccessful attempt was logged by Postgres here. So it says error permission denied to set parameter log. This is not an audit log, right? The audit, BG audit module did not create that, but you do have this in the log and the audit analyze program is able to pick these up and show that transactions were reported when they are like this. Let's see here, what's our next one? Oh, that was a little boring. I'm gonna skip that one. More roll stuff. We have a lot of, all right, let's look at the last one. And for some reason I lost my tail. I'm guessing that the log switched. That's what happened to me. So let's find the most recent. Okay. So unfortunately we got a bunch of junk in there we don't want. But here's a, this is actually just pretty much exactly the same example I showed you before. You know, creating this important table and then there's some SQL there too that shows us what actually happened. But what we can do now is get out of this and we can run pcql, oops. By the way, one of the reasons why I do scripted demos is because I'm absolutely terrible at typing in front of an audience. I can speak, but type, absolutely not. So let's go look at the statement that we had just run if I can get this all pasted incorrectly. So as I said, we did this DDL, we created the table and now we're going to check the actual event logs to see that that exists. And like I said, this is actually the, let me, hang on a second, let me, there we go. That's going to be a lot more readable. So now we can see that this is the same information that was in the log, of course. But now we can see it's all been nicely parsed out. It's in columns, it's easy to query, it's easy to use. And as I said, the tables that all this stuff is based on are highly normalized. So you actually have a user session table that'll describe the entire session of that user with all the audit records for that session. And then you're going to have a statement table which defines a high level statement that came in from the user. So they've come in and now they're asking PostGust to do something, the back end to do something. And then underneath that you're going to have every sub statement that ran as part of that statement. So you can really get into, and this view just brings it all together into one big chunk so it's easy to look at. Because the normalized tables obviously are going to be hard for me to query here, to write a query against. So yeah, so you can get the information that way. Other things you can do, you can, PGAW to analyze is partly supplied as a reference program so that you can go in and look at how audit logs can be analyzed and parsed out. The audit records themselves are compliant CSV and PostGust embeds them in CSV if you use CSV logs so everything is very easy to get out and very easy to parse. But obviously these records are going into the PostGust log, which isn't always ideal. You might want your audit records to go elsewhere. That's another place where you might want to consult the STIG, which has guidelines on using SysLog to move things off the server to split stuff out. So there's lots of possibilities. So PGAW, it does not have any logging built in itself. And this decision was made just on the basis of, well, logging is actually surprisingly hard. I know it sounds like a really easy thing, but doing logging well and getting it working and making it reliable and making sure everything's logging in the same place and nothing is lost is actually a pretty big problem. And so PostGust has solved this problem. There are some gaps there. There are definitely some things we could improve in the way PostGust does logging, but it has a pretty good logging facility and so we made the decision to stick with that. And if you need to take logs off the server or split them or do things like that, we feel it's better to do that downstream. There are a number of available tools to make that work. And like I said, the STIG actually has examples of how to make this work. So it is a practical thing that you can go and look at and go, oh, okay, so here's how I can split these logs. And also importantly, get the logs off the server. You really don't want the audit logs living on the database server. You'd really like to get them someplace where they're duplicated, backed up, some other thing. All right, so it's about all I got. I kind of burned through quickly. I usually, I meant to say at the beginning that you can ask questions during the talk and I completely forgot. So normally the questions actually chew up a fair amount of time. But anyway, I guess now the time would be to ask, does anyone have any questions? Do you mean actually disabling the hooks in the PG audit module itself? Okay. Oh, oh, oh, oh. Well, I imagine on a fork like that you would actually have a little difficulty building PG audit in the first place. There are, you know, because of course it's going to expect all those, you know, the hook variables to be there. And it's gonna assert itself into the chain and then, you know, call down. So it probably would not. So yeah, yeah, you definitely, I would say that depending on the hook, there are some hooks that are there to provide to add more rich information to information that's already there. So we use multiple hooks to do information. You might be using the utility hook plus the object creation hook plus the event DDL hook. All three of those and each one layers information in. So it gets richer and richer. So there are certain hooks you might be able to take out and you would just be reducing the amount of information you're getting. There's other hooks that if you took them out you would just, you know, hobble it completely. So it would definitely be something you'd have to look at very carefully if you were doing it. Yes. No, it doesn't. Like I said, the PG audit analyzed will help you figure out whether, you know, actual transactions committed or not. And the transactions actually are logged in by PG audit analyzed. This is information that's actually provided by Postgres. So PG audit doesn't touch it, but you can, the CSV log will automatically have it. And if you're doing log statements, you can include the transaction information. And so, but PG audit doesn't do that because that's already a function of Postgres. I'm sorry, I forgot to repeat the question. He was just asking if XIDs were logged so that you could, you know, identify statements that had or had not committed. Postgres already provides that information. So you can get that there. So the question is, will vacuum run if you have? Yeah, vacuum is, so the question was basically will vacuum run if PG audit is enabled? Yes, and so, well, you're not gonna actually get any audit records for automatic vacuums that are happening in the background, as I said earlier. The only time you would get an audit record for a vacuum is if you had miscellaneous logging turned on and you were doing manual vacuums as a user. So let's say you're doing, auto vacuum is running in the background all the time. That will never be logged. But if you had miscellaneous turned on and you do that one weekly vacuum, right, that a lot of people do, that would be logged. But really, to be honest with you, the reason why we created that miscellaneous category is this really is all stuff that basically no one wants to log. So if you really want to, it's there. PG audit will log everything that Postgres does. But these were things that we categorized as really, you know, like I said, re-index, vacuum, things like this. People probably don't wanna log those. But if you do, you can. I feel like I'm not quite understanding your question. I'm sorry. Yes, oh, okay. So sorry, you were talking about sort of cleaning up of audit logs and things like that, not vacuum in the sense of vacuuming the Postgres database. So basically it's more of a question about performance and log generation. Okay, so really the question is, what's the overhead? How much many logs are generated? Well, kind of the point is to, you know, with the granularity that we provided is to allow you to log just what you want, right? So it could be a lot. So yes, that is something you'll have to contend with and you will have to provide log space or you'll have to be offloading logs. There are a variety of ways to do this. You can use syslog to log stuff. You could use, you know, an elk stack to offload things and log stuff. There are a number of solutions to the problem of how to deal with logs, to archive logs, to do that. We don't really want to try to solve that problem here. As far as performance goes, what I can tell you is last summer, PG Audit went under, you know, extensive testing by a major financial firm and they found the performance to be extremely good. I mean, there was just a little bit of performance off the top, a percent to, but overall they were extremely happy with how transparent it was in terms of performance. So there is obviously the cost of logging and the cost of writing a disc, but what we discovered through that effort is the actual cost of generating the audit logs internally because you think, yeah, we have all these hooks in there that's definitely going to affect performance. As far as we can tell across a very wide variety of use cases, it really does not very noticeably affect performance. Yes, there are a few percentage, it's a few percentage points slower, but for what you get, you know, for all of us that was deemed acceptable because, you know, we kind of went into the entire exercise thinking the overhead might be as much as 10%. And when we discovered it was, you know, a quarter or less of that, then everyone was really happy. Like, wow, okay, yeah, we can do this. You know, this is totally, it was totally within the margins that they were hoping for. So I think you can not worry about performance as long as you aren't turning on the fire hose. You do limit the data as much as you can not only for the reasons of performance, but for the reasons of later, you've got to sift through this data and do something with it and store it. Right, there's this whole list of things so you don't want to produce more data than you have to if you can help it. And that's why we have all these granular controls in there to allow you to limit. Jason? Our create policies, audit it? Yes, yes. Yes, they are. Okay. Yeah. So. If you do that, then for instance, you do us a late stop on that same table. Will it be low? So you've got a single column audit and then you. For instance, everybody can select from the cell phone number. Yeah. But for instance, every time you open it in a program, you will select it, basically. Yeah. For instance, when someone generates a phone, it will be 15 or more at once. Is there a way to specify this kind of details? You don't want to log it, it's one or two, but as soon as we start sharing 10 of them. No, unfortunately, one thing you can do to limit. So the question was basically, are there any ways to only log after certain thresholds have been crossed or something along those lines? And unfortunately, the answer is no. This is exactly the kind of thing that would slow logging down a lot, unfortunately. And we did consider some things along these lines, but in order to make that work, you've got to keep state. So now I've got a whole collections of statements with counts and other things like that. And that is going to add even more overhead. We were trying to keep the overhead as minimal as possible and leave the analysis till later. One thing you can do, though, is you can use roles to limit the, so if you've got log on roles that are people that you need to audit, you can make sure they're audited and that say your ETL process, which goes through some other certification, doesn't have to be audited. Yeah, there are a number of ways to filter down the data but it's not as stateful in the sense that it's keeping track of the entire statement stack, of course, but once the statement is done, PG Audit forgets about everything forever. So that's really, the idea is that kind of analysis would be done after the fact. And sort of post-processing, if you will, and that's a lot of the kind of stuff that PG Audit Analyze does to go through and find relationships, seeing things and pull everything together. Any other questions? Sure. In the persistence part of the application only involves a small subset of those under the sort of a limited subset. So consequently, we do our own application of a little login. Okay. Thereby we're able to retain the reasons that certain decisions are made in the logic. We don't have the problem that you mentioned early on that could protect you through the first compile your audit log, but never the end. So I didn't even consider using something like PG Audit since you're here wondering if you can offer some value that that PG Audit could actually provide in that kind of scenario. Well, I think the main value adds. So the question is essentially this gentleman has, they basically implemented their own audit logging at the application layer and by doing that they're allowed to able to add a lot more intelligence and make sure that things were working the way you expect and contextual stuff like you were talking about can be taken care of. The primary advantage to PG Audit is that because it's running inside the database you can have a lot of confidence that everything gets caught. So at the application layer you're going to have the problem of what if someone forgets to write the audit command for this particular new module or so you need to audit, you know, you're obviously gonna need to audit your code to make sure that always happens. If a new program is written, if some ETL process is brought up, if now you're doing BI, right, so by putting auditing in that program you've kind of put the burden on yourself to do it everywhere or decide where it needs to be done. The nice thing about being centralized is you know, if you tell PG Audit to audit certain commands, it will. And you can have a high degree of confidence that's gonna happen for every application that's calling it. So is it less than ideal in terms of having to come back through and look at the logs and pull those up? Yeah, a little bit, but like I said, you have a central source for all your logs to be produced as well. So I think there's some real advantages there over an application-based solution. I think there was one more question back here. Well, I think for my money what I would do is if you're using PG Audit to do that kind of logging, I would set log statements off, right? I would turn it off because there's nothing that does the PG Audit doesn't do. However, the one example I gave back here, Client logman messages, right? So this controls what the server's going to log. So and you'll always, the default is error on every distribution that I know of and you'll wanna leave that, right? You wanna see when those errors come out because those are the things that PG Audit cannot do. There are certain types of errors that will not get far enough down to be seen by PG Audit, which really starts at the planner stage. That's when it starts to get informed about what's going on. And so you'll always wanna have this. So this one is good to use in conjunction. But as far as log statement equals all or anything, I would just keep it at, none actually, sorry. The offsetting is none. So you wanna keep log statement at none more than likely because PG Audit's gonna give you much more granular logging for anything that log statement can do. Okay, so we have one time for one question. It is server level. So the actual PG Audit is, well, PG Audit runs in every backend that starts and it started at server start. It's pre-shared, load pre-shared libraries. So it comes up initially. And it's forked with every process. So it'll run for everything. PG Audit analyzes per database, actually, though. So you have to set it up to two, yeah. So in that case, it's per database, but PG Audit itself is definitely per cluster for everything based on your policies. All right, and that's all the time I have. Thank you very much.