 Okay. So, hello everybody. Let me introduce you Joe Conway, who is a long-time contributor of PostgreSQL and who will talk about SACOM for your PostgreSQL. Thank you. So is this Mike back on? Can you hear me in the back right? All right. So like I said, my name is Joe Conway. I've been working around and with the Postgres community for a lot of years, 20-plus years using Postgres. I've been a committer for about the last 16 or 17 years, I guess. And I'm on the PG infrastructure team. I'm also the VP of Engineering for a company called Crunchy Data in the U.S. So I do, we do a lot of work with companies that care a lot about security or organizations that care a lot about security. And so I've done a number of talks on security. The way I like to think about it is kind of in a holistic way. Security involves protecting your database from the outside, right? The operating system that it's on, the box, actually getting to the box, how the operating system is protecting Postgres itself. Also from the inside, inside of Postgres, you lock down permissions so that only certain users can access certain tables and so on. So that's kind of the crunchy core. And then, but once someone is inside your database, you have to worry about them finding some kind of a bug in Postgres and exploiting it, right? So part of what you want to do to enhance security there is what called confinement. So reducing attack surface. And that's what this talk is all about. It's about reducing attack surface once you're inside of Postgres using Setcom. And then the other aspect of security, which we should mention is you should instrument. You should have monitoring, alerting setup. And it should not just be monitoring, alerting around performance issues. Things like monitoring, did someone restart my database, sending an alert so that if someone did restart your database, how come you didn't know about it? Or hopefully you did know about it. So what we're going to do today is talk about a little bit broadly about Setcomp. I'm going to show some example, C code actually, just to give you a feel for how it works. And then I'm going to go into how you might use system D to set up Setcomp for service and then specifically for Postgres. And then I wrote an extension called PG Setcomp. And so I'll end with that. The repository for that is we just made it public about a week ago. It's brand new. I've not actually even done a release out of it yet. But the code is pretty solid. I've been working on it for several months. And in fact, it originates from a patch that I submitted to the upstream Postgres project last August. And the community was not quite ready for the idea of having Setcomp filters built in. So instead, I implemented it as an extension, which is one of the great things about Postgres that you can do a lot of things through extensions. So I'll talk about how that works and how to use that. So just give me a feel, how many people in here I assume if you're in here, you have some idea what Setcomp is? Anyone have the room? Okay, so this is kind of, you know, this paragraph, I'm not going to try and read it, is from the documentation kernel.org. But basically, Setcomp stands for secure computing with filters. It's a filtering mechanism that's built into the kernel, and it allows kernel attack surface reduction. And what it allows you to do in kind of really high level big picture is it allows you to either block or at least audit what syscalls are made to the kernel. The original version of this was called strict mode. It came out in about 2005 in Linux 2.6. So it's been around quite a while. But this was pretty limited capability version of it. Basically was hard coded. You know, you the idea was that you would make a call to load Setcomp. And at that point, the process irreversibly could only make these four syscalls. So you could basically exit, you can read and write from a file descriptor that's already open, and that's all you could do. So that, you know, had I'm sure some useful applications, but was pretty limiting. And anything else, any other syscall would actually call to cause the process to be killed immediately with syskill. In 2012, Setcomp filter mode came out. And this is also sometimes called Setcomp BPF. I think I was in a talk yesterday and someone mentioned Setcomp BPF. BPF is a built in kernel facility for doing filtering. It was originally for packet filtering, but it can also be used for other things. And so this was used to add a flexible way to filter what syscalls you want to allow which ones you don't want to allow and what action you want to take when they get called. And so this is kind of an oversimplification to some extent, there are more actions than this. But as far as their kind of basics, you can either based on the syscall, you can add a rule that will say either kill the syscall, kill the process if the syscalls used, throw an error, and you get supply the error number, log the event. So don't do anything allow it but log it in audit D log, or just simply allow the syscall. So you can use these along with the fact that you can set a default in your filter. So you can say, I want to build a white list or I can build a black list. If your default is allow, and you set up rules for blocking certain syscalls, you've effectively built a black list. That's not what's recommended. What's recommended is you build a white list, which means you set your default to something like log or kill. And then you provide a specific list of syscalls that you want to allow. Now there's a library called libseccomp, I think written by Paul Moore. And that uses seccomp epf and it also is a nice interface to seccomp from C program. So that's what I use for PGseccomp. I really have no idea what system D is doing internally. I didn't try and look at that. So first off, you need to have seccomp built into your kernel to support for it. So the way you can check that is using this grep command. And you should see something like this, basically, indicating that your kernel has been configured for seccomp. These days, I'd be pretty surprised if you came across one that wasn't. But it is worth checking anyway. This is kind of a little bit of a side issue, but there's another kernel parameter. Basically, it's called no new privs. And what that does is it basically says that if you set no new privs, then the process and any child process of that process could never increase privileges. And in order to use seccomp as an unprivileged user, this has to be set. Because otherwise it wouldn't make sense because then if you didn't have this set, and you had some filters, you wouldn't want someone to be able to bypass those filters down the line. So in terms of libseccomp, the basic usage of it is that you're going to init seccomp, as I said earlier, with some kind of a default action. You're going to add a number of rules for specific syscalls, and then you're going to load the filter. And you can load a filter, you can load multiple filters. And basically, they just layer one on top of the other. The key aspect of this again, being that once a filter is loaded, subsequent filters cannot reduce the requirements. So you cannot, at one point, say that for a syscall right, I want to throw an error, and then later load a filter that says I want to allow syscall right, that won't work. Because the most restrictive action is the one will get used. So as I just said, once the filter is loaded, you cannot relax the restrictions. You can load multiple filters. And importantly, all child processes inherit all the active filters as well, they can set their own. And the highest precedence action is always the one that's going to be taken. So is this readable in the back? Good. So what I wanted to do here that I had actually was doing this just to kind of investigate the behaviors of libseccomp and seccomp in general. And as I was writing these slides, and as I was doing the work on the extension, and one of the things that became clear to me, seccomp can be really kind of hard to understand and get your mind around if you haven't played with it a little bit. So this is kind of a simple example. But it illustrates a number of important points. So just basically what I'm doing here, I'm creating a context for the seccomp. I'm going to loop three times. So I'm basically going to create and load three seccomp filters. On the first iteration, I'm going to allow my right action is going to be allow. On the second iteration, it's going to be log and on the third iteration, it's going to be allow. And so what I'm doing is I'm knitting seccomp with a default action of log. And then I'm adding a rule that says write syscall should be whatever my action is. So that in the first, the first time through, it'll be allow the second time it will be log and the third time it will be allow. And then I'm going to load the filter. Now I'm going to try and use printf. And then finally at the end, I'll release the context. And I'll just so I'll load three filters. When you do that, so here's basically running that compiled C code. And if you look through the audit log, what you find is you get these syscalls that get logged. Is that when I turn over? Is it too low in the back? When I turn my head? It's all right? Okay, because I was getting the feeling was louder when I was looking forward. So what's interesting to note here is the only thing I did really was call printf. You get all these syscalls get made, kernel syscalls. Go through this on the next slide, but syscall five ends up being fstat. These two syscalls end up being PR control and seccom. syscall one is right. And this is basically an exit return. And I'll talk about it a little bit later. But in the in the PG sec comp extension, I provide a little shell script that will basically pull just the names of the system calls out of the audit log and put them in a form that makes it very easy to use with the extension. So as you, as you look at that simple bit of C code, what you're seeing is first of all, before the first time you've loaded a filter, nothing gets logged, right, because there's no seccom filter loaded. So the first pass through, there were no no calls that were logged. But the very first thing you see is that printf calls fstat, but it only calls it once. I don't even, I haven't even really researched why that is. I assumed that that's because the first time it gets called, it needs to check to see if standard out is there, suppose, right? And printf clearly requires write. But in that first filter, I said write was allowed. So it did not get logged. Now in the second loop, I had a rule that says write should now be logged. And now all of a sudden we see the the output in the log for PR control and seccom because they were actually blocked in the first filter. And step fstat, like I said, is no longer called. In the third loop, again, I see the PR control and seccom. That third loop tries to add an allow rule for write, but it's ineffective because it's already been blocked and I can't relax that restriction. And then finally exit group gets called when the program exits. So there's a very simple program and yet a fair amount of stuff going on with sys calls. So this is a second example on this one. I basically have specific allow rules for the all of the other sys calls except for write, I'm still doing the, you know, log allow thing with the right. So otherwise, when you call it now, it looks a lot simpler because all the other sys calls have specifically allowed. And now you can see that the right sys call gets does not get logged the first time, but it does get logged the second two times because of that second filter. So that makes sense. Like I said, as you go through this, you really have to wrap your mind about what's going on. Otherwise, you sit there and you look at me say what, you know, like weird stuff's going on. So as I said, print off gets uses right, and it gets logged twice. Okay, so now we're going to switch gears into system D support for SECOM. Any questions about what I've talked about so far? Go ahead. Yeah, yeah, I'm going to get to that kind of I'll talk about that more. So the question was, how do we map G lib C calls to to kernel sys calls? And I talk about that more kind of as I'm going through this, but I guess in short, it's not an easy thing to do. And everyone I've asked about it admits it's not an easy thing to do. There are ways to do it with things like, I guess, p trace and and there are people who have written BPF filters that do that kind of thing. But what I found to be by far the easiest thing to do, which is really the only one I did do was I wrote my own library for Postgres, right? And one of the things and one of the reasons I only support kind of the latest versions of Libsec comp, because of two things, it, it supports this log action, which the older kernels do not and system D does not, which allow everything to happen just like it would have happened, but everything goes to the log makes it really easy to kind of run your software through its paces. Now, as people in the Postgres community pointed out, that's not perfect. But what I will argue, I guess I'll argue it now instead of later. If you sufficiently run a production system through all of its paces, the regression test for your application, maybe all the regression tests for Postgres, maybe you make your default rule log for even six months, right? Let it run a prediction production for six months and watch audit log. You're going to catch 99.9% of all syscalls that Postgres is going to make in your audit log. And you can go after you've done that initially, and I'll show you that process at the end. You're going to have the vast ones that pop up in syslog are going to be anomalies. You're going to go investigate and see, well, why did I get that? And then if you determine, oh, yeah, okay, in this very rare case, Postgres will use that syscall fine. I add it to my allow list and I restart Postgres. But if it there's no explanation for it, maybe the explanation is that someone's trying to compromise your system, right? So I think in practice, it's workable. And but that allow action really makes it a whole lot simpler to figure out which syscalls you need. So a little bit of a long winded answer. But hopefully that's good. So system D supports this comp filtering via some options. There's an advantage to that in that the control over the use of set comp in this case is now in the hands of your sysadmin, not your database admin. sysadmin may see that as an advantage your database admin may not. And it was also brought up on the Postgres mailing list that it may be more difficult, since it's kind of an external control from Postgres, it might be more difficult for someone who's trying to hack Postgres to subvert. I think that's maybe a fair comment. It does require extra coordination. I did find it required extra syscalls to be allowed. And it gives you less flexibility, as I said, system D doesn't have quite the flexibility that I've got built into the extension that I specifically wrote for Postgres. Although it may well be that more recent versions of system D will have that. But I've tried on Buster and I've tried on rel 8. And so far, at least the versions of system D that I've found are not as flexible as I'd like. They don't have that log option specifically. The other thing that they don't have is, and I think I mentioned this in another slide, when you set the error action for a syscall, by default, when the error is thrown, it does not get audit logged, which means nothing shows up in audit D. You just get an error in your application. I find that very surprising and disconcerting also. So with Libset Comp, you can specifically flip a switch that says, even if I've got an error action, I want it audit logged anyway. And that to me makes things a lot easier to figure out what's going on. So the first parameter that you use in system D is system call filter. It's basically how you set up a whitelist. The default action will be kill. And it's kill with a sys signal. Seccomp does that. It's a nonblockable signal. So the process is going to get killed no matter what. You can override that with system call error number, which will basically let you supply an error number that will be used for the error action instead of killing the process. And you can specify it more than once in your system D control file. You can also set up blacklist. If you use a tilt in front of the list, it basically inverts it. If you suffix your element with a colon and a number or a colon and a name of an error, then it will use that error for that sys call. But whitelisting is actually recommended. System D, one of the things that at first looked kind of neat, but later on I kind of decided was not as useful as it seems, is it has these predefined sets of sys calls. So you can say, I want to allow all of the system service sys calls. Or I want to allow all the file system sys calls. And that kind of on the surface sounds like it would be convenient. The problem is that the list of sys calls in those lists could vary from system D version to system D version, could vary from kernel to kernel. It could, you know, there's a lot of variability you don't without inspecting it carefully. You don't know what you're getting. And if you combine that with the difficulty in figuring out what sys calls you need to allow, I just ended up not wanting to go there. There is a system D analyzed sys call filter that will let you enumerate the actual filters, the actual sys calls that are in the filter sets. Sys call error number is what allows the override of the default action, gives you the error action instead of the kill action. But as I said earlier, it's not logged by default with system D. And I couldn't figure out any way to turn on the logging with system D. What's that? It's not implemented. So as a comment from the audience, it's not implemented in system D. There's another primer called system call architectures. That's one of the kind of things that, again, makes this set comp difficult is the system calls are architecture specific. And there's some nuances like if you're on a 64-bit system, you might actually have available the 32-bit sys calls as well. And that can actually be in a path to exploits. So what you really want to do is restrict the sys calls to just the ones that are native for your architecture. And that's the keyword native. There is a parameter to set no new privileges. The documentation claims, and maybe I just misunderstood the documentation, but it claims that whatever this value is, it will be overridden if you use sys call filter. But with a little bit of experimentation, I'll show you what I mean. I didn't feel like that was true. So maybe I'm just misunderstanding the documentation. But in any case, I'll show you what I mean. So with Postgres specifically, trying to derive the white list was too painful. So although I'm doing the slides in the order of showing you the SystemD implementation before PGSEC Comp, I actually did PGSEC Comp first. And I got a list. And then I used that in SystemD. And that worked okay. Like I said earlier, there were actually extra sys calls that I needed. So I started out with that list. And then I tried to start Postgres. And it gets killed. And I go look and I see what the sys call is. And I add it to the list and rinse and repeat five or six times until I caught all the extra sys calls that were needed when I was starting Postgres using SystemD. And then finally Postgres would run. But I did determine that no new privileges needed to be set. So this is kind of an interesting way of looking at it. If you look in PROC PID number status and you grab for any of these three terms, you'll see before you load SET Comp, basically no new privileges is not set. SET Comp is not set. And you're thread vulnerable for speculation to store bypass. And what I'm showing here is just basically all of the Postgres processes. So now if I edit Postgresql.service, I didn't try to want to list all of them here, but basically there's, you know, 94 sys calls that I determined were needed for Postgres, add them to the call filter list. I left it as an error. I didn't make it a, I left it as kill. I mean, I didn't make it an error action. You can see I set the architecture to native. And for the first time out, I left no new privileges to yes. So now restart the Postgres daemon and look at the audit log. So now when I go look, you can see no new privileges still set to zero. So it was not set by system deform you. That's the part I didn't quite get from the documentation. SET Comp is set. And it's in mode two. Yes. Okay. So the documentation was not really clear on that point to me, at least to me. So mode two for SET Comp is basically the SET Comp filter, which is the BPF filter. If that said one, it would be the original SET Comp, which I don't think anyone ever uses anymore. And then you can see that by as kind of a side benefit here, I get thread force mitigated for speculation store bypass. Just a note on that about a week or two ago, I tried this. This is all done on a latest Linux Mint machine, which I guess is based on Ubuntu 18.04, I guess. And that's what I see there on a latest Fedora machine, which I tried just a week or two ago. I did not get the speculation store bypass mitigated. And I'm not, I haven't really investigated why that is yet. So now if we uncomment no new privileges and run all this again, we'll see that we do get no new ProVset. Okay, so that's that's it for the system D implementation. Any questions about that before I move on? Sure. PG SET Comp? No. So the question is, will this be in the official packaging and specifically the system D filtering? I've not talked to anyone about that. No one's asked me about that. The packaging for Postgres and therefore the service file, I guess, is done by one of the Postgres community members. So conceivably, that could be done. Yeah, I could talk to Kristoff and see if he's willing to do that. Good point. So as I said, PG SET Comp, the repository for that, sorry, this is my refrigerator temperatures, is in, so my company name, Crunchy Data, PG SET Comp, it's basically the Postgres license. It was just released, it was just opened up a week ago. I've not done an actual official release. And just to forestall the probably inevitable question, I don't know whether this will get packaged by Debian or not. But again, I could talk to Kristoff about it and I could talk to Devrim about maybe the RPMs. But it's literally just been opened up within the last week. I wrote this over the last several months because of the fact that, you know, for the people that weren't here earlier, I tried to get this in the core Postgres and the community was not willing to take it. So I'd implement it as an extension. So this is SET Comp filtering through Postgres config options. And I think there's a lot of advantages to this. Not everyone agrees. But it gives the Postgres admin control over the SET Comp filtering. And by the way, there's no reason this can't be used in conjunction with SystemD. So as I pointed out earlier, you can add filters down the line as long as you're making things more restrictive. And in fact, when Postgres runs in a container on Kubernetes, which my company does a lot of with our customers, it's already running under SET Comp because containers do have, I think it's a blacklist filter, not a whitelist. There's about 350 SysCalls and they blacklist about 50 of them. So they allow 300 or so. I've found that Postgres needs about 100. So that leaves a lot of room for improvement as far as security posture goes. But in any case, it provides more flexibility. It does have the log action. It does ensure that if you use the error action, it gets logged. It allows you to have different settings at the Postmaster, which is the parent Postgres process level and the session level. And it also allows you to have different filters at the session level based on the user that's logged in the Postgres. And then finally, it also allows, there's a client command, which I'll show you, which allows even the client application to set its own filter. So now you can imagine, you know, the system administrator enforces some basic level of SET Comp filtering with SystemD. The database administrator enforces some SET Comp filtering at the Postmaster and the session level. And then the person writing the application can even lock it down further. The first thing they do out of the gate, and then it can never be relaxed for the rest of the session. So now, if you're really talking about layering security, now if someone figures out how to do SQL injection to your app, you've blocked stuff as far as you can block it. And it was pointed out it might be less resilient to the SystemD method, but you could use it in conjunction. And of course, the other thing you can do in order to change the values at the Postmaster level for this extension, you have to restart Postgres. And again, for the benefit of the people that weren't here at the beginning, good security practices, if Postgres restarts, you better get an alert. And if you didn't plan that restart, you better figure out why it restarted. Okay, so it's implemented as a Postgres extension. It's loaded via shared preload libraries, which basically means it's loaded immediately. Postgres won't start if it doesn't load, which is maybe a good point. If you have some kind of syntax error when you go edit your Postgres.conf to do this, and you go to start Postgres, and it doesn't start, it might be because you made a syntax error. Postgres error log should tell you that. There's a global config setting, which is the Postmaster level that requires a restart. The client settings are done through something called the client authentication hook, which happens immediately after the client authenticates, but before the client gets access to the session. So it's before there any user input is taken. The client filters require a reload, not a restart, which is kind of important if you want to be able to modify the client settings as the administrator without having to restart your whole database. Now again, this may be something that someone complains about and says it would be more secure to require a restart. I found that kind of hard to work with. I think this is good enough. This is still like, you know, version one of this, so maybe I could be convinced of otherwise later on, but for now that's the way it is. And it provides this seccomp filter table function, which will actually show you the merged filter, what it looks like in your back end, to the best of its ability. And the reason I say that is because there's, you know, in addition to the fact that there's no easy way to know which CIS calls are used by which G-Lib-C calls, there's also no way to read from the kernel and find out what the loaded filter looks like. So if a filter was loaded by system D, this extension has no idea about that, or Kubernetes, for that matter. And I've talked to the kernel maintainers about that and they said, yeah, that would be pretty cool. I'm not sure how hard it would be to implement, or if it's even possible. But so in any case, in order to enable this, you have to have shared preload libraries set to at least PGSEC comp, you might have other stuff in there, maybe PG Audit, if you care about security. Let's hold another talk. PGSEC comp.enabled is your overall on-off switch for this feature. The global configuration, which is at the postmaster level, you know, we'll see an example of this later, but basically there's four of these, one that's that, you know, underscore allow, underscore log, underscore error, underscore kill. Those are each specific lists, and then there's a default action. So you could basically set the default action to log, and you could have a list of items that you want to allow, and you could have a specific list, maybe of syscalls that you never want to allow, that are set to error kill. And then similarly, at the session level, you've got the equivalent calls, sessions that are global, and then this session rolls business. This is a little sidebar when I want to implement this. Postgres does allow you to specify specific settings that are bound to a specific user. However, those settings are read in later than that client authentication hook that I'm using, and so there was no way to get to those, basically using this extension. Now, maybe some future version of Postgres I can convince the community to add a hook in the right location, but right now there isn't one in a better location, so the way I did this was, basically you provide a list of specific roles that are going to have specific Setcom filters, and then you have this, each of these entries again, except dot with the roll name on the end, and that will give you a list that's specific for that logged in user, and that will get used instead of the default session one when someone logs in. And then finally, as I talked about earlier for an application, you could call this SQL statement, and you can create a filter on the fly dynamically. It can only be called once per session. You try and call a second time, it'll throw an error, just because I think that would be confusing because it wouldn't do anything. So this is, now this is my 10-step process for deriving your list of syscalls, at least for Postgres. So if you set, this says you're in your PostgresSQL.conf, you're going to enable Setconf, you're going to allow everything at the global level, and you're going to log everything at the session level, and just a note here, and it says it on the slide, an asterisk at the session level basically means just use whatever's in the global list. I just found out to be a useful notation. So you do that, when you're doing this, you don't want to do this on your production machine, please. You're going to want to modify auditD. I found that auditD actually by default is lossy. So if you overload auditD, you start losing audit records, which I find a bit strange, actually. So you want to make it lossless, and you also don't want the files rotating out of your way, because this, when you first run this, you will get very large growth of your auditD log very quickly. So if you clear out auditD and basically restart the service, restart Postgres after making all these changes, now, you know, this is what I alluded to earlier, you want to exercise Postgres through as many paces as you can do. So you want to run all your application regression tests, you basically want to try and get your application to use the database to do everything it might possibly do. You might also want to run like the Postgres regression tests. I've got the formulas here. Interestingly, this is how you use the make check world and specify some extra stuff to go into the configuration, which you would need in order to test this. When you run these, you will add a lot of sec comp entries to the auditD log. So at this point, stop auditD, run the script that I provided with the extension that I talked about earlier, and then it will just give you a list that you can cut and paste into the session syscall allow. Excuse me. So now you're basically going to repeat this at the global level and paste that into the global syscall allow. And now, you know, I've found over time that, you know, as we talked about earlier, it's not 100% deterministic figuring out which syscalls are made, because some syscalls are only made when certain things happen. Maybe something gets flushed, or, you know, I can't even explain all the reasons why it might happen. But I did find that if I re-ran this two or three times, I might catch one or two other syscalls. Now at this point, optionally, you might decide to change your defaults to error kill. As I said earlier, I think probably what you would want to do is leave it set to log for some period of time, implement all this in production, and then monitor your auto delog for some period of time and see if anything else pops up. And it should be fairly lightweight at this point. It shouldn't happen very often, but when it does, go investigate and see if you can figure out using the Postgres source or whatever why you got this syscall. And if you can convince yourself that it was legitimately Postgres, add it to the allow list. If it wasn't legitimately Postgres, you better figure out why that happened. So on an ongoing basis you would want to monitor Postgres and React as required. The question? So the question is, since the process is not killed, is there a way to get basically a stack trace or something to help you aid you in figuring out where the origin of the syscall was? And there's a patch that was being talked about. I can't remember if it's actually been committed for postgres 13 that would allow you to configure basically getting stack traces, I believe, but I'm not sure if, first of all, that's going to be postgres 13 which is not out if it's in there. And so in my imagination what I would do is I would start Googling and start grepping through the Postgres source. I don't honestly have a great answer for that. I don't have a lot of experience using this yet. This is, like I said, this project is brand new. The requirement for this actually is fairly new. I was driven to it because we have large organizations we work with that are starting to make this a requirement. So there's still some place to learn, but if you figure out something that's better I would love to hear about it. Okay, so now I just want to go through a couple of examples. I'm kind of running out of time. So I'm going to try and do this quickly. In this example I'm basically going to block the read link syscall. So initially if I create a tablespace and I say you can see here this is the use of that seccomp filter. So this output shows me that the read link syscall which is number 89 at the session level is set to allow in the context of session. If I now go use read link using this call, you know my call works, but now if I move read link from allow to error and restart or reload at least, you can see now I'm set to error and I rerun this and I'll get a permission denied. Here's an example I cannot reduce the restriction on nano-sleep. So if I add nano-sleep in the session allow but it's already in the global list for log, you can see that it's still going to get logged. This example I'm going to block clones. So this is kind of an interesting one. You know how many people are familiar with PG Perl U? PG Perl U is basically the untrusted version of PG Perl. You can do things like shell out and run stuff which can be really useful but is also potentially dangerous, right? So in this example I create a PL Perl U function that's going to let me cat a file and send it to the client as output. And you can see that works, right? So now if I add clone to the session error, I'm going to get an error. So now I'm going to create a special entry for the user Joe that does not have clone as an error. And now when I log in as Joe you can see I can use that function again. So basically I can use this to say Joe is allowed to use PL Perl U to shell out and do stuff whereas everyone else is not. Five minutes. Okay and then finally I'm going to show how that set client filter could be used now even though I'm logged in as Joe I can run create a client filter right here with this call which will again deny using the clone syscall. And so that is it. I've got I guess four minutes left for questions. Thank you for this excellent talk and we have some time for questions so does anybody has a question? You said that it's possible to reduce privileges at a later point. I'm an OpenBSD developer. We have pledge and what we found was that it's possible with quite a lot of applications to hoist some of the codes to the initialization phase and then drop additional privileges. Did you had similar experience with Postgres? Did you find certain components that were like okay this is where I drop some privileges but I need those privileges here but actually it can be done in initialization. Did you find some of those cases? So if I do I need to repeat the question again as you use the mic? Yeah. I do. So let me see if I can summarize that a little bit. So that you found that with certain applications you can basically when you initialize you need more syscalls available. Later on you can restrict and get away with kind of a reduced surface area right. And so yes absolutely that's true. Actually the filters that I've developed for Postgres using PGSEC Comp there are more syscalls required at the global level than at the session level and so the session level actually is already in the filters that I've developed using that 10-step process is already more restrictive for the client than it is for the global level. One of the things that's kind of related to that that you don't really ask about but I'll talk about a little bit that Postgres has like I forget exactly but it's like seven or eight processes that spin out when it starts. There's the postmaster but then there's some auxiliary processes. One of the things we've looked into is some of those auxiliary processes in particular probably need quite a few less syscalls. Unfortunately there's not a good way to hook into specific processes to reduce the surface area of those so far. One of the things that I hope we will get into a future version of Postgres is some additional hooks so that this could be extended so that for instance the auto vacuum daemon that runs as part of Postgres probably needs far less syscalls than a normal session does but that's not something that's possible to do yet. But you know as I said you can restrict at the session level and you can even restrict Postgres as opposed to the system D because as I talked about system D actually needed more syscalls than just starting up Postgres does. So if you use this with system D you'd have a bigger list with system D slightly smaller list with the global list for Postgres and a slightly smaller list at the session level that you're allowing. Any other question? Does it have an impact in terms of latency? Latency treatment? I'm sorry can you repeat that? Does it have an impact in terms of latency treatment? An impact on latency? Yeah. So does it have an impact on performance in general? Right. I've not done any measurements for performance at this point. You know I didn't see anything noticeable. Actually I mean you'll see something noticeable when you're going through that process and everything's getting written to the auto D log if it's all in the same machine that'll have a noticeable impact but once you kind of clean up your filter so not every query is getting logged you know the performance goes back to more or less normal. I'm sure yes I should at some point do some more rigorous performance testing to see what the impact of it is but you know the reality is that places that care about this kind of stuff are probably willing to make that trade off. I don't know if you have time to go into this but second I don't know that thoroughly but if you have like the way Postgres implements certain functions it's kind of separated within specific processes so is it possible to have a second a different second rules for different system processes within Postgres for example replication or like a different kind of profile for different processes? So the question is is it possible to have different profiles for different processes? Parts of the Postgres processes. Yeah and that's actually that's what I was just talking about a minute ago there are these extra daemons that are launched that are kind of auxiliary processes right now they just get what the postmaster gets because there's really no convenient way to hook into them but I hope that in the future I'll be able to improve that. I think that's probably it right we're done? Yeah we're done. Thank you again for this excellent talk and please raise some applause for him.