 Yes, to everybody I know, hello again, and to everybody I haven't met, let's talk sometime. My name's Steve Grubb. I'm a security architect at Red Hat, and I work on Red Hat Enterprise Linux, primarily looking at security for certifications, such as common criteria, FIPS, SCAP, and other things we do. So this talk is going to be about application whitelisting, and we're going to dig into some interesting topics along the way. So this is the outline of what we're going to be talking about. So what we're going to talk about here is what is application whitelisting? Compare it to other solutions, how code executes. We're going to talk about the design of a solution, talk about sources of trust, and how this might fit into an overall system design. So a couple years ago, NIST released a special publication, 800-167, and it's useful because it defines a few terms and talks about different things that application whitelisting should take into account. And so the way that they define whitelist is it's the list of applications, libraries, or files that's authorized to be present or active based on a well-defined baseline. A blacklist is a list of discrete entities previously determined to be associated with malicious activity. So permitted activity corresponds to a whitelist and not permitted activity corresponds to a blacklist. Also, common criteria has an optional requirement. That's the software restriction policies where you may elect. One of several ways to restrict the execution of software. Some people use digital signatures. You can also use hash or you can use file paths. Antivirus is a blacklisting approach. It defines the malware, but the problem with it is there's much more out there that we don't know about. So mandatory access control usually restricts based on behavior. And subject-object rules around information flow and access. Providence of software is really not taken into account. Application whitelisting is a different approach where you tell it this is the things we know about, which is simpler to say because usually you know what's installed on a system. So if you take a look at the Lockheed Martin kill chain or the MITRE attack system, which describes the way that intrusions happen and what they try to do, the place where application whitelisting sits is right between compromise and execution. And that's the area that we're able to target. Coincidentally, should they get past execution, then we'd be targeting everything on the right-hand side also. OK, so one of the things we need to think about is how programs execute. Usually they start with an exec VE, which calls the kernel and it opens a file, loads it, passes it to the runtime linker, it resolves the libraries, and then it jumps to main and takes off. But there's also another trick. And that is you can execute ELF files directly from the runtime linker. And so if you have something that's mounted read-only, or no, with the no execute, this is a trick that people have used in the past where you can just download it as a read-only file, and then you can use the runtime linker to execute it. Because runtime linker just reads the file, it doesn't try to execute it, but later it does. But it doesn't do it through an exec VE. Other ways that you can force execution is by using LD preload. And there's several viruses that take advantage of this. And so once it gets loaded, it passes it on to every child process. And if it gets into the user session, then it can do things against your account. There's also another variable called LD audit, which is a cleaner version of LD preload. But the effect is the same, that you intercept the runtime linker. I mean, with LD audit, you can do a lot more damage, because it allows you to intercept all of the runtime linking and setting up, and you can redefine the functions. So it's a formidable tool to know about. Another way that things can execute is that somebody can go change the ELF interpreter. Embedded inside these files is the preferred interpreter. And usually it points to LDlinux.so. But it doesn't have to. It's something you can define at compile time. So what you can do is you can have a legal program, but the ELF interpreter's changed. And instead of doing runtime linking, a malicious interpreter just takes off and does malicious things. So it looks like you're executing one thing, but you're really executing something else. Another way that programs execute is through language interpreters, such as Python, Perl, Oc, other things like that. So that's the traditional ways. But there's also much more malicious ways of executing code. And that's the mobile code. And when you execute through mobile code, we're talking about piping things in to standard and to be executed. You can also pass programs as command line arguments. I think you have up to 4,096 characters on the command line, which that could be a small program that could bootstrap itself into other things. Then there's remote fetching, like what Python does. You can override and redefine the module import so that it pulls Python code across the network. And another thing that you can do with Python is that it can call arbitrary system calls. And so what you can do is you can open with MFD Create, download something into that, and then execute. And it never touches a disk. It's entirely in memory. Another thing is you can just paste the program straight into the shell. And here's an example of some of this mobile code. This is a function that's written in bash. And this is functionally equivalent to the W get command, but it's entirely in bash because somewhere along the line, somebody decided it was a good idea to have bash to do TCP IP. It can also do just IP. If you are an attacker and you get to a shell and you're in a container and it's mounted read only, they're still going to get you. And the reason why is because of this. If they have access to bash, they can pull down anything. As a matter of fact, I can show you something here. We can, this is a white hat talk, but we can do black hat for a little while. So here we got a bash shell on the one you see here. Just close. So here's the function that you can just paste right into the shell. And so if you're a bad guy, you can paste this right in. You get a shell, a reverse shell. And now you got the function. OK, so there's someone there. Let's just source it just to make sure. Now, I've already got something set up here. So entirely using bash, right there, it can pull down a Python script. And if you go and just pipe that right in the Python, and let's pull up another. And you can see right there on port 8080, we have a server. So you just never touch a disk. This came off of the internet. There's other things that you can do. Like, for example, here's another one with Wget. OK, so this just pulled another program off of the internet. And this is the bad one. What this one does, this is a program called SnakeEater. And you can find this out on GitHub. What this does is it creates an URL to a shared object. Next thing it does is it opens it right here. And it does that by making an arbitrary system call to create an MFD, or MFD create. So then what it does is it reads it. And then down here, it sets a path to this FD. And then it executes it. And so here's a quick demonstration, again, of a program in Python that pulls down a shared object and executes it. So you can run arbitrary ELF code using something like SnakeEater, and it never touches disk. OK, so let's go back to White Hat. OK, so this is just some of the objectives that an attacker may try to do. But we can narrow that down a little bit for this talk and say that without privilege is what you can probably do is download malicious escalation tools. You can change search paths for an account so that it's trying to resolve things out of an attacker-controlled directory. And you can ran somewhere the account. With privileges, you can modify and replace applications or libraries. You can install new applications, backdoors, rootkits, ransomware, crypto miners, everything. Or you can inject malware into a running process using ptrace. OK, so with all this, we're going to start thinking about a solution to try and catch these things. There is an API that the kernel has called File Access Notifications. And it's been available since the 2.6.37 kernel. And it allows recursive monitoring with an amount point. It allows the user space application to say yes or no on an access or execution. And the kernel hands an open descriptor of the file to the monitor program so they can read it. And this was originally designed for antivirus. It has a couple drawbacks that you don't get any notifications on deletes or renames or file moves. This is just to show you what you get when you do the FA notify and knit. And then you set a mark, is eventually you get a callback. And the data that comes to you is in this structure right here. So the two interesting things that we're interested in is that you get an FD to the file, and it's open, so that you can inspect the file. But you also get the PID of the program that's trying to open the file. So from the FD, we can gather some information about the file. We can take a look at what the file's full path is by using read link against proc self FD. We can also take a look at what the type of the file is by passing the descriptor to lib magic. So from this API, you can figure out, this is a Perl program, this is Python, this is Ruby, this is PHP, this is ELF, or just a text file. We can also figure out what device it's on through the UW library. We can also figure out a trust status by looking it up. Once we've got the path, we can look it up in a database and see if this is something that we know about. We can also calculate our SHA-256 hash of it using lib gcrypt. We can also get some other information about the subject. The FD, obviously, is information about the object. But we can get information about the subject by looking at the PID. We can go into the proc file system, and we can pull out what the command name is. We can pull out the executable. We can figure out what type it is by passing that into lib magic. We can also figure out the UID, the login UID, and the session ID, and the proc file system also. So with these primitives and attributes, we can start to fashion a policy along these lines, where we have a decision, some statements about the subject, and some statements about the object. And so for the decision, we can tell it if these things match to allow the access. Or we can also tell it to do that with auditing so that we get an audit event saying that this is allowed. We can also deny the access. And we can also create audit events based on the non-isles if we want them. The subject attributes is just what we talked about a second ago in the pictures. You can tell it all, a UID, UID, session ID, PID, COM, and also some patterns. On the object side, we can tell it paths, directories, devices, file types, and hashes. We can have multiple statements and they're ended together. This is a little bit of information about what is in these different things. For the UIDs and session IDs, these are numbers. Process IDs, COM, is a 15 character string. Executables are also strings. The exec dir also has some key words that we can tell it, executable dirs, which would be stuff like sbin, bin, lib, and lib64, and libexec. System dirs would also include Etsy and maybe one or two other things. Now, we also can do pattern detection because the way the programs start up is different depending on what's happening. I did have a pattern in here for LD preload, but then one day, I was looking at it on a system that held LD library for an NVIDIA graphics card, and so that causes the runtime linker to do completely different things. And I decided it was unreliable at the moment. There's another way that we can add this back in, but I pulled that one out for the moment. The object statements, we can just tell it all. We can have a path. We can tell it that it has to be trusted or untrusted, meaning that it lives in the trust database. We can also tell it that a device like dev at CD-ROM, and one of the main uses, though, is for the file type, which just because of libmagic having things based around MIME types, we list things in that format. So this is a sample policy, the point it looks like. And this is a first match wins kind of evaluation. So at the very top, we tell it that we don't want any execution straight from the runtime linker. So we tell it deny that with audit and trigger on the pattern of starting the program from the runtime linker. And that's for all objects. We also don't want to let untrusted executables run. And so we tell it we want to deny that with audit and that the execution dirs have to be the exact dirs that I mentioned before. And it has to be trusted. If it's untrusted, then we're going to deny. And that's against all objects. We also have a pattern here to allow all ELF applications. And this is a pattern where you have to tell it to allow the types that you want. That's the white list. And then to deny everything else. And so we do the same thing with ELF libs. And that is we tell it what we want to allow to execute. And then we tell it deny everything else. Same thing with Python. We can restrict it to the exact system directories so that it has to come from the system directory. But we also want it to be trusted. The design goals of this policy was to have no bypass of the security by starting a program from the runtime linker and only approved executables that's in the trust database can run. ELF and Python files have to come from the system directories. And this prevents LD library path and Python path redirection. Also, the other languages are disallowed by default. So if you have Perl on the system, you would want to go in and adjust the rules. Or if you're using Ruby or PHP or anything like that. The design of this application looks like this. We get the events in a reader thread. And the reason we do this is because the system will deadlock itself if this application tries to open anything. So what the reader thread does is it receives the event, looks at the PID, sees if it's the PID of the monitor program, FA policy D. And if it is, we go ahead and approve the access. Because why wouldn't we want to approve the accesses that we need? So if it's not that PID, it goes into an event queue and a decision thread gets it. And this way, the reader thread can continue getting events and putting them into the event queue. So the decision thread then needs to figure out, what are we looking at? What's the subject? What's the object? And the thing is that it takes several system calls to figure out who's what. And so what you really want to do is to cache this information. And so the first thing it does is it tries to figure out, what are we looking at? And is this already something we know about? And is it in the cache? Because if it is in the cache, then we don't need to open up all these proc things. We can just go ahead and shortcut to evaluating the rules based on the cache. And so the cache is designed as a least recently used cache so that it's self-cleaning. Things that are recently accessed stay at the top of the list and things that haven't been accessed for a while eventually get popped off of the cache. The decision thread also has a trust database. In this particular case, since I'm designing this on a Red Hat system, the trust database comes from RPM. So basically what the policy is saying that everything that we know about is packaged. The packaging information is trusted, and we're going to use that to make decisions. Oh, there's one other feature in this, and that's a watchdog timer. Because this is approving or denying access to things, I think it's likely to be a target for attack. And so I've tried to design this in a way so that if somebody does try to get execution control, that there's a watchdog timeout that both of the threads have to acknowledge periodically or the watchdog timeout's gonna kill the application. The program, since it might be an attack target, doesn't run as root, it retains capabilities. It also loads a sec-comp policy that prevents exec VE so that if somebody were lucky enough to exploit this program, that the one thing they probably want, which is a exec VE, is denied by the policy. And then they're also gonna have to deal with that watchdog timer. So I mentioned that the sources of trust, we can use a package database such as RPM. And out of that database, there's the path, there's permissions, there's ownership, there's a SHA-256 hash, and all of the entries are signed. So every package can be trusted in the database. There's another source of information that's new and on the horizon. And this is called SWID, which is an acronym for software identification. This is covered by an ISO standard, and this is also putting this, they've also got an information reference, 8060, which details their take on the ISO standard and they put their electives into it. SWID is also being driven to all of the common criteria, protection profiles. One by one, they're asking for manufacturers to include SWID information for all the software they're shipping. So today it's kind of sparse and you might not find many SWID tags, but over time, it's gonna be everywhere. Now to talk a little bit more about SWID tags, because this is really an up and coming standard, there's four kinds of tags. One is called the corpus tag. And what the corpus tag is, is it's like for a body of software, like a CD-ROM. So a CD-ROM would have a SWID tag in a specified directory, and it would give the information about what's on the CD-ROM. There's a primary tag, which allows you to describe the product. There's a patch tag, but this is really aimed more at the Microsoft world where you install something and then you have patch Tuesday and you keep updating. And so this is designed more for the Microsoft world. And then there's supplemental tags, which allow you to add information like linking to webpages for more information. The SWID tags are an XML file. They convey information about the publisher of licensing. And then there's an optional payload section, which details files, sizes, and hashes. And this can also be extended with information about permission and ownership. And then the whole thing is digitally signed using the X80s specification. On Fedora and in REL, you can find SWID tags in user lib SWID tag. And here's an example of what they look like. There's much more to this because this doesn't have the optional payload section, which is the part that's probably most interesting. Okay, so let's take a quick look. I'll do another live demo here. Okay, so it's coming up right now. It's running, it loaded its rules, it changed the UID, initialized the database, did an integrity check of the database, and it found a miscompare. So right now it's rebuilding its database. Now, one thing to mention is that what it's doing right now is that we use LDBM as the database. And the reason why is because it's way faster than the RPM database. So what we do is we create our own database out of the RPM database. And by the way, we don't have to worry about locking and other applications. Normally it works a lot faster than this, but this is an old laptop. And I was having trouble getting the PDF file to open. So there's something busy on this machine. It was taking way too long. Yeah, I'm wondering if RBM had something locked. Okay, now we got this thing up and running. So this is the debug mode, which is not the way you would normally run it, but this is a lot more informational so you can see what it's doing. What it's telling you is the rule that it triggered on, what the decision was, and then some information to help debug with, which is what executable and what file. It's trying to be accessed. Okay, so let's follow this demo script here I've got. Let's go over here to the temp. And you can see when I ran CD that it created a whole lot of access. So what we wanna do is copy the system LS into the temp directory. And let's also make a sim link. Let's try to run this program. Over here you can see the denials. Okay, so let's try it from a sim link. And you see it was allowed. And the reason why is because it resolved it to the system LS program way back here. So the sim link, you didn't get the full path, you get what the real object was. Okay, so let's try to run this program from the runtime linker. And so we got a denial. And just in case, well, we've already got a copy of it there. Let's try to run that one because just to show you that the temp directory is loaded with a permission so it doesn't allow execution. So you see here it denies it, even from the home directory. And you can see that the denial is over. Well, way passed down. Okay, so back to the demo. Let's take a look at a program that has the interpreter changed. And just to show you that the interpreter is changed, we'll take a look at it with read-elf. That's right. Oh, there it is, yeah. I think it was denied. Yeah, there was a denial. Well, let's try to run it. Actually, I'll show you this file in just a second. Let's run it. So there was a denial. And the one last. Okay, let's go try and run that. You can see that system Python files are allowed to run. And one of the things that I have it to do in debug mode is to output some statistics so that you can see how it's performing. In this case, there was about 1200 accesses allowed and 11 of them denied. The cache size is 1024, 43 slots were in use. We had a lot of hits. Same thing on the object side, that that's shut down. You can see right here in the elf that it's not requesting the normal runtime linker but a fake interpreter. Okay, so I did show you the statistics report, but it also leaves a breadcrumb trail. Over in VAR log, there's a FA policy D file that's dropped there when the program shuts down so that just in case you need it for forensics information, there's a breadcrumb trail of everything the system was doing right before it shut down. But that's also configurable because there may be cases when you don't want that information sitting on the disk drive. Okay, so looking back at our original code coverage, we can deny everything that's got a line through it. The things that are in red are within reach. Those I can cover with just a little bit more work. And then also this fetched remotely thing. This is being handled an entirely different way. I've been talking to people in the Python community and explaining the dangers of Python and mobile code and things like that. So what they've done is there's a PEP 578 that's been approved and is being worked on for Python 3.8 and what it will do is it will add audit hooks to the Python interpreter. And then also you can have a monitor that looks at these hooks and decides yes or no. And this will be inside the same binary. And so we will be able to have a policy inside the Python interpreter that says no code from standard in. You're not allowed to override the module resolution. You can't pull things from the net. You can't have programs that come from the command line. So this should be on 3.8 and it will solve the problem there but you've got the same problem in other languages but I don't have leverage or ways of influence communities for the other interpreters. Some things I really wish would make life a little easier is that if we get notification on exit. I suppose there's a way to do it but it wouldn't be serialized with the main event queue and that causes some problems. The other thing is if, because we have to figure out is this thing in the cache, you know, what are we looking at? The very first thing we need to do is get some stat information. So if that was passed with the event, then that would save one system call and we can make a decision a lot faster. And then the other thing is in the proc file system, everything is in different files. It would be awesome if some of this information was collated like for example, proc self status does not have the login UID. You have to go open proc self login UID to find that. So it would be good if some of this was consolidated so there's just like one open system call and then a read and then we can make a decision. Things that are in the near term is reinstating the LED audit, LED preload coverage, detecting statically linked applications, interpreters pulling code from standard in, even though this will be solved by 578, there's still old systems out there that need protecting and detecting code from standard in, from the command line, standalone shell usage and adding more threads to it so it can scale out. So how this might fit, I'll briefly go through this. The audit system has a bunch of event feeds. From the kernel, we can get promiscuous socket, core dumps, sim links, net filter, TTY, syscall and file watches. There's also trusted programs that send events such as PAM login, shadow utils, password, SE manage, Cups, Clevis, Libre Swan. And there's also policy engines, LSMs and Setcomp can also set cause events. And then there's integrity apps, aid, FA policy, the USB guard. So we can take these events and start to fashion a system together where events wind up coming to the audit daemon. Right here it just shows the application white listing daemon. And then the audit daemon can have an IDS plug-in to look at things. The audit system is really easy to use. You don't have to worry about parsing events. There's an audit parsing library which takes care of all the idiosacracies of the audit system and just makes some calls and it gives you the data in nice little chunks. And then what we would look at is putting this into an IDS system with an ensemble model, something like this that looks for bad events, it does pattern analysis, does burst analysis, looks at historical norms and does misuse detection. And then that all gets summed up and then there's a reaction to it. That concludes the presentation. Questions? So is there a revocation story for the trusted database? Yes, anytime that the database is updated we get signaled and we update the database. So if, for example, you do RPM-E to erase a package, then that would trigger an update to the database and then we would know that that's no longer there. So that's kind of how revocation would work. IDS model that you talked at the end about is more of a blacklisting model. I'm afraid I didn't understand the question. So the last thing that you talked about is not whitelisting. You said you are looking for patterns, blacklisting basically patterns. Yes, yes, it's doing more than whitelisting. Yes, you're absolutely right because whitelisting is just about the provenance and whether or not it's in the trusted database. But in a way you can stretch that to say that the runtime linker is what we trust. And if they're calling a different runtime linker then it's outside what's trusted. You mentioned SWITTAG, in case you're not a big fan of XML, just want to let you know there's an IETF draft for doing SWITTAG in concise binary called COSWID just in case you're not a XML fan. Also I just want to do a full reference. We at our three o'clock presentation are defining another type of SWITTAG for firmware. So just kind of underbashed. So how does this work along with other access control systems like IMA or SCLinux? How does it coexist with other ecosystems? Well, this, because we're in the file access path, I think that we're before the DAC permissions. And SCLinux is after the DAC permissions. So I suppose in a way we get first vote or this program gets first vote or something like that. But it co-exists, find their complementary. So as UEFI secure boot, this doesn't try to solve that problem. And so it's complementary on top of that because really what this is aimed at is solving the problem of somebody pops in a shell from a demon or gets access to somebody's system to their credentials. And that's way after boot. But that's really what this is designed for. So, okay, last question. So, two questions. Since it's a last question. So, okay, so do you have any plans to extend this to kernel module white listing? Would we extend this to be a kernel module? No, to white list kernel modules. Oh, to white list kernel modules. Yeah, I believe we can do that. You know, I would just have to double check to see if the FAA notifying system gets notified. And if it does, then yes, we can. The second question is, okay. Believe the trusted database or the trust model works efficiently when the software or the applications are installed from the package manager. If you have legitimate scenarios where you are not using a package manager or you are not installing package, rather you are directly dropping the applications, distributing by a different distribution mechanism. And you have a legitimate need to allow only certain binaries which are distributed by a different distribution system. How do you foresee this fading? Okay, I'm not entirely sure. I understood all the question, but what I believe I heard was if somebody has applications that's not packaged and you install it and you want that to be trusted, how do you handle that situation? And if that's the case, there's an admin defined list that you can also modify which would then grant trust to that. So there is a way to add your own whitelist to this so that you don't have to depend entirely on the package. So we're out of time for the questions but I'm sure Steve will be around later. Thank you, Steve. Yes, thanks.