 Hello, everyone. My name is Daniel Bergener. I'm an engineer at Microsoft, and I'm going to be talking to you today about a tool that I developed to do static analysis on SE Linux policy. This tool is called SE Lint. I'm going to start with a brief introduction. I'm going to talk about some of the challenges involved in writing SE Linux policy, and then I'm going to show some of the example checks that SE Lint can check for, and then a brief overview of how to use SE Lint and take advantage of some of its features. As I said, my name is Daniel Bergener. I've been working on SE Linux policy for about eight years, and I'm at Microsoft, and I maintain SE Lint, and in my time working on SE Linux policy, I've spent a lot of time reviewing policy written by people who may be newer to policy or less experienced doing SE Linux policy. Doing that just like with any code review, I find myself talking about the same issues over and over again, which are catchable by automated analysis. The goal here is SE Lint is a tool to get a lot of that normal cruft in policy development dealt with automatically so it doesn't need human review. There's the GitHub link to where you can find SE Lint, and we just released version 1.1 in May, and still ongoing lots of active development to improve it and add more checks, etc. The reason for this is to improve SE Linux policy scalability and maintainability. As I said, to save reviewer time on most of the environments that I've worked on policy in, you tend to have a lot of people who are very new to policy, perhaps developers who are writing code and then also writing policy for that code, and then you have a handful of people who have more background with SE Linux and using it. We want to minimize the amount of time that those people have to spend reviewing the policy to free them up for other issues. I can also help to find issues prior to failures in production, take some of the burden off of testing and move it to static analysis. One of the main goals here is to have something that's good for inclusion and automated build pipelines. As another benefit, we've already submitted quite a few fixes to reference policy, and that means that the issues that SE Lint is finding, everyone's getting the benefit of those upstream, and we hope to continue submitting upstream reference policy fixes based on things SE Lint finds as we add new checks. First off, let's talk about some of the challenges with using SE Linux policy. As a story, I recently purchased a new scanner, and I was on the scanner manufacturer's website, trying to get drivers for my scanner, which worked great, and I was able to get my scanner set up very easily. But as I was looking at their FAQs, I noticed a what happens if SE Linux is blocking the use of the scanner, and so they had this mess of instructions, which is really, really long and intimidating if you don't really know what SE Linux is. You've just tried to install the scanner on your SE Linux-enabled system, and what should I do? It's also worth noting that these instructions involve editing a generated file, so this may very well not persist after a policy update and need to be redone, which is not indicated here. But fortunately, they do provide alternative instructions, which is how to disable SE Linux, and those are short, sweet, and simple. So I imagine that most users who are using SE Linux are going to opt for the disable. Now, the target here is not your in desktop user. They're not gonna write their own policy, but the target here is policy writers who often, in my experience, end up in the same kinds of situations where SE Linux is hard and intimidating. They have a big mess of instructions, and this is just one of the many possible solutions to be considered. So I would say that SE Linux is not necessarily hard, but it is complex. There's a lot of granularity exposed by SE Linux, and there's a lot of different solutions, and it's context-dependent, which one's appropriate. This is not a blame on the scanner manufacturer for providing bad instructions as much as they don't know what your computer setup is. It's hard to provide generic instructions. So if disabling security features is common, then it's important to make the use of security features easier. As a Linux security community, we don't wanna spend all of our time developing cool security features only for people to not use them. So why is SE Linux hard? I think there's fundamentally four reasons, and there may be more, but these are kind of the four that I think show a significance. So one of them is that computing systems, Linux included, are inherently complicated. This is why we have all these discussions at LSS, for example, about securing the Linux kernel and why we spent years and years, and there's still bugs because code is complicated, it's challenging, and there's bugs. SE Linux provides very granular exposure to the low-level implementation details of Linux and the applications on your Linux system. And that can really result in some very unexpected corner cases that the person writing the SE Linux policy was not even aware they needed to be thinking at the level of this particular library or the level of the kernel or the level of a particular user space demon or whatever. A second reason why SE Linux is complex is there is some inherent complexity in SE Linux as a system. So we've got five different models of security in SE Linux. Permission denials can come from any of those models. Additionally, there's various user space components that are SE Linux aware. So you may be dealing with a problem related to SE Linux that actually comes from SystemD or Debuss or PAM or any of these other SE Linux aware components. And now suddenly as an SE Linux developer, we're thinking about the implementation details of SystemD, which is not something you necessarily wanted to have to think about. Additionally, policy should generally conform to high-level security goals. Security can be hard because things like threat modeling are hard. Things like understanding the security properties of your system are hard. Now, hopefully having static security policy is a helpful solution to that and we can enable people to articulate their high-level security goals in a way that's analyzable. But users have to be thinking about high-level security goals and have to implement policy that correctly implements those goals. So in terms of finally writing scalable and maintainable policy is the fourth issue that I see as troubling. And when we have good policy design, that makes the security and functional implications of our policy a lot more clear. And adherence to conventions, just like writing any kind of source code can make the policy much more readable for future maintainers, other developers, reduce the chance of error there. The proper use of appropriate abstractions will make the policy more resilient. We're gonna see an example of an issue with that in a little bit. Also, portability requires some attention from the developer. Again, just like using any language, if I wanna write SE Linux policy that's portable to another system that may potentially handle correctly changing properties of the system I'm working on, we need to pay attention to how we do portability. And finally, reference policy, which is the commonly used wrapper layer around SE Linux uses the M4 pre-processor language. That has some subtleties. And we don't necessarily, as someone developing security policy, want to have to become an expert on the subtleties of M4 in order to write good security policy. But sometimes those details get exposed. So how do we address these challenges? So reference policy is a very huge step in terms of addressing a lot of these. Reference policy focuses on abstracting away the Linux complexity. It focuses on creating a portable base policy. It focuses on allowing scalability by having encapsulation in different modules. We also have analysis tools that can help check our high-level security goals. But using reference policy requires the correct use of reference policy. It's very easy to have a build system based on reference policy and not take advantage of the features that it affords to you. Or to take advantage of them incorrectly in a way that's going to increase the future maintenance burden. So the overall goal here with SE-Lint is that if users can spend less time on reference policy syntax details, they can spend more time on security. And the end result is hopefully that everyone using SE-Linux will have more secure systems. So in terms of goals, SE-Lint wants to report violations of normal conventions as well as poor style violations. So I'm grouping convention and style differently in terms of convention being these are the commonly expected norms. It's not necessarily right or wrong to do one way or the other, but we've all as a community agreed, at least in general, to do things this way. And so it makes it more readable to follow that. Style things are things that are really more objectively wrong. That you're not necessarily introducing a bug or an error, but it's going to cause you a maintenance burden down the road. We also want to warn about policy that could potentially cause unexpected errors. There's a lot of policy that works fine right now. And then when you change something seemingly unrelated, suddenly you're getting an error at compile time or at runtime. And we wanna be able to warn about those situations early when we're doing static analysis so that we're not traveling trouble diagnosing and challenging to diagnose error down the road. We also wanna treat SE-Linux syntax and RefPolicyM4 usage as a unified grammar. So a lot of existing tools will only deal with the base SE-Linux language. And the ability to treat reference policy syntax and the subset of M4 that it uses as all part of one language grammar really enables us to look into the style of how reference policy is being used. We also wanna be fully configurable to enable people to turn on and off the checks that they do or don't want and have granular disabled so that this can be very usable. If people are using a static analysis tool and it's providing them false positives they don't want and have no way to turn off, they're going to stop using it. And lastly wanna make it easy for upstream to contribute checks so that if there is a particular policy development issue that's bothering you, it's easy to add a check for it. So I'm gonna go over just four examples of the kinds of issues we find. We actually find quite a bit more. I have the number on a later slide but here's just four examples to give you a taste. So this first one is a nice normal SE Linux allow rule. We're going to allow a domain called FUTI to read and write its own anonymous pipes. And this is based on a rule that I saw in a policy I was working on and this is the rule that's recommended by audit to allow. And so we did some testing, we observed these denials for these three permissions and added the rule, now no more denials. Then we're going along and a few months later we get another denial for the I-Octyl permission. And so what's happening here is that on anonymous pipes, normal behavior you're only reading and writing to these pipes. However, the pipes do accept I-Octyls and there's certain situations where someone may want to configure the behavior of the particular pipe using an I-Octyl. It's less common, but it does happen. And so if we had just used the appropriate reference policy permission macro, read, write, FIFO file perms, we would have gotten that I-Octyl for free. We never would have had the test failure later on when this behavior changed. And that's why reference policy really helps improve policy robustness and why we want to be using the appropriate permission macros. So SE Lent has a check and here's the message it will display. So FUTI TE line 26, S is for style, this is a style failure that it suggests that you probably wanted to use the read, write, inherited FIFO file perms and then it helpfully tells you that you had get, add, or read and write using this interface would add I-Octyl lock and append. So you need to decide if there's a security risk associated with adding I-Octyl lock and append. Maybe in your scenario, you actually don't want to be allowing this I-Octyl. But that's up to you to make and we'll talk in a little bit about how you can create an exception in your policy if you don't want to allow this. But I think often you do want to allow these. All right, a second example check. So one of the things that reference policy does is we create interfaces which enable encapsulated policy in one module to be accessed in a different module. These interfaces require policy developers to declare their own dependencies for the usage of those interfaces. So this particular interface is taken from upstream ref policy and it's called dev read, write card manager and it allows a domain to read and write to the card manager device. And you may notice that this interface accesses the device T type but it doesn't declare that in the generally require block up at the top. So ordinarily in most policies this is probably not going to be a problem because you've probably called other interfaces in the device module that require the device T type and the way the macro expansion with ref policy works is that these just get plugged in as macro expansion. So the first time you require device T it's now required and then we won't hit the error. But if we were to say reorder our interfaces or maybe use this interface call in a new module that didn't require device T suddenly we're getting errors and we're saying I just reordered a couple of lines why am I suddenly getting a dependency not satisfied kind of error? So we've got two different checks and the one here is showing that device T is used but is not required in the interface and we've upstream ref policy submitted and had merged a patch to correct 57 instances of this. So this is no longer in upstream ref policy this is the situation it was before that patch. And then there's also the reverse problem which we corrected 48 instances of and that's if we say required device T but didn't actually use it this means that our interface would have a dependency on something it doesn't strictly speaking need and so if that dependency for whatever reason became unavailable we would have a failure that we didn't need to have. And so we recommend devices line 1671 here's a warning now because this could potentially cause a compiler failure and it says the type device T is used in the interface and not required. So the next one this is a pretty uncommon situation and I just think it's kind of interesting that there's a interface foo and what we've done here is we have allow $1, $2 both of the arguments and the critical thing here is that they are next to each other without anything in between them and then if we were to call the interface elsewhere without using a comma to separate our arguments M4 is going to interpret that as being one argument rather than two arguments which in this case was presumably the intention and so this means that now instead of $1 being bar T and $2 being BAS T as we probably wanted now $1 is bar T space BAS T in this particular instance that actually works fine and I did see this come up in a policy once where everything was working fine but if later we were to go back and modify this interface and have something where $1 was not right next to $2 then this call would break and it would give you a confusing error message and you would say why on earth did modifying the interface make the call break? That shouldn't have happened. So SCLint actually recommends three different things for this particular interface. So the one that we've been discussing is this first one here argument number one bar T of called interface foo contains an unquoted space. Generally speaking you don't wanna include unquoted spaces in your interface calls unless that's behavior is what you want but in that case probably quoting the arguments is gonna be more clear. Also we have a check that you should have a documentation comment before your interfaces and there's no documentation comment here because I threw this together as an example for slides and finally we have the same one from previously that I generally required foo at the top of this and didn't actually use it because I grabbed an interface that allowed dollar one foo and changed it and forgot and that's a really common problem that happens when people are writing policy is they're gonna copy paste their policy block and then forget to make all of the changes. So last one we also do file contexts and so there's actually a fair amount of other tools that can do this same kind of work but it's I think helpful to have one stop kind of tool here that gets that as well and so this is redundant with some other tools but I think it's good to have multiple tools there. So this rule here we're going to label some script.sh with our some script exec t label and this works totally fine on my system but there's actually two problems here. First off the path portion of the file context is a regex which means that dot is a special character so this does just fine match some script.sh but it would also match any character in the place of that dot which is not the biggest deal in the world but it is not what you intended. And then also the gen context macro takes a second argument but M4 doesn't care if we just omit arguments it just substitutes the empty string. So we were supposed to supply an MLS component. Now this rule came from a non MLS policy and so everything was fine but if we were to later than turn on MLS and our policy now suddenly we're getting all sorts of breakage for what should have been a one line config change. So we've got warnings for both of these issues reported by SE Lint and you got a style warning that your MLS levels are not specified in the gen context and we've also got a warning about the file context contains a potentially unescape regex character so we don't just check for dot check for a variety of different characters dot is kind of the big one that shows up a lot because you get a lot of extensions it's really easy to forget that you need to escape that dot there. All right so I also wanna talk a little bit about how to use SE Lint if you're interested in using it. So you can again get it from our GitHub here and this is a link to the releases page where you can see all the latest releases. Again 1.1 is the latest release and there's been a little bit of work since then but 1.1 is pretty up to date at this point and then it's just a normal installation process. So we check whatever the sum of those four numbers is number of issues. There's only three convention checks and that's kind of a future work thing that I would really like to get more convention checks going in this. Convention checks tend to be a lot harder to write because the conventions are written to be human readable or understood to be human readable and it can be challenging to translate that into machine readable. So that's a big growth area going forward for SE Lint is to improve the number of convention checks. We have 10 style checks. This is really a lot of the bread and butter here of things like you aren't using this reference policy abstraction appropriately or not using it at all and you could be taking advantage of it and you would get some benefits. We also have 11 warning checks of things that could possibly break down the road and six error checks. Now you might ask why have an error check if the compiler would just catch it for you and that's a really good question. Generally we try to avoid having error checks if the compiler is going to catch it helpfully for us. However, when SE Linux is compiled you've got a multi-step process, right? First you're running a pre-processor then you're doing the policy compilation then you're linking the policy then you're loading the policy on your system. Errors coming in at link time or at load time are often expressed in the common intermediary language syntax which can be very difficult to debug and now we're asking policy developers to also understand SIL syntax if they're trying to figure out what this error is. For cross-compiling policies sometimes errors that don't show up until load time which is often file context errors can be very late if you have a lengthy process between a build and getting it installed on your system that can be troubling. So we try to provide errors where the error message is either unclear or it comes very late in the process. We can enable and disable checks in the config file or on the command line. So here's an example enabling check W3 and W2 and disabling check W5 also use the recursive flag here. It's also possible to do individual disables in policy. So here's an example of adding a comment to disable style check S010. And this would be maybe I decided I really didn't want the Ioptl in this situation like we discussed earlier. Couple other helpful features, ordinarily SELint returns zero if everything ran successfully even if it found issues for inclusion in automated pipelines you may wanna return an error code on any issues found. And so you can do that with the dash capital F flag. There's also a run summary that's the dash capital S flag here. And I also in this example use the dash dash summary only flag which says don't actually tell me the issues just display the summary at the end. And so this was the state of reference policy as of when I generated this slide. So you'll notice that there's a few issues there. Obviously the big one is this C005 709 and that is ordering of permissions in an allow rule and that's a pretty new check that we added and haven't had time to go through 709 reference policy lines and get them all fixed up yet. But hopefully that'll happen soon. There's also a context flag. Sometimes you may want to only report issues on the policy modules you're developing but since SELint wants to be aware of all of the interfaces used in your policy all of the other types declared it's helpful for it to have the global view of your policy source. And so this context flag says scan the rest of the policy to get the context but only report issues on this path. If anyone wants to help contribute contributing checks is designed to be really straightforward. There's a simple function prototype. It gets two arguments. One is some metadata about the file being scanned that kind of thing. And then one is a pointer to a node. So we parse the policy into an abstract syntax tree that node should hopefully have everything you need to write your check all the information about this is an allow rule it has these permissions, et cetera. And then these checks return null if the check passed or a check result structure that has the information about what failed. The big thing there is a string that you display to the user for this is the failure. So you create a function. You set up the registration which is a couple of lines of code and then you can get that merged. And I think about half of the checks have been written by people other than myself which is really exciting for me. Don't have a huge number of people contributing yet but those who have have contributed quite a bit. So that is the end of my talk and it does look like quite a few questions have come in. So from your experience, what is the best approach for finding suitable ref policy interfaces for a given denial? So that's actually something that I really hope to add as a future feature to SeLint but it's not in there yet for SeLint to suggest those interfaces if you have the allow rule but my recommendation in general is to grep ref policy on the target type in your denial to find which module and reference policy owns that type and then look through the interfaces that require that type. All right. Yes, so there's a past research and there is, yeah. So there's another tool which I did find out unfortunately after we named this SeLint that also shares the name SeLint. And so sorry about the naming conflict there. SeLint, the other SeLint was developed as a research project a few years ago and it's focused specifically on SeAndroid policy and this is focused on reference policy. This SeLint actually does not currently support SeAndroid policy. Unfortunately, that's something I would really like to add but there is some future work to be done there. Some of the ways that SeAndroid handles its policy don't work with our current parser structures. There's some parser refactoring that needs to get done to handle SeAndroid but hopefully soon. Is Microsoft using SeLinux in any of its products or solutions? If so, where? Yes, but and that's why they have hired SeLinux developers. I don't know of anything that's public that I'm allowed to talk about right now. So I'm sorry, I can't really give a great answer on that. How does SeLint get implemented into existing devices? You mentioned the end user early on. How developers in policy or other representation, otherwise make implementations of SeLinux, I assume is what that means without sacrificing the usefulness of enabling it. I'm not 100% sure I understand the question. So from an SeLint standpoint, so typically when we're developing an SeLinux policy, the first thing we do is we start with forking reference policy and coming up with a policy that's custom to our system, develop policy based on the denials that we see on our system. So then SeLint is a good thing to integrate into say a build pipeline for your custom policy for your system. And that's a great way to make the implementation of SeLinux a lot easier. So I hope I kind of address that question. I'm not sure that, I really understood it though, sorry. What have you done to try to get this part of the process for anyone developing SeLinux policies? Yeah, so I think there's been kind of two prerequisites to get this to become kind of part of the general process. One of those is right now I showed the slide a moment ago of these are the issues that were in reference policy as of about a month ago. And so if you're gonna fork off of reference policy to build your policy or you did it a while ago, now you're dealing with a bunch of reported issues that are upstream. And so my number one priority in terms of policy integration right now is getting these fixed upstream so that someone can have a clean base to work from. In terms of part of the process, I think a big part of it is also just getting the tool more mature. This tool is originally published back in January. And so it's gaining a lot of maturity, done a lot of development fairly quickly, but continuing to mature the tool. And then lastly, there's evangelizing it. And that's kind of part of why I'm wanting to do this talk now is to make it aware so that people can get it as part of the process. We're looking to integrate it into some of our pipelines here at Microsoft. How would you debug denials that are not due to type enforcement but possibly through other SeLinux aware applications or other things like constraints? Those are really difficult. And that's part of the complexity here. Audit to allow does actually display a message if it's a constraint violation, which can be really helpful in that regard. The SeLinux aware applications can be tricky. I think the real long-term solution on that is to update the fixed issues in those application codes so that they do their error reporting more cleaner. And that's been happening a lot. I remember back when I got into this eight years ago, it was really challenging and it's gotten a lot easier as applications like SystemD and Pam, et cetera, have done things like honoring permissive mode appropriately, added better error messages, et cetera. So that's the big advantage there, but I don't know that there's a huge shortcut in that, unfortunately. Can SeLinux be used for Android? I said that a moment ago, unfortunately not, but I'm hoping to improve that soon. I think there's a big growth area there. Definitely welcome your contributions, but that is something that's on my to-do list as well. I have not uploaded my slides anywhere yet. I think they will be auto uploaded at the end of this talk. How much of this could be applied to other M4 inputs? So the way that this is implemented right now is really specific to the subset of M4 that's used by reference policy. So I'm not sure that there's a ton of code reuse or say generic M4, but I think the exact same techniques could be used for other M4. And I'm not sure what's out there in terms of autocomp input or whatever if they're static analyzers, but I would think that exactly the same sort of technique could be used there. Again, this is really targeted at REF policy specifically. So I think that's all the questions in my queue. I hope this was informative to you guys and thank you very much for your time.