 Hello, everyone. My name is Miguel Salin. I started working on Linux sometimes ago, and now it is finally in Linux Mailer. At first, he started as a personal project that he did by personal needs. I wanted to take control of my own data on my Linux workstation. I then continued his work with my previous employer, the French National Statistical Agency, and now with Microsoft. This talk is a complementary to the one I gave yesterday at the open source summit, which was named Sunmaxing Application with Unlock. But this time, I focused on the downside. We see all Linux works underneath, and what are the constraints of an unprolegatory access control system. But first, let's see what we try to solve here. This drawing illustrates the reason why it's so long. It says, if someone steals my laptop while I'm logged in, they can read my mail, take my money, and impersonate me to my friends, but at least they can't install drivers without my permission. Here, the problem is not to steal a laptop, but to compromise an application, which will then be able to access user data. This talk first explains the goal of Unlock and the rated consequences. We'll see how to use it, a bit of a story that led to the current design. It will then explain the current implementation constraints and the potential limits of the current in future videos. But first, let's see what is really Sunmaxing. Sunmaxing can be used for different meanings, but in this talk, it is used as a security approach, not a binary state, to isolate a software component from the rest of the system. We propose that the running system is trusted and not compromised at an early stage. However, once started, it can be attacked and without compromising the trusted company base, illegal access to restricted data could happen because of compromised application. Indeed, an in-accused and trusted process can become many issues during its lifetime. The third models are for us to protect from vulnerable code, maintained by the application developer. They're also to protect from many issues or vulnerable code from cyber decode. And finally, the salam del can be defined by the developers as it fit best. But what is Landoc? Landoc is the first monetary access control system available to infrared requesters since Linux 5.30. The idea is to restrict ambient rights, according to the kan semantics, for example, global system accesses, for a set of processes because this cannot be done with second. And Landoc enables to develop built-in applications and machine to protect against either exploitable bugs in trusted applications or to protect against interested applications thanks to sandbox managers. Contrary to the monetary access control system, like SLinux, Apamor, SMARC, Automation, Landoc empowers any process, including infrared ones to safely restrict themselves. From the user's pavement of view, Landoc offers three samples. Each one of them is designed to do a specific action. It can be seen as a builder button, and initial reset is created thanks to the Landoc Creative Set Cisco. This reset is then usable thanks to a specific file descriptor that can be passed to the Landoc Add Rule Cisco. And finally, once the reset is ready, this reset can be enforced and restricted as a calling process thanks to the Landoc Creative Set Cisco. Landoc enables to protect user data from energized access or disclosure by making it possible for threads and future children to only allow access to a set of file hierarchies. The right, access right, are executed with a Y2 file, list directory or remove file, or create files according to the style. Here is an example in C, but we can do the same thanks to Hilary libraries for Go and Rust. First, we need to create a reset attribute, a set will define the reset, and it will contain a set of access rights. All these actions will be denied by default, except if a rule inside this whole set allowed this action. Then we need to pass this whole set attribute to the Landoc Create Rule Set Cisco. And if the call succeeds, then we get a rule set file descriptor. The second step is to add rules to this whole set. To define a rule, we can create a path.binis attribute which contain a set of access rights. Then we can open a path which will then be identified thanks to another field from this pathbinis struct. We can then, thanks to the reset file descriptor, pass this struct, this row, to the Landlock as well. If the call succeeds, the rule set gains this new rule. The final step is to enforce the rule set. Once all rules are added to the rule set, we can then pledge to the channel that the current thread will not gain new privileges, which can be done thanks to SOID binaries, for example. Then we pass the rule set file descriptor, to the Landlock Rustic Self Cisco. And if the call succeeds, the calling thread will be restricted by the rule set. I started working on Landlock five years ago. At first, it was a proof-of-concept to extend SecComp, which was called SecComp object. And it was using the SecComp Cisco. And it was a bit hacky, but the idea was to be able to filter not only arguments, but can objects as well. But then I switched to the LSM framework. So not at the Cisco layer, but really more close to the call semantic. And I also used EBPF and SecComp. And measure step. After that was to create a new BPF helper, which was able to identify file path. After that, it was mainly to trim patches to make the patches minimal. The module we bumped happened in 2020, where we moved entirely EBPF and replace the SecComp Cisco use with a new dedicated Cisco, which was in fact a multiplexer. The next step was then to remove or replace this multiplexer Cisco with free dedicated Cisco, which are now Landlock create rule set, Landlock add rule, and Landlock restricts self. And finally, after some iterations, in 2021, the 34th version was matched in mainline for Linux 5.13. So why not more EBPF? Because of SecComp BPF, on which the first session of Landlock was based, I then used EBPF as a way to define SecComp CC, which could be updated on the fly thanks to EBPF maps and evolves over time. The main goal of Landlock was and still is to bring sandboxing features to all users, which means to have an unplayed access control system. EBPF is very powerful, and I prove with presentation of Landlock that it is possible to implement an access control system. However, a programmatic access control does not fit well with unplayed principles. Indeed, EBPF can also be leveraged by attackers against the canal, which is now why EBPF is not meant to be used by entry-age users anymore. But also programmable interfaces with IO, for example, EBPF map, can lead to side-channel attacks against other programs, which is an issue for entry-age access control. It is also not possible to efficiently compose loading programs but only to change them, which is done for second EBPF, for example. But that is not enough to get an efficient access control based on the fly system. But still, this will contribute to Bustua, the BPF-LSM, previously called the canal runtime security instrumentation, which then gain an extra feature thanks to BTF and make it much more powerful. Now, let's see the priorities underlying principles for unlock. First, we need to make sure that we don't weaken the system security by adding new features. When modifying the link's canal, it means that these new features are potentially important for attacks against the canal, and all resources it controls access to. Second, only sandbox processes shall be accounted for the sandbox and we should limit the use of non-userspace accounting between the canal. This includes access check time and allocated memory. Third, we need to protect and sandbox processes from sandbox processes. Indeed, sandboxing is meant to isolate potentially compromised processes and then limit the malicious impact on other processes. Being able to manipulate or impersonate other processes may also be used as a creation example of the confused deputy attack. And finally, of course, sandboxing should be useful to limit access to data. Now, let's see the implementation constraints of an infillage access control system. First, this need to be useful for multiple and different applications, which means independent but innocuous and comfortable security policies must be guaranteed. We are turning to prevent bypass through other processes and to follow the principle of least privilege. And finally, to limit the canal attack surface. For example, by using simple basic regression but not by code. How do we compose security policies? First, at the system level, there is other access control systems and that is done mainly to the LSM stacking work. The other part is specific to LearnLock. It is to compose all sandbox policies in an efficient way. From the general point of view, LearnLock is implemented as a unique security module, an LSM, which means that it relies on a set of access control hooks while deep in multiple LSM systems. In practice, to be useful and used, LearnLock needed to be usable as a stackable LSM. Indeed, for security in that approach, we should not replace other security mechanisms but add more security layers on top of them. LearnLock has been a recognized motivation for the development of LSM stacking. By the way, thank you. LSM can register a set of hooks for a set of actions which are able to check access, for example, when opening a file, sending a network packet, doing an actor and so on. And each LSM can also now register for blob sizes for a set of can object types, which are used to tie specific data to can objects, for example, iNode, files, sockets, and so on. The canal denies an action when a first hook call returns an error. These are second trial checks. Thanks to the infrared confining, an important advantage of LearnLock is the ability to compose security policies. Indeed, because each application can define its own security policy, the canal must match them in a safe way. From a user point of view, this must behave like a stack of scoped sibling or nested policies. However, for performance reasons, this composition cannot be simply implemented as a stack of policies. Moreover, dealing with complex data structures implies dealing with multiple underlying mechanisms like line spaces, line points, overlays, and special file systems. And this means also that we need to handle hierarchy of policies. The sandbox can only drop more accesses. Sandbox policy composition applies to file identification and rings to constraints. First, we cannot use extended attributes on files, because we must handle multiple policies, and that will mean to add a lot of data to these extended attributes. But we also must enable to embed policies, which means fmr identification, because application can be updated, and this corresponding identification must follow these updates. And we should be able to deal with 3dmi files. The second thing is that we cannot directly use open file, because we may not have access to the reward, we may be in a container, in a file system name space, and we must not be away to bypass other access control systems, which, for example, could be sidechain attacks to infer where we are in the global file system. So how does work the file notification follow mark? Well, if I use inode tagging, access files are tied to inodes by user space thanks to open file descriptors, and a new system called the long log add-on. All access rights for the same inode are stored in line in a deeply-did kernel-structured kind of attack, including a flexible array, which enables to have efficient lookup for static inodes. And then, lifetime of such tags depends on associated sandbox domain lifetimes, and underlying superblock lifetimes thanks to a new LSM hook that we added, the security sb delete. The second part is to check file hierarchy. When we request in access to a file, we walk through open files until all domain have been checked, or the root is reached. Now let's see an example. Let's say there is a shell application that wants to sandbox a user station. This case is the first layer of sandboxing, and there is seven words. There's one to allow ejection of other applications, one to read the configuration, and other access rules to enable read or write for specific directories, including the user home directory. Now let's say a user in the station wants to launch an application that also sandbox itself. If the application will require access to maybe libraries, configurations, and cache files, configuration files, and let's say it is a picture viewer, the picture directory in a read-only way. And now let's say that the picture parser can also sandbox itself to reflect the attack surface. In this case, the parser inside the picture application will only have access to the cache and the selected picture, which is in this case, slash home, slash user, slash pictures, slash cool dot ebag. So this is the third layer. Now let's see how the canal checks if a specific access to a file is a load or denied. For example, this cool picture. First, it checks the first time node, which is the file, and in this case, the third layer loads access to this file. So it's okay if all of the layers also load access to this file with a basic action, which is here, a read-only action. So the canal walks to the primary directory. Here's the pictures one, and it checks that the second layer also allows access in a read-only way to this file. And then continue to walk to the primary directory and find that the primary directory, the slash home, slash user, is also allowed to be accessed in a read-way. So the canal knows that all three layers allow this action on this specific path. So the canal don't need to continue to walk to the primary directory, and the action is allowed. Now let's look at another angle. The policy is a hierarchy. Let's say this is the first process, P1, that can create a new child, P2. None of these processes are sandboxed, but now P1 wants to sandbox itself. So create new sandbox. This case can see that P2 is not sandboxed because, well, it existed before P1. So the process hierarchy is not the same as the sandbox hierarchy. Then P2 wants to create a new process, P3, which initiates the P1 sandbox, but P3 can also create its own sandbox domain. And this doesn't mean that you will escape the P1 sandbox, but you will gain an additional sandbox. Then if P3 creates a new process, P4, this P4 process will automatically inherit the P1 sandbox domain and the P3 sandbox domain. So all these protections will apply to P3 and P4. Because lane lock sandboxes can be nasty, the canal must make sure that the sandbox boundaries cannot be crossed in a way that will lead to a privilege escalation. To make this simple and effective, lane lock checks sandbox hierarchies when a sandboxed process requests access were processed from another sandbox, which means through Ptrace, which is a debug feature. Only a process pertaining to a parent sandbox, or another sandbox at all, can access processes from a child sandbox, which means the same or less privileges. In this example, P2 can be traced P1, P3 and P4. P1 can be traced P3 and P4, but not P2. And P3 and P4 can be traced each other, but not P1, nor P2. It also applies to special RPCFS parts. A new artist control mechanism should come with guarantees. We use the KSELF test harness framework, which comes from SecComp and made it more generally available to other users. We work on a case test to check the different access mode types and make sure that all the relevant kernel code is covered. As a result, there are twice as many lines of test code as lines of kernel code. The code not covered only deals with internal kernel errors, for example memory allocation, and risk conditions. Now let's see some other way to test the kernel, kernel-fedding. C-Scholar is an unsupervised, coverage-guided kernel-fedder. More than just carefully writing and reviewing the code, we extended the C-Scholar-fedder to get a decent and meaningful code coverage of Linux. As a proof of effectiveness, it led to a bug discovery, independently fixed, in a path-based. We added Linux system calls, extended some specific system calls related to file system or ptrace. I did test to help it discover a specific Linux kernel code, and reach a good coverage, which is really hard to get better. And finally, we checked that it can indeed find bugs. Thanks immediately for your help. The current state of unlock in my line is a minimum variable product. The idea was to upstream the core part of Linux that make it both useful, but still simple as much as possible. There is then first-time limitations to avoid potential policy bypass because of policy compositions. In a sandbox, file replanting is denied, which means renaming or linking a file to a different ground directory. Also, first-time top-level changes are denied, like for example, doing one point or changing the root directory with Pivotoot. However, the chute C-Scholar is still allowed. It is still okay for generic real-time environments, because of the n-printed approach, there is design limitations. Entourage access control cannot restrict anything. For example, not all processes and not the kernel. Then, the hierarchy of thunder. Here on LSM hooks need to be updated to bring more access control types to unlock. For example, there's a lot of iNote-based hooks, but not so much path-based hooks, which are used by unlock. However, to fill the current limitations, second BPF can help to complete a sandbox. Let's see the current side of the map. In the shop term, there needs to improve kernel performance for the kernel features and to add the utility to change the ground directory of files, like we saw before. In the medium term, we want to add only features to ease debugging, to extend fact-time external types to address the kernel limitations, and to add the ability to follow a denied listing approach, which is required for some use cases. In the long term, we like to add minimal network access control types to build application files, and to add the ability to create file-script log capabilities compatible with PBSD's Capsicum. Okay, let's wrap up. Unlock is going to be inclusive, which means it is an implementable access control system available to any process and any user, and safe to use from design to the implementation with a lot of tests. Any process should be able to protect user data or event system data, considering some impregnation constraints. In a shell, Unlock is a minimal but extensible interface to create some accesses. Please feel free to ask any questions on the chat or on the mailing list, and you can take a look at the website to find more information. Thank you.