 Hello, everybody. Welcome to this presentation about extending Sys&D security features with EVPF. I'm Mauricio Vazquez and I work for Microsoft. Before I start talking about the work that we did to introduce some new security features based on EVPF for Sys&D, I want to talk about the security features that were already supported by Sys&D before we did some of our work. Before we implemented the security features, Sys&D already supported some features based on EVPF. The first feature supported was the IP firewall. In this case, this feature allows us to count the IP packets based on the IP and also to deny or to allow those packets based on the IP address of those packets. In this case, those programs were implemented by writing the EVPF code directly in assembly. There was some kind of custom compiler that translated the configuration provided by the user to the EVPF assembly code. Another feature related to EVPF in Sys&D was the custom, the support for custom programs also for the IP firewall. In this case, the user can implement its EVPF program, so the user can compile the program and the user also is in charge of loading and pinning the programs and then the only thing that is passed to Sys&D in this case is the path of those pin programs in the BPF file system. So, months ago, Sys&D got support for the EVPF. So, there was a new property implemented there that is called AllowBindPort. This is the first property that was implemented by using the EVPF support in Sys&D. By using this property, the users are able to restrict the set of ports that an application can bind to. This was done by Julia from Facebook and this is already released. This is available on the latest Sys&D release that is version 249. The nice thing about the work that Julia did is that this work also introduced support for the BPF in Sys&D. So, it was not only about implementing this AllowBindPort, but a whole framework for implementing new functionalities based on EVPF that used the BPF were merged on Sys&D. So, why is this support relevant? Okay, the first thing important for us is that by having the support, this is much easier to develop to implement new EVPF based features. This is mainly because by using the BPF, we don't have to worry about writing the code in assembling. So, we can write the coding so to see. We can use CLAN to compile to EVPF assembly code. So, yeah, this is much easier. This is just the same as in other programming languages that we are usually writing the coding in high-level languages and using a compiler to generate assembly code. This is in the specific case of EVPF. There is not that many people that is able to write EVPF assembly code. And, yeah, this is slower to develop functionalities in assembly. And this is also very difficult to maintain. So, yeah, in summary developing the programs using C is much easier and faster for us. The other important factor about EVPF is that we have the full EVPF API available. So, by using this library, we are able to handle the different BPF objects in the kernel. So, for instance, we are able to create them and we are able to update, delete, get the elements in a map. We are able to load the programs in the kernel to attach those programs to the different hook points and so on. So, there are so many functions in the EVPF API that we have now available on systemd. The next part can be a little bit obvious, but I want to make it clear why we choose to use libpf. So, libpf is the official kernel library for EVPF. Actually, this library is directly developed in the kernel source tree. So, this is important for us because each time there is a new kernel feature related to EVPF, this part is implemented in libpf. So, we can say that libpf is always updated to support the latest kernel, the latest EVPF features in the Linux kernel. Some of those features that are interesting for us regarding the implementation that we did for systemd are compiled ones around everywhere. So, this is a technology that allows us to compile the EVPF program while we compile the whole systemd binary. And then we just deployed that program in the different target machines without having to recompile that. I will explain a little bit more about that in a second. The second feature that was interesting for us is BPF skeleton. So, BPF skeleton is a mechanism that is implemented in BPF tool. By using it we are able to get a C representation. We are able to generate a skeleton from a BPF object file that represents all the elements that are inside that file. So, for instance, when we wrote the BPF program we defined some maps, we defined some programs, and so on. And we can handle all of those by using this skeleton. So, this is a simplification that makes it easier again to implement and to handle the EVPF programs, maps, and so on. Okay, so let's go into the details of the different properties that we implemented. Actually, we did two of them. The first one is restabit network interfaces. It is implicit in the name. This is for restabiting the network interfaces that processes can access. We support both and allow or deny list. For instance, this could be used when we know that there is a service that shouldn't be accessing the internet. So, in that case, we could, whether we could deny the traffic on the internet facing network interface, or we could lock that interface to the loopback sorry, lock that service to the loopback interface. So, we have different options. So, the idea is if we know the interfaces that assist assistant services should be using, then we can limit those interfaces to be sure that this service is not able to use any other network interface on the system. Regarding the implementation of this feature, we did that by attaching two EVPF programs to the C-group, Ingress, Egress hooks of the C-Senty service. What is nice about this per C-group EVPF program is that we can keep separated EVPF programs for each C-Senty service. So, in this case, we have two different EVPF programs for each C-Senty service that is using this feature. And from the implementation point of view, we don't have to worry about handling the logic of where is this application running, I mean what is the C-group where this application is running because each time the programs are executed, this is implicit, the C-group where the application is running is already implicit on the program. In order to determine if a network interface is allowed or not, we have a hash map where we save the different network interface indexes. The interface is not allowed, we just drop the packet. The size of this program is rather small. We have something like 50 lines of code. This is a C code. I want to mention here that the PR that is implemented, this is a little bit big, but actually the piece that is doing the real work on EVPF is very small. The logic that we have around is to load the programs to populate the maps and so on. Actually the logic that is inside the EVPF program is very, very small. For this feature, we require the kernel version 5.7. This is because we decided to use EVPF links for handling the system, the human reload and the human re-exec. So this is because we got a file descriptor for the link. We don't have to worry about unloading the program, detaching the program and so on. We can just pass the descriptor for the link from one SysMD instance to the another one. The support is already merged. So it was merged some months ago and this will be available in the next SysMD release. This is version 250. Okay, this is time to do a quick demonstration about how we can use that. In my system, I have two different network interfaces. I have the good but one and I have another that I use to reach the internet. In order to create a temporary SysMD unit, I'm going to use SysMD run and I'm going to pass the rest of the network property to show you how it works. Let's start by creating a service that is only able to access the interface that is used to reach the internet. In this case, the PINC is working. This PINC is going to the public internet and this is the interface that we used to reach the internet. Everything is working fine as it should be. If I try to PINC the local host with the same rule, this is not going to work because in this case the host packets are going to the loopback interface but the reservation allows only to use the mp0 s3 interface. If I change this to the loopback interface, this will start working again. Another thing that I can do is to restrict to use a deny list. By using these syntax and saying that the service is able to use any network interface on the host but the mp0 s3 one. In this case, this PINC should work because it's using an interface different than the mp0 s3 one. This is useful for instance to restrict the services from reaching the internet in this case. What else I can do? I can define a list of interfaces that are supported. If I try to PINC the loopback this is going to work and in this case also try to PINC the internet this is also going to work because both interfaces are allowed there. Finally, I can also use a deny approach with a list. In this case, it says that none of those interfaces there is allowed. This is not working and the same happens for this one that is also not working. Let's go to the second property that we implemented. This is again regarding security. This time this is for restricting the types of the file system that processes are able to access. We also support the allow or deny list in this case. This feature provides an additional security layer because we are able to restrict the processes for accessing some of the dangerous files. So in a use case example of that is that if we have a service that should be accessing files on a given partition then we can restrict the types of file systems that this service is able to access to the type of the file system type of that given partition. Also, if there are other services that we know that they don't require these dangerous files system then we could restrict those from being used by those services. Some examples of dangerous file systems are CFFs, 3FFs and so on. Regarding the implementation, this is using the linear security modules together with eBPF. So those are the traditional LSN hooks but this time is implemented with eBPF. What I mean is that we are able to use eBPF to make a decision of what to do once the hook is involved. So we can do a look up in a map or we can implement whatever logic we want to make the decision. In this specific case we are using the file open hook so each time the process tries to open a file we run our eBPF program and based on some logic we are able to decide if the process is able to access the file or not. In this case the LSN eBPF programs are not C-group aware so what it means for us is that we have to keep a single global program in the host. Because of that we have to keep a global eBPF map of maps where we start the different magic numbers of the file systems and this is indexed by C-group ID so in this specific case the outer map is accessed by the C-group ID so we have this helper to get what is the C-group ID where the application is running and the inner map we have a list of all of our deny file system magic numbers and I will show you in a second how we are able to access those magic numbers. So the magic the program receives this strut file this is an internal kernel structure so for this reason we have to use eBPF car to read the magic numbers so we have the file and we have the different members where we can find what is the magic number of the file system that is being accessed the important part here is that the layout of this structure could change in different kernel versions so that's the reason why we are using code even if the layout changes from a kernel version to another one eBPF will be able to calculate the specific before loading the program into the kernel again this program is rather small this is just 66 lines of code and the kernel requirements are very similar to our previous property this is 5.7 but this time we also need to have this compilation flag whether we have to to have eBPF in this compilation flag or otherwise we have to boot the kernel with this parameter we have to include eBPF in the list of lino security models that are used regarding the status of this PR we already have one approval we are waiting for all the reviewers but there has been a lot of interaction there a lot of comments, a lot of reworking done by Yago and I think this is going to be merged for the next release so because yeah this is almost ready to be merged for the next one ok this is time for another demonstration about this feature I have enabled the eBPF security module on the kernel boot parameters so this is important to be sure that this parameter is set to be able to try this feature I'm going to create some I'm going to create a file on the temporal folder just to test let me check that we are able to access that file from the host so yeah this is working fine and again I'm going to use ccnd run to create a temporary service but in this case this is going to be with the rest of the file systems property so for instance let's say that we are going to restrict only two ads far and then I'm going to cut the contents of that file so as you can see here we got an operation error message while trying to access that file if we look at the output of the mount command we can see that the tmp folder is the type tmp fs so this is the reason why this is not working so we can add tmp fs here and you can see that everything worked fine we were able to access the contents of the file for instance if I try to get the command line of the executor again in this case there is a problem because the proc is a different file system so in this case to be able to do that I should add proc there so in this case this is working I call for example say that this is able to use whatever file system it wants but not to use this is proc so if I try to read the file on the tmp fs file system this is going to work but if I try to read something on the proc fs this is not going to work so yeah by using this feature we are able to restore the file system that process can access if you want to know more details about the work that we did there are some links on the presentation so the first one is about the post where we include all the details of the implementation we have also some pointers to the code and here we have different pointers to the different topics that I cover in this presentation I think this is all thank you very much for your time and yeah I'm happy to take any questions that you can have thank you bye