 Okay. Hello everyone, I'm Ligile Nardi. I'm a PhD student in computer engineering at the University of FISA. And along with Professor Giuseppe Lettieri and Dr. Giacomo Pellicci, we developed a framework that we called EBPF-based Extensible Parabritualization. So this is the outline of today's presentation. I will start with a few words on parabritualization, but I'm sure that you all know parabritualization better than I do. Then I'll move on with some words on EBPF. So what is this technology? Then we will see what hyper-up calls are. Then we will move on with our framework, EBPF-based Extensible Parabritualization. And then we will see some, a use case of our framework with some results. Then we will be time for Q&A. So let's begin. I guess you all know parabritualization. So it is just a quick look on how parabritualization can be achieved. So there are some several techniques with each of them as its own pros and cons. So the most important one are hyper-calls that are much like system calls, but for the hypervisor. So the main advantage is that we can almost do anything with hyper-calls, but on the other hand, we have to modify the kernel. One, something to take into consideration is that the guest operating system is knows that it's being virtualized. This is not an issue, but it's something to take into consideration. Also they are quite expensive because we have a context which from the guest operating system to the hypervisor. Another technique is VM introspection. So of course the hypervisor has full access to the virtual machine memory. And so it can access all the data structures. So with this technique, we do not need to modify the kernel, of course. The kernel, the guest kernel does not know that it's being virtualized. But on the other hand, techniques like KSLR, so kernel address, KSLR organization, or confidential computing defeats this approach because the memory is simply not understandable. Another technique that we will see in a moment is hyper-upcalls. That is similar to VM introspection, actually. So what is EBPF? Well, EBPF stands for Berkeley Packet Filter, so it started as a packet filter many years ago. Now it is being extended. E stands for extended, actually. And right now it is flexible and efficient technology. It is composed of an instruction set, storage objects, and helper functions. These are quite useful debugging tool because it enables direct tracing, so we are able to insert tracing points into live software with almost zero overhead. And interesting thing is that it is supported in Mainlane Linux. As I said, EBPF has its own instruction set. It is a reduced instruction set, actually. And for this reason, it can be formally verified. So we can verify all the code that is being executed. And the kernel, of course, supports this thing. So kernel is not modified, as I said, because it is supported in Linux. So what is a possible approach to EBPF? Well, EBPF code is written in a C-style, C-like language, and then it's then compiled into bytecode using, for instance, C-lang, C-lang supports EBPF. Of course, we have a just-in-time compiler inside the Linux kernel, or we have an interpreter. It works on almost every architecture. So once the bytecode has been generated, it can be presented to the kernel, ready to be verified. So this step is quite important, because we are injecting something inside the kernel. So we must be sure that this does not do any harm to the operating system. So if the verification is successful, this code is then loaded inside the kernel. And then we can collect the bug information, statistics, and so on, as we will see in a few minutes using EBPF now. So what are hyper-up calls? Well, as I said, with VM introspection, we have full access to the virtual machine memory. But the hypervisor doesn't really know where the structures are located and how to access them, because of different kernel versions and so on. So the idea here is that the guest sends a message to the hypervisor, registering with these hypercodes. And what does this message contain? Well, they contain two information. The first one is some code and the second one are references. This code is a code that access, in order to access data structures. So we are telling the hypervisor how to manage these data structures. Of course, we have a security issue, because we are sending some code to the hypervisor to be executed. So we need to verify it. So they are using EBPF, even here. And one, if the verification is successful, the installation is completed and everything works. So the hypervisor, when he wants to know something about the guest operating system, can invoke this hyperup call. There is no context switch between the guest and the host, so this is very fast. Code is verified, but as I said before, this does not work when we are in a confidential computing environment. So moving on to our system. So the idea that we implemented is that the host is sending a message to the guest. We have a guest agent that receives these messages and the consumers accordingly, according to some information inside the message, we have another. As you may have a guest, the message contains some EBPF code. You are injecting EBPF code inside the guest. So as the hypervisor, we ask him that you may know, I guess, and using a virtual device. Because we have a virtual device, we also developed a device driver. It's a simple kernel module. So as a module, it can be loaded or not at any time. EBPF code can be loaded or not at any time. So we do not need to modify the guest kernel at all. One interesting aspect is that the guest is free to decide to load the EBPF program or not. What does this mean? Well, the guest agent, when it receives the message, it receives the EBPF ELF file, it can analyze it. So it can see what it will do. For instance, it can contain a key probe and with a key probe, you have to specify which function will be probed. And you can have a list of allowed functions. So we can implement some policy. We do not need to implement any of them, but for the sake of simplicity, but it can be done. And so the guest has also this possibility to enforce some policy. So how does our system compare with relation to hyper-up calls? Well, we are both sending messages, but in our case, we are sending a message from the host to the guest. Well, in hyper-up calls, we are doing the other way around. So it's the guest sending something to the host. We are both using EBPF, maybe for the verifiability property of this technology. And one interesting aspect are our response time. So in hyper-up calls, everything is performed by the hypervisor. So there is no context switch between hypervisor and guest. This is quite fast. In our system, this means that we do not need the help of the guest operating system. So maybe this is doing something else, so this is quite fast. In our system, on the other hand, we need the help of the guest operating system. So we have EBPF code running inside the guest. So we have a context switch because we need to pass information from the guest to the host. And so the response is asynchronous. Let's say asynchronous. So let's see a use case of our technology. Virtual to physical CPU affinity. You can ask yourself, why? Well, the reason is speed. We all want our application to perform as fast as possible. And then you can ask, why don't you achieve CPU affinity with static allocation? Of course you can, but this is not flexible. In this way, the pinning is performed on the fly. So it's just in when you need it and when it's not needed anymore, it can be unloaded. As I said, EBPF code can be loaded or unloaded at any time. Last but not least is security. Maybe you want to perform some pinning for security reasons, so. So what is the issue here? Well, what we want to achieve is the image on the right. So we have a guest thread that required to be pinned to vCPU zero. We, of course, send for virtual and pistons for physical. And we want the CPU to be pinned, vCPU to be pinned on any of the available CPU. For instance, the vCPU zero. But what actually happens is the image on the left. So we have a guest thread that is pinned to vCPU zero. But a vCPU inside the host system is just a process. So it can be scheduled on any of the available CPUs because nobody told the host to perform this binding. So this is the semantic gap. So how can we overcome this issue with our framework? Well, the host sends a message to the guest using the technique I explained before. That with some EBPF code that contains a K-probe on the scheduled zaffinity function. That is the function invoked when the system code is fired. So here, so every time this function is called, our EBPF code we run. So it means that someone invoked the system code. So someone wants to be pinned to a specific CPU. Our EBPF code we run. And it will write inside an EBPF map that is a specific data structure that can be accessed both by user space and by the K-probe. Also in kernel space. So it writes inside this EBPF map and then we have a guest agent that is constantly checking for any changes in this EBPF map. If any change is found, then this information is sent back to the hypervisor. So we can then do something according to some internal policies. I mean, it can do anything. It can do binding. It can do whatever it wants. How can they communicate? Well, they can use simply the IO CTL system code. So this is, we then performed some experiments. So we have a single producer, single consumer, lockless queue. So we have a producer and a consumer that are sending messages to each other. And we measure the throughput in a million packets per second. That's on the Y-axis. Starting from the image on the left, we have two bars. The red bar represents the throughput when the VCQ pinning is off, is disabled. And the green bar when the VCQ pinning is enabled. We have two sets of bars because we are considering a no-load scenario and a low-load scenario. As a load, we use the DS application. Whose output was redirected to the 2.0. So it is a CPU bound application. So as you can see, when there is no load on the system, there is no noticeable difference between using VCQ pinning or not using it. So this is something we actually expected. So in a low-load scenario, there is some difference. Why? Well, we have some load on the system and the Linux schedule, what it tries to do is to balance the load all among all available cores. So for this reason, the application is moved from one core to the other, losing cache's information and thus reducing the overall throughput. It is called CPU hopping. While with VCQ pinning in place, this simply cannot happen because this schedule is not allowed to move the process from one CPU to the other. So moving to the image on the right, here we are in a high-load scenario and we are considering a different issue, the serialization issue. As I said, we are working on a lockless queue. So we have both producer and both consumer that wants to write many things as possible. And if they are scheduled on the same core and nobody prevents this from happening without VCQ pinning in place, they are competing for the same resource. So it reduces the throughput because one has to wait the other to complete. So going from the left to the right, the time spent, the serialization time spent goes up. So it goes from 0% up to 20%. And as you can see, the overall throughput goes down while with VCQ pinning in place, so the green bar, the throughput stays quite high. One interesting aspect to point out is that in the 0% serialization time, so the red bar, the overall throughput is still lower than the green bar. Why? Well, this is for the same reason as before. So we are in a high load scenario, so this process is moved from one CPU to the other, just like before. So another, something else we consider is the hyperthreading or symmetric multi-threading if you wish. So CPUs are numbered differently, maybe numbered differently on a virtual machine and on a physical machine. So here we have another semantic gap because maybe we want to pin two processes, two threads on two different hyperthreads, but what can happen is that if we have no remapping, they can be pinned to simply two different cores. So this semantic gap and remapping shall be performed. We implemented this thing. We performed some experiments just like before and we considered no load, low load and high load scenario. Just like before, the red bar represents when hyperthreading with CPU pinning with hyperthreading remapping is disabled and the green bar when it is enabled. As you can see in each of the cases, I'm talking about the two images starting from the left, the overall throughput is higher when the CPU pinning is in place. So last but not least, we considered we performed the same experiments on a physical machine and then on a virtual machine with and without the CPU pinning in place. So the red bar is a physical machine, the green bar virtual machine with the CPU pinning in place and the blue bar is a virtual machine without the CPU pinning in place. And as you can see, there is no noticeable difference between running on a physical and on a virtual machine in terms of overall throughput, but there is a remarkable difference when without the CPU pinning in place, both with load and no load. So this means that advertiser virtualization actually works quite well. So this concludes my presentation. I hope it was all clear. Here you can find my email, my LinkedIn account and the GitHub page of this project. Yeah, actually I'm the maintainer right now. Thank you. If you have any questions, I'm here. I will be happy to answer. Mention confidence of computing. Confidence computing has some sort of boundaries, some, you know, trust model in most of the guests and you also talk about eBPF. It's also a matter of isolation. But how do you kind of reconcile the guarantees that you want to get from confident computing and the things that you might get from eBPF? As in other words, could you ask the guests basically to give you confidential information in a way that would be a valid eBPF program that might violate your confidence. So he asked, how can we deal with confidential computing and eBPF? This is the question that's all about. Well, this is, I don't know. Because it's a very tough question. It is all about trust. Depends on your system, I guess. Because if you, how can I say? So with eBPF, you can verify your code so you can be sure that it does not do anything bad. But at the same time, you're giving away some confidential information. So I think it depends on the application, I guess. Mine for this is something like with ACPI when the machine starts up, the host can give it ACPI play balls including code that is executed. Is that a possible use case for this? Because I'm trying to think of when would you want the host to give the guest code executed? And that seems to be one of the main ones. Did you think about ACPI and how it works? So he is asking whether we consider ACPI as a possible use case. No, we did not. This can be an idea. We just consider this use case, but it can be extended to many possible applications. So this can be one of them. It's a powerful framework, I think. Can you speak louder, please? What about the security implications that you are sending these previous programs? Generally, most of them are limited to the root user? We are using the root user. So he is asking what are the privileges. So yes, we are using root privileges. We are probing kernel functions. So yes, we need the root privileges, of course. We are getting them from the host. The programs are sent from the host to the guest. So yes, of course, all this code must be verified. So this is performed by the kernel before loading them. And as he said, the guest agent can also... Let me show you the slide. So the guest agent can also enforce some policies there. So it can see... If it sees a K-Prob on a specific function, it says, no, I don't want this to happen. So we have these security measures. We can have also these security measures. Something else. One common use case for when you want an EVPF program is you might collect some data, populate it in a map, and then later you would want to have something user-based that reads the map and then does something. Have you thought about how that use case would apply to what you're doing here? Is there some way that the host can get access to what's being collected in the guest maps? I don't think I understood the question. So normally, with EVPF, you have something in user space that injects an EVPF program and they collect data, and then you would later read it out from the maps that have data. But here is the host that's injecting an EVPF program into the guest. So how would it then later get that data if it was collected in the guest? Yes. So the question is, how can the app provider get the information back from the guest? Well, we have a user agent that is constantly checking for changes inside this EVPF map, and when he finds any changes, it simply use the IOC-TL system call to the virtual device. Okay, but I would probably quite often multiply that on the host, right? So I might be collecting from, say, multiple VMs or something, rather than just something like this case. With multiple VMs, I don't know. But that's a tough question. I don't know. I think we consider just one VM. So you send this information to the virtual device and thus the app provider has this kind of information. If you have multiple virtual machines, then you need to understand from which VM is coming from this information. I don't know. I have to think about it. Very nice. You have a virtual device. This is a possibility, yes. Any other questions? Oh, from the back. How do you deal with maintenance nightmare that you would arise if we could? Please speak louder. How do you deal with the guest having to have, like, main functions that's all? Sorry, I didn't hear the first part. So, sorry. How do you deal with the guest possibly changing things when you upgrade the guest from the OS? Ah, so he's asking how do we deal with maintenance? That's a nightmare. You said it correctly. That's a real nightmare. Because we are not all helper functions are stable ABI. So this is an issue. We didn't think about a possible solution. So if you have a solution, please tell me. So yeah, the only choice is to use a stable ABI, stable, so functions that won't be modified. But I think there is no obvious solution here. Verifiability. So the point would be if you have this verifiability. So do you trust some code sent by someone else? That's the whole point. You can verify it and then you can, you're free to load it or not. So why don't you use hypercores? Well, we can use hypercores of course, but we have to modify the guest kernel. Please. You have to change the host kernel. Like sometimes having a stable hypercall API is problematic and you have to synchronize between the host and the guest. And basically I think this is planting the problem of the stable API to the Linux API containers which in your later we'll have to deal with it anyway. So the thing that I like in this is that I don't care about it and I just punch it to the new infrastructure. The volume of the response ability has been there for years. And in practice people are using always the same scripts and they just don't care that it's technically accurate. Of course it needs a lot of verification on both sides, that's the point. Thank you. We're running out of time. If you have one last question. So I guess not. Thank you again for listening to me and I will be around here for two more days. So if you have any questions please feel free to ask. Okay, thank you. Thank you.