 Hello everyone my name is Andrea and today I'm going to walk you through a tiny part of the RASVMM security journey. So for those of you who do not know about RASVMM I will just give a really quick intro because we don't have a lot of time. So RASVMM is an open source project that is providing virtualization components written in Rust. The focus of this project is to provide high quality virtualization components and sometimes this comes at the expense of features. We are also looking at providing code and components that are easy to extend and easy to use and that is because we are serving a we are serving different customers that have different needs. The main RASVMM customers are virtual machine monitors such as Calc Hypervisor and Firecracker. Now just to give you a few examples of components that we are having in RASVMM. We offer Hypervisor support through KVMI optals and KVM bindings crates and recently added the Microsoft Hyper-V crates. We're also providing implementation of the devices and here we have a few categories. So we have the legacy devices which are incremented in VM Superio and then we also offer some primitives for working with devices such as the MMIO bus, PO bus and device managers. Another important part of RASVMM is the implementation of virtual devices and here again we have primitives if you wish and these are the queues and the virtual devices. Also we have device implementation such as the host user and the host user I2C and other primitives. These components are in various states, some of them are already published, some of them are in implementation. So now with this short intro we can just go back to the security story. For us security is very important and we are trying to apply security at multiple levels starting with the organization setup and ending up with operating in production. So now I'm just going to walk you through these levels, these security levels and how we are applying them. In terms of organization setup, well, people have said it multiple times and I'm just going to say it again. But, well, we're called RASVMM so we're writing components in Rust and actually this helps a lot because writing in Rust already eliminates complete sets of vulnerabilities from your code. Then the other thing that we are doing is that we are, we offer one Rust package which are called crates per component. This helps because customers of RASVMM can import only what they need so they can reduce their code base by only importing the components that are actually needed. Another thing that we are doing is that all components run the same set of tests. We really want to make sure that all components are at the same level in terms of quality and these tests include unit tests, build and linters. Something that is also part of the testing is auditing for security vulnerabilities, independences, but I just mentioned it as a separate thing because I just want to talk a bit more about it as it is important. Auditing for security vulnerabilities is done through cargo audit and how this cargo audit tool works is that it checks for vulnerabilities in dependencies of a component. To do this check, it checks a Rust vulnerability database. There is a vulnerability added there, you're going to be alerted when you're running cargo audit. So there is a catch. We are running this at the RASVMM level, but it is really important for customers of RASVMM and any other people that are writing Rust code to run these cargo audit checks in their products as well. Because RASVMM is providing library components, which means that we do not actually fix the dependencies. These dependencies are only fixed when releasing binaries. So again, it's really important for customers to also check for vulnerabilities. In terms of development, we applied a few things. So one of the things that we are looking for is to reduce number of external dependencies. And this whole thing can start a long debate. But let's just say that what we are trying to do is to use common dependencies that we can trust. So most of the RASVMM components have dependencies on a few crates, and these crates are Lipsy and Serdi. Serdi is the serialization and the serialization crate that is widely used in Rust. We also have zero dependency components, and here there are a few examples of that. So we have a FlattenDevice 3 implementation that needs no, that has no dependencies. Then the crate that is providing implementation for legacy devices has no dependencies. And that's the same for VM device, which is the crate that is providing device abstractions, let's call them. So the bus implementations and device managers and so on. Another thing that is important during development and especially before declaring a component to be production ready is to add negative testing. And that is because we really want to take assumptions. So another thing that I would recommend people to do when they are consuming RASVMM and when they are consuming anything else is to run the unit test and the integration test of the component that you want to consume. Because that's a safe way to check for assumptions that might not hold in your environment. And thirdly, we are trying to reduce the usage of unsafe code. And again, this is an important topic. So let's just look a bit more in depth at this. It is like the path of risk resistance here is to just open a huge unsafe block and just write everything in it because it's easy. You don't have to check for anything and you don't have to actually go in depth and look at what exactly is unsafe. We do not want to do that. What we do instead is that we are trying to limit the unsafe code and make sure that we only open unsafe blocks where they are needed. Then another thing that is really important is to check the return of the unsafe code blocks where this is possible and document why the way that you are using this unsafe block code is actually safe. This one, this documentation really reduces the risk of being misused. And this is only one advantage. The second advantage is actually that when you're starting to write this documentation, you really figure it out. And you really understand why this code is actually safe. And who knows, maybe you're going to get the surprise and realize that this code is actually not safe. This function is probably my favorite topic. So one of the things that trust us is that it is making you document unsafe public functions. So if you have a public function you will need to add a safety section in which you document how this function can be used safely and why this function is unsafe. And something that we added recently is threat model documentation. This one is particularly important because with this threat model documentation, we also found a way to document expectations from consumer products. So when writing the threat model documentation, we did a few things. So what we did was to try to understand what is trusted and what is untrusted. We also tried to identify the actors that are involved in a component, and then we also tried to identify some threats and mitigations. And now I'm going to give you an example of how I wrote the threat model documentation for the serial console. But in order to be able to do that. I just have to quickly explain how the serial console is implemented in RSVMM. But I will just let you know that this is overly simplified. So we are implementing an overly simplified serial port with a 64 byte FIFO. And if we were to simplify the operational mode of the serial console we would say that the serial console is just receiving and transmitting data. And I have here three main areas of ownership. We have the VMM code, which has the ownership of the serial input and the serial output. Then we have the serial code, which basically knows how to handle read and write operations. And then we have the driver code, which is just receiving and transmitting data. The host is considered trusted and the guest is considered untrusted. So a read operation, how it would work is that the VMM would have a serial input and it will forward this input from the input from the serial through to the serial console through the NQ bytes method that is defined in serial console. And what happens in the implementation of the console is that we take these bytes and we add them to the FIFO. And this FIFO is consumed whenever there are really requests from the driver. For the write operation, how it happens is that the driver is just transmitting data in the serial console, we just received your request and we directly forwarded to the serial output. So here we didn't actually implement the FIFO we are just like writing directly to the serial output. So, now you might already think about a few problems with this implementation. But before I start, I just wanted to let you know that this threat model is already available in REST VMM VMSperio on GitHub. And that these threats that we are talking about here are already fixed. So the first problem is that a malicious guest could generate large memory allocation by flooding the serial input. So this already has a CV allocated. And for this we have to fix this. So first of all, the serial input can be forwarded to untrusted parties. So that is why we consider this to be a security vulnerability. And the problem is that we did not have a limit on this in FIFO. So this in FIFO could have grown unlimited. So the fix was relatively straightforward. What we did is that we limited the input FIFO and we returned an error whenever the FIFO is full because this vulnerability also needs cooperation from the VMM because at the VMM level we should check for FIFO full errors so that we don't get spammed with events from the serial input. So a second threat that we identify is that a malicious guest can fill up the host list by generating a high amount of data to be written to the serial output. So now if you remember the previous slides, the serial output was in full control of the consumer. So in the serial console implementation, we didn't have a FIFO output or anything like that. So the mitigation is only possible at the VMM level. And here we are recommending the consumers of VM square IO to rate limit the output. And for that, you can use a ring buffer because that has a fixed size and it will just overwrite data or name pipe, which again has a fixed size. So this is an opportunity for unlimited growth. So, how I imagine this presentation is going to look like is that I'm going to come here and say I ran fuzzing and I discovered I don't know how many bugs. But actually here I am telling you that I just basically looked at the code and discovered these bugs while looking at the code. But there's an important lesson here. So I read the serial code multiple times but didn't identify this problem from the beginning, but only when I actually get to read the code with security in mind, putting that security hat on I was able to discover these problems. Now, another lesson for me was that it's really important to follow the input and output and understand what is trusted, what is trusted and what are the side effects. In terms of fuzzing virtualization components here things are a bit more complicated. Well, fuzzing other virtualization components. And so what we are trying to do and I'm going to be completely honest and tell you that continuous fuzzing is not yet implemented in Brazil mm. What we're trying to do is do component based fuzzing. And this has a few advantages but then also a few disadvantages. One of the advantage is that we are fuzzing library code. So it is relatively straightforward to pass the input from the father to the target interface. And then these components can be tested in isolation so you don't have to build a whole monster to just this tiny component. The other advantage is that you can actually test directly the low level interface and you don't have to go go through multiple other higher levels. These are the advantages. The problem is with the problem with low level testing this low level testing is that testing side effects becomes harder because you only know the effects of of this particular component, and not what happens in the big picture. Another problem is that the issues that you want to identify might not reproduce in, for example, a VMM because maybe you actually don't can't get that malicious input. One of the challenges with with right with running fuzzing that we identified is also that mocking the driver code is not really straightforward. So what we did so far is that we identified what are the target interfaces. And here we have the cues and the device implementation. One of the first targets is the virtual block implementation. And then what we did so far is we started writing this framework this mockup framework for virtual devices. This was partially implemented as part of Google summer of coding 2021. What we already have available is a framework that can create descriptor chains and what is particularly important for fuzzing is the is the ability to write arbitrary data in those descriptor chains that we are creating. So, next, what you're looking at doing next is to create a specialized mock framework for devices as well. But here we have to pay a little bit of attention because we really need to find the correct balance between the random data from the fuzzer and useful data because we want to look at what coverage can we get with the fuzzing but also we do not want to send only valid data to the device because then we wouldn't call it fuzzing. And another thing that is important for us because as we have problems with fuzzing, the same problems are with testing just the normal behavior of devices. What we want to do is actually to reuse this the mock framework, we are already doing that to reuse it in unit test integration test, and we're also having some benchmark benchmark test where we are going to reuse this mock framework. Another thing that is super important is reporting vulnerabilities. So, I'm saying this because we are already operating RASVMM in production, there are multiple VMMs that are using RASVMM in production. So if you find a security vulnerability, we would really want you to follow the security policy. The RASVMM components have a security policy attached. You can find this by going to the repository and then their security dash policy. So you have here a few examples. So if you look at the policy, and the too long didn't read version of it is that you will have to send an encrypted email to RASVMM maintainers, and then we are going to work with you to assess the issue and start an embargo if that is needed. I know that a lot. I know that, but if I want you to have three things from this presentation, this would be that it is important to apply security at all levels from projects set up to development and operation. Then security vulnerabilities don't only get discovered by fuzzing, you could also discover them by reading the code and having the security hat on. And once you do that, it's really important to write that threat model and read threat models of the components that you are consuming. And the third thing, and this one is really important is that please, if you discover a security vulnerability in one of the RASVMM components use the security process for reporting them. So, that is all. Thank you for joining this session and if you have any questions you can just contact me.