 Hello everyone, thanks for having us in Open Source Summit North America. We are going to talk about the Rastrolinus project and give a status update on it. My name is Miguel Hedda, I am a maintainer of the project. Hello everyone, I'm Wetson, I'm also a maintainer of the project and Miguel is going to get it started. So Rastrolinus is a project that aims to bring Rastroport into Rinnuskernel as a first-class landmine. And by first-class, what we mean is that it can be used anything that you would use the C-language right now in the kernel. As you may know, there are many other languages in the kernel, some here in the plot. Some of them are non-probable languages, others are used via the scripts and utilities, etc. But the idea is that if Rast gets into the kernel, it's the second main language that could be used for writing anything that you would read writing in C. So the bar that you see there on the right-hand side with Rast would increase eventually to the degree or same order as C. So why we want Rast in the first place in the kernel? Well, the first and main advantage of Rast is that it decreases the chance of having memory safety back. This includes use after freeze, double freeze, data phrases, etc. Rast system problem language that offers these advantages in the safe subset as long as the unsafe code that we need to write in the instructions is sound. And we will see a bit more about that bit. We also hope that it helps decreasing logic bugs because of the stricter type system. And with most disadvantages, it should also help on the development of review in the driver. For example, if you are a kernel developer and you're writing a driver, and you don't have to use unsafe code, you stay on the safe subset, then you should not have, as long as, again, the structure is sound, you should not have any memory safety bugs. So you don't need to worry about that. Also, because the type system is stricter, again, that should help more to catch cases where the function that you are calling from the driver and the types that you are using change. So the compiler ensures more things at compile time. And therefore, when changing through the feature when refactoring drivers and refactoring the instructions should help you catch kernel developers, should help them catch more bugs. To reiterate a bit and to reinforce the idea of why memory safety is important, I have here this slide which reminds that more or less, according to Microsoft, Google, and other companies that have done some, they have structured some data on the issues, the vulnerabilities that they have had over the last months, years, around 17% of them, of the vulnerabilities in C and C++ projects, come from a defender behavior, that is, memory safety issues, etc. You can see more in that link. So how does the project work? How does RAS for Linux, how we have set it up? There are several ways to do this, to integrate RAS and C in a project. But what we are doing in RAS for Linux is that we are encapsulating the unsafety of the C APIs that you see there in the right side of the slide. We are encapsulating those APIs and safe APIs with what we call safe abstracts. This is basically a RAS code that wraps the C APIs and provides for spouses a safe API to the drivers or other models, or even other code, but to simplify, let's say, RAS drivers. You see there, as we're in the slide, that there is a forbidden line from the drivers to the C APIs, but this means that eventually we want to forbid any direct call from the RAS drivers to the C APIs. The reason for this is to avoid a bit more encapsulation from the C APIs and, second and more importantly, to avoid any direct calls to the unsafe C APIs and try to get the drivers to avoid unsafe code as much as possible. It doesn't mean that all the APIs of the safe abstractions are safe. We have some and we may need some that are unsafe, but most of them are unsafe and we want them to have and to be safe. In this slide, we have a bit more, we zoom in a bit on the details. You can see here, for example, the driver, my food driver, would right now call the kernel create, the kernel create is where we currently have all the abstractions for all the subsystems that we are wrapping. And these call into the C APIs, which we don't call directly because RAS doesn't understand C code. So we use a tool called binding that generates a binding scrape, what do we call, from the headers, from the headers that are in the include folder in the kernel. And this post is basically the bindings, the C functions and the C types in their RAS equivalent format or representation. This diagram here gives a bit, another quick overview of what are the different pieces of the project right now. This may change soon in the future because the kernel create that you see in the right side may be divided. But right now what we have is a kernel create which contains, again, the abstractions, a macros create which is, they are like plugins for the compiler that procedure macros are calling in RAS. We also have the alloc create, which is a copy of the RAS alloc create which contains, for example, the containers and things that require a memory allocator, which in turn depends on the core create. The core create is the one that contains the base of the foundational types and all the facilities of the RAS language. These both the core and alloc, we cross-compile to whatever target we need. And the alloc create, again, is for the moment in the kernel tree, we hope that we can take it out and use the upstream alloc create but for the moment we need some tweaks or changes so we have a copy inside the tree. Then we have other pieces that they are not as important. Now, what have we done in the last year? So, first of all, on the infrastructure side, the first point which is important because it was talked in the main list, Linus wanted to see it gone, which were the panic allocations where we got rid of that. For that we introduced the alloc create into the tree and there is upstream, in upstream RAS there is work going on trying to provide this facility as well. We also moved to the latest edition of the RAS language. The RAS language has this additional system that allow to improve the language over time while keeping backwards compatibility. We also moved to the stable release of the RAS compiler. Last year we were using Nigly, the Nigly compiler. Now we are using the stable compiler. This doesn't mean that we are not using unstable features but we are using the stable compiler and we are tracking it. So, we upgrade the new versions as they come up. We also got some more further architecture report. We got some testing support as well. We are able to run the documentation test as they are called, the code that you write in the documentation. This is tested and is run inside the kernel as K unit test. We also got support for host programs in RAS which is in the kernel utility programs that you use for other tasks that you need to do outside the C compilation. We also got on-the-fly generation of the target specification file based on the kernel configuration. The target specification file is basically all the features or configuration that you need to provide to the RAS compiler for a given target. This is specific to the RAS compiler and we are also working with AppStream to see how we can move from that because that part is unstable. But previously, we didn't have this on-the-fly generation so we got customized in this target to work with you in the prototype. Now, on the extractions, there has been a lot of work. Wesson, which is presenting as well and introducing himself before, has been working on the example drivers and many, many of the extractions, for example, some conditional things like RAP3s as well as more other functionality that the drivers need, for example, files, tasks, credentials, etc. Also, a lot of synchronization features like semaphores, Fibocable, Mutexies, etc. We also stopped using Arc and RC, which are types from the allocrate and we started to use REF, which is simplified and tuned to the kernel and we expect that through or over time we will change other things and we will customize things to be more kernel-like like we do in the season. And we also have one important thing which Wesson will talk later is async and why it's important and why we work on it and what is coming. On other projects, it's important to talk about a bit other projects and what happened last year. Rust stabilized some features that we were using, we will see a bit on that later. We got some improvements on the upstream Rust in the compiler, in the standard library, in the tooling. We got some things there by contributors and it has been great, the experience working with them. There is support now in the ecosystem, let's say, for the new upcoming, the new demangling system for Rust. They can use the robot from Intel, has started running with Rust enabled, which is great. We thank them for that. Linados took it, also added Rust support and then on the Rust compiler side, Rust C Code Gen GCC, which is a package for the official or the main Rust compiler to use GCC through the GIT library, got merged into the official or the main Rust directory. GCC Rust, which is a new frontend for GCC, gained a few months ago a new second full-time developer so they have been doing great work and things have been spinning. Also, compiler problem, we did some, when not we, we suggested some improvements that they did the team or the compiler explorer team, which I think or we think they are useful for developer when they need to test some things in compiler explorer and CD output. Also, in events, we presented it in several venues last year. Here you see some of them or most of them you can refer to them if you want more information. As a fun fact, in LPC last year, in the keynote there was an informant pool asking few questions about Linux and ecosystem and one of them was what was the most emerging technology that was, that you were most excited about. An attendees voted or said Rust, Rust was the most expected one, the most they were excited about. Now, today to give a bit quickly the status, the status is still, as we have said many times, it's experimental. This means we are not expecting users to write away, use it in production, but it's usable already for writing new abstraction, writing new drivers and other modules. Even for writing new subsistence from scratch, instead of writing abstractions you can, if you are a kernel maintainer of a subsistence, you can write a new subsistence from scratch instead of developing abstraction for the current CAPIs. We have two modules working, even if they are not 100% complete. We have two modules, GPIO driver and the binary module for Android, which we have met in the past series. Quickly through the architecture, we have, this is the list of the architecture that we support. And again the target specification file is now generated from the kernel configuration. So kernel maintainers of these architectures can tweak the flags and can tweak the features they need and they can start providing better support for those architectures. We typically focus on X, 86 and ARM mostly, but we are happy to see more problems on the others as well. On unstable features in Rust, this is a fun topic a bit because in order for us to start putting a minimum version of Rust or the Rust compiler to establish a minimum version for the Rust compiler or the tool chain in the kernel like we have for GCC or CLAN, we need to basically cross all these features here. Some are more important than others, some we can maybe work around, etc. But we wanted to show you a bit the progress since the last year and what features went away since the last year. We are very happy that we see all this progress in accessing that in Rust compiler. So for GCC, as I mentioned before there are two main projects to generate code through GCC, apart from, of course, LLBM in a middle. Rust C, Code Gen GCC, nowadays already passes most of the Rust compiler tests and can actually bootstrap the Rust compiler. And Rust GCC, which is this new front-scratch front-end is looking into compiler, compiling the core standard library of Rust. So we are very, very, we are amazed about the work that they have done so quickly. So we hope to start playing with that with both of these projects as soon as possible to have the kernel parts of Rust compiling with GCC as well. And with that, this is basically the main things that we have been doing or how things have got, how we have got into this point. And the current status, and Wesson is going to talk a bit about the asynchronous Rust, which is something that I already mentioned before and it's coming, more and more things are coming in the near future. We think it's a very important, not very important, but maybe something that you cannot really do on, we'd see, let's say, and Wesson will explain much better why it is the case. Yeah, thanks, Miguel. So before I start getting into it, I'll talk a little bit about what motivated us to do this. One piece of feedback that we got last year on LPC was that we focus a lot on the memory safety and undefined behavior guarantees that Rust offers, right? And perhaps we should also look at the aspects of Rust beyond that. And one aspect that we feel strongly about is productivity, development, developer productivity in general. And we feel that asynchronous Rust actually is, as Miguel mentioned before, something that is quite a big difference between Rust and LPC. So let's start getting into it a bit more. So the first question that comes along is what is asynchronous programming? What do we mean when we say asynchronous, right? And what we mean is basically we have a piece of code that we want to execute, but for some reason we cannot make progress. There's like a piece of data or some stage that needs to be reached that hasn't reached out to the data that's not available yet, so we can't progress. That's when asynchronous programming comes in. When we are in these situations, there are usually two approaches to these two broad approaches, right? One is to create state machines and have events feeding to those state machines and that state machines evolve, right? And another option is to have just dedicated threads, right? And then you have straight line code and then when you cannot make progress, you just put that thread to sleep, right? And then eventually when you reach your state or when data becomes available, the thread wakes up and continues to make progress as you want it. The problem with this second option is that while you wait for your state or data, then that thread is idle, it's not doing anything. So you're burning a whole call stack in a bunch of states for that. So this actually doesn't scale and we'll talk a little bit about that later on. Now, what is it that Rust offers that helps approach this problem, okay? So the first thing, and this is the big differentiator between Rust and C, here is that the compiler builds the state machine from straight line code, right? So what you do is, and we'll see some examples that will make this clear, but I'll first introduce the concepts here. The developer writes straight line code as if they were doing the dedicated threads way of doing things. So it's very straightforward. Let's say you read something from, let's say you have a network server, right? So you read something from your socket, right? And let's say you haven't read enough information yet, so you wait for more to come and then once more is available, you read more and so on, and then you start passing your data and then start producing your results and then writing them out. These are examples of steps that one could take in this straight line code, right? So the Rust compiler reads straight line code like that. There's a few differences between this which we'll see in a second. But then what the compiler does is it generates a state machine automatically generates a state machine for you, right? And this state machine is a combination of codes and contacts data, and this will be clear when we take examples. So the compiler does that. And then once you have the state machine, you actually need to have a way to execute the state machines to drive them. So we have this concept in Rust called the executer, right? Which is basically a way for you to make these state machines go forward. And then once you reach those states where you cannot make progress, then the state machine is registered with a reactor, right? And then block, right? And the idea is that blocking the state machine doesn't block a thread, right? It's just like say some state in the memory doesn't take over a call stack. It's just some state in the memory and we'll see that it's a minimal state. And the thread that the executer was running can just continue doing something else, perhaps another state machine. And then this reactor thing is first of all, it's independent of the executer and what it does is once that state is reached when the state machine can make progress, it goes to the executer and says, oh, the state machine is ready to continue. So the executer can continue doing that. And the reactor is dependent of the executer. So you can have different types of executers and we'll talk a little bit about different types in the kernels. And one piece of information that I haven't given you yet is that the work that the compiler does is really just the first one, the creating the state machine for you. It doesn't provide executers and doesn't provide reactors. And this is related to creating these zero cost abstractions. So the environment has to provide this and we'll see how we provide those in the kernel in a second. So for executers, we actually use work queues basically to run the state machines and the idea is that each state machine holds with it a work item and then when the state machine is ready just queue the work item and it will eventually run and make progress until it blocks again. And then the work item completes its execution and the worker thread can work on something else. We also have the option to run in a single thread. We can have all the state machines attached to a single thread. And for reactors in the kernel, we actually already have, of course, this concept of asynchronous behavior. So we have sockets. We can do read and write sockets and then if the sockets are not ready for reading, for example, then we can register. When the socket becomes ready please tell my executor to run it again. And then we have a virtual file system operations, like you're reading or writing to and from a file. Those may block because we have to go to the block layer to read some stuff. In fact, block layer also has an abstraction block layer IO. We have these irbs. We have timers, which is also a reactor you can say, I want this state machine to block for 10 seconds. Eventually, when the time expires it gets queued again and gets to run. So here's a brief example of what I meant when I said that you write straight line code. This is an echo server. All it does is it accepts connections and it reads data and as it writes the data back to the peer. If you look at the acceptor loop, it literally just has a loop and it calls accept. If we can accept a new connection then we spawn a new task which is the state machine and this is the one. Echo server stream. This is this guy and this async is what triggers the generation of the state machine automatically by the compiler. If you look here, you'll see that we have a buffer. It's initialized and then we have a read and then we have a write. All error handling and sleeping and blocking is in here. This is a complete example and this gives you a sense of how simplified things are to have a straight line and all the error handling and everything here. To give you a sense of what the state machine looks like, I'll show you here for the echo server state machine. Here we have the memory layout and then what the compiler does automatically for us is it creates this truck which is the context. Stream is shared by all the states. Buffer is also shared by all the states. Then we have a tag here that says what is the state that we happen to be in. Then state zero is when the function starts. There is no state associated with it. It only has a buffer and streamer shared by all states. What this does is initializes the buffer to zero. It comes into the loop and eventually reads. We actually have state one and this has a socket weight object in it. This is how we talk to the reactor. We talk to the socket and say if you're not ready to be read then I'll block and wake me up when you're ready. Then there's a third state state two which is on the right because the right can also block. The right is not draining its side. Then we won't be able to write. Then I expanded this. The right all down here. The idea is that we have N. It's this local variable here which is the number of bytes that were read. We want to write. Within that function we also have the remaining buffer here. It also appears here on the state machine. We also have another socket weight for weighting in case the right blocks. The idea here is that all of these states overlap. This is a union. The current state dictates which one of these unions is active. One thing that is really interesting here is that even though we have echo server as a function that is calling block, we actually don't need a stack to represent the state when we park the execution of the state machine because all these states that we need to have saved and restored at a later point is actually stored in this context. The compiler does that automatically for you. You just use them as you can see here like local variables. It appears like a local variable here on the stack and appears like a local variable here but they are in fact just context variables stored in this layout. So you can think of these as coroutines or green threads but they are also stackless and automatically generated by the compiler for us. So why does this matter to us? The main reason is that the kernel has a lot of handwritten state machines and the kernel is intrinsically asynchronous. It actually has this mode of operation where there's always some piece of data or some piece of state that we haven't reached yet and then just have to wait. And when we build these state machines by hand we can build these highly scalable systems but they are very complex and with this increased complexity we increase the risk of bugs. So what Rust brings to the table here is that it may actually eliminate the tradeoff between simplicity and scalability when I talked earlier on about having a thread and then writing a straight line code, that's not scalable if you want to create a new thread for each connection that you upset then eventually you use too much memory to represent each client that is connected to you. And Rust eliminates this idea. You don't have to choose either you're scalable or straight line. You can do both with Rust and it's going to be as efficient as before. Climbing up is quite simple. We actually use this idea of dropping implementations in Rust to most of the cleanup. For example if you look at this code for example and this server is blocked here on the read and you want to terminate this disconnection, really all you have to do is come to this task that we created and say close. What happens is it knows that it's in state one and it knows that the socket weight is alive. So it actually destroys this socket weight and the socket weight in turn knows which socket it's attached to and knows how to unregister from it. So cleanup is automatic. You don't actually have to think about it yourself and implement it. And of course we get all the Rust memory safety guarantees. Now here I have a bunch of examples. I'll just run through them and I'll leave the slides there for folks to refer to it later on. But KSMBD is an example of a server that we have in the kernel. It actually has a mixture of the straight line code that is scalable and the state machine and more scalable thing. But it actually has a few problems which the problem is related to actually blocking even when it's doing work in the work queue. Another problem which is this even in the straight line code due to cleanup we actually instead of just blocking this call it actually goes to lead for 100 microseconds. So we're like we're spinning and every 100 microseconds we're waking up and the idea here is just we want to see if the interface was removed and we want to back out of this which is something that you wouldn't have to do with Rust. If you just close this state machine and everything would be taken care of automatically. And this is here the example of a straight line code but blocking. So read can block and then now we're burning a whole call stack on this and this other read can also block and then we're doing this and then this process actually works out and ends it over. NVMET is another example but this one doesn't use a mixture. It actually just does the state machine and it's fully async and it's very scalable and great but it comes at the complexity so this is an example of receiving one thing. We have a state machine if the state is this then do that and then if the state is this then do that and each one of these can actually halfway through it fail because there's not enough data so it returns with either E again and then waits to be called again. Here's the same size side of things out a lot of states and the last example is BCM203X which is just doing the firmware uploading firmware to some Bluetooth device it also has a state machine and this is related to it's USB device so there's a USB device attached to some host and there's a state machine that is driven by completion of USB and so what I'd like to show and this slide here is just to reiterate what I've said before and Miguel has said before too the async rest actually allows us to write straight line code reduces complexity, fewer bugs, fewer vulnerabilities it has the same performance and scalability as the complex state machine manually written state machines and of course it benefits from memory safety from rust so we feel that developers are more productive if they can write this code this way and not the unique performance penalty and with that Miguel is going to come back and tell us a bit more about what's coming along in the next year Yeah, thank you for for async what's on set is one of the things we are working on there are other milestones that we have for the next year and even if it is extrapolating a bit we cannot, of course, promise the future but what we want to see happen is first of all we want more use cases or more uses of the rust support inside the kernel because this is basically the end goal having more example drivers etc. We want to split the kernel create that we talked about a bit before into several and managing create dependencies throughout the kernel tree etc. This would improve development for kernel maintainers and developers. We also want to improve the integration of documentation on the testing and the rest of the toolchain there are things to improve in several areas of course one of the critical things as well is getting more system maintainers involved there are some system maintainers a few of them that are interested in using rust they have contacted us but we want to see more and more people joining trying to code, writing new extracts etc as well as more companies involved and more researchers as well people from academia We also want to see the rest of the rust feature that we had listed before that is like seeing most of that gone Of course we are not exactly sure when but hopefully by the next year if we have the same number of cross lines new cross lines as the last year then we are mostly there Now as we mentioned we want to play with compiling the rust portion of the kernel with ECC and of course getting merged into a kernel which should help everything else or make everything else easier and we also another point that I would like to mention is that we wanted to have some learning resources write more revenue resources, more documentation to use this this rust support in the kernel which as Wesson said is something that we have been some of the feedback that we got last year and also as a last very last thing there is some work on the MVME driver that Wesson started back then Andreas Hinburg from Westerita is working on he may have something ready in the future Also there are some upcoming events we are very excited about the first one is Kangrejo which is the Rust for your workshop is a meeting and conference that we organized to make everyone that is interested in this effort this year we are going to have it face to face which let's see how it goes we will be also right after that next week we will be in Plumbers in LPC and this year we will have a micro conference on Rust not just Rust for Linux but also everything Rust related if you are a Rust developer you have used Rust also the kernel and you want to speak there please send us a proposal the call is open and finally something we wanted to mention before in the talk we mentioned that there were two sessions of the information life and of the series we had two on code documentation on Rust for the kernel and a bit of an interaction the first one was a bit an interaction with the safety what the safe and unsafe Rust means and there are three more coming from Wilson the first one is in July it will happen in July and the other two are being escaped and with that thanks a lot for being here we would have liked to be live but we will try to answer questions in the chat in what the talk is going on thanks a lot thanks everyone thanks bye