 So, hello everyone. I'm impressed there are this many people here at the last session of the week here or more diligent than I am. So, my name is Alstair and I'm here to talk about writing an embedded operating system in Rust. So, I thought I'd start a bit about myself. I work in Western Digital Research and I've been working with Rust in specifically embedded for a few years now. So, it's kind of what I like about Rust, and the little bit I don't like about Rust, I guess. But just pre-warning, I really love Rust. So, it's not a lot of bad stuff to talk about. Okay. So, I think people probably know what Rust is if you're here, but I couldn't tell if how many people were using Rust every day, how many people had just heard about it, and thought it was the next great thing. So, I thought I'd talk about what it is and why it's exciting and then go into the embedded part. So, Rust is a systems language. So, it's similar to C in that regard. So, there's no garbage collector, there's no virtual machine, it's all compiled down to bare assembly, and it provides memory and thread safety. That's probably the thing you hear the most is that it's a safe language and that's all done at compile time. So, I'll talk about that for the next few slides. So, it's extremely powerful and has great performance and it's slowly being introduced into open-source projects. So, I was at LPC earlier this week and there are multiple sessions about using Rust inside the Linux kernel. This patches on list and it looks like it's going to happen. To give you an idea of performance, there was an interesting talk at LPC as well about rewriting the NVMe driver in the Linux kernel in Rust. So, this is a great experiment because the NVMe driver is written in C and it's not some side driver that no one ever cares about. It's used in pretty much every laptop in this room is using NVMe and every high-end server is probably using NVMe. So, it's extremely performance and critical. So, it's currently written in C and there are a lot of smart people all over the world looking at it, squeezing every bit of performance they can out of it. So, if Rust can compare to that, you know it's going to be good compared to some clunky driver that no one really cares about in this slow anyway. So, the numbers from the outcome were that basically Rust was in most cases about equal to C, and the worst case is about two and a half percent slower than the hugely optimized C code. So, to give it some Rust is basically as much as performance C. In this case as well, the kernel uses some auto-generated bindings, and we think that might lead to some of the performance slowdowns, and it's possible Rust could overtake C because the compiler actually knows more about the code than the C compiler does. So, Rust is great. But, I mean, why bother switching, right? If it's as good as C, why not just stick with C? C's worked for so long, I think everyone in this room probably knows C, why learn something new? And the reason is because of memory safety. So, some people might say, oh, it doesn't matter, right? I never write buggy code. All my code. That's a good response. All my code is perfect. I've never written a bug in my life. I find those that it's wrong. That's just stupid. But maybe you are amazing and you've never written a line of bad code, but you probably work with someone who does, and you probably will leave the project or inherited the project and someone else will, right? It's just unbelievable to say you're not going to. And I think these two statistics kind of give that away. So, Microsoft and Google Chromium or Google Chrome heavily invested in, right? Not some hobbyists working at night. These are huge projects with huge software teams behind them doing fuzzing, reviews, audits, all sorts of things. And they say 70 percent of their security bugs are memory safety issues. And so, think about that. It's security bugs. That's things that not just like the color isn't right in the background or something, that's impacting users because they're exposing them to malicious attackers, right? And 70 percent are memory safety issues. So, if we could fix that, it doesn't get us all the way there. We're never going to get perfect, but that's a good percentage of security issues, just kind of fixed if we can use a memory safe language. And that's where Rust really shines. So, I kind of assume everyone here knows C or is familiar with C. And so, I want to start with how Rust is similar to C. So, it's ahead of time compiled like I talked about. You use LLVM in the Rust case to build assembly output. And it focuses on maximum programmer control. So, unlike some scripting language or something like that, you have full access to the hardware. There's nothing below, there's no JVM or anything like that, your full access in zero runtime overhead. It works well for bare metal talk, which is why I'm here. It's statically typed, it has performance, I just talked about that. And it's easy to link with C programs. So, you can call C to Rust code and Rust to C code. So, you don't have to rewrite everything in Rust overnight, you can kind of replace bits and bits and call into them. And it has the same basic kind of flow control. The house Rust different, right? If it's the same, why would we bother switching? So, it's strongly typed, much more than C, which is interesting. It has module system. So, in C, you have these header and you have if, if def something underscore H, then you define it and then it gets clunky and you have all these headers and you got to split them out and which ones you use. Rust doesn't have that, it's just modules. You just import, you want to use this, import the module, you have access to it. And all statements evaluate to values. So, you can do like let variable equal if, and then have ifs, ifs, ifs, ifs statements and you get a value back. And what starts to shine with the memory checkup is these references. So, if you have a reference or a pointer, you have one mutable, which means you can read and write it, or many immutable, which is read only. And so, this is how we start getting the memory safety parts. And you think, oh, that's just like a, such an annoying constraint. I can only have one read access to this variable. How can you ever do anything about that? I think the more you look at it, the more you think that's all I do in C anyway. If you're passing around like pointers that you're writing from multiple places without, you can do it in Rust with locks and stuff. If you're not doing that in C, you're just asking for trouble, right? So, these are things you do in C anyway, Rust just makes you do it. It has generics, which I have an example on as well. Macros in Rust are complex. If anyone's ever looked at them, they hardly get your head around, but they're much more powerful in C. And almost all of Rust is a safe subset through static analysis, which is where you get the memory safety. And I'll talk about the unsafe part later as well. So, this is an example of C code in Rust code. So, on the left, I think I can do, you know, it's pretty simple. I have a string, I allocate another string, copy from one to the other, print them, and then I'm gonna change the w from a lowercase w to an uppercase w through pointer accesses. And so, the Rust code looks pretty similar. So, I create a mutable variable. That means I can edit it in Rust variables or default read only, so you have to claim it as mutable. I'm gonna print it. I'm gonna get a pointer, and then I'm gonna access the pointer here with my, you know, adding offset into it and access it. So, this is where Rust is different to C. So, we've got my pointer, I'm adding six, you know, I'm counting six characters in. I'm dereferencing the value to get the character. But that's an unsafe operation. And so, in Rust, unsafe is, I think there's lots of different ways of thinking about unsafe. But the way I like is, you're not saying it's wrong, it's not bad, it just means that you know something that the compiler doesn't. That's what unsafe is. So, in the case of embedded, if you have a UART, MMIO region, right? So, you've read the data sheet and the data sheet says, if I write to this address, it's gonna print something on the URL. And then, in the Rust compiler doesn't know that. So, if you try and write to that address, it's gonna say, oh, that's dangerous. What are you doing? That could segfault, that could crash. Who knows what it's gonna do? And so, unsafe is you just telling the compiler, trust me, I know, it's okay. If I write to that address, it'll print out, it won't crash. And so, that's what unsafe is about. And the reason it's nice is because, generally, in your program, you only have small amounts of unsafe. So, if you have a team, you can all review the unsafe code much more thoroughly than you can the entire program compared to see where every single thing you ever do is unsafe. Rust, you only have little snippets, hopefully, and that is unsafe. And in reality as well, you don't actually use unsafe a lot. You kinda hide it behind helper functions and wrappers. So, you don't ever really write unsafe code. And then, we do the same thing as C. We just set the lowercase w to an uppercase w. Again, it's unsafe because we're accessing a raw pointer. So, Rust in embedded, I guess that was the topic of the talk. So, Rust has a strong focus on embedded. There's an embedded working group. There's embedded books. There's all these things. I was telling someone earlier this week about embedded Rust and they said, oh, but why bother, right? Isn't all the cool Rust features, you know, don't you need high level things for that? I think people get confused, I guess, because there's, people think of Rust as well and it has these helpful, you know, like the vector so you can create like a dynamically allocatable array and hash tables. It has these kinda helper functions that a modern language would you would expect but that C doesn't have. And this isn't a core part of the language, right? The language doesn't need that. It's actually just a library on top. So, Rust has kinda three main libraries, the core library and that runs everywhere. That runs on your bare-metal system, that runs on your POSIX computer that works any way you want and that has things like options and results but not too much else. Then you have allocator library called liballoc and that gets your vector and more complex things like that but that needs an allocator. So, if you run on your computer, it works. If you're running on a bare-metal system, you can implement your own allocator and that will work or you can say, no, I don't wanna allocator, you know, it's too high-risk or too much work and you just don't use that library. And then there's also the standard library, the STD which is file-open network traffic, stuff like that. You need a real desktop class system at that point. So, the embedded Rust project that I've been working on is called TOC. So, it wasn't just written by me as a community of people and it's written in Rust, it's all Rust. It's not an RTOS, but it is a small embedded system. So, it's designed for a system without an MMU. So, think of the Cortex-M3, Cortex-M4s or your RISC-5, RV-32I type of systems. And it's designed to protect both the kernel and applications, it runs applications from attacks. So, this is a big diagram of how the architecture works. So, down the bottom, we have the hardware. So, it's an embedded system, we have a CPU, you know, Asperate-C, it's by, that type of thing. Then, and this is, you know, it's not a Rust thing, it's just any hardware, it doesn't matter. Then, we have our drivers. So, you can see kind of the orange, that's our drivers. And so, we consider our drivers trusted. So, they can use the unsafe keyword and this is kind of by requirement, right? So, the driver can really do anything. So, it's going to set up a DMA and it's going to copy, you know, your spy data and DMA into memory. So, if it's malicious or buggy, it can set the wrong DMA address and, you know, delete your code. So, that's inherently an unsafe thing and you have to write that correctly. But then, so that's considered trusted. And then, we have the core kernel as well with a scheduler, a process management, it handles Sys calls and it has these hardware interface layers, which I have in the next slide. And that's also all called trusted. So, again, it can use unsafe. It's not a lot of unsafe, but it can. It's by design the core part of the kernel and we have to trust it. And what gets exciting now is these capsules. So, the capsules are untrusted. So, all of this code is running in the same hardware level. It's all running in secure mode in ARM or machine mode in RISC-5. But using Rust, we can forbid unsafe in capsules. So, that means the capsule, where we put all the logic and the complex parsing, doesn't have any unsafe code. So, for example, this is where you might put your BLE stack or a virtualization layer. So, for example, we have multiple applications and they all want to print. So, they all Sys call in and print. Then, we have a capsule that takes them and puts them all together and writes them all out to the one you are. And that type of stuff is complex, is buffer management, we've got to move things around, we've got to return errors or return success values. And all of that is done in a capsule, which is no unsafe. We forbid it in the language. And so, this means if there is a bug in there, it's unlikely to be kind of a memory safety issue where we overflow a buffer or something like that. It's more like a logic error where we return the wrong value or something. But we're not going to allow an attacker to overflow our BLE stack because they say the size is five and we have only a two byte long buffer because Rust will catch that and Rust will help us there. Then, we use the hardware to separate processes. So, processes run in a different user mode, yeah, user mode in RIS 5 or untrusted on ARM. And we use hardware like MPU and PMP to enforce them. And so, they actually can't do anything malicious. So, application can be written. Anything or C, assembly, Rust, it doesn't matter. But they are completely constrained by the hardware and all they can do is system call into the kernel. Even if a malicious application wants to take down the system by just while one and use all the CPU, we'll preempt them eventually and swap in other applications. So, no application can take down the system and we again try and offload as much logic as we can there because again, if that's taken over or that crashes, it only affects one process instead of taking down the entire device. So, I talked before about Tock Hills and Rust Generics. Hope you can see that. But on the left-hand side is a hill. They're called traits in Rust. So, it's just a pub trait called hasher. Hopefully, you can see it called hasher and it has a const size so the output is always going to be a constant. And this is kind of like a header file almost in C. So, we're saying these functions exist and you need to define them. And why this is nice is because we can use it to connect like the bottom layer where those drivers are up with the top layer where those capsules are. And so, we can say, this driver implements hasher. So, it's going to accept all of these functions and it will handle them correctly. And the capsule can say, I need a driver that implements hasher otherwise I won't work. And at compile time, we can ensure that the driver's there and the capsule's there and everyone kind of understands what's going on instead of looking for weird function pointers and be like, I'll make sure this function pointer calls here and what happens if it's not there or not implemented. And so, Rust, it's another way we can statically enforce everything at compile time instead of at runtime. And then on the right-hand side is the implementation. Actually, screenshot of the wrong thing but it's the same concept but it's implementing the function. So, we take a copy of ourself, we take some data, we, I can use this thing. So, it's in the recording. So, if we're not busy, then we do the work and we can then modify the registers. So, we're setting start, modifying, clearing interrupts, enabling interrupts and then progressing the data. So, it all kind of, this is the unsafe code that's happening under here. We're reading and writing to MMO addresses but it doesn't look unsafe to us. It just, we're just modifying values, right? So, also, so embedded. So, we need inline assembly, right? We need, we can't write the entire thing in Rust. We're gonna have to have some assembly code and Rust just handles that. So, here we have an example of the startup code for risk five. So, we can link it in the link hub. We can specify where we want it. We call the function as a naked function. So, LLVM doesn't generate any pro log epilogue for it. And then we just say as the same way in GCC does in LLVM that we have assembly and we can write it out. And this is some, what is that, there we are. This is some RISC-5, oops, that's some RISC-5 assembly. Again, it's inside the unsafe keyword because again, the compiler is saying, like I don't know what's in here, it's up to you to keep that safe. And so, there's a few hundred lines of assembly in the entire project and you can heavily scrutinize that, read that, it's very well commented and then the compiler has to trust you with the unsafe. So, we also use, I mentioned this a little bit because I also really like RISC-5, I thought I just kind of throw it in there. We use RISC-5 PMP, which is the memory protection to isolate the applications. So, in RISC-5, we actually can also apply this to the kernel. So, the hardware is all times enforcing write or XOR execute. So, all memory is either writable or executable, it is never both. So, the stack, for example, can be read and written but can't be executed. So, even if there's a bug in our unsafe code or the Rust compiler doesn't catch something, hopefully the hardware then kicks in as well and makes it even harder for an attacker to get some commands on the stack and trick us in executing them because hopefully the hardware again will catch us there. And so, then we do the same for processes where we say, a process can only access its memory, it can't access anyone else's memory and then the same thing, read and write, execute. Also, read, write for its stack and stuff and read and execute for its text. Oh, it's a little quicker, I guess. But, so that's the kind of end of why Rust is so amazing. And so, I had a few pain points though if anyone's kind of ever started using Rust that at least I found that I ran into. So, one is lifetimes. I don't know if anyone's done a lot of Rust, but you kind of, yeah, I see some nods. You kind of quickly run into lifetimes and this is an example of, you know, it's really basic function. We're just implementing something and basically calling down to the layer below it. But you can see that tick A and the tick A is just everywhere and that's a lifetime A. And so, it's a way of telling the compiler that as long as our struct exists for this long, the buffer will also exist for this long and then the client will exist for this long. So, it makes sense and when you learn Rust you kind of slowly understand it, but why it's a hard thing to understand coming from C and it just looks kind of clunky and long and there's other examples of scrolling across the page as there's just so many lifetimes. I don't really have an answer or a specific complaint, it's just kind of annoying. But one thing that you do really have to be careful of with embedded Rust is hidden panics. So, Rust has this concept of panics. So, it's like an assert. You can say, you know, panic unsupported feature. And in embedded, you don't use them that much, but maybe it's more if there's something fully unrecoverable. So, maybe you get a hard fault and you know, I don't know what to do, we can't really keep going. So, I'm just going to panic. I'm going to print some information and exit. And if you specifically call it as a macro called panic, then it's fine, right? You understand what's going on, you know what's going to happen. But Rust will also put some hidden ones in. And so, an example of this is an array access. So, if you call an array like a buffer and you iterate through it, you'll do buffer, you know, square brackets i and i is zero, one, two, three, four, five. And each one of those accesses, Rust know is going to check that the array is however long and your access is inside that array. If it's not inside, if it's outside, Rust will panic. And so, if you're writing an embedded kernel, it's not great that the user space can send, here's a five byte buffer. I tell you it has 10 bytes, iterate through it. Rust won't let you get a vulnerability because it will panic, but it'll panic and your whole kernel dies. And it's not great for reliability, but also in embedded is a huge amount of code space because a panic is a string. And so, now you have the string that you're storing in there, in your binary that you have to look for. And it's not clear or obvious when you start doing this, when you start working with Rust that it's going to do this to you. So you don't have to do it this way. There's a function called get and you should write your buffer.get and then I. And that won't panic, that will return a result and you can check it in that way. You won't get a bug and you'll handle the result. You'll handle the error. If you want to panic, you can panic, but maybe you just return an error to the application or something. But there's no easy way to catch panics once you've built it. And so, this is something that comes up a little bit with the Rust inside Linux work because the same thing, no one wants a Linux kernel to crash when you pass user space does something weird. So, there's no tools at the moment, but the hope is that there will be improvements in the future away to print warnings or errors when you're going to generate a panic from something that isn't obviously a panic. So, that's something really to keep an eye out. It's not obvious when you start working on Rust. There's some overhead stuff like dynamic dispatch, which is kind of like a vtable thing. If anyone knows C++, C++ has the same kind of problem, but you just have to be careful about how you do it. There are ways to write Rust that you don't get this, but then normally the code is more complex and confusing and harder to follow. I'm not going into too much, but if anyone's interested, there's an issue kind of talking about all the trade-offs. And the other thing is virtual function elimination. So, I know if anyone knows C++, but it has the same thing. And there's, again, there's this vtable it's to do with because I mean C++ is more like the object orientated part. In Rust, it comes from the traits. I talked about four way implementing traits. And it becomes really hard for a linker to remove unused code. And so, we've seen examples of, you have this build and in your debug build, you have a really helpful, if there's an error, just dump all the CPU state information, right? And you'd print it all out. And that's really great, but it takes up a lot of code size because you have all these strings and it's dumping them all out. And then you build the release mode and you don't want that. So you just never call the function. But at the moment in Rust, the LTO is not good enough so it won't remove those functions. So you'll have functions in the final binary that can take up a lot of space and never be used. So at the moment, we have to use configs, which are like macros like if-defs and NC to comment them out or remove them. But it would be nice if the Rust compiler gets smarter and can remove them automatically. So that's it from me. Are there any questions or thoughts? Oh yeah, I just have to read out the question when you say that. So. I think that there is a work on panics that are not, that are recoverable from, there might be an RFC for that. Okay, so I just have to repeat it for the recording. So the statement was there is a, it might be work on panics that are non- Did they're recoverable? Panics that are recoverable, yeah. So I guess it depends though, because sometimes there's some things and especially in a better way, you can't recover, right? Like if you get a bus fault, what do you do, right? You just kind of have to just give up and say that's an error. Maybe reboot, but it's still, so I think there's different levels of panics and it'd just be nice if we could turn them on and off, which things like Zig do. You can turn like debug mode on and debug mode off. So there actually is work. Okay, that's good. So you're saying there is work and rust for that, yeah. So the question was, what are the compilation times compared to C? Yeah, I hear a lot about people complaining about rust compilation times, but like our final binary is very small. It's an embedded application. It doesn't bother me. I think the compilation time issue more comes from if you're building your web server in rust and it pulls in all these different dependencies and crates and you've got 100 dependencies and it's building them and then building the final giant web server that can be slow, but it normally builds in under 20 seconds for me. So it does cache, right? So it depends what you edit, but I don't think it's very slow and embedded. Like a kernel, and the kernel panic would raise an exception and like a bug growing Linux kernel, right? So that way it's unrecoverable because you kind of exited your task and you never know what to do. Is that correct? Yeah, okay. So I'm trying to repeat the question too. So the question was about the panics and if then user space calls an abort, then it goes into the kernel. So I guess my point was, so if user space panics, I don't care. That's user space is problem. So if an application panics, it can tell the kernel, it doesn't have to, but it can tell the kernel and the kernel can say, okay, you panicked, I'll restart you or I'll just kill you, or it doesn't matter, I don't care. But the main problem is if, and that doesn't affect reliability, right? If one application panics, the others keep running. The problem is if the kernel panics, like that buffer, right? So if you have one of those buffer with square brackets and the user space passed it in, and the user space says, here's my buffer and it's only five bytes, but the user space tells you it's 10, then the kernel will panic on accessing that. Yeah, the panic would lead to asynchronous one, right? So well, yeah, I mean, so Rust won't actually do an invalid access. So, yeah, yeah, so Rust will close a bounce check, but that will call the panic handler. So the hardware won't trigger an exception, but you, so yeah, there's no overloading in Rust, but like, so for example, in talk, right? So the panic handler is implemented in talk and it will just exit and print UART, and like dump as much state as it can over the UART. So you could, I mean, you can try and handle it, but you generally don't have unwinding, which is another thing Rust has. But you don't have unwinding and embedded because there's nothing really to unwind and code size is important too. So, yeah, so in Rust, I guess the question was in debug mode, the overflow would cause a panic. In Rust, there's not really in some, in a lot of that stuff, there's no difference between debug and release. So a release mode build will also panic if you overflow that buffer. What do you mean that the overflow? It checks the overflow in runtime, yes. Okay, so the comment was, oh, wow, I'm worried about the overhead. Yes, but I think there's two ways to look at it because if you aren't checking the length, size of your buffers and C at runtime that you're getting from a user space application, you're also doing something really bad. So, Rust does the check for you, but you also do the check and see anyway, right? Like, oh, the integer overflow. There are different, there's ways to do unchecked and checked integer overflow. You can do whichever one you want, depending on your situation. Yeah. Yeah, so the question was, if you do the check beforehand, is the compiler smart enough to know that you checked and not do a check? Yes, so there's a few things you can do. There are unchecked ways to access a buffer, so if you, I don't think it's a good idea, but you could do the check manually and then do the unchecked ways and just ignore the checks. But in the experience, I've seen if you do a check, the compiler will pick up that it's already checked and it won't double check again. So, you can just use dot get, which is the good way to get them, like index into array for example, and if you did a length check beforehand, the compiler will remove all the checks and the actual generated code. So, you lose all the overhead of accessing it. Yeah. That's a good question. So, the question was, I've been talking about talk, is it used anywhere and what's its status? So, talk is an open source project. It wasn't started by Western Digital or anything, but it is an existing open source project. It has lots of interest at the moment. Lots of people are looking at it. The Google Open Titan project is using talk. I don't know if people know what that is, but that's like a silicon, open silicon root of trust and they're using talk and developing with it. There are lots of academics and Western Digital also contributing publicly. Like in production, you mean? No, I don't know if it's anywhere in production. You said that drivers use Unsafe, but is that really needed? Got your like for MMIO, make a crate that handles the MMIO, and then the drivers use that crate? Yeah. So, the question was, drivers use Unsafe, is that really needed? Can't you just use a crate that does the MMIO access and then not expose the Unsafe? Yeah. So, that is what we do. So, the actual driver generally doesn't have the Unsafe keyword in it. There is a wrapper below that that does the Unsafe accesses and the driver doesn't have Unsafe. But the idea is a driver is still allowed to, because sometimes like with DMA, it comes up every now and then that maybe this driver needs to do something or maybe it needs to access. Sometimes there's like this special config you need to hit before you actually hit these other ones, and it's easier just to do a raw point to access. So, generally no, it's not a good idea but we just allow it. Well, the capsules actually have a Rust language feature that says we forbid Unsafe. So, you can't turn it on. I mean, you can if you're debugging, sometimes you delete that, but an actual committed code and it's, you cannot turn it off. So, any more questions? Elsewhere else has any approach towards built-in functions that were known in LLVM or GCCM or what have you? The question was, does Rust have any built-in, you mean like built-in functions in LLVM or GCC? You mean like the GNU extensions and that type of thing? Is that what you mean? I guess that's a good question. I don't know. I mean, it's only, well, so the moment Rust only uses LLVM, although there is a lot of work on GCC front end as well. Yeah, I don't know if they're, they probably use them internally, but I don't know if they're exposed in the way they would be declining into the Rust compiler. Like the Rust language just is what it is. If it's in there, it's part of the language, if it's not, it's not, there's no special extensions you add. They might be using the LLVM extensions under the hood though, not sure. Okay. So, if there's no more questions, then that's the end of my talk.