 Our next speaker is Jamie Sharp, and it's about rocket science, always literally, because he worked called for an open source rocket. I got scared a bit, so I'm going to be serious for now. He also created a website to connect people with online comics that has over 4 million pages in almost 500,000 comics. 50,000, sorry. I told you serious stuff here. Once he fixed an e-mail bug that existed since he was three years old. Well, that's an impressive thing. And he's a Nintendo contractor who just wants to make computer reliable and safe for us. Welcome, Jamie. Hope for I turned this on all right. OK. So I'm going to talk to you about translating C source code to Rust. But first, there is a bit of a tradition in cybersecurity talks, where first I come and try to scare you all. I point out there's all these SCADA systems that are controlling flood control dams or the electric grid or whatever. And they're all computerized. And this should be absolutely terrifying, because we depend on all this critical infrastructure up to and including our election fair elections. So cybersecurity is important. I'm here in the Mozilla Dev Room. The particular approach that I want to talk about cybersecurity from is, of course, Rust. Rust is an interesting tool for trying to address cybersecurity issues, among others. Because for one thing, of memory safety features of the language, that Rust, through its type system, addresses many common security vulnerabilities, buffer overruns, all these sorts of things. Rust is also an interesting tool in terms of enabling better programming languages techniques. It's as simple as having a proper tag union type in the language. It can let you structure your code in a way that makes it easier to maintain and easier to get right. So there's a lot of people who are kind of excited about Rust. Many of you are probably in this room. And there's a bunch of people who are trying, you know, embedding Rust in existing projects or translating parts of existing projects to Rust. Firefox is almost a boring example since the language is practically invented for the purpose of doing Firefox better. But more exciting for us, a month ago, as of today, I think, Lib RSVG shipped their first bit of Rust. And this is a piece of software that, you know, you may be running on your laptop now. It's part of the GNOME folks collection of libraries. And the big reasons why the Lib RSVG folks were interested in doing this included having a better way of doing testing. Unit testing is a giant pain if you're working in C. So just some of these basic engineering things that the Rust community is really focused on usability of the language and these broader software engineering issues. There's other projects that have been experimenting with migrating parts of their existing C code to Rust, like the EMACs being incrementally ported to Rust because we didn't have enough operating systems projects in Rust yet. There's Russell, the experimental port of the Muscle C library that was done as an educational exercise just to see how it would go. There's folks rewriting core details. There's a bunch of these projects, right? You can dig through Reddit or whatever and find a long list. So maybe what we should do is try to write everything that's in C. This is clearly a great language. Let's just rewrite everything in Rust, except if you're still actually thinking about it, maybe this is not such a great plan. So in small projects, we've got maybe thousands of lines of C, ranging up to bigger projects with millions of lines of C. If you go and try and hand rewrite all of that code, you're not going to get it right. You're going to introduce new bugs that were never in the C version. And at best, even if you did get it right, it's still going to be a super tedious process. You don't want to do this. So maybe the answer is let's not do this by hand. Maybe the answer is we should automate this process of translating C to Rust. So I'd like to tell you about this project I've been working on for a little less than a year now called Corode, which is all about automating the process of translating C code to Rust. So some particular principles that Corode follows. Corode aims to preserve safety and correctness. So given that we're starting from C source, which is not particularly safe and often not particularly correct, preserving its existing level of safety and correctness is not a very high bar. So the output of Corode is not a safe Rust. It is Rust in the unsafe subset of the language. That means that it might crash in spite of the usual promises that Rust provides around memory safety. But I don't know how it is exactly as safe as the input was. And another important point regarding safety and correctness is that people always ask me about what do you do about undefined behavior in the C source that you're translating? And the answer is that I try to, whenever I reasonably can, pick a unsurprising definition for undefined behavior. Given that Corode is acting as the C compiler, undefined behavior in C means that the C compiler gets to do whatever it wants to. And so I try to pick what I'm doing as something that will be least surprising to the programmer. Another principle of Corode is preserving maintainability to the extent possible. So if you started with code that you could actually maintain, then the output of Corode should be Rust that you can actually maintain. If you started with other crap, I can't help you. So in order to accomplish that, a big part of how Corode works is it, or a big part of its goals, is to preserve variable names, preserve the structure of the code as much as possible, preserve as much as possible about how heavy the program was originally written. And finally, Corode aims to preserve ABI. So if you run Corode on a piece of C source, the output Rust should still link against other C code that the original C code was able to link against. It should be compatible at the level of binary linking. So all of these together enable, ideally, Corode to be used as a way of making Rust be a drop in replacement for as much as little of the existing C source that you're starting. A key part of the idea here, and the reason why I don't feel like it's a problem that Rust doesn't try to do anything clever about making code more safe than the original or more correct somehow. The idea is that it is easier to refactor safe Rust from unsafe Rust than it is to refactor safe Rust from C. If you've got a tool that does a lot of the work for you and all of its left is to take this unsafe Rust and clear it up some and make it be idiomatic Rust and make it use safe borrows and all these other things. That is a much smaller problem, a much smaller amount of work that you have to do by hand than what people are doing today. So the whole point of this is to enable incremental manual improvement. The fun way to do this, for instance, is run Corode, get a giant pile of Rust, and then say, you know what, this function, this function, this one function has been pissing me off for years. I am so excited that I finally get to rewrite it. That's the one I'm gonna rewrite in Rust first. Or maybe you wanna pick a function that is the most safety critical or something. Whatever approach you wanna take, you can do it incrementally by using Corode. So all right, this is good. We have some tool that gets, I can hold questions to end, sorry. All right, thanks. Some tool that will automatically get us from C to Rust. So now we can go translate everything, right? Well, should we, is this a good idea? I'm gonna suggest some principles here. First of all, if you just want to translate something from C to Rust in order to learn something, by all means, absolutely do this. It is an excellent exercise, you will learn a ton. I highly recommend it. I have certainly learned a ton just building Corode. But if you want people to actually use your translated Rust, I would like to recommend three principles. First, can you demonstrate that the Rust is equivalent to the original C? There are a few different ways that you might do that demonstration. One that would be that assuming Corode is trustworthy, if you can say, well, I used Corode, then yes, that's all you need to prove, right? That these must be equivalent because Corode did all the work. Corode is not yet trustworthy, but I'm working on it. Another approach might be if you're working with a project that has a test suite. Assuming that it passed the test suite before, it would still pass as the test suite after switching to Rust. That's pretty good evidence, right? Another principle is that asking whether Rust advantages are actually going to help with the kinds of bugs the project faces. So if you're looking at a project that has a significant component where it's interacting with the network, interacting with untrusted data, things like that, that might be a really strong argument for why switching to Rust is a good idea. If it's just something where you don't deal with untrusted data at all and you're barely touching memory or whatever, maybe Rust advantages are not so helpful to you. Finally, of course, is the project community actually interested in accepting patches that switch to Rust? So if I go off and throw a Mesa patch at these guys sitting in the front row, are they going to actually take it? I've seen a lot of head shakes there. So don't do that, right? Don't waste your time on patches that are not actually going to get accepted. I'm going to switch gears a little here and talk about a case study I did of actually trying this approach. What happens if we go and try to automatically convert a giant pile of code to Rust? What I was looking for when I went to pick a project to do this case study to was first of all, I wanted something that's not actually, it was an open source project, but it's not actually maintained. Because if I'm going to write a bunch of patches, I didn't actually want to think about whether anyone was going to take them. If it's un-maintained, I don't have to think about that at all. Nobody's going to take them no matter what, so it's fine. I also wanted projects that are written in C because Courage only works with C. Projects with security implications because otherwise it's not terribly interesting to switch to Rust or not as interesting. And projects that are still in use, like someone might actually get some value out of this. So this is kind of a great position to be in, right? It's un-maintained. People don't care about it enough to maintain it, but they care about it enough to use it. So what project did I pick? The ConcreteVersion system. CVS, there's a bunch of source code out there that's still in CVS. Despite the fact that everyone has moved together, material or whatever, there's still 6% of W users with CVS installed according to the Automated Popularity Contest. Because there's all this source code out there, that this is the only way you can get at it. CVS does rely, sorry, is usually used unauthenticated and unencrypted so if there are remote code exploits, remote RCEs, vulnerabilities, they're that much easier to actually exploit. And yet, despite the fact that this is of use and has security implications, the last release of it was in 2008. It's also not a trivial code base, right? If I build a tool that can translate 50 lines of C, that's not terribly exciting, right? But if I can translate 50,000 lines of C, well, it's not a huge code base, but that's not trivial. It's an interesting exercise. CVS also relies on many corners of C that other projects may have managed to avoid. The original release of it, the original C release anyway, it was from 1989. So there's still K and R style function prototypes and things like that. That was kind of exciting to implement in code. So the idea is, if we can translate this, it seems like we can probably translate all sorts of things. As a nice bonus, CVS does have an extensive test suite. So that's turned out to be really useful. So let me tell you about where I've gotten so far on this case study. 6.4% of the source lines of code I've translated to Rust, which doesn't sound like much, but it was basically I went for the things that I could do in a couple of days and the rest of my time was spent fixing career bugs. So that's over 3,000 lines of code in the slot count sense, so not counting blank lines and comments and that sort of thing. And that's 10 out of the 68 source files that are the core of CVS. There's a bunch of reasons why that number, that 6.4% is still fairly small. Code is absolutely work-in-progress tool. The biggest issue is that control flow is hard. Rust does not have go-to statements or C-style switch statements. And the number of wacky things you can do, like let me just say Duff's device, for example, that you can't directly express in Rust mean that getting the translation of arbitrary C control flow to Rust is tricky. That's the biggest thing that I'm working on right now, actually. There's also other things, you know, there's a bunch of different little corners of the C specification that I just haven't gotten around to implementing yet in Corode. There are a couple of features in C that don't have any indirect correspondence in Rust, notably variable length arrays or bit fields instructs, at least in terms of being ABI compatible. The other big issue I mentioned, you know, Corode should preserve maintainability, but unfortunately, in order to quickly get started trying to get in Corode, I took an approach where it looks at only pre-processed C source, which means by the time Corode sees it, all of the comments are gone and all of the macros are fully expanded. So you've lost the comments and the macros mean that if you used macros in a significant way, the generated code looks like crap. So there's a bunch of stuff to be done still, right? But I can at least demonstrate that I can use this partially rustified version of CVS to clone the CVS source tree from CVS. So that's cool. Maybe that's not very convincing, right? Maybe more convincing is that the CVS test suite, which takes like an hour to run, actually still passes. I'm gonna pause for applause here. So I talked about the stuff that Corode still needs, but there's, I assume that's done, right? It's just a matter of programming. We'll get there. What should happen on top of that? The one big thing is Corode makes no effort to produce idiomatic rust. So for example, if I write this C for loop, just a standard C idiom of I wanna make I go from zero to 10, not counting 10. The idiomatic rust version of this is for I in zero dot dot 10, right? The Corode generated version of this is not nearly so pleasant. There's a bunch of details in here about how C is specified to deal with unsigned wrap and all sorts of things, right? Ideally, there would be a tool that would recognize this kind of non idiomatic rust and suggest maybe you wanted to write it the other way. Other sorts of things that would be nice to have tools for are Corode. Every time you have a pointer in your C source, Corode generates rust that uses raw pointers. It would be nice to be able to recognize in an at least partially automated way that those raw pointers are being used in such a way that you can replace them with safe rust biotypes. This is non trivial. If anyone's looking for a PhD thesis project or something, let me know. But the key thing I wanna say about this is that there's already a bunch of tools that rust developers can use. Notably Clippy, which you should definitely check out if you haven't seen it. That are in the category of lint tools for checking different kinds of code style issues. And Rust Fix, which is a tool for trying to automatically apply the fixes for you. So improving those tools to recognize these kinds of non idiomatic rust and ideally suggest fixes for them. Improving those tools is useful for all Rust developers, not just people using Corode. So I think this is gonna be a valuable work for the community as a whole. To wrap up here, Corode is already proving the concept of making it feasible to incrementally migrate from C to Rust. There's a lot that it can already handle, I think, with the control flow of stuff taken care of. It's gonna be very close to handling most of the C source that's out there. But there's plenty left to do in terms of making the resulting generated Rust source better. I wanna particularly thank Don Marty of Mozilla and the Mozilla Open Innovation team for supporting my work on Rust, the last quarter of last year. And a whole bunch of people who've submitted, who've made contributions to Corode. Feel free to grab Corode from this GitHub repo, check out my blog for a bunch of detailed posts and follow me on Twitter if you like. And with that, I think I have a couple minutes for questions.