 You can hear us. Okay. Excellent. Step one. Okay. Cool. Thank you for coming today. I'm Chelsea. And this is Isis, a colleague and friend. And today we're going to be talking to you about integrating Rust into Tor. And we're going to be talking to you about some of the successes we had and also challenges and things that we learned. Hi. I'm Isis. I do cryptographic design and implementations often in Rust. And I also do security and privacy engineering. And I worked for the Tor project from 2010 to last month in July 2018. And I very proudly no longer work there. And if you want to know more about that, you can talk to me after. And I'm Chelsea. I have worked on distributed systems and applied cryptography. And I do research design and implementation on the intersection between distributed systems and cryptography. We wanted to start off this talk by thanking the people who helped us get to where we are with this talk, with the material and both the implementations that we've done as well. So first I wanted to thank the members of the Tor network team who provided feedback about these issues and contributed to this effort as well. So the people in the Tor network team that worked with us on this are Tim Wilson-Brown, Nick Mathewson, Taylor Yu, David Goulet, George Kadyanakis, Alexander Furoy, and Mike Perry. And we also wanted to thank a whole bunch of Rust people who worked with us on getting through challenges and answering all our questions. And we're in general super helpful and kind, including Alex Payton, Manish Gorkar, Nikola Zell, without boats, Steve Klabnick, Alexi Beingesner, Patrick Walton, and many others. Okay, so this is what we'll be talking about today. We'll just do a quick introduction about what Tor is, and then we'll talk about how we started on this effort, where we are currently, where we are going, and overall what we've learned from all of this work. And then finally what we hope to see as well. So very brief introduction to what is Tor. So Tor is an anonymity tool, and it allows people to use the internet anonymously. So Tor will run on a client machine, and there's also a network of what we call relays that make up the Tor network. And essentially, client traffic is routed through the Tor network, and it provides anonymity for that request to the end server. It can also provide anonymity to the end server as well. But essentially the Tor is both the client and server software, and then all of the relays that make up the intermediate network. So with that said, LittleT Tor is something that we distinguish. So LittleT Tor is actually the core code base that we use for this anonymity software. And it's important to note that LittleT Tor has been around since 2002. So it's a very old code base. And as you can see here, there's about 300,000 lines of C. So there's a lot of C code. It's gone through a lot of evolution. So kind of all the challenges that you might expect with something like that. But it's interesting to see now that Rust is actually the second most language that is being used in Tor. And all of this has come about really in the last year. Okay, so just very brief introduction. Now we're gonna talk about how we started with Rust. So I think it's interesting that we actually didn't start out saying we're gonna go to Rust. Rust was actually the end of a long conversation. And this long conversation was how do we start to use languages which are memory-safe in Tor? So we really want to protect end users. C is very challenging, very delicate and fragile. And so as much as we can, we wanted to do things that would protect end users and help prevent bugs. So the way that we actually started this is we had a meeting at one of our semi-annual developer conferences. And this meeting was about how do we start to move to memory-safe languages. And so when we were talking about this, we identified some goals. Some of the goals were to do no harm. So the code that we deployed, we didn't want to be a liability to the user. We wanted to have confidence in what we deployed. Other things were like developer friendliness and productivity. So we wanted to make sure that it didn't take us a year or a really long time to deploy some of these things. We wanted to get up and running quickly. Cross-platform compatibility. So these are some of the things that we talked about. And I think it's amazing. I don't know if you've ever been in a group of developers who are talking about what is the best language to pick. Consensus is usually never achieved. There's a lot of really strong differing opinions. But I think what was amazing is after we did this, Rust actually emerged as pretty much the language that everyone said that we should experiment and prototype. So this was amazing. At the end, we were all looking at each other like, are we all agreeing? Is this really happening? We had all agreed never to say the name of any language during this discussion at the beginning. And then someone said Rust and then someone else said Rust. And then everyone was like, wait, are we just saying Rust? Over and over. So we saw this as a very good sign. And we're like, let's try this out as quickly as possible before we scatter again and we disagree. So it was a really special moment, I think. With that said, there was a lot of questions that we had. So this was a new language for a lot of us. ISIS had done quite a bit of Rust before outside of Tor. But a lot of us hadn't. And so what we did is we identified where we were at that moment. And the kinds of things that we needed to identify to say yes, like Tor is a first class supported language. Or sorry, Rust is a first class supported language. Or no, this isn't going to work for us. So essentially we went through and sort of laid out this timeline. And started to think about all the things that we needed to know and identify and try before we were like, okay, this is a good idea. Let's go for it. And so we had some critical questions that we didn't know the answers to at that time. So one critical question is how could we integrate Rust into the Tor build system? Right, so this code base has been around since 2002. There's a lot of old tooling, a lot of complexity. And so we didn't want to have to rewrite our entire build system. So one thing we wanted to know is how do we just sort of do a drop in and start to integrate Rust right away? Another question we had is what is the overhead to implement existing or new sub modules in Rust? We didn't want to do a rewrite the world approach. We really couldn't do that. And so we were thinking like, you know, how much work is it going to be to do this? Do we have to sort of reimplement everything? Or is this going to be really hard? One other question we had is is Rust supported on the platforms that Tor supports? So diversity is a good thing in the Tor network. We don't want a single bug on one operating system to take out the entire network. So we wanted to make sure that we could continue to sort of support this ecosystem diversity. And then finally, another question we had is can we reproducibly build Tor with Rust enabled? This is something we still need to investigate. But reproducibility for us is really important both for user confidence in the binary that's being distributed and also for us in order to reproduce bugs that have been reported. Okay, so that was sort of how we started in the questions that we had at the time. Now we're going to talk about where we are with this effort and some of the work that has been going on. So the first, well, not the first thing, but one of the first things we started with was an experimental sub-module rewrite. So we wanted, the question we wanted to answer is can we rewrite an existing sub-module with little overhead or code changing? So basically, can we take something that's already there, import it to Rust, and what happens with that? So what we did is we chose one sub-module with limited dependencies and a simple interface. And ISIS is going to talk more about sort of the consequences after we did this and stabilizing it and bringing this to production. But the reason why we chose what we did at the time is so that, you know, it was kind of minimal impact. And we thought with a simpler dependency graph we could sort of do this rewrite. And we were successful. I think we didn't actually have to change any of the calling C code when we did this. So it was nice to say, it was a nice validation to say like with what we have here we can actually slot something else in. The overall takeaway that we had from this, though, is refactoring would have actually been really helpful to this sort of porting exercise. So the experiment was useful. We did get something in. But in the future we really want to refactor our C code and make things as simple and less complex before we actually start to import more things to Rust. And so one thing that we are doing right now is a modularization effort. So this is all on the C side, but we feel our modularization effort will actually enable moving more functionality to Rust in the future with less challenges of managing complexity on both sides, which we'll go into more as well. So one of the issues we ran into once we integrated, once we redid one of our modules in Rust was that we ran into linking issues with tests. So the problem is that building C code and Rust code, a static libraries, using the same sanitizer, for example Ubisan or ASAN doesn't currently have a configurable way to pass the same sanitizer options to the linker, which causes problems for unit test code where the Rust code is actually calling C code. For doc tests we need a similar way to pass arguments to the C linker when the Rust code in a doc test is actually wrapping some C somewhere. Originally I thought this was that we would need a multi-stage build process because we had Rust calling C and C calling Rust and back and forth. So maybe if we refactored everything into smaller static libraries and then linked them together in the end that it would be okay, but that still didn't fix the issue with ASAN. This has resulted in us stubbing out test versions of Rust code which are in pure Rust and they're essentially re-implementations of code which is wrapping C code. So for example I wrapped our usage of OpenSSL's PRNG and hash digests with code to implement RAND RNG and digest trades. But we then had to substitute pure Rust implementations during the testing to avoid the linker errors because if you look at higher level cryptographic implementations like signatures which need to get a hash or need to get randomness from the RNG, they were calling this code that was wrapping OpenSSL and then the linker issues were coming back again. So this is one of the sort of hard problems that we've hit and we're not quite sure what to do yet. It's the ticket no one wants to touch. So seeing as that was like a wonderful thing to go through we thought that we would go further and like Chelsea mentioned rewriting this module. And so not just wrapping the C code but maintaining two implementations of a bitwise and behaviorally identical binary parser in both C and Rust at the same time. And this is largely because we weren't quite ready to say we're doing Rust and we can write Rust and ditch the C. So we wanted to do Rust but then just have it side by side so that you could compile with either one. It turns out to be a really, really bad idea. A very bad idea. Don't do this. Really don't do this. In maintaining this behaviorally identical binary parser, we found bugs in both implementations. Lots of bugs. Some of the bugs got CVE numbers assigned to them. Bugs, bugs. Unicode bugs. A really good way to drive your developers right up the wall as they basically manually fuzz for edge cases which produce different behaviors which in the context of an anonymity system it's really important that if someone had compiled with Rust or if someone had compiled with NLC that their behaviors aren't different. And this actually goes down to a really low level. Like for example if in the Rust version we're doing more allocations you may be able to if you can give them a string for this parser to parse. This might result in it expanding and taking up more memory than it would in the C version. So you might be able to learn things about what kind of machine someone is running it on or what code they're running and thus tell the difference between clients. We found some of the bugs. We found a memory exhaustion attack. We found two different remote crashes due to null pointer de-reference and a DOS attack triggered by an infinite loop. So that was all great. It worked really well. It was lots of fun to write hundreds of unit tests. And so we decided we'd do it with cryptographic implementations. This actually turned out to be okay. We already had support for several different implementations of the ed25519 signature scheme. And the way we did that was registering the external code via structs containing function pointers. So to integrate ed25519, which is a pure Rust implementation of the signature scheme and its underlying curve library curve 25519, all we had to do was create a file which presented the same interfaces as those defined in the function pointers. This is a lot easier to implement and test and it should prove easier to maintain in the future. This is probably largely due to if you're writing cryptographic implementations, you have to match test vectors and if you don't, something's horribly, horribly wrong with your math. So moving up the stack from bad ideas to better ideas, we then decided it would be okay to design a new feature in Tor and to only do it in Rust. And so this work was done by our colleagues, Tim Wilson-Brown and Nick Matthewson. It was based upon our other colleagues, where Rob Jansen and Aaron Johnson. And as I said, it's the first time that we integrated something into Tor that was Rust only. So in cryptography, a differential privacy system is, it aims to provide a way to maximize the accuracy of queries to a statistical database while minimizing the chances of identifying specific records. And so in Tor, ProveCount is doing this and it's aimed at safely gathering anonymized metrics on servers in a manner that is resistant to the servers colluding with each other to influence the results or to learn things about the metrics that are being gathered. And so they're doing this by adding a bit of noise to the metrics that they're gathering and they're using a homomorphism when everything's put back together to remove the blinding factor and retrieve, like, metrics over the whole system and not about any particular user or server. And this is using Shamir's secret sharing for robustness with the Provex algorithm developed by our colleagues, fellow researchers, Tariq Alahi, George Dinesis and Ian Goldberg. So we're going to talk a bit about where we're going in the future and more about what we've learned. Yeah, so I think it's important to note that Rust is a really great language and we really like it, especially for the safety that it provides. When you are using Rust across an FFI boundary, this is where there's a lot of complexity and things that we found to be challenging. So one thing that we are excited to see is in the recent release of Rust is the ability to use a global allocator. So before this, what we were doing is we were managing memory on sort of both sides of the language boundary. And we were trying to do this in such a way that it was less prone to errors by developers. But it was pretty challenging. And we had a lot of discussions about, you know, do you allocate on C and free on Rust and when do you do this and how do you do this safely. So this is something we're going to be moving to and we're really glad to see this feature being released. Sorry, that button. So another thing that we want to do in the future is have some kind of automated mechanism to keep C and Rust shared types in sync. So ISIS is going to talk more about this in a bit, but we've really tried to keep our interfaces pure Rust and our C interfaces pure C and we sort of have this translation layer in between and we do this largely for safety. But what this means is then we have duplicated types across the language boundary and keeping those in sync has proven to be challenging and we're kind of worried about bugs there in the future. So we've been looking at how do we have some kind of automated mechanism to keep these in sync. Okay, so that's sort of where we started, where we're going and now we're going to talk about a lot of lessons that we've learned along the way. So as I was just mentioning, complex objects across FFI boundaries are difficult. Yeah, we're doing this manually. If you look in our code base, there's comments where we say this is coupled across the language boundary and really it's just mental overhead to keep track of what is on both sides and this can be constants, structs, we do this with enums. We do this so that in Rust we don't have to have unsafe like C handling code everywhere but it does introduce duplication and complexity across the boundary. Yeah, and so anywhere you have sort of duplication without static checking is something to be worried about. And we also have extra copies when we convert types across the boundary. So in towards C code there's a concept of a smart list. So it's essentially a VAC of arbitrary types and so when we pass this across the language boundary we convert this into a pure Rust vector. We do this with strings right now and it does introduce extra copies and when we start to do this with structs or other types, we haven't done this yet, but we do extra copies to ensure safety but it does have some overhead to it. And just as a fine example of how we do this right now, essentially we have an enum on both sides of the language boundary and as enums are represented as ordered integers, we can sort of pass this through and then on the Rust side do some kind of translation based on this ordered integer and translate it into a Rust enum. Don't do this, this is bad. But it would be nice, we've been thinking about how can we do this in a way that's safe and any suggestions or lessons learned, please let us know. But this is sort of our early attempts at doing translation where we could keep Rust pure but still be able to pass complex types across the language boundary. Alright, so another lesson we learned was it was really important to keep a very rusty API separate from the FFI and we didn't do this at first. We wrote a lot of very suspiciously C looking Rust code. We had basically like implementing the same behavior in this binary parser again, implementing the same behavior but then sort of exporting the same interface as the C. So this very C like thing that you do where you pass a pointer to just a buffer as like the first argument and then that's actually your return because you don't actually have a return type or yeah, it's just bad. It looked really bad and we've slowly rewritten it a bunch of times and now the Rust APIs are really good. We ended up defining coding standards which said that the pure Rust implementations needed to present intuitive easy to use Rusty APIs and that the code for FFI and interfacing with C needed to live elsewhere in a separate module. So we have crates for each C module that we've rewritten so far and each one is supposed to have its code in its own module and then another FFI.rs module so that the FFI code is very distinct from the Rust API. This should work better moving forward once we have more Rust code, calling more Rust code because then we can have types which nicely implement the traits to make them work better together but it does seem to be slightly confusing to new contributors in the meantime. Our Rust coding standards also require that Rust to C FFI be kept separate to C to Rust FFI and that our Rust to C FFI all live in one big crate very far away from the other crates to avoid code duplication. So if I'm writing Rust code which is calling C you don't want someone to have that FFI defined eight times in like eight different crates and someone makes a mistake in a type somewhere or calls the wrong function. So another thing we learned was that it was better to write new features in Rust while also rewriting things in Rust and if we had to do this again we'd opt for writing new features while rewriting the old code and by rewriting in Rust here we mean replacing. Don't maintain two separate implementations of something. If you have to do that don't do it for any substantial amount of time. More specifically maintaining these two bitwise and behaviorally identical binary parsers was bad and it led to badness and sadness. Helping new features in Rust only as well as rewriting things was awesome and great especially rewriting making new features in Rust and you should toss all your old C and technical debt out the window for Ferris's sake. So another thing we found helpful and thankfully we did do this from the beginning because as Chelsea mentioned I was somewhat familiar with Rust and some of my team was not is that we started a running code standards guide which included helping developers find the documentation find you know here's the instructions on how to use Rust up here's like the Rust programming language book here's the Namecon and walking them through the standards from the outset and pointing to them often when people were confused and revising them as people were confused in new ways. We found that it was important to give developers very clear concise explanations of both what to do and what not to do along with code snippets explaining and explaining like the reasons for why we've made these choices. So part of our code standards for safety has a bunch of examples and there's a link in a second that I'll show you but it seeks to educate developers on like what is undefined behavior in Rust and that it still exists because some people didn't realize this switching from C from the outset avoiding unwinding across the FFI boundary avoiding panics, maintaining type safety avoiding unsafe and unwrap whenever you can and when you can't you have to document it whitelisting the only C ABI compatible types for crossing the FFI boundary which that enum one back there was I think that's not okay according to our standards but it still exists in there anyway no abusing unsafe to muck around with lifetimes how to avoid memory leaks and C string usage performing allocations to copy buffers across the FFI boundary as Chelsea mentioned avoid enums if you can and don't do anything weird with floats and a bunch of more do's and don'ts in there so if you want to look at these guides if they are useful for your project feel free to take them if you have more guidelines that you'd like to contribute we'd be happy to take pull requests and so with that said there are a few things that we would hope to see from Rust in the future one of them was that with this binary parser Chelsea and I essentially sat there writing hundreds and hundreds of unit tests trying to find all the edge cases like if you put two commas in a row what happens? does a different thing happen in C? does a different thing happen in Rust? if I say that I support version 5-5,6 does that do a different thing in Rust and a different thing in C? and we had to write all of these edge cases by hand and they're like, this is what a fuzzer is for why are we doing this? but we didn't really have an easy go-to way to hook up like AFL or anything through cargo fuzz or another similar project and say, hey please take the same fuzzing input and give it to both the C function and this Rust function and then test that the output is exactly the same one thing I also wanted to note that we experienced as we were going throughout this work is the work of bringing along an entire team in this effort so just to bring back to what Isis was talking about with our coding standards I think one thing we worked on a lot was working with our team and sort of bringing people along as we were doing this effort just as a side note but I think one thing we're hoping to see out of Rust as well is sort of better FFI documentation so a lot of the coding standards that Isis was just talking about were things that as we were going along and doing code review we're like, oh this is like really unsafe across an FFI boundary but I think one thing that wasn't immediately clear is when you go and learn how to write pure Rust how does that actually differentiate from writing Rust that will be called across the FFI boundary and what types of safety guarantees do you have when you're writing pure Rust versus something that's called from C so we've done a lot of documentation in our coding standards about that type of thing but I think sort of differentiating the different worlds of pure Rust versus not would be helpful to people who are just getting into this work in the future so another thing we ran into is that we tried to use BindGen in some cases and partly because our code was kind of a tangled mess like for example we had this one header we don't have it anymore but it was called or.h and it was roughly 30,000 lines of code and essentially contained every struct you might ever use anywhere in this entire code base it was bad, we got rid of it but when we ran BindGen with that BindGen was like, I will generate everything this whole 30,000 line and we're like, no, no, I know that file is sourced there I just wanted just the thing just the one module, not the whole world and it was really hard to figure out how to try to configure BindGen to not do this and we ended up looking at the ways that Firefox was using BindGen and the point that you're looking at your browser code to try to figure out how to call a tool so that you could call it in a different way in your code was very confusing and hard so we haven't been able to use BindGen that much and it would be great to have better documentation on how to use that for different cases and thanks