 which is maybe weird, because we're here at a Ruby conference, although for some reason Rust has been a pretty popular topic, not just among Rubyists, but among node people and all kinds of stuff. I don't actually have a good explanation for exactly why that is, but I know that people want to hear about it, so I'm here to deliver. The topic of my talk is basically a link bait talk, a link bait title, which I came up with last week. Okay, I'll continue. But I came up with the title, actually, while I was working, I'll talk a bit more about this later, working with Godfrey, and actually sort of encountered this phenomenon on my own without actually planning for it. So I've been working on my talk about Rust up until that point and actually encountered this phenomenon and sort of rebuilt the beginning of it to focus on the hook. So first of all, I work on Skylight, which is a tool that you can use to keep your Rails application healthy and keep an eye on its health. And one of the things that we do to make that work is we put an agent in your Rails application just like any of our competitors. And very early on in the lifetime of our product, what we realized was that if we wrote all the code that actually did the instrumentation and talked to our server in Ruby, which was how we initially prototyped it, there was gonna be a limit on how many things we could collect, and that's something that we noticed all our competitors ran into as well. There was a limit on what they could collect. And so we decided at some point, well, we wanna be able to collect a lot more things because the more things we collect, the more information we can give you and the more data crunching we can do. And Carl, one of our co-founders, decided that what he's gonna do is he's gonna rewrite the agent in C++. And I think over a weekend or something, like I don't know exactly, I just know that one day he came in and gave me a fully formed C++ program that did a lot of what the agent was already doing and it indeed worked. And I looked at it and I said, I'm sure that you will be able to understand the C++ code and I actually understand it, but I'm sure you'll be able to maintain it. But I don't feel comfortable with me and the rest of the team maintaining this big blob of C++ code. And this was like a year and a half ago when Rust was already on hacker news a lot, but wasn't really anywhere, I think, in terms of completion. I think like Servo was really the only project using Rust at the time. And I said, you know, give me a week, I'll try to do this in Rust just because it seems like Rust is a good fit for this kind of project. And so we went from there. So from then on, we've had a big chunk of our agent the part that collects data written in Rust. And I hope today I'll help you understand sort of why that came to be. I think from the big picture, the thing that makes Rust really successful or at least successful for me, is that it takes people who are good programmers, really good programmers even, but who look at a bunch of C code and say, I understand what this does, but I'm scared of this code. And therefore, the idea that, well, if my Rails app is too slow, I'll drop down to C, well, that's a good thing that we could pitch, that we could say to people. Doesn't end up really turning out to be true, right? So there's this idea that with Rust, you finally can actually achieve the promise of if I have a piece of my app that has a bottleneck that's too slow and I want to speed it up, maybe you can actually do it in Rust. So the idea is that you, like anybody, you in your job as a Rails developer or a front-end developer or whatever, you also can be a systems programmer. And I'll talk a little bit more about what it exactly means to be systems programmer in a few minutes. But first I want to start with the story. So how many people here know Sam Saffron or Discourse? Okay, I'll try to give him some credit here. So Sam works on Discourse. Discourse is awesome. Of course the forum software that a lot of people use. And one of the things that he did, so he came in, not as a Rails course, he came in as a Microsoft shop guy, but Discourse decided at the beginning that they were going to use Rails and Ember as their stack. And they did that basically because they wanted to find a way to get more contributors and they figured if they tried to have an open source software that used Microsoft as a stack, that was going to result in very many fewer open source contributions. So they decided to base it on Rails. But of course Rails has a well earned reputation for it being slow and Sam is a speed junkie. So Sam's has done a lot of good work to make Rails pieces of Rails faster either by just reporting slowness by actually submitting patches or in this case something unorthodox. And so one day Sam Saffron was looking at a flame graph and he has some tools that make it easy to look at this and you can see he somehow inked some circles around the thing that he reported. He said, it looks to me like blank question mark. The blank method is taking up a lot of time. It's taking up like 4% of the total time in this whole program. 4% is a lot of time for Rails app, right? So maybe your request takes 100 milliseconds. That could be like 40 milliseconds. So that's a lot of time. That is totally the wrong thing. That's 4 milliseconds. I should be able to do percentages, right? I'll blame it on jet lag. Okay, anyway, the point is it's a big chunk. 4% is a big chunk and he said, let me try to make it faster. And at first he, this is the changelog entry about this thing, at first he tried to submit some patches to make it faster in Ruby. If you want, you can follow the rabbit hole. It's interesting reading where he basically tried to just make the method faster. But after a bunch of efforts, I think the first thing that happened is he submitted a patch and somebody replied, seems legit, and then Tendril ever applied, does not seem legit, because it was wrong. It wasn't compatible. So eventually Sam just decided, I'm just gonna rewrite it and see. So to get an idea, this is something I grabbed out of his test suite. This is what the, it's blank too, because he's testing it against his faster blank in this particular situation. So you can see that it's a pretty simple implementation, right, it's the regex that checks to see if there's any spaces, very simple. Actually you might not look at this and feel like this should obviously be slow, because it's basically all gonna run inside of Ruby, but what you should be able to realize by looking at this is that this is gonna exercise some very generic parts of the system. It's a regular expression, right, so it's not, unless the regular expression engine happens to be very optimized for this scenario, you're not gonna get the best performance out of this implementation. So he basically went and wrote a C extension. You do not need to read this. The snip over there represents like another 20 lines or something like that. And it's sort of like what you come to expect when you look at code that is designed to make things faster, right? So it's a function, has a bunch of gunk around Ruby stuff. You can see it's like pulling out pointers and making pointers, getting them out, et cetera. And then there's just a big switch statement. Of course switch statements are expected to be fast. And you can also see that the loop is just like right in here. They're not using any abstractions for the loop. It's literally just like everything is right here. And as perhaps expected, if you actually run this code, what you will discover is that it is much faster. So Sam made this. He released it as a gem. You can install it as fast blank. And it will basically take care of itself and make your blank question mark methods much faster, which has an impact on some percentage of your Rails app request. So people should definitely do this. And additionally, Sam said in the read me, this gem allocates no strings during the test, making it less of a GC burden. And this is actually something, I actually wrote a blog post about this last week, but this is something that I think people don't have a good time really understanding. When you allocate a bunch, whoa, I'll comment down here. When you allocate a lot of objects, usually, so the allocation itself is very cheap. If you benchmark allocating like a thousand objects in Ruby, it will be pretty fast. But the thing that makes it a problem is that eventually somebody has to clean it up. And the more objects that you allocate, the more time the GC has to spend. But of course, the GC runs kind of randomly throughout the program, so it's very hard to track down GC, pauses back to the original thing that triggered it. But things like allocating a bunch of strings just to do a match to see if there's any blank characters allocates unnecessary strings. So a nice side effect of this is that it allocated no strings. So this is pretty awesome. And I think you can basically say for the most part, case closed, people should probably use this gem. Maybe there's other things like this that we could do and people, maybe there's a general collection of gems. I think this was like a pretty low hanging fruit. This is pretty great. So I personally saw this as a pretty good success story for the whole notion of dropping down to C to deal with bottlenecks. Obviously, not everyone's going to be able to write that C code, but still it's pretty great. So when Godfrey started working with me a couple of weeks ago, we discussed the fact that he was really interested in learning Rust. This is Godfrey Chan, he's on the Rails core team. He had sort of babbled with sort of Hello World tutorials in Rust, but he was interested in really learning Rust. And he came up with the idea, maybe we should try to re-implement the fast blank method in Rust. And the goal was not actually to do a better job. The goal was just to say, what does the infrastructure look like to basically implement the same thing in Rust that Sam implemented in C, sorry. And so just to get an idea, obviously we're not going to want to use the regular expression engine in Rust. There is a regular expression engine, but we don't want to use it. We want to start out by saying, okay, what does the loop look like at a high level? And at a high level, the loop looks like string.cars which could return a numerator or maybe it just gives you back an array of characters. And then you loop through all of them and check to see if any of them are white space. All of them are white space, rather. So this is a high level implementation. If you try to run this in Ruby, it will be much, much slower even than the regular expression version. But you can sort of get a sense for what the constituent parts are of the problem. So the next thing that we did is we basically transcoded that code to Rust. And I'm leaving out a bunch of, you know, surrounding C extension gunk. I think we're planning on... I want to say we already posted it publicly, but I think that's not actually true. But it's basically sort of the traditional... I left out all the, you know, SAM saffron C extension gunk. It's pretty straightforward to do this. And so we implemented this thing called Fast Blank. It takes a buffer which is basically just a string that you could transfer over the boundary. And then it does pretty much what we did before, what the Ruby code did. First, it converts the buffer into a Rust string which is called a slice. Then it gets an iterator for all the characters. Then it calls all on them, passes a closure which says, is it white space? And I just, just to go through here what is happening here. So first of all, the buff object has a method on it. So this is a method like in Ruby. And that method creates a new slice object. And then that slice object has a method on it called cars. And that method creates a new iterator object. An iterator is kind of like an enumerator in Ruby. I would say pretty equivalent. Then there's another method called all which you run on the iterator. And that all method is actually implemented for all iterators just like the all question mark method in Ruby is implemented for all enumerables. Then we pass a closure which is like a block in Ruby. And then finally there's another method call on each character to look up is white space. And so you might, obviously there's a pretty high level code here and you might look at it and say, well, obviously that's going to be pretty slow. There's a reason I say naive Rust implementation. This is basically the first thing that you can implement. It uses essentially all the high level niceties that you would expect to implement something like this in Ruby. So we effectively me and God free wrote this code just so that we could get to the point where the rest of the infrastructure was working. And then we said, okay, now that we have this implementation, let's benchmark it. And then when it's slow as expected, we'll go and make it faster. So we benchmarked it and it was faster. Which led me to say to make the title of this talk. So first of all, it is no longer a naive Rust implementation that is just is the Rust implementation. It actually doesn't help you to do something different. Actually many things that you might think would be faster than this written in Rust would be slower. And I might talk about that later. And interestingly, if you look at this, right, typically this kind of high level abstraction we're using methods all over the place. We're creating a bunch of different objects. We have a loop, right? So loops are usually the place where you should really, really be on the lookout for extra costs. Somehow this is not slow. How, how is that? And that's what the rest of my talk is about. So hopefully the mind numbing parts the rest of my talk will be well motivated by this surprising result in the beginning. So first of all, before I get into the details, I want to talk about abstractions. The goal of an abstraction, of course, is to hide details, right? So if you use a function, the whole point of the function is that you're not looking at the inside of the function to understand how it works. You're looking at the API documentation to understand how it works. And normally abstractions also hide performance details, right? So normally if you use all on enumerable, you're not supposed to think about whether there's a specially optimized all written for Array or not, anything like that. You're supposed to just say, I assume that someone did the right thing here. We'll do the job. So in practice when you use a high level language like Ruby or JavaScript, you're supposed to think of abstractions as abstracting not just the implementation detail but also performance characteristics. You're supposed to assume people basically do a good enough job and if they didn't do a good enough job it doesn't really end up mattering that much in reality. And interestingly, here's a big chunk of documentation. It's actually half the documentation from a JavaScript library that I work on called Ember. There's a thing here, a function called intern in utils and you can see the top line says strongly hit runtimes to intern the provided string. So interning is kind of like how in JavaScript you can use a symbol and then, sorry, in Ruby you can use a symbol and those symbols can be compared very quickly or used in hashes very quickly. In JavaScript there is no such thing but you can basically wink at V8 using a bunch of hacks to make it do effectively the equivalent thing. And of course if you look at this documentation it explains sort of what's going on but it also says basically don't use this function it doesn't really end up mattering. But of course when you're writing the kernel of something like an Ember it sometimes does end up mattering and you end up writing these hundred line documentations about what exactly is happening behind the scenes in today's implementation of V8. So in practice the fact that these abstractions in JavaScript abstract over performance details while in like 98% of the cases is a win occasionally in a small percentage of cases usually in the kernel of your framework or library ends up mattering a lot. And actually James Dalton, John Dalton who works on Low Dash basically made his career out of exploiting these details that you're not supposed to think about and making Low Dash really fast. So basically there are cases where the fact that you're abstracting over things that you know performance details ends up being bad. And so really what a systems programming language is is a programming language that lets you actually look at an abstraction and while the abstraction might be hiding implementation details it does not actually hide performance details and I'll give you a series of examples of what that means. So first of all I showed you that there were a bunch of methods that we used in our example. So how does a method work in Rust? So I'm gonna start by showing you an example and let me walk through this carefully. So first of all we have a struct called a circle and you can sort of think of a struct in Rust as being both like a C struct which lists a bunch of fields but also comes to the implementation like a class in Ruby or Java or something like that. So you can see that we have a struct which has a field called radius which is a 64 bit float and then we have an implementation of that circle which it has a new function on it which creates a new circle. That line inside of there is basically just saying make a new circle with the radius that you passed in and we also have a method called diameter which basically returns the diameter given the radius and so this is not very different from how you might have written this in another programming language with perhaps the exception that the implementation is a separate area from the actual definition of all the fields and we'll see why that matters soon. So now let's talk about how you actually use this. So if you look at I've made a main function here and the first thing we do is we say circle new 10.0 and then we print the diameter. That print line is how in Rust you do string interpolation basically. So you can see here that we created a new circle and then we printed its diameter and both circle, column, column, new and see that diameter are methods, right? Just like methods in Ruby and it's important in Rust that they are not effectively in Ruby when you use a method it's sort of the performance characteristics don't matter. In Rust it does. But before I get to that let me just show a different implementation, square. A square has a side of f64. Again it implement, you implement square. We do the same sort of thing with new. Should get the hang of the new method now and we have a diagonal which basically multiplies the side by the square root of two which maybe you may or may not remember from like sixth grade math. I didn't actually remember it. I had to look it up. So now we basically want to do the same thing. We have a function called main. We say make a new square of 10.0 and print its diagonals. Sort of the same deal. And what's interesting about this situation here is that if you look at the example, both examples it looks a lot like, you know, doing the same kind of thing in Ruby, right? You're making a new thing. You're calling a method. New is a method. The methods are methods. Looks kind of the same. But there's actually a really big difference which is that in Rust every single time you call a method unless with very minor exceptions that are very explicit, the compiler knows at the time it compiles exactly what method you are talking about. So let me go back there. So when you say square, colon, colon, new that isn't saying, you know, at runtime go figure out what method that is and call it. And when you say see that diagonal it's not saying at runtime go figure out what that method is and call it. The compiler actually points directly at the place where it's going. And at first glance that might seem a little bit useful but you probably have heard from people like, well, dynamic dispatch is pretty fast. We have really good techniques these days for making dynamic dispatch fast. And indeed that's true. So for example, the go implementation of dynamic dispatch is very fast. It's like reasonably fast implementation. But there's actually an important detail which is that when you statically dispatch to something, when you know exactly where it is that you're going at compile time you can also inline the function. And you might be thinking, well, function inlining seems good. It seems like an important optimization but maybe it doesn't end up mattering that much. So I went to the function inlining Wikipedia entry which is of course where you go to learn anything. And if you read the function inlining section it has a bunch of hemming and hawing about how inlining may or may not actually be the thing you want to do in any given situation. But if you look at it it says the primary benefit of inline expansion is to allow further optimizations and improved scheduling. And basically the idea here is that inlining is kind of a gatekeeper optimization. If you have a function and you don't know where you're going you're basically done optimizing. You have to wait until runtime to know where you're going and maybe a jit can sort of recover by figuring things out at runtime. But if you can inline at compile time you can go further. You can say, okay, now I see the whole picture and maybe if you inline a bunch of times you see the whole whole picture. Maybe you see the whole program. And so you can do things that really require more expensive knowledge of the whole program. So again, I think of inlining as a gatekeeper optimization for a lot of other optimizations and static dispatch is what you need to do to get inlining. So basically if you don't have static dispatch you don't have a whole bunch of really important optimizations that end up mattering. And when we eventually at the end circle back around to the motivating example at the beginning we'll see why that ends up being true. So that's step number one, static dispatch. In Rust you write code that looks a lot like the code you would have written in another programming language that looks like dynamic dispatch but you get really important optimizations out of the box. So that's one important vector. The second important vector is allocation. And you might have heard the word allocation a bunch of times in various contexts and you probably know that you should try to avoid allocation. And that's true. The more allocation the more time you have to spend deallocating things. So let's look at how it actually works in Rust. Well, so first of all here is the code that I was looking at before and keep in mind for the next few slides that obviously square new has to go somewhere, right? So we have to actually make a new square and we have to put it somewhere. But before we explain where it goes let's first look at how you would do perhaps an equivalent thing in Ruby. So here we have a square class. We initialize it. Hopefully people in this room who are here for a Ruby conference can read this code. You probably would not have deaf main but whatever. And what happens in Ruby is that when you say that you want to make a new square what happens is it goes on the heap and internally in the Ruby C code that thing is called an R object. So basically you make a new square and you put an R object on the heap. And the way you should think about the heap is it's just a place where you put stuff. It has to be managed so when you want to get rid of it you have to do extra work to get rid of it. But the fact that things are on the heap means that you could have many pointers to the same thing and you normally have to worry about exactly where things are. So when you type square new 10.0 in Ruby you're basically saying this is somebody else's job but it actually gets allocated somewhere in a giant filing cabinet basically. Then you print out the diagonal and then when you're done so far the object is still on the heap but eventually the garbage collector will come around and stop the world and it will eat up the R object and then the R object is cleaned up. Of course that takes some amount of time to walk the entire space, et cetera. But this is basically how garbage collection works. There are various strategies to make it more efficient. Ruby actually has a pretty good garbage collector these days. But the basic idea here is that when you say square.new what you're saying is make a new square and put it somewhere else. So that is heap allocation. And just to remind you what systems programming normally looks like if you want to do heap allocation outside of Ruby and see it looks something like this. Yes it should be scary. It's very easy to make mistakes. You could forget to free. You could malloc a wrong thing. You can malloc a pointer instead of the actual thing. You have to remember to line up all the pieces. So now that we understand that Ruby does heap allocation which basically means take it and put it somewhere else and here's how you do it and see. Let's look at an alternative kind of allocation which is called stack allocation. And this is again SIMC code. And what you see here is that we've made a struct which is the equivalent of the struct that we made in Rust. The exact details are not so important here but I'll just walk through the basic idea and then go into more detail in the Rust example. So the first thing that we do is we enter the main function and the key trick with stack allocation and this is like one of the main one weird tricks of C and C plus plus is that when we compile the main function we can actually look at it and say okay we actually know how big a square is and we know what the diagonal function does and we know how much space we need for it. So what we're gonna do is we're going to leave some space. We're gonna leave some space in the stack when we enter the main function and we basically create some space for local variables. We're gonna leave enough space in the stack to hold all those objects and why does that matter? So remember before I said when you put things on the heap somebody has to eventually go and clean them up. Well what happens if you put things on the stack? So first of all we make a square, we fill it in, we print, we call the diagonal function, the diagonal function makes another place on the stack so basically pushes onto the stack another frame which has enough space, I'm aligning the details, it has enough space for whatever the diagonal function needs to do but then the important detail here is that when the diagonal function returns it can just move the stack pointer back up. So basically it's like okay I'm done with this frame here and then nobody has to free anything. Basically the fact that we knew ahead of time how big whatever intermediate variables were needed means that we don't have to free anything. Simply by exiting that frame we got rid of all the garbage so we can basically create the garbage in a place that we know and sort of make it part of the process of calling functions. And then equivalently over here when we're done with the main function we leave that stack frame and then all the things that we made for that stack frame go away. So this is a much better way of creating things that are only needed temporarily. Obviously if you need something for a longer period of time that outlives the lifetime of that function you may still need heap allocation but under many circumstances if you look at regular code if you just look like any function you write a huge amount of the things that you make are temporary. They're only needed for the lifetime of that function. Sometimes you need to call other functions but at the end of the day you basically have a lot of temporary things and so the nice thing about stack allocation is basically that the stack pointer which has to exist anyway that's moving things up and down is responsible for freeing memory. There's a small caveat which is that the exact size for each function has to be known at compile time. And this is why it's hard to do this kind of stuff in Ruby or JavaScript without a jit. Because you may look at a bunch of local variables but you can't tell just by looking at the local variables in Ruby how much space you need. So maybe you could put enough space for a pointer there but you can't put enough space for the whole square object because it has other stuff in it. So it ends up being really hard to do in dynamic languages without a jit but in static languages we get a lot of the information that we need to do this optimization. So let's go back to Rust and I hope that when I showed you this example in the first place you thought that looks pretty similar to what I might have written in Ruby and obviously not the syntax but the memory management part of it. And again just like in the C code that we wrote before when you make a new square in this case it's actually going into the space that we created at compile time. So we made some space at compile time for the square and we put the square into that space and when we call diagonal again we call that function it has scratch space for its stuff. When it returns it basically pops up all the stuff gets collected when main returns pops up it gets collected. And the interesting thing about this is that again if you think about the code that you normally write a huge amount of it is creating variables that are only used in this function or functions that it calls directly and all of those variables could be stack allocated and it really adds up a lot. So this is another trick. If you can eliminate heap allocation of all temporary things that don't outlive the time that the function is called you can eliminate not just the garbage collection so you can obviously free things and then the freeing will not require a garbage collection pass but even freeing has some costs because if you think about how the allocator has to work when you free something it has to go find the thing pull it out and when you allocate something it has to find the slot to put the next thing in. So there's some cost to using malloc that just amounts just deals with the fact that putting things into a big filing cabinet and taking things out requires some bookkeeping and the bookkeeping actually adds up quite a bit. So stack allocation is an important trick and so now we have already two things we can statically dispatch an inline and we can stack allocate and these already are very big improvements over what you might have written in Ruby. However you probably at this point look at structs and you say well structs are interesting but I'm used to being able to do inheritance in Ruby I'm used to being able to share code what if I showed you an example of a circle on a square what if I wanted to have some implementation for all shapes so in an object oriented language a traditional object oriented language you would inherit from a shape this is kind of like the hello world of OOP object oriented programming and Rust recognizes that you want to be able to do this kind of stuff but it implements it in a little bit of a different way and the way it happens to implement it is both in my opinion more powerful than simple object oriented programming but also much faster. So let's go look again we have our struct circle, our struct square and now we're going to create a new trait and a trait is basically you could think of it like a mix in Ruby except you aren't required to implement it may not come with all the default implementation so the innumerable mix in Ruby would be easy to describe as a trait in Rust where the each method would be a mandatory method without a default implementation and then you would have many other methods that would have default implementations that call the each method so a trait is effectively a list of methods some of which are mandatory and some of which are derived from the things that you provide as an implementation in this case I don't have any default implementations but you are able to do that and now what we can see is that we're going to implement shape for these two things and this is where the fact that the implementation is not in the same block as the original struct comes in handy when you implement a particular struct typically there are multiple implementation blocks so you have an implementation block for what are known as concrete methods or sort of the methods that are specific to that struct and then you have implementations of various traits and those traits again are sort of like mixings in Ruby in that they represent cross-cutting concerns and they aren't required to be in a strict hierarchy just like Ruby mixings are not required to be in a strict hierarchy so what we have here is we have a function called area and the function called area for shape returns pi times the radius squared and the square version of the same thing multiplies side by itself so now that we've implemented the traits let's actually go and use them in our main function and here we call, we make a new square we make a new circle and we call s.area, c.area and again just like before because Rust knows that we have a square and a circle it's going to point at the exact implementation of the trait that we have so it's not like it's saying well shape is an interface so we'll just dynamically dispatch that interface at runtime it knows at compile time exactly what thing we want to dispatch to and it dispatches to that so so far so good we still have static dispatch we still have inlining and all the optimizations but you might be thinking well what if I wanted to you know that print line is sitting right there in my main function what if I wanted to pull it out to a different function if I do that doesn't that mean that I'm going to end up with dynamic dispatch again right doesn't that mean that the other function has no idea what thing I'm actually what the concrete thing I'm actually calling it on is and to handle that Rust has a thing called trait bounds and so here's how you talk about it so it's a little different than perhaps what you might have done in another language with their interfaces but basically what you're saying here is you're saying I want to have a function called print area and it takes any type as long as that type implements shape and that allows you to use methods from shape inside that function but what the compiler does when it sees that so first of all from a high level perspective you look at that and it looks it's just a polymorphism right it just it says I don't care what kind of shape it is just call area but what the compiler is actually doing when it sees that implement that annotation is it saying make a new version of print area for every single type that actually gets called with this so on top where we have print area with s and c the compiler says okay make a special version of print area for circles and make a special one for squares and now we're back to static dispatch and now we can inline that implementation right so we can we basically you can you can see that you can start writing abstractions that are pretty high level in the sense of sort of ruby or go where you can say I take anything that implements this interface but because of the particular way that it's implemented in Rust you get you still get static dispatch so you could you start building things up and because of the tradition the typical idiomatic way you write things in Rust you end up with very transparent to the compiler big trees of things and in my opinion traits are where the magic is so up until now if you just look at structs and impulse before I talked about traits it's definitely an improvement over the equivalent c code because the c code is pretty gnarly no matter what but it might feel like well if I don't have I can't really write anything higher level if I start needing to have multiple implementations of something I'm gonna get stuck I'm not gonna be able to write the functions I need I'm gonna end up writing very concrete code but because of the fact that there's a way to write functions that take trait bounds again it's here it's a function that takes any implementation of shape but what this is saying is it's not just that you should go figure out at runtime what kind of thing to do what kind of thing you need is it's saying at compile time figure it out that allows you to create many abstractions that however deep you go end up with the compiler being able to see the whole picture so you get to write methods that support many different types but you still get static dispatching and inlining and if you remember at the beginning I showed you with like string.cars.all blah blah blah that all method is actually implemented as a generic method on all iterators but at compile time the compiler says oh I know that you actually have the cars iterator here I know that this is not just any random iterator in the world I know that you have the cars iterator and so it is able to actually inline that actual code directly into the program and maybe you're starting to get a sense for where all this is going now if you want to learn more about how traits fit into the abstraction story there's a really great article on the Rust log called abstraction without overhead traits in Rust it's actually the third part of a three part series which was memory safety without garbage collection concurrency without data races and abstraction without overhead all of which sound like contradictions in terms but all of which are real in Rust and I would encourage you to definitely check it out so what we have here so far is a way to start writing code that feels high level, that feels very generic that allows you to make good use of code reuse but that doesn't actually slow your program down and the last bit, the last thing that I want to show you is closures and closures interestingly you actually know all the things from what I already explained to understand how closures work in Rust but I'll go a little, I'll go slow so first of all I'm going to show you TAP so Ruby has a thing called TAP in it it started off in Rails but ended up in Ruby and the basic idea is that it's a method that's on all objects so it's implemented on an object and it takes a block and that block just takes the object but the whole method TAP returns the outer object so the basic idea is you may basically need it sort of abstracts over the pattern of like x equals something do some stuff and then return x which you probably have experienced a lot of times and it can be much nicer to use TAP and so you can implement TAP in Rust and I'm not going to go into the details of how it's implemented it's like five lines of code and a little dense but you can implement the equivalent thing in Rust and the way you use it in Rust is actually very equivalent to how it looks in Ruby so I made an array here I started off with the string Rails I tapped it then I extended on all the args from the environment so that's like argv and then at the end I pushed another string on it which is dash h and then if I that whole thing v gets the value of the multiple taps and now has the vector that I've been working on the entire time and the reason why you need TAP here is because extend and push actually don't return the original object so if I was to just say v equals vect.extend.push somewhere along the lines at least in Rust you'll get a compile time error but it's not going to do what you want and TAP lets you do what you want basically here so that's great but I think if you're used to these kinds of abstractions in Ruby you kind of come to get wary of them because every single time you use an abstraction like this you have to say to yourself is it really worth it? like maybe I should have just done the three-liner and you'll see blog posts out there that say like don't bother with stuff like this it's confusing but more importantly it's slow so you should just the three-liners fine everyone knows what it means etc but interestingly in Rust this program will actually perform equivalently to the longer version of the program in Rust this idea is called zero cost abstractions which is what that blog post I showed before goes into more detail about and you may already have an intuition for how that works with traits and structs but how can that work with closures? how does what we learn so far help us understand how closures end up being free in Rust? so the first thing is closures are normally stack allocated in Rust because if you think about how you use closures in Ruby 99% of the time you're making a closure to past you're making a block to pass to a function that's just gonna use it and return right away right so there's no actual reason for those closures to be heap allocated they don't need to outlive the lifetime of the function that you're using them in so normally when you use the closure notation in Rust they're stack allocated and that eliminates all the heap costs that we talked about before that's number one number two is when you write the closure syntax in Rust it actually implements one of three different traits which you can go learn more about I'm not gonna get into details but basically they implement one of three different traits automatically for you behind the scenes which means that the actual closure is gonna be invoked statically so the function that takes the closure says I take a closure and calls it but because it takes the closure as a trait just like we saw before the actual closure gets invoked statically which means that the closure itself can be inline so basically the consequence of that is that without adding a lot of special magic related to closures in Rust if you call a function like tap that takes a closure by the time it's done being compiled the body of the closure gets inlined into the thing that called the closure and that whole thing gets inlined into the thing that called tap so it basically looks exactly the same to the compiler as if you had written the longer more annoying version because of the fact that everything along the way is transparent things are not dynamically dispatched they're not heap allocated they're transparent to the compiler and I should be clear when I say transparent to the compiler I don't mean that this is something that you have to think about all the time the difference between Rust and Ruby here is not that in Rust you have to spend all your time thinking oh I'm writing a closure that means it's stack allocated the difference is that the idiomatic way of writing the equivalent notation in Rust ends up telling the compiler enough information to do what it needs to do and unlike jits it does it in a way that is guaranteed so every time you see a function called tap if you look at the implementation and you sorry if you look at the signature of tap and you look at the code that calls tap you can predict with 100% guarantee whether it's going to be stack allocated and whether it can be inline so going back to the beginning and our not so naive Rust implementation remember that what we have here is a bunch of things that look like high level constructs and at the beginning perhaps it was unclear how that could end up being so fast and so the interesting thing here is that I sort of enumerated what's happening before so first of all those slices that iterator and that closure are allocated on the stack which means there are no special heap costs which are sort of the intuition you get from writing Ruby or JavaScript is that when you make things they have cost and in Rust the intuition that you have is that when you make things they mostly don't have cost unless you ask for them to have cost and second of all these methods and the closure the block that we made that gets statically dispatched and inline so the consequence of all of that is that we have at the end of the day something that has the whole program the thing that makes the slice the thing that iterates over the iterator which internally is a loop, a while loop LLVM can actually see all of that so basically once Rust is done making all the functions it basically hands it all off to LLVM and LLVM says oh you've told me exactly what function this is calling you've told me exactly what closure this is calling and so LLVM says okay cool I can basically convert this whole thing that whole high level thing into a loop and the consequence of that is that usually when you write code in Rust like basically all the time and you write anything to do with loops you can write high level code the functional style that we like so much in Ruby being able to do filters and maps you can write code like that but the actual emitted assembler looks a lot like if you had handwritten the crazy thing in C right the thing that you normally have to write if you want to get performance out of C and really my message today is the thing that's awesome about Rust is that I just gave you a lot of details about how things work under the hood but that all you have to know is that one liner that I showed at the beginning of the program you can basically say I have a hotspot here in my Ruby program and I want to write it in Rust you can use all the high level tools that you're used to basically the 2015 modern programming language tools and you can get really good performance so do the question is from a performance perspective is there anything that should be that Rails would benefit from moving into Rust I'll just answer that in two parts so first of all if you had asked me that question with regard to C I would have just said no because and this is actually when I worked on Merb which was a performance sensitive re-implementation of a big part of Rails that eventually got merged into Rails we merged as a project into Rails we had thought about writing stuff in C was like a serious consideration but the problem is that C is so hard to reason about for most people who are likely to contribute to a project like Rails or Merb that you basically end up you're going to end up having those pieces of it being able to be maintained by a very small number of people and worse if anybody if somebody comes in and thinks they know what they're doing and makes a mistake now everybody's Rails program segfaults so that seems very bad so I think from if you ask me that question about C the answer is suck it up do things like Aaron likes to do which is you know make move things into constants hoist things out of loops that ends up giving you a lot of benefit without the kind of cost that writing stuff in C comes from however Rust actually bends the curve quite a bit as I showed you here and so you could imagine writing chunks of things like for example a good example would be like the static file middleware the static file middleware is not doing a lot it's for a pretty simple middleware I would normally have said before like a few years ago like whatever it's development mode doesn't matter however basically every Heroku app like 95% of them are using the static middleware to serve files and you may want to not have that block your you know entire Puma thread or your entire process so perhaps making moving that into a middleware would be good and I think that that is sort of the the structure of my answer is I think that the most obvious wins would come from moving parts of action dispatch which is like the whole dispatching layer into Rust and I think for sure it should at minimum start off as being an opt-in gem although at some point there could be benefit to getting the blessing of the core team because if somebody if you're basically re-implementing stuff in Rust and you don't really know exactly what things might change the core team might change things at any time and then you have to like figure out how to re-implement it in Rust so you probably want to focus on things that are extremely stable right apis are extremely stable the good news is rails have been around for 10 years there's a lot of stuff that's extremely stable so you can I would say probably like the dispatching library a lot of things I have to do with strings make a lot of sense there's one thing I didn't mention actually about this example which is that Rust uses UTF-8 as its default string encoding so it's very very useful for Rails apps which also use UTF-8 as its default string encoding but you would have to do a lot more work if you wanted to make a gem that was written in Rust work with arbitrary encodings if you didn't want to transcode things into UTF-8 before calling into Rust so I think that makes it a really good fit for Rails which already pretty much hardcodes UTF-8 and perhaps a less good fit for like re-implementing the string.c file in MRI into Rust awesome