 placement after Aja and the program was very strategic. I recently met an extremely proud quadcopter owner. He wouldn't stop droning on about it. 30 minutes of this, folks. So I just want to say happy Friday. It's actually Monday. I'm sorry. So welcome, everybody. I'm really excited to be here at Cascadia. I'm speaking here for the first time. This is my first full-length talk here. And I need to move it along because I have 229 slides. But this is very exciting. I was researching what the conference is about and what I should talk about. It's really exciting that there is a conference that's all about waterfall development. I really like. I really enjoy waterfall development. Because if you've ever been in a waterfall, you'll know that it's super comfortable. And there is a lot of really neat waterfalls, especially in movies like Jurassic Park. And I really like that movie, too, because they have eunuchs in there. And you can't like, I love that. I love this movie. It's so good. Anyway, also, I really like waterfall development because Gantt charts are amazing. I mean, look at this thing. You can tell that stuff is happening in here. I mean, look at that. That is awesome. I mean, you can tell. You can tell something is happening. That progress. And the thing is, I really, really don't like extreme programming. And I'll tell you why. One of the reasons is the name of it, that really pisses me off. Look at that. Why is the X big? XP? That doesn't make sense. Just spell it right, seriously. It should be EP. The other problem with it is it's totally extreme. So whenever I'm doing programming, I've got to wear my official extreme programming helmet. Because I don't want to get hurt while I'm doing development. So I feel like this is why I'm really excited to be at a Waterfall Development Conference, because I just really want to slow things down. So I was checking out the conference website and taking a look at it. I thought it was really cool. So I checked out the source of the web page, and I thought this is cool. I don't know if you guys noticed this, but they use CSS, which stands for Cascadia Style Sheets. So I was really inspired by the talks earlier today. I want, all right, everybody, what time is it? Tell me what time it is. No, it is business. It is business time. That's how I answered it. Nobody else seemed to answer it that way. Anyway, I thought that talk was really, really, really exciting that I thought about Han Unification. He didn't even touch on that. Take a look at these. So those characters are all the same. Each column are the same characters. This is Han Unification. For Unicode, they took these characters and glommed them into one. So each row is one code point. But what I thought was really interesting about Han Unification is, if you're taking all these characters and turning them into one character, does that mean it's actually a Han Solo? We'll do something productive, I promise. Anyway, so this talk will be extremely boring, so I hope you find it to be awesome. So I was really worried about whether or not people would like this talk, so I put a bird on it. I don't know why anybody hasn't made any Portlandian jokes yet. But I guess that's because Portlanders don't like reality TV. Did I mention I'm from Seattle? Anyway, thanks for having me here. I'm really excited to be here. I'm on the Ruby Core team and the Rails Core team. You can find me on Twitter as Tenderlove. I'm also on GitHub as Tenderlove. Instagram is Tenderlove, and I'm also on Yo as Tenderlove. So if you want to contact me there, you can send me some Yo's. I'm the number one contributor. Number one Rails contributor. I have a lot of internet points there. That's a lot of internet points. But I'm going to give you all the, this is the secret to getting a lot of internet points, OK? You guys all listening to the secret? This is the secret. The secret? Revert commits count two. So more mistakes equals more points. So go for it. You know how to win. Anyway, I'm a short stack engineer. I enjoy pair programming. This is a close-up shot, action shot of pair programming. Hard part about setting this up is that the TTY is kind of sticky. I have a cat. His name is Gorbachev Puff Puff Thunder Horse. I have another cat. Her name is SeaTac Airport YouTube. This is a close-up shot of her. So my wife said that we had to get two cats so that I would stop looking at pictures of cats on the internet. What she didn't realize is that's just not how it works. Now we have two cats and I look at pictures of cats on the internet. Anyway, this is my other cat, her natural habitat on top of my laptop. Recently, I've been getting into Node.js, and that's so that I can be closer to the metal. So like this is me. I'm getting close to the metal there. Super close. I have my own consulting company. Adequate, adequate. We do everything adequately. I'm trying to come up with a new logo and music, but this is good enough, I think. Recently, we've been working on a lot of ground breaking technology, this. We break a lot of ground with that. Actually, we've got a website called recruitersfam.com. You should check out this website. We are collecting data for lulls. And I want to talk a little bit about Markov chains. So what Markov chains are is if you take a corpus of something, say, unwanted emails from recruiters, you can build a Markov chain out of these things. And what Markov chains are for is taking a bunch of text and turning it into something new. So you have a giant corpus, and you start out with this corpus of text, and you say, I want to generate something new from this using the patterns involved here. So we're going to look at how to do Markov chains. So let's say this is our corpus, just some sample data. What we do is we parse this and turn it into a tree that looks something like this. And these are the different nodes. The nodes are the words, and the edges on each node is the probability that we'll move from one word to the next. So if you're on node i, you have a 25% chance of going to tender or a 75% chance of going to love. And when we generate new text, what we do is we start out on a node, calculate, we choose, like, grab a random number and figure out where we should go next based on these particular weights. So we can come up with new sentences based on this particular graph. So this new sentence that we might come up with from that particular corpus was, I tender love you too, even though that sentence didn't exist in the original corpus as a brand new one. So what I actually store in the data is I store occurrence counts. So the edges are actually the number of times I've seen i go to tender or i go to love. So I saw that transition three times. And if you look at the data in Ruby, it looks like this is the data structure that I use in Ruby. So it's just a hash. The key is that starting node and the values are where it could possibly end up with their counts. And taking the recruiter data that I have, or the recruiter emails that I have, this is a sample of the real data that I have. And the thicker the line is the higher probability that it will go from one node to the next. So basically what we have to do is whenever we're at one of these nodes, we have to pick a random child to go to next. And I want to talk a little bit about picking random children. And the way that I do that is using a binary heap. And what a binary heap looks like is it looks like a tree data structure like this. It's a binary tree. And the properties of this tree are that each node has two children. The tree is complete, meaning that we always have all the children are always filled, unless it's the very bottom row where maybe only one of them is filled. But the next one that we add will fill it. The next node that we add will fill it. Each node is greater than or equal to its children. Or you can do a binary heap in the opposite direction and flip that conditional. But in this one, I'm saying each node is greater than or equal to its children. So let's say we want to add a new node to this. The way that we would do that is let's say we're adding a 15. That 15 is too big. It's bigger than the 8. So the algorithm for adding this is we swap the two. And then 15 is still bigger than 11. So this breaks our rules. So we swap those two. And now our tree is where we want it to be. So if we look at the heap for transitioning from I, we want to choose different nodes based off of I. If we look at the heap for that, this is what the heap looks like. So love is the top. That is the highest probability that we'll go to. And you can see each of these counts represents the weight for that particular node. So the way that we select one of these using a binary heap is that we take a journey of love. So what we do is we pretend that that particular heap, or that heap is like a road that we're driving on. And we're driving a car. And our car is going to land on one of those particular nodes. So we calculate the total trip cost, which is the total cost of all the nodes. And we generate a random number between zero and that total cost, or that top cost. And we call that, we say we have a random amount of gas. We see how far that gas gets us. And wherever we land, that's what our random child is going to be. So this is what it would look like in practice. I like to imagine that. So we have our trip total cost of seven. We generate a random amount of gas. We have Kit here going on a date. And he's a very irresponsible dater. So he fills his tank up with a random amount of gas. And then he walks along here. So we start out with six. At the top one, we subtract three, visiting that node. We go down one. We subtract one. Our gas is two. Down one, our gas is one. We go back over here. Keep walking because we still have gas. Now our gas is zero. And that's the random child that we go to. So we'll stop in a different place, depending on how much gas we filled the car with. What I think is really cool about this data structure is that we can store a heap in an array. So let's say we have our heap that looks like this. If we move all these nodes such that they're next to each other, we can actually store it in an array that looks like this. So if we were to transition that, that's what it would look like if all the nodes were lined up next to each other. And what's really interesting about this is that we can then calculate our parents and our children using the index of the array. So we say the index divided by two is the parent. And two times the index is the child. Two times the index plus one is the other child. And the only way this math actually works out is if that nil is at the beginning there. So it has to be a one-based array. But let's say we're here at the very top. Our index is one. We multiply that by two. And that means that the children is at index two and index three. Or let's say we're down here at index three and we want to calculate the parent. So we say three divided by two. Well, that's one, because we're doing integer arithmetic. And Ruby, if we were doing float, so you would have to floor this. Anyway, so the output of this, you can see the output of this Markov chain generation at horse recruiter. This isn't complete without some stuff. I'm not going to read these out loud. These are some of the ones that I liked. Humor, if you are receiving LinkedIn. This one's my personal favorite, Bauer. So now on to the actual stuff, which I will have to hurry. So I'm going to talk about speeding up your code, speeding up Rails. We're going to look at a bunch of benchmarking libraries. And we're also going to look at how I use these benchmarking libraries to speed up Rails, how you can use those benchmarking libraries to speed up your code as well. But first, I want to talk about some stuff that's on my mind. I spend a lot of my time thinking about weird stuff and things that annoy me if you can't tell from my previous slides. For example, people always say this separation of concerns. And this really, really bothers me. And it's not for a good reason. The only reason it bothers me is because I think of this face doing this. And I'm like, how does this have anything to do with anything? All I think about is two concerned faces getting separated. I can't focus on what is going on. So whenever anyone says separation of concerns, I'm like, OK, move on. Anyway, so let's talk about some pro tips. Ruby dash D, does anyone use Ruby dash D? One person. Well, you should use Ruby dash D. And I'll tell you how to use it. Just tack on dash D. And the reason I really, really, really like Ruby dash D is because it puts Ruby into debugging mode and it prints out anywhere that an exception was raised. So this is very handy if an exception gets swallowed. So if you're seeing some weird exception get bubbled up and you don't know where that came from, you go to the code, it's not actually coming from there. You can run the program with Ruby dash D. And you'll see these debug errors come out, or these debug lines come out that say, hey, an exception occurred at this particular line. And you can go check that one out. It prints them out regardless of where they came from. So you can find solved exceptions. That's just one thing I've been using for debugging. The other thing I've been thinking about recently is Rack. And I want to say, so Rack, it's over. Reality TV. I am a jerk. Sorry, anyway. So Rack has this, this is the Rack interface. Hopefully most of you are familiar with the Rack interface. You have to implement a method called call, which takes this environment hash. And then you can, you proxy this environment hash down to other Rack middlewares. And you have to return up an array back up the stack. And what really, really annoys me is that the only way you can pass data between your Rack middlewares is guess what? This LOL global data. Let's just shove crap in our hash. That really, really bothers me. Really bothers me about the Rack API. But also I need to say something like, oh I should have mentioned I'm on the Rack core team as well. So there will be no Rack 2.0 star. There will be a Rack 2.0. But probably I want to release a Rack gem that's 2.0. There may not be a Rack spec that's 2.0. Although I want there to be a Rack spec that's 2.0. So I'm thinking like, for a Rack 2.0 I want to do something like drop Ruby 186. I feel like maybe we don't need to support that anymore. So go to something that's maybe greater than or equal to a Ruby 2.0, but of course we can only do that in a major version and Rack is already at one. So obviously there has to be a Rack 2.0. But I've been thinking a lot about the next generation of web servers and this type of protocol that we need to have in order to move forward. And that's, I want something like this. Like I want a Rack spec that looks something like this where basically we have an environment hash which is just the CGI environment. And we have an input and output stream where input is the post body and it can't be rewound. So anytime you need to take data like say somebody's uploading file or they're doing a form post, we read off of this IO. And the output IO is just an IO-ish type thing where we can write stuff to it or set headers. But it seems it's very similar to an IO. And what's cool about that is the web server, if it wants to, they can wrap up that IO and provide something to you that actually chunks stuff back out to the socket. The goal of this for me is basically to steal from Node. Just because one thing that really annoys me about Node is people are like, oh, look at all the stuff that Node can do, Ruby sucks it can't do that, we can totally do that, we just don't have the APIs. I want to steal and do that. So these are just my thoughts. If you have, these are random thoughts that I just want to write down and share with people. If you have thoughts about this, please come talk to me about it. We need to push Rack forward. We need to make the Ruby web server space even better than it is today. And the only way that we can do that is improving this API. So onto performance tools. I'm gonna talk about a lot of performance tools, gathering data from Rails applications, what we're doing on, I keep saying we, whenever I say we, just gsub that in your mind to I, it is me doing it, there is no we. I also think that's weird when people are like, oh, just report that bug to such and such open source team. And I'm like, you know, that team is just one person, right, it's not, can it be a team if it's just one person? I don't understand. Anyway, it seems like if you're a team of one, you're probably talking to yourself frequently. So maybe I am a team of one. Anyway, I'm not crazy, really. So I wanna talk about raw performance, measuring raw performance of a particular algorithm. Davey touched on some of this stuff earlier today, so hopefully I can move through it quickly because I only have like 12 minutes left. I like to use this library called benchmark IPS and I'm gonna compare this to the standard library benchmark tool. Standard library benchmark tool looks something like this where we say like, okay, create a benchmark, run in some time around some particular function, we run that function in times, and then it tells us how long it took to run that function in times. The problem though is when you're writing a benchmark like this, you're like, well, how big should N be? You don't know what that should be and oftentimes you'll run a benchmark and the output will look something like this and you're like, wow, my code is crazy fast. Took like zero time. So probably what this is saying is probably your N wasn't big enough, but you don't know how big to make that N and that's where benchmark IPS comes in and I think that benchmark IPS is very, very handy this way. You just say, okay, give it a block of code and it runs that code as fast as it can in five seconds. So it's like, okay, how many times can I run this particular block in five seconds and it reports to you in iterations per second. So the higher the iterations per second, the better the algorithm is or the better that code is. So for example here, we have two benchmarks here, one is accessing a set and one is accessing a list and we all know which one is gonna be faster here, but you can see that the output, see the output here, it says like, those are the iterations per second. The set include was some, I don't know, big number per second and the array was much smaller. So with iterations per second, higher is better. So the other cool thing is that this provides a standard deviation, so when you're doing stuff, you're running your benchmarks and obviously you're listening to Rebecca Black on YouTube and you've got iTunes going at the same time and probably there's some Netflix as well. Well, that's gonna cause some standard deviations in your code or in your benchmarking script so you'll see those standard deviations there and that's handy for you to know like, well, okay, this algorithm will run so many times per second plus or minus a particular standard deviation and you wanna get that standard deviation as low as possible. So if we compare that to, I wanna cover another thing with benchmark is like, if you're doing this particular example, we have a hash and a set and we say, okay, we're gonna do set include and we're also gonna do hash include and you think, okay, well, a set include and a hash include, those should be approximately the same, right? A set is probably implemented in terms of a hash so accessing between these two should be about the same amount of time but if we run this code using the standard library benchmark, we might see an output that looks like this and we'll see that, well, the set access is slightly faster than the hash access. This may happen. Now, if we run this test again with benchmark IPS to see how quickly we can do in five seconds, we'll see our output looks like this. I'm not gonna make you read those. We'll look at a nice graph. The graph looks like this and we can see that set access is actually slower. It's lower than doing hash access and actually the reason is because, the reason for that is because set actually does wrap up a hash so there are other method calls involved so when you do set include, it's proxying methods over to the hash, the underlying hash implementation so it's slightly slower. This probably won't actually matter in the grand scheme of things but this is just an example of how the standard library benchmark could lie to you. The other reason I like using these tools is for black box testing. Black box testing, like many times, so I have a confession to make. When I'm working on Rails, many times I actually have no idea how any of it works. This is true. I've been working on it. I might be the number one committer. Remember, many of those are reverts but I don't know what's going on and I'll use benchmarking tools to try and figure it out. So for example, let's say we have two cache implementations. We're doing cache access on both of those and we wanna measure that. We wanna see how fast these accesses or how fast an access works when the cache size grows. So we can collect all the reports from benchmark IPS. We can say like, okay, I wanna populate the cache to size 10, 100,000, 100,000, do a report. We can grab that report, grab the report down here and compile all this data, which I'm compiling into a CSV. So I take all of those reports, turn them into a CSV. I change that to do seconds per iteration because I actually care about how long each iteration takes. So then I multiply that by 10,000 and say, well, I wanna know how long 10,000 iterations takes and then I'm gonna graph that. So I turn that into a CSV, throw that into numbers and I can see what the graph of that looks like and you can see along the x-axis there that's the number of elements involved along the y-axis is the amount of time that it took for 10,000 iterations. And we can see that that blue one there stays linear and that green one, or the blue one stays constant and the green one goes linear, or possibly square, we're not sure. But it's actually linear if you go look at the actual implementation. This graph makes sense because one of these is implemented with a hash, which we know lookups will be constant time and one of these is implemented with an array which we know will be linear time. Where I use this with real life, a real-world example is routes and rails. I want it to understand as a routing table, does the size of the routing table impact how fast it takes to generate an ATAC? So I said, all right, let's create a bunch of routes. We'll draw some resources. This is an example with one route. So I did it again with 10. I timed it, did it again with 10, did it with 100, did it with 1,000, and then graphed how long it took to generate a link for each of those sizes. And if you look at that, I actually went all the way out there and we see that we get about a linear. It looks weird. We have a large standard deviation, but along the x-axis is the number of routes involved, y-axis is time per link to, and you can see it's about linear. So we know that adding more routes to the routing table does not impact how long it takes to generate a link to. Now, the next thing I wanted to understand was does the length of the route impact a link to? So how long that href is going to be? Does that impact how long it'll take to generate an ATAC? In order to do that, I wrote another benchmark that said, okay, I'm going to match against get slash a. So right here we have get slash a, and then maybe get slash a, a, get slash a, a, a, a, et cetera, et cetera. So we do a length of one, a length of 10, a length of 100, et cetera, et cetera. And if we plot that and look at the, look at that performance, we'll see, again, along the x-axis is how many, how many slashes were in that link, and along the y-axis is the amount of time that it took to see that, or to generate that link to. And you can see as we grow the number of segments, the amount of time it takes to generate an ATAC gets longer. So we can say probably that implementation has some sort of linear data structure. Maybe there's an array involved. We're not sure exactly. So now that we understand where our time is being spent, or now that we understand what exactly is slow, how can we figure out where our time is spent? And the way we do that is with a tool called StackProf, and this is a, this is a sampling profiler. And the way that we use this is we say, okay, run this code inside of a block and it'll sample every so many seconds. Where are we? What call frame are we at? And the idea behind that is that the longer you spend inside a particular function, the more likely it is that that sample will find that we'll see that you're there, right? So this dumps out a text file, and we can just say, okay, show me the text, or it dumps out a data file, and we can use this command line tool to dump out what that actual stack trace is. And this is the output I got from it. And you'll see at the very top there that's where we're spending the most time. It's in this method called URL4. We're spending 26% of our time there. So that's where we want to focus our efforts on doing performance improvements. The next thing I want to understand is the amount of objects that we're generating. I use gc.stat to do that. So I say, tell me how many allocations have been allocated in the system. This says, this returns a value to me that is the number of allocations that have happened in the entire system ever. So it's a number that's always increasing. So what I did is I said, okay, find an object from ActiveRecord. Tell me the number of objects that have been allocated in the system. Warm up our cache. Get the number of objects allocated. Do our actual benchmark end times. Get the number of objects that were allocated after that. Subtract the two and divide by n. And then we know how many objects per run we did. So a real-world example where I'm using this is with figuring out where our objects are being allocated with regard to views. So in order to do this, I generated a fake request and I sent that to our application. So we have this is doing requests against books slash new and it's doing it for a count number of times. So right here we set up the rack environment. This is a hash that we just sent to our middleware. That hash of stuff that I totally, totally hate. Count up the number of allocated objects. Exactly the same thing we saw in the previous slide. And if we graph this between 40 stable, 41 stable and master, our test results look like this. They're actually going down big time. But there's one thing I want you to notice about this graph is that the lower left-hand corner actually starts at 2,000. So this is an example of how to lie with graphs. If we set that lower left-hand corner to zero, the graph looks more like this and then we are sad because it's not very much. But you have to realize that even though this graph does not look very good, it's actually 19% reduction in objects since 40 stable. And actually a 14% reduction in objects since 41 stable. Now this is interesting because we can see the overall number of objects that are allocated for your system but you want to know what the strings are being allocated. Oh, shit. Three minutes. Okay. Allocation tracer. This is the tool that I'm using for that. So allocation tracer will tell you what objects are being allocated. It counts up forever and ever like our previous one but it tells you the types. So here's an example. We run this a thousand times. We get the output from that. And we'll see that we generated a little bit longer but I trimmed it up for the slides. So I wanted to look at speeding up helpers and reducing object allocations reducing object allocations inside of our helpers and I did this by profiling requests and response which we saw previously this test I showed previously but this time we're generating that hash again but we're using stack prof this time in order to figure out where we're doing object allocations and if you look at this look at the output you'll see that this is using up 9% of our CPU time and I want to talk a little bit about SafeBuffer. So I wanted to see where a SafeBuffer initialized being called and the way I used this is with the TracePoint API. So look up TracePoint. It ships with Ruby. TracePoint this code says okay fire this particular block anytime there's a C call which is a call on a C method or anytime there's a normal Ruby call fire this block and inside this there's an active support SafeBuffer where the method is initialized. So I get access to the call stack so I can see exactly where those are being called and you can see down here at the bottom we have two particular calls and the output of this program shows me that one of those is actually inside the HTML safe method. Okay this will become interesting later hopefully. The other one is inside of the main function where we saw that we just allocated that straight and where this is actually happening inside of Rails if we look at that call stack there's a tag option method. This tag option method is used inside like say form tags anywhere where we generate attributes of the tags so like you know href equals whatever or method equals post any of those places we're using this particular helper now where it was actually happening is right here this erubyutile.h so I want to talk a little bit about sanitization oh god I'm not going to make it in time let's go quickly so safe we're going to talk about safe buffers this is an ordinary string this is html safety handling in Rails so a normal string is not html safe we consider that not to be safe if you tag it if you say.html safe we return to you a safe buffer that actually just tags it so html safe just tags it saying that this is safe we consider this to be something that's okay to write out over the wire so erubyutiles.h if you look at the implementation of that what it would do is say if it's not html safe we're going to make it html safe with this gsub and then we're going to call html safe on that what this actually did is the gsub generated a new string object and the html save generated another string object so we're actually creating two strings for object whenever we call this method or two strings per call so if you look at tag options we call erubyutiles.h we assign that to value that value variable is used that value variable is actually just interpolated back into a freaking string which means that what's happening here is we're going from a string to a string to a safe buffer back to a string again so this safe buffer is a waste so we want it to eliminate it the way we did that is extract this method extract this into another method we have an unwrapped html escape which does not tag the string it just escapes it and then we create this extra one down here so we stay backwards compatible so we update our callers now this will only create one string and we get interpolated down here again so now we're just doing string to string to string and this may seem very sad except that it's saved over 200 allocations per request so we run this benchmark again again with books new for a certain amount of time look at this place I wanted to look at the types of objects and where they were being allocated and if we look at that this is a breakdown by type string, array, hash we'll see they're 4.0 stable, 4.1 stable and master and we're actually dropping with all of these again it doesn't look super impressive but remember 19% reduction 14% reduction and again your mileage may vary shouldn't that be kilometers shouldn't we be using the metric system it depends on how your ERB templates are used I have like a lot of slides left I'm not sure if I should continue Ben, should I keep going? OK I just want all of you to get your monies worth seriously I'm not always speaking at Cascadia and you don't live in my house my wife hears all of this every day so anyway I wanted to look at string object reduction I wanted to do this even more in Ruby we have mutable strings so if you look at run this code 5 times here and you say printout food object ID you'll see every time we execute that block it actually allocates a new string even though that string never changed each time we go through it and what's cool is in Ruby 2.1 there's a new optimization where if you freeze a string it's actually literal it's actually constant we don't allocate a new object so if you say food.freeze and print out the object ID you'll see it's always the same thing now if we look at implementations of ERB templates so this is an ERB template this is the code that it actually generates I want all of you to read this very closely I'm just kidding so anyway if we zoom in on part of this compiled template we'll see we have this method here this is from string and that string came from your ERB template it's an HTML literal this is an HTML literal that came in your template but what's interesting is the HTML template literals can't change those strings cannot change you can't get access to them they cannot be modified so we added freeze added freeze to that and now we just freeze those HTML literals again this brought down our allocations for request drastically the next thing I wanted to do is speed up speed up output and I have to warn you this is a work in progress the way that we're speeding up this stuff hasn't actually landed in master yet but it is on my laptop hopefully my laptop does not burn up it's all up here right? it's fine it's fine the way that I was speeding up output is using the law of the meter but I think this is a weird name I think it should actually be the suggestion of the meter it's actually a law you can't get arrested for it if you violate it except here in the United States where you can get arrested for it so one thing I was wondering though is if you get arrested for violating the law of the meter does that make you an arrested developer? yes when I thought of that it blew my mind okay anyway so law of the meter is interesting I don't actually understand any of the explanations of it what I think about it is it's not about the dots it's about the types that you handle in your functions so it doesn't matter how many dots you have in your functions it matters what types you handle inside of your function and the fewer types that you handle the faster and easier your code is now again let's take a very close look at this compile template very close read that closely zoom in again here we have our old friend save append where we're adding a string literal HTML literal that cannot be modified and what's interesting about this is that it's generated remember it's generated by our ERB compiler and if we go look at the output buffer object this output buffer and we look at the implementation of save save append let's take a look at it this is the implementation of save append equals and the first thing I notice about this this happens to me all the time when I'm working on working on rail stuff is why? why are we checking for nil? this doesn't make any sense the ERB compiler guarantees that that method call is going to be called with a string we know in advance that's being taken care of for us we don't need to know about dot whether or not it's a nil right so that line can go it can go so we're not dealing with nils anymore this function dealt with nils and it dealt with strings but now we're saying okay we'll say super and we're calling 2s value 2s on that thing let's make sure that stupid thing's a string it's got to be a string 2s really really a string but remember the ERB compiler guaranteed it was a string for us we know in advance it's a string so we can get rid of that 2s we know we know we're just dealing with 2s now now the value we know that calling super with a value we can just get rid of that that goes away so because we're just passing all those parameters up and now now that method can just be implemented like that I could just go away so this is how you speed up rail so I'm not sure I'm not sure whether or not that was the law of the meter why that code was actually actually was the way it was I think it may have been defensive programming but my jokes wouldn't have worked if I called it that so I think what we have to do when we're dealing with this stuff is we have to limit the types that we're dealing with it's very powerful for us to say like in this particular function I only handle strings I only handle strings now if we look at the superclass we'll see that ok well the superclass passes in we have to check whether or not the self is html safe but we know that that output body that output body can't be mutated and the only way that our output buffer cannot be html safe is if somebody mutates it so this only this conditional only happens on mutations this conditional is only ever true when that output buffer is mutated but how can you out how can you mutate the output buffer I guarantee none of you know how well I hope you don't otherwise I have broken your code who mutates the output buffer I think nobody does that so I think we could actually reduce the superclass down to this and increase our increase our performance this way so I think if we eliminate our objects the first thing that we need to do we need to do for speeding up our code is eliminating our objects remember I have no idea who said no code is faster than no code really limit the types that we're dealing with fewer types that we deal with equals less code less code equals faster code the other thing is report performance issues to us on the rails core team please open tickets and you don't need to say it like you don't need to say like Aaron my app is super slow just say like Aaron this this thing I think that finding a record from active record should be faster it's too slow now like we need to know this particular stuff because we don't really have a good I'm not sure how to do this we don't have a good suite for preventing against performance regression so please please help anybody if you can but if you ever see any performance issues report them to us if you say like if you see something in I don't know in node or some other framework where it's very fast compared to something we do in rails report that say hey this other framework does something that's equivalent to what we do in rails why is there there's faster it's probably a bug and before you decide to freeze all of the strings measure measure measure measure measure seriously don't go around freezing all the strings in your code base your coworkers will hate you only do it in places where it's actually a bottleneck we're actually seeing performance issues so to conclude this I know that I'm over rails 4 2 will be the fastest rails ever thank you that is all