 Our first morning keynote this morning, and that is Aaron Panderson. Come up, Tinder Love. One sec, I gotta do this real quick. All right. For posterity. Thank you. Thank you. No. I'll give you a hug, but only in person. Okay. Okay. Okay. Time for me to turn into a nervous wreck. Should I start dropping F-bombs now? Is this it? My race car. Okay. Okay. All right. There we go. All right. So Ted Cruz and John Kasich both dropped out, so we have to come to grips with reality now, which is to make rails great again. Brad said he wouldn't make any PHP jokes, but I will. Okay. All right. Hello. Hello. Hello. I'm Aaron Panderson. I'm going to introduce myself because I noticed many people here are new, so in case you don't know who I am, my name is Aaron Panderson. I go by Tinder Love on the internet. That is my name everywhere mostly. And if you, like maybe you've seen this name or my avatar, I don't look like my avatar. This is, that is me. That's actually me. It's really me. I promise. Anyway, I'm on the Ruby core team and the Rails core team. I'm also on the Ruby security team and the rail security team. So that's, that's fun. I thought I'd try this out for all of the security nerds here. This is my PGP fingerprint. I'm not going to read it to you. I am the number two, number two rails committer that's there is the list right there. I'm number two. The reason I'm number two is because I took my commit points and I traded them in for stuffed animals and those, those like, you know, those plastic soldiers with the, with the plastic parachutes. I love those things. Anyway, I work for a, I work for a small startup company that is in San Francisco called GitHub. I recently, I recently started working for this company. So thank you for your patronage using GitHub. I really appreciate it. The company, the company, the company is very legit. I like to give hugs on Friday. I'm not going to do that here with so many people, but if you come up to me later, I will give you a GitHub. Anyway, so I started, I started working at GitHub like a little over a month ago and I was looking through all the stuff we have and they're like, yeah, we've got, we've got slide templates and they're really cool. So I started using the slide templates and they are very cool. They're way too cool for me. So I'm going to switch to something more of my style, which is just default slides. So just in case like you might, you might not know this, but this is actually a sponsored talk. It is brought to you by one of my coworkers. I have no tea. Please tweet at her because she's paying me for the slot and emoji and she said she wouldn't pay up unless I proved that it was actually in my slide deck. So I want you to tweet at her and say hello. This is also brought to you. This slide deck is brought to you by my cats. This is what, this is, I love this face. It's just, it's like, this is the face I make when I'm staring at code. I'm like, this is, this is SeaTac Airport Facebook YouTube. We call her ChooChier for short. That's her resting face. This is Gorbachev. He thinks he is hiding. Here's a better picture of him. It's Gorbachev Puff Puff Thunder Horse. And I have, I have a bunch of stickers on my cats. So if you, I also have GitHub stickers too. So if you want to come up and say hello to me, I will give you stickers of my cats and GitHub stickers if you would like them. Also the talk is brought to you today by my wife. So thank you, wife. Thank you. So I want to talk a little bit about my job at GitHub or I, like, the way I like to say it, jit hub, legit hub. Anyway, GitHub, I noticed, so one thing that really, really inspired me is when I was reading these, I was looking through their slide templates and I noticed at the bottom it said, GitHub, how people build software and I was very inspired by that keynote footer. Anyway, so I actually, I take that statement very seriously and that is what I am doing at GitHub. Like we develop software, you all develop software. So what my job is at GitHub is basically bringing GitHub application development to everyone. And that means taking things like our development style or anything we use for, you know, performance features, security features, anything that is not core to GitHub itself, anything that's not core to our product, I'm trying to take that and push it upstream into Rails or Ruby, any of those types of things, trying to make them public so that we can all benefit from that and build, all of us can build better applications. So essentially what my job is to do is Rails core development and also Ruby core development. So please buy our products, help support me. So I want to talk a little bit about my career goals. I've been thinking about my career goals for quite a while now and I'm a 35-year-old software engineer and I don't want to admit that but we're not in San Francisco so I think it's okay. So when I first started my career I thought to myself, you know what, I want to get rich. I thought about this. I'm like, I want to get rich. I'm going to retire by the time I'm 30. I won't have to work anymore and then I got a little bit older and realized that, you know, that's going to be very difficult and I started thinking, okay, why do I want to get rich? Like what is the point? Why do I actually want to do this? And the reason, after much introspection, the reason I wanted to get rich is so that I could do what I want to do all day, which is making Rails great again. So I came up with another solution for this. There is a different solution. So think about this. If you're completely independently wealthy and you could do whatever you wanted to do all day and what you wanted to do is make Rails great again, what would the difference between just drawing a salary on that, like being a cog in the machine versus being totally rich? So I, what I want to do with my career is I want to become a cog. I want to be the lost cog in the JIT Hub so that I can work on making Rails. So this is my, this is actually my career goal is to become a cog. So unfortunately I'm giving a keynote so you can tell I'm failing at this job. My career is not going so well. Anyway, let's talk a little bit about what's new in Rails 5, alright? So new in Rails 5. First off, Rails is in its prime. This is the first time that Rails has been in its prime since Rails 3. We no longer do XML sit-ups. We do JSON burpees. Thank you Justin. I want to talk a little bit about, I want to talk a little bit about threading, the new threading features in Rails 5. And you may have noticed unfortunately that DHH is not here. We have had a race condition. He could not make it now. But I guess that's okay. He's racing in the Le Mans and I heard that that is essentially just NASCAR for Europe. So, alright, I know many of you did not laugh at the previous threading joke. And you're thinking to yourself, I came to the opening keynote and I heard that joke there. Well, let me tell you, we have a little bit of Ruby drama today. I am extremely angry at Jeremy. I want to take you all back to a time, back to a time of previous place. Back to Friday, April 22, 2016 at 7.57 p.m. I am going to take you to my chat history, right there. Now you see my normal shtick is essentially to, you know, DHH gives his keynote and then later on I give mine and for the beginning portion of my presentation I basically just give him a hard time. And I thought to myself, this is going to be really difficult this year because, well, I really like Jeremy and I don't want to give him a hard time on stage so I was trying to figure out what am I going to do with the opening part of my presentation and then this happens. Even back, so this is, what is this, 2011, five years ago he did a double dream hands on stage with me. I got everybody in the Rails core team at the time to do that except for DHH so I had to give Jeremy a lot of respect but then he does this to me. Alright, so I'm going to give him a little bit more hard time. Five. Five dollar. Five dollar full on. What even is this? Where did he get all these slides? Come on. He's not giving 110%. He's just repeating the same thing over and over again. I feel like he got this from the surplus of slides. I'm just kidding. I love you. I love you. Jeremy, you're the best. Alright, so I want to talk a little, I want to talk about some major changes in Active Record. We're going to talk about some major changes. So to talk about this, I want to, I want to talk about SQL. I want to talk about SQL so I've prepared a statement. Please notice my advertisement thoughts. Alright, actually we've got a new thing called application record. If you generate a new application today, you'll see that we have this new class. It's application record in your new application. It basically inherits from Active Record base and you can put methods in that that you want to have in all of your models. The other thing that we've started doing is, like, I want to appeal to startups. I heard that, I heard in Justin's talk that we're giving way to Node.js so that, like, worries me. So I want to start, I want to start appealing to startups, people who want to start their businesses on Rails. So what we've done is, if you take a look at the new application generation, so here I am. I know you can't read it. It's fine. I'm generating a new application. I've named this application OMGLOL2. And that is because I have multiple OMGLOL applications on my machine. And you'll see here there's a new file which you can't read, but it's okay because I'm going to zoom in here. And if you look at it, we have under the models directory, we have a business.rb. And this comes with all new Rails applications. And if you open this up, it's actually a business model for you. So this is how you can make tons of money. So we generate a new business model and this is actually a Hacker News compliant business model. So here you go. All right. So I love surprises. Surprises are great. I mean, they're never bad, right? When you get surprised, it's always, always, always a good thing. So you go home and there's a surprise party for you, you're super happy. They never turn out to be terrible things. And last year we had some surprises. Like we had a surprise action cable announcement. It was a brand new surprising thing to even me. So I want to take the opportunity this year to also announce a surprise. This is going to be in Rails 5. And that is PHP template support. And I figure with this, so we're appealing to startups. And I think, well, maybe if we combine forces with the PHP community, we could take out Node.js. So what I've done today, I want to give you an example here. This is, if you look under app views, we have index PHP and we print out PHP info. And if I run the Rails server and then access it, you'll see, okay, here's our nice Rails 5 intro screen. And then we go to the users page and there's our PHP info. I want you to look around you. What do you see? Rails 5 running PHP 5. So this isn't enough. You're like, oh, whatever. It probably just shelled out to PHP and it's printing out the thing. This is no big deal. Well, let's go a little step further. So here's our PHP version thingy. I'm going to change this to access some variable hello world. You'll see that at hello world. And if we go into the controller, come on here and work your Vim. You can do it. You'll see we have an instance variable named hello world with the time now. And we print that out from PHP. And then hopefully I go over here and reload the page and you'll see the actual time that was made in Ruby and goes into your PHP template. This is not a joke. But we need to go further. This isn't enough. You're thinking, oh, he's probably just printing that out. It's just a string. Who cares about that? He's printing that out and sending it, you know, shelling out to PHP and it's doing some magic trick. No. No, no, no, no, no, no, no, no. So what we're going to do is we're going to access our active record models here. I'm going to go in and access users.all, get rid of that instance variable. And then to my horrible PHP, you can see here this is a traditional PHP template. This is what you'll find in most PHP files, including my terribly broken HTML. Look at that. Look at that indentation. Good job, Aaron. So we reload it. And of course there's nothing there because we didn't put anything on our database. So I'll go over to the user's new and add some users. I think I sped this up a little. I'm just adding users. It's normal, you know, normal stuff. So add some users. And then we go back to the index and there they are with my horrible HTML. And then I think I fixed it. I don't know why I needed to show you this. I was like, I can't get my lists right. Add the traditional indentation. There it is. Look at that. So, yes, PHP 5 in your Rails 5, you'll have PHP templates. I don't think I mentioned this at the beginning of my talk, but some of the things I am saying to you might be a lie. It is up to you to decide which is true and which is not. All right, so let's get down to business. The actual title of my talk, we're done with the fun part of the presentation. Let's get down to business. So I typically do the reverse mullet style presentation where we have fun in the front and then business in the back. So let's get down to business. The title of my talk is, what have you done for me lately? And I know you're thinking to, you know, I have done a lot of stuff for you lately. I really have. But I know that you're all thinking to yourself, yes, but what have you done for me lately? So that's the real title of my talk. What have you done for me lately? Really lately. I want to talk about performance. Surprise, performance. I'm talking about performance. I love performance. I love fast code. And the reason I love it is because it's part of my plan to become a cog. Essentially, if all of your code gets faster, you feel happy, right? It's like I'm Patrick Swayze from Ghosts coming over and helping you do programming and you're like, everything's faster. It feels good. I'm not sure what's going on. You look over your shoulder and nobody's there, right? That's me. I want to be that cog. So we're going to talk about boot time performance, run time performance and memory efficiency. And when I do performance work, I always have to think that performance is, it's always about tradeoffs. When you're doing performance work, it's always about tradeoffs. You're trading, you're always trading something. Whether it's memory for speed or, you know, anyone who caches something understands this. You want to have a faster page, you cache it. You cache the page. But that cache had to go somewhere. So you're paying some sort of price for that speed. Or maybe concurrency for memory. This one's an interesting one that isn't so common. You give up concurrency so that you can save memory. Or complexity for memory. Like maybe you want to reduce the amount of memory that you're using so you increase the complexity of your code. And I'll look at some of that later. But the point here is that performance is never free. It's never free. You can't just have free performance. So it's very important for you when doing performance work to understand the constraints that you're working under. How fast does the code have to be? How much memory should it consume? All these particular constraints. You need to figure out those before you do any performance work. So I'm going to talk about boot time performance. And I think this is interesting because... You're eating my advertisements. I think it's interesting because it impacts running tests, impacts server deployment, restarting and production. I like restarting stuff. That is happy for me. I'm not a real spring user. So boot time performance is important for me. When I look at that, basically I think about boot time performance from the very beginning. When I do bundle exec rails. This is it. I look at this whole thing. And I think about the different parts that are involved here. If you look at each part of this command, the very beginning we have the bundle command. That bundle command is actually a bin stub. This bin stub is installed and controlled by RubyGems. So if you care about the speed of just that bundle part, you need to look at RubyGems. If you care about the speed of the exec part of this command, you probably have to look at bundler. If you care about the speed of that exec part, that is where you need to look. With the rails command, you need to look at rails. And also with the s command, you need to look at rails as well. Now, if you think about the boot process and we extend those lines out a little bit, it's essentially a timeline of what your code is doing at any particular point in time. So we can extend that out. And we know that as soon as we hit enter, essentially we're going to be spending our time in RubyGems, then bundler, then rails. That's the timeline of our boot process. And a lot of people say, okay, well, you know, speeding up Ruby speeds up everything. Why don't you just speed up Ruby rather than speeding up, you know, speeding up those particular things. And I do like working on speeding up Ruby. I think that's a fun task. But just because speeding up Ruby speeds up everything, that doesn't mean we need to write slow code. We can write faster code and have speed today. So I'm going to talk about two optimizations at the very beginning of the boot process. I'm going to talk about Ruby sort of, and I'm going to talk about RubyGems as well. So the other day I was running an empty program. I do Ruby-v and I do a blank string. And I see that it takes 100 milliseconds and I think to myself, OMG, Ruby is slow. And anybody that runs Ruby in this program and sees that it takes 100 milliseconds, they say it's slow. I can't blame them for saying that. Everybody should think this is slow. But if I disable RubyGems and I run it again, I see that it takes maybe 50 milliseconds. Now I want to make two points of this slide. First is that measuring the amount of time that RubyGems takes is at the very beginning of the boot processes. Fairly difficult things. The way I do this is measurement by elimination. Essentially, what I do is say, okay, well, let's just look at the boot time of Ruby and then remove RubyGems and look at the difference between those two, and that's how we can kind of gauge how much time we're spending in RubyGems. So we know that we have to optimize in that particular place. The other point that I want to make is that placing blame is difficult. When you're running that Ruby-e blank string, how do you know that it's actually RubyGems' fault? I would not blame anybody for looking at that and saying, wow, Ruby is slow, right? Even though you don't know under the hood, half that time is spent in RubyGems. It's not actually Ruby's fault. So why is it slow? Let's look at, the reason it's slow is this file, gemprelude, this is a file that is in Ruby's source code itself. If you go look at Ruby source code, you'll find a file called gemprelude.rb. When you install Ruby, you won't see this file anywhere. It's only part of the source distribution. So if we look at the inside of this file, this file gets loaded every time you start up Ruby. So the first thing it does is it loads RubyGems right here. So why are we loading RubyGems? Why does this thing load RubyGems at the very beginning of your process? The reason is because back in the battle days of 1.8, you had to require RubyGems before everything. Today, in Ruby 1.9+, you don't have to do that anymore. It's built in. Ruby does this for us, and that's what the gemprelude provides to us. So we load RubyGems, and then you'll see here, this line loads the did you mean gem? This is new in Ruby 2.3. So what does the did you mean gem do? It gives you, it's, if you make any typos, it tries to suggest to you what it should be. So here I've made a typo. I wrote object IP, and then it says to me, did you mean object ID? So it tries to suggest the correct method for you. This is essentially the clippy for Ruby. So did you mean as a gem? Did you mean as a gem, and it is shipped with Ruby as a gem, which means that RubyGems has to load it. RubyGems is responsible for doing that. So right here did you mean RubyGems loads the did you mean gem? So what exactly does the require method do? This is essentially a TLDR of the require method. I need to read it super closely, but essentially what RubyGems require method does is it iterates over every gem on your system, and it says, hey gem, do you have this file? If you have this file, I will activate you and then try requiring the file. So that's essentially this loop right here. That loop is O3N or just ON, and what this means is it's actually testing for three particular files. It's saying, hey, do you have the file with RB extension or the file with SO extension? So what this means is that the more gems you have installed, the slower require gets. And you'll notice that we always do require at the beginning of the Ruby boot process. So what this means is that the more gems you have installed, the slower Ruby gets. And we can actually see this in action. So this is a detrace script to watch all the stat calls for a particular process. What this command is doing is running as root. I have to be root to run it. Dash queue to make it not print out stuff I don't care about. Here I want to look at all stat calls on my system, and then here I print out the file that's being statted. And then down here is the command that I'm actually running. I use RBN for my Ruby management. You'll have to figure out a different command for yours. And then I'm just running the empty program there. So when I run this, we can watch this in action. You'll see if I run this and just count the number of stat calls with Ruby gems on my system, we see about 298 stats just on boot. Without the digimene, if I deactivate the digimene gem, we only see 12. And without Ruby gems or the digimene gem, we see five file stats. And what's neat is if you look at all these stats, you'll see a printout that looks something like this. This is a sample. This isn't the whole 300 of them, or so this is just a sample of them. What you'll see is that it's trying to go through every single version of every single gem that's on your system looking for that file. By the way, just a side note, I think this is really interesting. It's statting files that don't have an extension and Ruby won't actually, if that file exists, it won't actually require it. So it's okay, I have some good news. There's good news, everybody. Good news. The digimene gem starts with D, and D is pretty early in the alphabet. And these are sorted alphabetically, so it'll only go to the Ds. So we've got that going for us. So one good way... One good way that I would propose that we speed up the Rails boot process is that we rename Rails to AAA Rails. Reminds me of phone books. All right, so let's make an improvement. We can actually speed this up by using the gem command. There's a gem method. So that top program is just a normal require requiring the digimene gem. The next one is a different program that does gem digimene and then requires digimene. And if we compare the stats on those, you'll see a bear require took about 300 stat calls. Where doing gem plus require only took 16. So why is this faster? The reason it's faster is because if you look at that gem thing, which I know you can't read, it's fine, the gem command actually mutates the load path. And the first thing that require does is ask Ruby, hey, can you require this file? Is this file in the load path? And if it is in the load path, it'll just require it and continue on with our lives. So we have an 01 lookup of the gem spec. We know the name of the gem spec. We just look it up. And then we have an 01 require the file. It's not actually 01, it's just very small. So to fix this, we just patch the gem prelude, and this is the patch I applied to the gem prelude. Just two line diff here. We went from a bear require of on to a gem plus require of 01. So in this particular case, we are trading off complexity for speed. Okay? We're adding more complexity to the code, but we're gaining speed out of this. Now, complexity does have overhead. And even this patch has overhead. And you might be thinking, hey, that's only one line. It's one line. It doesn't matter. Yeah, we'll add that. There is no cost to that. But if you look at these commands, you'll see, I want to go back to this command. There is an optimization in RubyGems. When you do gem install bundler, and it installs the bundler bin file, RubyGems knows that, hey, when you run the bundler bin file, we probably want to activate the bundler gem. It came from the bundler gem. We know that it came from the bundler gem. So let's activate the bundler gem as soon as you call bundle. And you could see this. And before RubyGems 252, if you looked at those bin stubs, you would see right here we have those two lines, gem bundler, and then do the load. This is for all bin stubs. I'm only picking on bundler because I just use it all the time, and it's great. Anyway, so we have an 01 require here. This is what it is. And then if you look in RubyGems 252 to 262, that gem call is actually gone. So we've gone to an ON time here, which means that all of your bin stubs now get slower as you install more gems. So the point here is that even though that was a one-line change, even RubyGems maintainers can miss this too. So you do have to think about these trade-offs when you're looking at code complexity. All right. So let's move on. We did RubyGems and now I want to talk about Rails. I don't want to talk about Ruby and Ruby so much. Let's talk about booting up Rails. And I'm excited to talk about this because of new technology that is not actually new because I went to Koichi's talk yesterday and he just told you all of this. So if we take a look at startup times, it'll break down something like this. These are not actual times. You have a pie chart here. These numbers are all oh, should I swear on stage? These numbers are all bullshit. Oh! I said a swear word. All right. So these numbers, these are just made up. What I'm trying to say here is that these are not actual times. These are just things that you do as you're booting your Rails process and they take some amount of time and we don't know what those times are. We don't know what they are. So how can we measure them? So if we're looking at the same thing that we see or sometimes I'll just disable the garbage collector and compare the benchmarks and see what happens. So we can eliminate GC from the boot time process. We can measure its impact just by doing these simple things. So we can cut that out and if we cut that out, we see it doesn't really impact boot time so much. So we can eliminate that. We have another thing in here, GitHub we actually have. We have code for caching these lookups so we don't actually do load path searches when we boot our Rails application. But you can also do this as well. If you get this gem called BootScale you can also cache these lookups so you don't have to do load path searches on your boot time for your application. So we can eliminate that from the Pi as well. And like I said we eliminated GC earlier so basically all we're really left with is compilation and execution. So the amount of time it takes to compile your Ruby code and then whatever execution we're doing at boot time. So let's tackle compilation. And to learn about this I want you all to go to Koichi's talk which was yesterday. So I'm going to talk a little bit about it now. Your normal program flow looks a little bit something like this. We take source code. We compile that source code into byte code and then we execute the byte code. So what the idea is is we want to take that beginning part that source code to byte code translation and cache that in a file somewhere so that rather than doing this translation at boot time instead we'll do something like this where we read the byte code from a file and just evaluate that byte code. So we don't actually have to do a boot process. So here's an example of a compilation script. Some of these methods are new in Ruby 2.3. So all this does is take an input file and compile it to an output file. And we can run it and you'll see when you run this, compile hello.rb, you'll see there's just some binary data in there. So our first half is done. We've completed our first half. The next part is we need to be able to load and execute this byte code. And then we evaluate the byte code and then we assume that the hello class exists. And we knew that it existed in that byte code. So this works. We know that hello is available. If we run it, it works. So we're able to load the byte code. We're able to write it to the disk and load it back in. So the second half is done. We are 100% done but Rails isn't faster. And this is because I always give 110%. So this is something I don't really want to do but we have to do it to get over this 10% process. So let's look at the require process. We have three different files, A, B, and C. And they depend on each other. And if we look at the dependency graph, it looks something like this. We have A which requires B, B requires C. C goes back to B. So we have this kind of weird dependency tree. And if you look at the actual logic of the require process, essentially B will start to require B, we'll start to acquire lock for C, try to use B. B is already locked so we can't use it. We finish requiring C, release lock for C, release lock for B, et cetera, et cetera. So this sucks. The point here I'm making with this slide is that there are too many rules and I don't want to figure them out. So essentially what I did is patched Ruby such that it would call a callback when it needed to look up a file and we could just have it do whatever we wanted to at that point. We essentially had to look up or require hook here that would say, okay, when we want to load foo, instead of looking through the file system for it, it would actually hit this method and we could do whatever we wanted to with it. And you can go look at that on my fork of Ruby at GitHub. Yes. Got to get that money. So we have an example here that loads compiled code. This is a lazy compiler. Get up. So if we compare the two, if we boot Rails with and without pre-compiled code, we see this example. So before it took about 1.8 seconds after it took 1200, we're about 30% faster. So I'm actually really excited about 30% faster. Koichi said in his talk, oh, it's not that much faster. It's only 30% faster. And I'm like, no, 30%. Better than what we have today. We should do this. But what I think is really cool about this is the more code that we have to load, the bigger impact that this will make. And on our application at work, we have a lot of code. So this will help us out a lot. So for future work, I'd like to upstream this new callback if I can someday. I think it would be interesting if we compiled code on gem install. So we could just have that available by its cache and validation. So I'm not sure about that. Need to figure that out. So next we'll look at runtime performance. I talked a bit about boot time performance. Now I want to look at runtime performance. I wrote a patch a while ago for doing polymorphic inline caching. And I want to talk a little bit about this. So what is inline caching? Cache, there's a cache in your code and you don't really see it. This cache lives anywhere there's a dot. So where that cache is right there and it says, hey, this object is of type hello. Where is the foo method? You have to look up the foo method when this code is executing. And you don't see this cache because this cache is inline. Hence an inline cache. Yes. So basically this cache, as soon as that code executes, the cache contents will look something like this table where we have a key, which is our hello, the class is located. So the second time this executes, the VM says hey, oh, I know this is hello and I know it's for method foo. The cache hits. I can just go look at that, go directly to that method. So, our next thing, inline caching. If we have two types here, it sees that first type hello, we get a miss, but it populates the cache. And we get a cache miss there, the key is wrong, so it misses. And what this means is in MRI our cache size is one. We can only cache one value and we call this a monomorphic inline cache. It's one value, one type stored in this cache. So what if we had a cache size of two? If we had a cache size of two, then we could say okay, we missed on the first one, we missed on hello, but that gets put into the cache. But the subsequent hello hits and the subsequent world hits as well. So a cache size of two or more, we can call that a polymorphic inline cache. It's caching multiple types. So this particular optimization pays off when these call sites see multiple types. If you see many different classes at that particular call site, it pays off. So we pay off when we see multiple types and you can go see, I have this polymorphic inline cache on GitHub. Go there. So you can check it out and try it out yourself. But unfortunately, the TLDR is it didn't help us. We ran it on our application and it didn't help our application at all. And the reason is because if you look, count the number of types that you see in the application. This is the test application that I use. If we counted the number of types, only 3% of the call sites had two or more types, which meant that we would only see maybe possibly a 3% performance improvement. So this particular optimization is a trade off of complexity and memory for speed. We're adding more, we're using more memory, we're adding more complexity and we're gaining speed with this. But unfortunately, that speed is only 3%, so it's probably not worth it. So this patch, for our application, this patch was not worth it, so this probably won't get upstream. And the last site that had 1600 types, 16 different types of objects passed through one method. And I thought that was very interesting and where that actually came from is someone calling instanceEval on an object. Whenever you call instanceEval on something, it creates an anonymous subclass of hello, or anonymous subclass of whatever it is and attaches that. You probably know this as the metaclass. Now, that means that in this particular case, this foo method right here is an anonymous subclass, which means this is always a cache miss. So when I was counting these in our application, that's what I was seeing as anonymous classes coming through 1600 anonymous classes. So it's an interesting thing, an interesting question to ask. Like, what class is this? When you look at var, what class is var? And you think, oh, well, it's an instance of hello. But I actually like to think of this more as an instance of an anonymous subclass of hello and that anonymous subclass is lazily populated. It's also known as the metaclass. And this is actually okay because of the Liskov substitution principle. It's fine. It's a VM optimization that we go around pretending that this is hello most of the time. So when does this singleton class get instantiated? Anytime we call instanceEval or singleton class or add a singleton method to something. And there's other cases where this happens, but these are like the main ones that I saw. Now, I know you're wondering, if I'm a Rails dev presented with this info, then why should I care? Well, the reason you should care is that some libraries do this, and unfortunately, they are popular libraries like RSpec or Event Machine. So the solution is don't write code like that. Just don't do that. Please. And if you really, really think you have to, don't. Seriously, just don't. But if you really, really, really must do that, we can speed it up. And what I found interesting in RSpec is that these singleton classes, these singleton classes are classes with no new methods added to them. They didn't add any methods to the singleton classes, they just used the singleton class as a storage location. So a singleton class with no methods is exactly the same as its super class, meaning that those two things can share a cache key. So what I did is I added a patch to, added a patch to Ruby that would, well, I haven't upstreamed this, but I made a patch for Ruby that would actually share those cache keys, and that's what it looks like. Again, GitHub. And this is a, here's a benchmark for it. This is a benchmark comparing essentially polymorphic call sites between the singleton class and the non-singleton class version. And if we look at it, the original speed is, it took about 2.5 seconds to run this benchmark. And after sharing those cache keys, we see it go down to 1.3 seconds. So it's about 45% faster. In this case, we're trading complexity for speed. So, in this case, it's probably worthwhile. That's a good performance improvement. We saw a good speed improvement. The patch is relatively small and innocuous. Another thing I thought was interesting about doing this work is that I felt really, really bad about it because I went, I subscribed to the Just Don't Do This School of Performance Improvement. So I tried to refactor RSpec to not use singleton classes anymore, and it turned out it was actually easier for me to just optimize Ruby. So, I don't mean that as a dig at RSpec, but RSpec's great. I'm just saying that code, man. All right. So memory efficiency. Let's talk, we've done boot time, we've done runtime. Let's do memory now. I want to talk about copy on write, aka cow. We use copy on write a lot. You probably use it. So, first, I want to talk about impacts to copy on write. I want to talk about heap layout optimizations. So, these are the two topics we're going to discuss here. Copy on write optimizations, essentially what that is, is when you have a parent process and that parent process points at some memory, a bunch of children, and those children don't get a copy of that memory, they actually just point at the parent process of memory. So, they share that memory. When you have 10 processes, it doesn't use 10 times the memory, it maybe uses only that parent process of memory. They all point at that one particular bit. Now, if any of these children writes to that bit of memory, the parent process shouldn't see that write happen, so what the operating system does is it actually copies that memory down to the parent process. So, that's why we call it copy on write. When you write to a particular memory location, the operating system copies that for us. So, that copy, when that copy occurs, that's what's called a page fault. So, there's a page fault that a copy on write page fault, that page gets copied on those faults. So, the operating system copies some of the memory. It doesn't copy that entire block like my diagram showed. So, that's where you wrote, but it copies in page size blocks, operating system page size blocks. So, if we look at Ruby's memory layout, Ruby's memory layout looks a bit like this, where Ruby will allocate a page, and then all of our Ruby objects are allocated inside that page. So, as we allocate Ruby objects, those Ruby objects will fill out those pages, and if we need a new page, Ruby will allocate one whole new page, and then we start allocating objects inside of that. Now, when GC happens, if a page is free, the garbage collector will free up any objects, and we have these objects that get freed in here, and if you look at this, you can think of this as a little bit like Swiss cheese that has holes in it. So, there's these little slots that have free objects, and if any pages are free, the garbage collector will actually remove that page. Now, what would be cool is if these two objects here, if we were to go to that other page, then we would have a free page here, and we could free that page as well. That would be really nice. Unfortunately, in MRI's garbage collector, objects don't move today. They don't move. So, we end up with a heap that looks something like this, where we have a bunch of free slots. These are Swiss cheese, I like to call it Swiss cheese, because it's just got a bunch of holes in that. Now, this actually wastes memory in two ways. The parent is pointing at these particular pages, then we fork. The child is also pointing at this same memory layout. Now, the child allocates an object, and it goes there, and we get a page fault, and the operating system writes an operating system size page to the child process. But that operating system size page is larger than a Ruby object, which means that we're actually going to copy multiple things from the parent process to the child process. For example, a Ruby object is about 400 objects, and OS10, the default page size, the operating system page size is 4K, a Ruby page size, that block I was showing you is 16K, and a Ruby object is about 40 bytes. So, one Ruby page is about 400 objects, one Ruby page is four OS pages, and one OS page is about 100 Ruby objects. So, if there's any free slots inside of any of that 100 object area, then we're probably going to copy too many objects. So, this only impacts Ruby code that forks. So, what forks? Unicorn forks, we use Unicorn in production, probably many of you use this, most MRI web servers fork. So, what I was thinking is, this section of my presentation has no code, unfortunately, I've just been thinking about it, because it's fun to think about. If we look at a page in some hard ways, allocation moves something like this, we're going left to right allocating new objects. We're filling up that page. What would be interesting is, if we could predict which objects were going to be old, what if we knew that these particular objects were probably going to be old, and these other ones we don't know about, maybe they'll be old, maybe they'll be new, we don't know. But there's some objects we might know are old. So, if we know that those are old, we might know that they're probably old objects, and the other side of the page with possibly new objects. That way, we would end up with a page where one side of it looks like Swiss cheese, and the other side looks more like Gouda. So, we have a more solid page, and those probably old objects won't get copied to the child process. Or maybe we could even have pages that are strictly dedicated to things that are probably old objects. So, you might be thinking, what are probably old objects? We can see these. For example, when you create a new class, that class is probably not going to get garbage collected. In Ruby, our class is our objects. That class probably isn't going anywhere. Neither is the module, neither is the constant. There's other things, too, that we can say, maybe these things are probably going to get old. So, trade-offs. What are the trade-offs for this? Unfortunately, I don't know yet because I haven't implemented it, nobody's implemented it yet. I basically wait for Koichi to do it and then go to his presentation and learn that he did it. So, maybe it's complexity for memory. How complex is it? I don't know. How much memory is it going to take? We need more tools for introspection, I think. So, I built a tool that I thought was really fun. I call it heapfrag. This is it. And what it does is it shows you a layout, the fragmentation and layout of your memory. Is the video starting? Okay, yes. So, on the left side there, you see the layout of the memory. The black parts are free slots. Those are where we can allocate objects. The red dots, those are old objects. And green dots are new objects. So, you'll see, as I'm allocating objects in IRB, it's filling out that page. You can see, as soon as it does a garbage collection, that it's actually wiping those objects away. So, I'm going to disable the garbage collector and allocate a bunch of stuff. And we can actually see the heap expand. So, it gets wider and wider. And then, when we garbage a bunch of stuff, we can actually see the heap expand. It gets wider and wider. And then, when we garbage collect those, they all go away. Go, Erin. Yay. So, we GC a bunch of times. All right. I'm almost out of time, so I'm going to wrap it up. And I want to wrap it up with five lessons. What do I value? What should I measure? How can I measure it and does it provide shareholder value? Five lessons. Let me repeat those. What do I value? How can I measure it and does it provide shareholder value? All right, everybody. Let's make Rails great again. Thank you so much.