 I guess it's time, I'm 10 seconds early according to this clock, but let's do this anyway. I have a lot of slides, 251 according to Keynote. So this is the interactive portion of my presentation. Everyone like, unlock your phones, take out your phones, take a look, I'm just going to get this out of the way real quick here, okay, just two minutes of your time. All right, everyone, okay, okay, let's move on with our presentation. Sorry, this is for me, this isn't for you, this is for me. Okay, thank you, thank you for coming to my presentation about selfie sticks. There are two purposes to selfie sticks. The first important purpose is to help you identify people who do not, who like fun stuff and like to have a good time and a fun time. The other purpose of a selfie stick is to let you know who doesn't like to have fun or fun things. So I recommend that you all get a selfie stick. By the way, have any of you seen this photo? Yeah, that's me. I've been sending airdrops to everybody here and unfortunately some people decline me, it makes me sad. I was a, I like to do this wherever I go. I was actually at an airport once and there was a group of older people traveling together and they were coming home for vacation and they're like, let's share all of our photos and they're like, how are we gonna do it? And I hear one guy say, let's just airdrop them to each other and take out your phones and we'll send them and I was like, yes. So I send them like photos of my cat. One guy is like, what is this cat photo? If you're gonna do this though, you need to make sure, like, so make sure to name your phone something nondescript. Like, my phone is named iPhone, so. Refreshing coffee. Let me show you, I just wanna show you a few of the photos that I have received. This is, that is, that is cute. We got another one here and then I got this. It's a good one. I got this one and then, and I love it. I love it when I hear about people saying like, who is doing all of this? Like, I thought about that. I'm like, yes, success. I have done it. So today we're gonna talk about methods of memory management and MRI or so this is actually an MRuby talk, but many, many Ms. My name is Aaron Patterson or Tenderlove. That is my nerd code at the bottom there. So if you want to send encrypted things to me, you can. I noticed that we are having a cheerleading competition here, which I think is really amazing. But I like to think that it's actually probably a mockumentary filming and I really wanna go see. This is what I look like on the internet. I look different in person. So if that, you might recognize that icon more than me in person. I'm from Seattle, which is not Ohio. So I, but I wanted to share with you like, my wife is from Japan. And we like, we like to practice Ohio culture, which is essentially every morning we say Ohio to each other. Yes, sorry, I only have 30 minutes for this talk. I work for a company called Github. It is a small startup out of California, but I don't live in California, as I said, I live in Seattle. This, this company is the first legit company I've ever worked for. I love, like I love using Git, but I'm not gonna force push it on all of you. So, room is slowly, slowly turning against me. So I'm a Git, I'm a GitHub certified engineer, which means I really enjoy bare metal. My name on there is Tenderlove. And it was weird starting at GitHub because everybody refers to everybody else by their nicks. So when I showed up at the, when I showed up at the office, and just to give you a little background, I've been working remotely for about the past six years. So I don't really interact with many co-workers IRL. So I go into the office and everybody is calling me Tenderlove. And I'm like, it's awkward. Please, you can call, please call me Aaron. Tenderlove is fine too, you can call me that. Actually, I'm gonna tell a story, I don't care. We're gonna have to move very quickly, but I will tell the story. My parents are both engineers, so it's not weird to them that I sit at a computer and do my job all day. I tell them everything I do, everything about myself. Except for this one little thing, this name. They do not know this name. They didn't know this name. I was invited to speak at a conference in Salt Lake City, which is where I was, where I was born. My parents still live there. And I said, okay, I'll speak at this conference, but only if you give me two free tickets for my parents. And of course, the organizers absolutely would love to do that. So I show up at the conference with my parents. We meet the conference organizer. The organizer said, I'm like, hey, here's my parents. The organizer says, okay, great, you're gonna be up real soon, but we've reserved three seats for you down at the front. So he takes us down to the front, and there's three seats at the front, and there's three signs. And the first sign says Tenderlove. The next sign says Tendermom. The next sign says Tenderdad. And I'm just like, no. Not now, not now. So I had to tell him, I had to tell him like, look, this is just a name, people know me by it, be cool, don't worry about it. And I ask you, they're gonna ask you about it, just don't worry, you know, and he didn't know. And we've never talked about it since, so it's fine. Anyway, I love cats, cats are the best. This is one of my cats. This is Chuchu. She isn't as famous as Gorby. This is Gorby, Gorbachev Puff Puff Thunder Horse. He is hiding here. He thinks he's hiding, which I think is adorable. Chuchu likes to sit on my desk. She's sitting on my desk. And she makes this face that I really love. This is the face. It's the same face I make when I'm programming. It's just like, just nothing. Staring at a screen, just, ugh. So I have stickers of my cat. So if you'd like a sticker of my cat, come say hello to me. I also have GitHub stickers. I think, I ordered some from our online store. And it said like, oh, order some sticker packs. Order as many as you want. I see there's a drop-down, which is kind of funny because the drop-down was the numbers one through 200, like each one, like, okay, well, and there are sorted stickers. So I assumed that if I ordered one, that would be like one pack of assorted stickers, right? Because like, why would you order one? If it's a random sticker, why would you order one? And they would mail you one. It doesn't make sense. So I'm like, well, you know, Rubikoff, I'll just order like, I'm not sure how many are in the pack. I'll order five, that seems fine. So I order five, and then five stickers show. And unfortunately, I forgot to bring those five stickers. But I have one of my coworkers who is actually speaking right now as well. She brought some stickers and was nice enough to give me some. So I have some of those. I was recently in the news, I approved a pull request. I'm also very much into keyboards. This is one of my keyboards. I love mechanical keyboards. So if you wanna talk about that, come talk to me. The interesting thing about this keyboard is that it's backlit, but it's backlit with ultraviolet LEDs so that I can get tan hands. Why, because I don't go outside very much. So I figure I should do that. I wanna talk a little bit about some new Ruby features, especially the ones that Matt's was talking about in his keynote. He was talking about typing, and I wanna talk a little bit about typing with Ruby. So this is soft typing. So let me show you soft typing. Then we have dynamic typing, which is a bit more like this. You're moving your hand around the keyboard a lot. And then there's one more, one more, which is static typing. And the way static typing works is that you don't actually move your hand, the keyboard comes up to your hand like this. So I'm really excited for those new features in Ruby 3. So today we're gonna talk about GC. Let's get serious. I have 20 minutes to present 200 slides. Let's do this. But which is a garbage collector. We're gonna talk about, I'm gonna talk about some memory terms here. We're talking about memory. There's two main places we talk about. We talk about the stack and we talk about the heap. Stack is memory that is, stack memory is temporary variables for each function. So when you call another function, you order, you store some of that memory on the stack. So each function as you're going down, we store that memory on the stack and as you pop up it gets released. Heap is unmanaged memory where we just say, hey, let's go allocate some memory and it just is stored in one particular place and it doesn't matter what functions are being called, any function could access that memory. We're not gonna talk about stack memory because we don't really deal with that so much in Ruby. We do in the virtual machine that does but your Ruby code, everything is heap allocated. So we're gonna talk about the heap. Inside the heap, there's really two types of heaps that I wanna talk about. There's one heap in terms of your machine but inside that heap is another heap which I would call the Ruby heap. The Ruby heap is where Ruby objects live and those Ruby objects can actually point to other places inside the machine's memory. So other places inside the machine's heap. So the Ruby heap when we're talking about Ruby objects is actually a subset of the actual heap that's being used in your Ruby program. So we're gonna talk about the GC in MRI and you might be wondering to yourself, why should I learn about Ruby's GC? And I would tell you that is because you're at a Ruby conference that you should learn about Ruby's GC but hopefully some of you have apps in production. Like is that a thing? So we have apps in production and it's important that when you have apps in production you might encounter like scaling issues or you have tuning issues and it's at those particular moments when you need to learn about garbage collection because maybe this is impacting the performance of your application. So what I want you to learn from this or what I want you to take away from this presentation, if you don't know much about GC, I just want you to learn the terms that we're gonna use throughout. I'm gonna bring up and point out some various terms. Just learn those terms, take those away and you will know a lot more about garbage collection. If you already know the GC terminology then pay attention to the algorithms that we're gonna talk about. So if you already know those algorithms I'm also gonna share some new stuff that I've been working on in Ruby's Garbage Collector. And if you already know the new stuff that I'm gonna talk about in Ruby's Garbage Collector please come give this talk for me because I'm nervous and would rather watch. So let's talk about GC algorithms in MRI. I'm specifically gonna talk about the algorithms that MRI uses. There's two sides of the GC, the collection side and the allocation side. So actually a garbage collector isn't just responsible for reclaiming memory it's also responsible for allocations as well. So we're gonna cover the collection algorithm first which is kind of counterintuitive because you probably wanna allocate some memory before you collect that memory. But we're gonna cover the collection algorithms first because I think those are most complicated and that's not the place where I'm working right now I wanna save that for the end. Then we're gonna talk about the allocation algorithms that we use in Ruby's GC. Finally we'll cover some introspection APIs so some actual Ruby code that you can go home and use and then also tuning variables that, environment variables that we use at GitHub for tuning our application. So collection algorithm, before we get into the collection algorithm here's a picture of my cat just to relax you a little bit, little hands. So what type of collector is MRI's collector? There are various types of garbage collectors we're gonna talk about MRI's. MRI's collector is a mark and sweep collector it's generational and incremental and we're gonna talk about what those things mean. So first though at high level what is a garbage collector? Well if you think about, I'm excited about this conference because many of the speakers have been talking about linked lists and trees and things like that and if we're gonna talk about exactly the same thing here. When you think about a GC if you think about objects in your Ruby code they actually form a tree. So for example this Ruby code on the left there we have an array that's got a hash and inside the hash we have some things and if you look at those object relationships they actually form a tree data structure. So we have a root which is a magical node it's just a magical thing at the root don't worry about that but the root references that A variable which is our array and that array variable references a hash and the hash references a symbol and a string so you can see how this forms a tree data structure. Now let's say we change that code a little bit so we say A is equal to nil and we cut that line away from the root. Now we should be able to collect this entire tree that's below the root so that goes away. Now what the garbage collector's responsibility is is to find all of those nodes that are no longer linked to the root and free those nodes up. So at a basic level that's its job. And when we're talking about GC algorithms we're basically trying to figure out ways to speed up finding those nodes that we can eliminate. So important words from this are root set. The root set is that special node at the top. Those are variables that are referenced from the top level. Garbage is objects that are no longer referenced from the root node. Those are things that we're able to free. Live data are nodes that are actually referenced from the root node. Those are things that we're actually using, data that we're actually using. So the GC's job is to find these unlinked nodes and then free them. So how do we actually find these unlinked nodes? One process is mark and sweep. This is what Ruby's garbage collector does and this strategy is very easy to implement. There are two distinct phases in mark and sweep, the mark phase and the sweep phase. So the way the mark phase works is it looks like this. Let's say we have objects that form a tree that look like this. The mark phase, what it does is it starts at the root and then follows these arrows. So it'll follow those edges and then mark the object and it'll continue following those recursively until we've marked everything. Once it's reached all of the objects it can reach and marked all the objects that it can mark, we know that any objects that are unmarked, those can be freed. So we go into the free phase or the sweep phase and we sweep those unlinked nodes away. So those get sweep and then we unmark everything so they all go back to blue and we start this process over again. So this mark and sweep garbage collection is very easy but unfortunately it's too slow for us and one of the reasons it's too slow is it actually has to stop the world. So what this means, what stop the world means is that as your program is executing all of a sudden it'll just stop and then do walk this entire tree of objects, collect all the ones that are unlinked and then continue and you may have noticed this like if you're writing a program that outputs a bunch of stuff you'll see at some point it'll just go like you're outputting, outputting and then all of a sudden it just stops for a second and then it continues on later and that's what is happening here. We're stopping the world so that we can actually deallocate those objects. Unfortunately it has to visit every single object every time so it's walking through this entire tree every time so we have to walk every object. Every object, every time we do a garbage collection garbage collection phase and this can be kind of slow and there are algorithms that help us deal with that and the way that we can deal with that and walk fewer objects is through a generational algorithm which I will explain now. So the idea behind generational algorithms is that objects typically die young. So the idea is that most objects are going to die young but if we took objects that were old and objects that were new and divided them into two places maybe we could get some performance benefits out of this and that's where generational collection comes in and we'll look at how this actually speeds up our GC. So let's say we have, this is an example of a generational algorithm let's say we have two generations, zero and one we start at the root node and we do our typical mark and sweep so we mark B and D, A and C don't get marked because they're not linked to from the root so we free up A and C and now that we've gone through this collection phase we say okay B and D you're now promoted to generation one so we move those into generation one. Now your program executes again and allocates some new objects and those new objects get allocated in the young generation. So then the program stops again we do a garbage collection phase again we do a mark and sweep so we mark F we don't go and mark B because we know that B is an older object so we only mark F and G and then that marks D we do the same sweep phase and then promote those to the older generation. So what we've saved here is we've saved looking at B you'll notice from our previous mark and sweep algorithm that we had to visit every node every time but with a generational collector we don't have to visit every node every time so we can reduce the number of nodes that we have to visit but unfortunately oh and then we go and unmark them so the performance benefit here is that we didn't have to touch B so we don't have to touch older objects every single time we do a GC but there's one slight problem with this now let's say we have a memory layout that looks like this we have B and D and memory and they're in an older, older generation and somehow the program runs and we don't know how this is just somehow a new object gets allocated and it gets allocated into the older or the newer generation but it's referenced by an older object now if you remember from our generational algorithm we don't look at older objects we don't look at the references that they point to so that arrow is actually unused right there which means that when we go through our normal mark and sweep phase E never gets marked because we don't consider those old objects which means that we'll free that E and we've actually freed live data we have an arrow pointing at that E and we've now freed it so this is a problem that means our program can get a seg fault or something we've freed memory that the user is actually using so what we have to do is we put a what is called a write barrier here so whenever we write one of these arrows from an old object to a new object we put that object into what is called a remembered set so we remember that that object exists and that way when we do this GC mark and sweep phase we not only look at roots but we also look at objects that are inside this remembered set so we remember that which means that we mark the E the E moves on to the older generation and everything is okay so the important words from this are write barrier all a write barrier is is just a piece of code that executes when an object is being written to so when that arrow gets written we execute some code and the code that we happen to execute is something that adds an object to a remembered set remembered set all it is is just a place where we can store stuff to remember for later which I found this word to be very I don't know scary but it's actually not it's just a place where we put stuff to look at later so generational collectors are faster they're not so easy we have to install this write barrier and the algorithm is a little bit more complicated unfortunately we still have to stop the world in order to do this collection so we stop and then do a collection this is where we can actually speed this up by introducing another algorithm called incremental garbage collection so I'm gonna introduce this as quickly as possible nine minutes alright so the way that we do incremental marking is through tricolor what is called a tricolor algorithm what we do is we have three different colors for objects we have white, black and gray white objects are objects that'll be collected black objects are ones that have no references to white objects but are referenced from the root and grays are referenced from the root but we haven't considered anything that they reference so the algorithm is essentially pick an object from the gray set move it to the black set for each object that that object references move those objects to the gray set and then we just repeat steps one and two until the gray set is empty and then we can free up any white objects that are left over and this might be difficult to visualize just from words but if we look at a graph of this you can see okay we'll start out memory that looks like this we color the root black anything it references is gray then we pick an object from the gray set A and F we look at their references color those references gray then we pick from those color those black they don't actually reference anything so we're done now we can sweep away any objects that are white so everything is marked black and we're finished with this finished with this algorithm if you wanna visualize this what you can imagine this as essentially a wave moving through this and I'm very proud of the slide it took me a long time to do this imagine a wave moving through this graph where the front of the wave are gray objects and the back of the wave are black objects so what is the benefit of this algorithm why like what is the point the important thing about this algorithm is that we can interrupt any of the steps so we can stop at a step we don't have to perform all of these steps we don't have to go through the entire tree each time we do a GC we can perform each of these steps incrementally which is why it's called an incremental garbage collector we can do a little bit at a time and then let your program run again and then do a little bit more and let your program run again what this means practically is that halting time is reduced we get more throughput because we're able to say like hey let's stop for just a little bit let your program continue stop for a little bit let your program continue so our halting time is reduced and your throughput is increased but there is one slight problem with this let's say we have a graph that looks like this we're in the middle of doing a GC so we've started the mark phase we're in the middle of doing a GC then your program is allowed to run again something happens we go through continue on with the GC so we mark B or A and F become black, BC and D are gray then we let your program run again and somehow somehow your program allocates a new object G so F points at G but we remember from this algorithm that we don't consider any references that black objects have black objects are done we don't care about those anymore so we have another problem here where when we finish the mark phase this G object isn't marked so we might free that and we have exactly the same problem we were seeing before where we have live data that we've accidentally freed so how do we deal with this? we install a right barrier so any objects that are colored black we can install a right barrier there and put G into another remembered set so important words from this algorithm are incremental we want to increase throughput and decrease halt times again we're using a right barrier in order to deal with these objects that are colored black another remembered set so we're just putting those objects into a remembered set so the benefits of this garbage collection algorithm are that we are trying to minimize tracing tracing is following those arrows we're trying to minimize the number of arrows that we look at each time we do a GC we're trying to decrease halting so I want to talk a little bit about things that GC and MRI is not is things that our garbage collector does not do our garbage collector is not parallel so it does not run in parallel with your program it does run concurrently so our incremental steps are running as your program is running but it is not done in parallel it is also not real time real time garbage collectors try to guarantee a certain amount of time that your GC will run in so if you have a function and you say I need this function it is mission critical that this function execute in I don't know 50 milliseconds or whatever there are GCs out there that will allow you to make that guarantee our GC is not one of those but you can find GCs that will do that our GC is not compacting it means that it will not move objects around in memory and we're going to talk a little bit about that in the four minutes that I have remaining four minutes and 100 slides I have remaining so let's talk about allocation algorithms cats that's me with the cat I have 19 minutes oh my god thank you I thought this was a 30 minute slot okay let's have a drink of refreshing coffee ah thank you Brandon ah okay 19 minutes I got 19 minutes left okay ah alright heat play out so we're going to talk about Ruby's heat play out we're actually going to look at how objects are allocated in the allocation algorithms we've looked at the collection algorithm the MRI uses we're now going to look at we're now going to look at the allocation algorithms that it uses now in Ruby when Ruby allocates an object it doesn't do a malloc every single time it allocates a new object we actually allocate a chunk of a chunk of memory and then divvy that up into Ruby objects and the reason that we do that is because malloc isn't free slow clap yeah I'm sorry I shouldn't I shouldn't drop puns in the middle I should really just put those at the beginning and continue on anyway so malloc malloc actually costs we can't calling the malloc calling malloc isn't a free thing for us to do to actually cost us CPU time so what we try to do is we say okay well let's allocate a large chunk of memory and inside that chunk of memory we'll divvy that up into individual Ruby objects so we'll allocate a large chunk a page or a slab I like to call them slabs which I will the reason I like slab better I will cover later but there are pages and a page is page memory is contiguous it's just a contiguous chunk of memory just some place on the heap and we say okay inside this inside this chunk of memory we actually hold a linked list so there is a linked list inside that chunk of memory nodes in that linked list are called slots each slot is a Ruby object so if we look at a page page that looks imagine this page is in memory inside this page we will store many Ruby objects and those Ruby objects are stored as a linked list and this list is what we call a free list so in order to allocate a new object all we have to do is find the first open slot which is just the next pointer in that free list so we just say okay here's your new Ruby object every time we do an allocation and then we just bump that pointer forward in the linked list so we move that pointer to the next space next space in the free list this is called bump pointer allocation because we're just taking a pointer and bumping it forward so we call it bump pointer allocation just because we're bumping that forward so in the case that we have a full page for example like let's say you allocated a bunch of objects and it fills up that page then when you try to allocate a new object Ruby's GC says, oh, we've run out of space we've run out of pages let's just allocate some new pages so it just allocates a few new pages and then you have more space to allocate Ruby objects Eden pages are the pages that are searched for free slots so that is what we call the Eden is what we call the pages where we're gonna actually allocate new objects and yes these terms are very I don't know, they're interesting I suppose objects are born and die and I guess the metaphor works anyway so let's say we do a GC we have a page that looks like this when a GC happens it'll actually pull an object out of that page and let's say all the objects in that page are freed let's say we lose all of them then the page will actually be destroyed so it'll get reclaimed that page is actually put into a tomb and then later we free up the pages in the tomb so important words are slot a slot is just a place for a Ruby object a page is a place where we allocate a bunch of Ruby objects so we have a bunch of contiguous Ruby objects together and Eden is the place where we go look for a place to allocate a Ruby object a tomb is where things go to die so I want to talk I'm going to get a little bit distracted here because I have 15 minutes so I want to look at some interesting allocation hacks I just it's not necessarily related to allocation or freeing I just love this technique and I think people should know about it you may or may not know that not every object in Ruby requires an allocation so integers and floats those things don't actually allocate allocate any objects and I'm going to talk about how that's accomplished and to talk about that a little bit I'm going to have to talk about the sizes of the pages so one page in Ruby is 16k about so every time we allocate a new page it's 16k one object in Ruby is 40 bytes now pages in Ruby are what are called aligned and what that means is that when we allocate some memory we say hey, give me back a chunk of memory but I want the address of that memory to be divisible by some particular number so we don't do a malloc, we do an aligned malloc so we say hey, operating system, give me some memory and the address of that memory I need it to be divisible by some magic number by some multiple and in our case we're going to choose a multiple of 40 so 40 is our multiple now let's say we take that number 40 now 40 happens to be the size of a Ruby object if we take that number 40 and we say okay well let's start out with 40 and then let's increment it by, multiply it by one, two, three, four, et cetera so we know where each of those addresses are if we do this in Ruby and then print it out in binary print out that address number in binary you'll see something that looks like this I've done it here in Ruby and you can't, it's easier to see the pattern if we just take these binary numbers and combine them together if you stack all those binary numbers together and look at them you'll see a pattern emerge and that pattern is that those last three digits are always zero so the last three digits or the last three digits in any Ruby object are going to be zero what that means is if you're looking at an address in memory you can know whether or not that address is actually a Ruby object a Ruby object must end with three zeros so what if we said okay well let's just flip one of those bits to one then we know oh that address isn't actually a Ruby object it's just something else we can apply meaning to these apply meaning to these binary numbers so we use these three bits to apply it to add meaning to those addresses which means that we can actually represent integers without doing allocations so the way that we would represent an integer without doing any allocation is let's say we have a flag one one is gonna be our flag now for example we have the number two and in binary two is ten so what we'll do is we'll take that number two and shift it over one bit and then apply our integer flag to that number now any time we encounter a one zero one in Ruby code we know that that's not a pointer to an actual heap object we know that it's a pointer to an integer so what we can do is say well let's just decode it so we'll take that binary number which is five we'll shift it over one and we have our two back and we can actually see this in we can see this in action in Ruby 2.3 you won't be able to see this in Ruby 2.4 but you can see this in action we can calculate actually what is our biggest fixed num value so if we say two to the 64 minus two or minus one and put that out in binary you can see it's 64 ones but unfortunately that number is actually a big num so if we go to two to the 63 minus one it's still a big num if we go two to the 62 that's a fixed num but if we add one to that it becomes a big num and why two to the 62 the reason we two to the 62 we have one bit for our sign and then we just said we have to use one bit for our encoding so we started my machine is 64 bits so we take away one bit for the sign and then another bit for the encoding and now we're down to 62 bits so we can calculate the size of the largest fixed num so this is the biggest object before we actually do any allocation is going to be two to the 62 minus one now unfortunately you can't see this so you're seeing big num, big num, fixed num, big num you actually can't see those objects in Ruby 2.4 because we've made it better you shouldn't know about those things you should just use your Ruby and be happy which is what this is doing so if you look at the class output from these you'll just see it's always integer and you have no idea but you can now this is being recorded I'm gonna tell you a secret I'm gonna tell you a secret don't tell anybody else and hopefully maybe we can just censor this out of the video so fixed nums are singleton and you can tell it they're singleton objects by looking at their object ID if you print out the object ID you'll see that it doesn't change so two to the 62 minus one you'll see that object ID doesn't change but as soon as we do two to the 62 the object ID actually changes because we're doing allocations now that object ID this is the part that needs to be bleeped out of the video that object ID is actually based off of the location in memory where that object is so this is a specific implementation detail but you can tell even though everything is now an integer in 2.4 if you look at the object IDs you'll be able to tell which ones are getting allocations and which ones aren't so this technique is called a tag pointer where we're taking a pointer and we have a particular pattern that we know that those bits we can add meaning to those we can tag those pointers in Ruby some of the objects that are tagged are fixed nums which are now just integers floats true false nil and some symbols so not everything is tagged and I think there might be more but I'm not sure all this is clearly documented in some C comments so go look at the source code so let's talk a little bit about allocation problems unfortunately these Ruby objects we have an issue with poor reclamation so let's say we have three pages that are full of objects when we run a GC some of these objects get freed it would be nice if we could say like okay let's take those objects and then move them around and we can rearrange them such that we have a page that's now free we can actually free that page that would be nice unfortunately objects in MRI do not move so we cannot do that it means that we have these holes in our pages that look like this so we could have freed we could have freed that space but we can't because objects won't move around so we've got this potential space that could be freed but we can't do that unfortunately this causes copy and write problems which we've run into in production so this is now we're into why I don't like calling Ruby pages pages is because one Ruby memory page is not one OS memory page and this is important when we're dealing with copy and write so one Ruby page is 16K one OS page is 4K so one Ruby page is about four OS pages and any time you write to a place in memory any time copy and write stuff gets violated whenever we have a place where we have to actually do a copy into a child process it copies an entire page it doesn't copy just that bit that got written to it it copies one page so each time we have a page fault we're copying about 16K page holds about 400 around 400 objects so one OS page every time that gets copied we're copying about 100 Ruby objects so let's say we have a parent process and a child process pointing at the same page the child process maybe writes something writes something to that and now we actually have to copy that entire thing the OS copy is one entire page so we wrote 40 bytes but we got four kilobytes copied so in order to deal with this I've been thinking about different ways for dealing with this and one thing is to group old objects together so let's say we had a special place where we could say okay I have an idea these objects are probably gonna be old so why don't we allocate all those objects that might be old into a specific location and group those together so let's say we have two page types probably old page and then just we have no idea page so if we encounter something that will probably get old we'll allocate that into the probably old page and if we have something that we don't know about we'll just allocate that into a regular page so what is going to be old we can use heuristics to figure out what might be old for example the foo class we know foo class that's probably not gonna get garbage collected because it's a class same with the bar class maybe that constant won't get collected frozen strings things like that we can look at the source and kind of figure out through heuristics what may or may not be old we could also do statistically determine which objects are going to be old like for example if you have a foo object that you know 80% of the time we allocate a foo object it becomes old well let's just allocate that into the probably old space so we can keep statistics based on that this helps us efficiently reduce space reduce our GC time and also is copy and write friendly so I've implemented some of this at work I get hub pays me to do stuff so I'm doing GC work and I've implemented the one that does the technique that's basically like okay we know that classes and modules will become old so let's allocate those into their own space let's look at a cat photo that's my cat her tongue is very long I don't know this is my laser thing my laser thing doesn't work that is an extremely long tongue I think she's licking her eyeballs you can see it again there so her tongue doesn't fit in her mouth it's always hanging out I'm afraid it's gonna dry out oh I think I'm down to five minutes and 40 slides left you should have let me stress out more so okay let's say that we're going to allocate we're gonna take these classes and modules and allocate those into their own particular location so here's our key red dots are objects green dots are classes or modules and white dots are just free space so if we take a ruby page and then just print that out with this particular key grouping our classes and modules together we'll get something that looks like this so each of those vertical lines is a page and then right so we've got some pages in there this is default MRI there's no patch applied to this so you may or may not be able to see there's some little green dots green dots sprinkled throughout there and those green dots are classes or modules now with my patch applied you'll see all of them get grouped together so if we do this you'll see in there there's a green line and those are all of the classes and modules grouped together on one particular page now if we compare those two which I have done very scientifically here you'll see that the one on the bottom is actually my patch applied the one on the top is just a trunk with no patches applied you'll see that mine is actually wider which means there are more pages allocated there are actually more pages allocated in mine than there is on the default without unpatched version which is bad that means that my patched version is actually using more memory but if we take a Rails application this is what a Rails application will look like you can see so there's a bunch of green dots in there this is an unpatched Ruby version now if we compare this to what mine looks like with the patched version you'll see mine looks like this and you'll see a bunch of green lines and those are pages that are just dedicated to classes and modules and if we compare those two together you'll see that the patched version actually is smaller than the unpatched version it's around 17% smaller heap with the patch applied so I want to show you another one other great way to reduce the heap size of your application is to let's see I got a video of it here basically what you do is you take your mouse and you click over there and if you drag it it's very small look at that that's just amazing amazing anyway you can check out so you can check out these patches on our fork of Ruby at GitHub it's we're open sourcing we open source all of our work on this so go there you can find it future work that I would like to do I'd like to work on the next thing I'd like to work on is actually moving objects it would be really cool if we could say like remember this graph this thing where we like we'll move it together and then you can't do that because you know we'd like to get rid of that page but we can't do that what I would really like to do is apply a patch that basically turns that into a into a yep so I think that's something that we can actually accomplish and I think it would work very well or be very in line with the stuff that Koichi is working on with guilds so I think we can accomplish that through stack scanning and forward pointers and other barriers I think it's possible to do this so all right let's wrap this up very quickly with some GC introspection stuff so you can get GC information by doing GC stat and I know that Nate was talking all about this and you don't need to read any of this for two reasons one because he talked about it and two now that you've been watching this entire presentation and you know how these algorithms work and you know the words behind these algorithms when you look at these keys the key names will actually just make sense so you can look at the names you already know the algorithms and the words so these numbers will make sense to you you can also check out GC performance with a GC profiler all you need to do is use this is available in Ruby you just do a GC profiler enable and you can get some statistics about your profiler that enables it this prints out a report for it you can also do heap introspection which is one thing we do a lot you can use objectspace.dumpall and that'll give you a JSON representation of your entire heap so you can see all the objects that are stored in memory unfortunately this method does not show you so you saw in my charts earlier we had blank spots all right this dumpall only dumps live objects so you never get to see any of those blank spots you'll have to go look at our patches on GitHub to see how we're printing those out so I actually had to patch it to print out those blank spots but what's cool too is you can print out you can do objectspace.dump and get information about one particular object so you can just give it an object there and see information about it like the size it is in memory 40 flags on that and what else is cool is we can see how the GC impacts objects using this objectspace.dump API so let's say we allocate a new object and look at that we can see oh it's right barrier protected and if we look at that or if we run a GC so we allocate X at the top there run a GC and we look at the output you'll see on the third generation there or the third GC it becomes old so what this means is that every object in order for it to become old it has to survive three garbage collection cycles so it's considered young until we've GCed three times once it's survived once it's survived three generations then we say okay we're now an old object so go check out all the methods on objectspace including trace object allocations is another one that I really enjoy read the documentation on this one this one helps you find where objects are being allocated some GC tuning variables that we use at work we tweak this environment variable this tweaks the number of free slots after a garbage collection we also tweak heap init slots this is the number of slots that are available as you when you start your Ruby program and also we tweak the number of slots the growth how quickly the heap grows so what you want to do is when you're tuning your Rails application for example the way that we tune it is we tune it such that when we're running a request we don't actually want to allocate new pages because page allocation is expensive so we want to make sure that we actually have a large enough heap that it will deal with any request that comes in and I think I'm actually out of time so thank you very much okay