 As Megan just said, my name is Colin Fulton, all my contact info is up there. And we have a lot to go over today, so let's just jump right in. You are a developer at a Ruby conference, and you decided to take a very, very brave approach to go to the conference. You decided to go to a talk whose description mostly said it was going to be about talking raccoons. Now, this is an interesting choice. Good for you for deciding to do something a little bit different. And now I have a choice for you. Would you like for me to explain what the format of this talk is going to be, how we're going to go through things today and give you an idea about how we're going to progress and learn about Ruby's garbage collector? Or do you just want to figure it out as we go? We can just jump right in and you'll just figure it out as we go. How many of you would like to hear an explanation of what this talk is if we get a, you know, people raise their hands? And how many people just want to jump right in and figure out how it goes? Cool. Luckily, you just made your first choice and that's the main thing I was going to explain. Alrighty. So, we'll skip past that. All right. So, once upon a time, you're working at your job, you're just working at your desk and you encounter this problem on the server where, you know, processes were shutting down, you're getting all these out of memory things and you've never had to deal with memory before. You're a Ruby programmer. You heard that there's this thing called the garbage collector and it just takes care of memory for you. And then you go on, you look at some documentation, some blog posts and they have all these weird terms like heap and malloc and you're not quite sure what's going on. And so, you decide to take a little break, go out to lunch. So you exit the office, you turn the corner and right before you go off to your favorite coffee shop, you notice that right behind your office in the back alley, you see a little motion. You turn around and there you see a raccoon. Now raccoons are common in your city but this raccoon is wearing a top hat and an embroidered vest which you don't see very often and so it catches your attention. You and the raccoon lock eyes and the raccoon says, oh my goodness, no one is actually supposed to notice me and then it goes and it runs in a little tiny doorway that you've never noticed in the back of your office. Now, you are kind of hungry but at the same time, it isn't every day that you see a talking raccoon. So you decide to chase after it. You crawl in through the little doorway in the back of your office and you go through this weird little tunnel and eventually you come out into this giant room with raccoons running about everywhere. It's filled with tables and all those tables are covered with pastry and there brushing himself off, you see that raccoon and their crown looks at you with a smile and says, I'm so glad you followed me and you said, what? He says, you're a developer who doesn't understand how Ruby's garbage collector works and you say, how did you know that? He says, well, you understood me and whenever a human understands an animal that's talking, that means that that human needs to go on some kind of allegorical adventure where us animals were explained to you how that thing works. You say, seems like you skipped a lot of exposition there. He says, well, I did because we have a lot to go over and you guys probably want to go to lunch. Now, my name is Patter. I'm a raccoon and I'm going to take you on a guide of Ruby's garbage collector, sometimes called the GC. You can call me the MC of the GC. Now, a couple of things I want to go through first. We're going to have to simplify things a lot. We're going to have to kind of not dumb things down. Everything that I'm going to tell you is real, but it turns out that Ruby has grown a lot over time and it's been improved a lot and so there are lots of little tiny edge cases and things. So don't worry about the fact that we're simplifying things. I just want to give you an overview of how things work so you can better understand stuff and then you can dig in later. Now, what do you see before you right now? And you say, well, I see all these tables with raccoons right now and they're just pastries everywhere. So it says, OK, this pastries you're seeing is probably because you're hungry. Those are Ruby objects. We're inside the memory of a computer running Ruby. Each of those tables is an area memory that's been reserved for us and all those pastries are objects that we see. You say, hold on. Now, I'm a smart developer. I want to see how it really works. I don't want all these fun, cute abstractions. I want to see how it really works. And Patter says, OK, and he snaps his fingers and all of a sudden you feel yourself shrinking down until you're smaller than an atom. And you just feel yourself surrounded by silicon atoms with all these electrical fields moving around you, virtual photons being exchanged. You say, hold on, what's going on? We're all the ones and zeros. And you hear Patter's voice up above saying, well, this is how a computer really works. It doesn't deal with ones and zeros and logic gates and all that stuff. Computers really work. They're physical devices with electrons and charges moving around. They deal a lot with quantum mechanics. You say, well, that's probably not going to be very helpful for me. He says, exactly. When you talk about ones and zeros in a computer, you talk about C programs, all those things are just abstractions to help us understand stuff. So when you're looking at pastries and things, don't think of that as dumbing down it. That's just an abstraction to help you understand things better, because no one can understand how all this stuff works. And Patter snaps his fingers and you're brought back into the room full of delicious pastries and raccoons. Now what I'm going to explain to you today is enough to kind of appreciate what the garbage collector does, what its job is. We're not going to go too much into details, but if you want to go into details, we can dig in a little bit. But the reason why I'm going to do this is so that it empowers you to learn more. You can go to another conference talk that talks in more detail about Ruby's garbage collector and have a better understanding of what they're talking about. Or if you encounter a bug and you need to read the documentation or a blog post, you'll understand the concepts and ideas, and you'll be empowered to learn more on your own, because that's really what you need as a computer programmer. You don't need to know everything, you just need to know how to learn more. And now we come to our first choice. If you don't know what a garbage collector is, or maybe you've heard of it, but you're not really sure exactly what its job is, we can very briefly talk about what a garbage collector is. If you want to learn about how computer memory works and what goes into actually making a garbage collector to better understand, maybe you already know what it does, but you want to know how it does that, we can talk a little bit about that. So how many people want to just a basic, very brief overview of what a garbage collector does? Raise your hands. And don't worry, more than happy to talk about it. So only a couple people. Don't worry, you can come see me afterwards if you want to see any part of this talk that I don't go over. And how many people want to learn about how computer memory itself works? All right, so most people want to go there. And so you turn to Pat or and say, OK, I want to kind of understand how you raccoons, like, what is this place? How does this memory work? I'm not quite understanding what pastries and tables have to do with computer memory. And so Pat points to the room and says, OK, well, each of these tables that you see before you, that's a region in memory that's been reserved for a program. Each of those pastries is just a bit of data that's been stored in that memory. And so there are two things in memory. There's an area of memory, and then you can put objects or data in it. Now you may notice that every single spot at this table has a number on it. And that number is what's often called an address. And we can have pointers to those address. All that an address is, it's a unique ID to locate something in memory. So if you don't have the actual data for something, you can hand a function, an address, to say, hey, go out to this address and get whatever data is there. When we take one of these index cards, write down one of these numbers to reference something. We call that a pointer. So you can think of pointers like indexes into an array. If your computer has a pointer, it's a pointer. If your computer memory is just one giant array, which is one way to think of it, a pointer is just an index into some particular part of that array. So it says, if you go to this index, there will be some data there. Now you can put any data underneath that pointer. That pointer has no idea what's underneath it. It just gives you a location in memory in the same way an index just gives you a location in the array with no hint about what's actually there. Now in the C programming language, there are two basic kinds of way that data get added to memory. One is with local variables. So this will be something that you're just going to use in the context of a function. If you have a C function, you just quickly want to print out hello world. Maybe your entire function just prints out hello world to standard out. You'll create a string that says hello world, and then the C compiler will figure out how to store that memory. And then as soon as your function ends, it will clean that up and remove it from memory. Because local variables are only used for the duration of a function call. You can think of this like a pastry that's just like a quick little snack, maybe a crepe. It doesn't need to last very long. And so you just eat it, and then as soon as you're done, you just quickly throw it away. You're not going to keep it around for a very long time because crepes don't keep very well. Now let's say you have some memory that you want to keep around for longer, something you want to last for more than just the call or your function. For example, Ruby objects. We want them to potentially last for the duration of our entire program. When we want something in C to last for a longer period of time, we generally call a function called malloc. It takes an argument which is the size of the amount of memory that we want, and when you call malloc in C, it stands for memory allocate. The C program will go out to the operating system and it'll say, hey, I need a stretch of memory that is this big. You don't say where it is going to be in memory. You just say I want a stretch of memory that's this big. Then malloc will return a pointer to the start of that bit of memory. Now note it only returns a pointer to the start. You have to remember and keep track of how big that area of memory is associated with that pointer. This is one reason why C program is so difficult. You have to keep track of where things are in memory, how much memory they take up, and so you can think of this as when you call malloc, you're asking for a table of a certain length, and you can store more and more things the longer that table is. And so then the operating system will bring out a table and you can place objects there, and it will tell you where the table starts, but you only know where the table starts, and then how long it is, and so you have to calculate out where the end is. If you try and place a pastry past the end of the table, it's gonna fall on the ground, it's the pastry's gonna get destroyed, and your program is probably gonna end. So that's how malloc works. So now we've allocated memory, but how do we free that memory up so that other things can use it? If we only allocated memory over and over and over again, creating more data, eventually run out of memory in your machine. So there's also a function in C called free. You give it a pointer to the start of memory that you've malloced, and then that tells the operating system I am done with this memory, and you can go ahead and take that table and remove it, and then that memory can get used by some other program. And so in C you need to keep track of what memory you're using, and then when you're done with it, you have to make sure to free it so that other programs are free to use that memory. Now it's worth noting that Ruby's garbage collector doesn't free garbage collected objects. Ruby will reserve a long stretch of memory, a big long table which it calls a page, which is confusing because an operating system also has a thing called a page. But this page is just a stretch of memory where Ruby will store some of the objects. When it needs more space, it will allocate another page. It will malloc another table inside of memory where it can store more pastries. Now when the garbage collector frees objects, it doesn't actually remove the whole table because there may be other objects that are still active on there. And Ruby knows that if you had, you know, let's say two megabytes of pastries stored in memory, you're probably at some point gonna make a similar amount. And so it wants to keep that because malloc and free take time to execute. So Ruby will malloc memory, reserve it for use, and then it'll just put objects there when the garbage collector decides to delete them. It keeps that memory and will just overwrite that spot with a new object every single time. So this way Ruby programs tend to only grow in the amount of memory they use. They'll allocate as much memory as they need to run the program, and then they'll keep that memory so that the next time you try and create an object, it will just try and fill up whatever space it has and clean up that space as it goes. All right, oops. All right, so now we have a couple choices for wherever we could go next. You could learn how everything I just told you is actually a lie. Computer memory is way more complicated than that. And we're not gonna go into the, really into the technical details, but I find it really fun to find out that computers just barely work. I mean, they work really, really well, but if you see how complicated computer memory is, you'll be surprised that you're even able to turn the computer on. We can skip past that and go right ahead to learn how Ruby figures out what to delete, and don't worry, with these earlier choices, you'll still have an opportunity to go back to these later choices later. So if you wanna learn about RAM, we'll still have a chance to learn about how Ruby figures out what to delete, but if you want, we can just skip right ahead to that if you don't wanna figure out how computer memory actually works. And so those are gonna be your two choices. So how many of you wanna learn about how everything I just told you is a lie? Raise your hands. And how many people wanna skip ahead to learn how Ruby figures out what to delete? Okay, so don't worry, we'll still get to that. So you turn to pattern and you say, okay, so this seems pretty simple. You just malloc regions of memory, and then you free them and pattern pulls you down and says, okay, let me let you know a little secret. It turns out Ruby memory isn't actually just a bunch of raccoons running around putting pastries on tables. It's actually way more complicated than that. So let me just give you a little glimpse into how computers really work, because if everything works the way I just told you, whenever you ask for a memory, the operating system would go out and get it. Whenever you freed it, it would go delete it. If everything actually worked that way, your computer would actually run much slower. And so a lot of little tricks have been added over time to make it faster. You can think of computer memory kind of like a giant game of telephone. So when you create a new object in Ruby, your Ruby program goes out and it tells Ruby itself, hey, I just created this object, put it in memory. If there's enough space in one of those tables that Ruby has already reserved, it'll put it there. If there isn't enough space, Ruby has to go out and talk to malloc and free to ask them for more memory. It has to go malloc more memory. Now malloc, as I told you, is a function. There are different implementations of malloc. Sometimes when you ask for a region of memory, depending on the implementation you're using, it may ask for even more memory from the operating system, and then just give you a small slice of that. Because again, it takes time to malloc memory. If it mallocs more than you asked for, then it means next time you ask for memory, it can give you more. And those two tables will be right next to each other because they're actually a part of a much longer stretch of memory. And there are certain performance reasons that might be nice. Similarly, when you free memory, it may not actually go ahead and return it to the operating system. Depending on the implementation of malloc you have, it may actually hold on to that table and say, yeah, I'll go ahead and delete it. And it just keeps that table there. And then next time when you ask for a bit of memory, it says, oh, hey, I have this bit of memory right here. I just asked for it from the operating system and totally didn't keep it around from last time. Here, you can go ahead and use it again. So this helps it run faster. But malloc and free are asking for memory from the operating system. And operating systems are really big, complicated beasts, and they're doing all of the same tricks. They're dealing with memory in larger chunks rather than these little tiny bits that we ask for so that they can run more efficiently. When you ask for memory, they may secretly actually give you more memory than you ask for, or ask for more memory than you ask for to then add for more efficiencies. When you free stuff, they may not necessarily delete it. Also, in the case of an operating system, it's important to keep in mind the operating system isn't the, it's controlling your computer, but it's talking to the CPU. And the CPU is a totally different beast than everything we've talked about. Modern CPUs are basically built on lies. If you tell a CPU, do this multiplication, then do this addition, then store this stuff in memory. That CPU may actually execute those steps in a completely different order. It may execute more than one step at the same time. It's trying really hard to make your code run faster, and so it's gonna just mix things up. And as long as it gets the same result, you don't care. One example of this is if you have a branch statement, an if, an if statement. If the CPU were to wait until that if statement evaluates to figure out whether it should do the if block or the else block, that would take a lot of time. So CPUs these days have what are called branch predictors. They'll actually guess whether or not that if statement is gonna go in the true direction or the false direction, and it'll start evaluating ahead of time. If it made the guess correctly, that's awesome. It means that your program will run faster. If it mis-predicts, if it guesses the wrong thing, it has to rewind back, undo all the calculations it did, and then start on the other branch and go over. Now, we normally don't see this. There have been some security vulnerabilities that have come up related to this, which could do an entire talk on that. But the main thing to keep in mind is just CPUs are horrendously complicated. Anytime someone tells you this is how a CPU works, that's probably a lie. It's probably way more complicated and interesting than that. But no one really fully understands how they work because they're so complicated. So those little lies that we tell ourselves, those abstractions make it a lot easier to think about it. Now, the CPU is going to try and store stuff in memory when you ask for it, but when you try and read stuff in memory, it will store it in local caches. It actually takes a long time to go out and talk to RAM and ask for stuff in memory. So it may cache more than you actually asked for, so that if you're asking for a region of memory and you want the first thing in it and then the second thing and then the third thing, it'll get that entire table so that it makes the cost happen up front to get the whole table. And then as you ask for each little pastry on that table, it has a local copy of it and it'll just return those one by one. Similarly, when you ask to write something, the CPU may say, cool, I'm gonna go out and write that and it just caches it and wait until there's a convenient time that it can go talk to RAM. Which brings us to memory itself. There's dynamic memory, RAM in your computer and then most computers today have flash memory also which sometimes things will get stored if there isn't enough space. RAM is gonna store stuff in a completely different order than you wanted. It's also gonna cache things. It's doing its own stuff to make stuff go faster. And if you're storing stuff in flash memory, flash memory is just ridiculous. If you ask flash memory to store the number seven, it may store the number 552 and it just somehow remembers that well, 552 is actually a seven here and it's doing this because there's a lot of incredibly complicated physics that has to do with the fact that flash memory is based on a lot of really broken technologies that they just kind of had to add a lot of duct tape to to make it work. Now it does work and it works really well and it goes really fast because of all this stuff but it means it's a total hot mess on the inside. So you may be wondering now, how do computers actually work? Everyone's lying to each other. Like nothing is what it seems. Well the way that all works is everything has been designed by very good engineers. We've spent a lot of time debugging this and we've been through a lot of bugs and problems and each one of these lies, even though something may say I'm gonna store something in memory and then it doesn't actually do it, it may actually, as long as it remembers that it was supposed to have stored that memory there, it could store it in a different location as long as the next time you ask for it, it goes out and gets the same thing. So that's a brief introduction to the crazy game of telephone that goes on inside your computer. And if you're worried that your computer isn't gonna work now and that it's all built on magic and lies, don't worry, that's why computers are really fun. No one understands how all this stuff works so don't feel too bad. All right, so now we have a couple choices. We can learn how Ruby figures out what to delete. We can figure out how these raccoons doing the garbage collector do their job, you know the main subject for this talk and I don't know why you wouldn't wanna learn about this, we have more cute raccoon illustrations coming up and that's kind of the main thrust of this talk, learning how the garbage collector works. But if you don't wanna do that, we can skip ahead and talk more about implementation details inside the garbage collector if you're already familiar with something called the mark and sweep algorithm, which is what C-Ruby uses. So how many people wanna learn how mark and sweep works and how Ruby deletes objects in memory and how many people wanna skip ahead to sort of implementation details? All right, cool. Thank you to the couple of people who raised your hand with that one for being honest. Honesty is really appreciated here. All right, so you turn to Pattern and say, okay, this is cool, I'm very confused about how computer memory works, but okay, you say that a job of Ruby's garbage collector is to delete things once we're done with them. I know that I create new objects that get stored in memory and at some point they get deleted, but I never tell Ruby when to do that, I never call free, so how does Ruby figure that out? And Pattern says, well, let me take you over to my friend Mark. It brings you over to a raccoon that's dressed up in a little banker outfit and he's running around all these pastries, putting flags in them and connecting them with strings and following them around and running around just sort of crazy pattern that's really hard to figure out. And Pattern says, hey, Mark, come over here. And Mark says, well, okay, I can pause for a little bit, but I'm really busy because this process is running right now. And Pattern says, okay, can you explain to my friend over here what you're doing? And Mark says, well, there are two parts of the Ruby garbage collector. It's a mark and sweep algorithm. I mark, so I do the mark step. My job is to figure out what bits of memory you can delete so then later we can sweep up that memory and just delete it. The sweep step is easy. We just say whatever memory has been marked for a deletion by me, Mark, we can go ahead and delete. So how do I do my job? I'm actually gonna pull up, well, first let me give you a little bit of an analogy. What objects can we delete? This cake over here, this cake is for a two-year-old's birthday party. Now, I don't know if you've ever been to a two-year-old's birthday party before, but it's a little bit chaotic, a lot like a program. And if you were to throw away the cake midway through a two-year-old's birthday party, all hell would break loose. That would be a very, very bad situation. So we know for a fact that this cake should not be thrown in the trash until this birthday party is over. There may also be pieces of that cake that have been cut off that are still connected to the cake, they're still part of it, but are stored elsewhere in memory. So we need to go out and figure out are there pieces of cake that are still in people's hands? We wouldn't wanna throw those out because if you've ever tried to pull away a piece of cake from a two-year-old, I personally haven't done it, but that's because I don't have a death wish. If you pull away a piece of cake from a two-year-old that's still eating it, bad things are gonna happen. Similarly, if someone put down a piece of cake and then maybe they're gonna use it later, we don't wanna delete it because even though they're not using it now, it's part of a birthday party that's still going on. So it might be deleted later. The same thing is true of Ruby objects. There are some Ruby objects that we obviously shouldn't delete because they're part of a currently active process or those Ruby objects are associated with something that should never get deleted. For example, constants in Ruby. We're never gonna garbage collect a constant because it can be accessed at any point in your program and we never know when that's gonna happen. So whenever you sign something to a constant, we'll mark it that we don't wanna delete it. There may be objects in memory that are associated with a method that's currently being run. It's available in the local scope. We don't wanna delete those because again, there's a method that's currently running. Someone's holding on to that piece of cake and we don't wanna get rid of it. Finally, there are objects associated with those two things which it may be an element in an array that maybe you don't reference anywhere but because it's part of an array that's still being used, we wanna make sure we save it because someone might use it later. So Mark pulls out a blackboard and says, let me show you how this works in Ruby. He draws out a very simple bit of Ruby code and then he draws out a bunch of boxes. Now this Ruby code, I'm sure you're familiar with, just Ruby code, it's a very, very simple program and then these boxes are different slots in Ruby's memory. You can think of these as places on the table where we can store objects. So let's go through and evaluate this program. First we define the method greet and so that will then get, you know, let's just imagine it gets parsed and interpreted now even though Ruby is actually a little bit more complicated. Well, let's imagine it gets parsed and interpreted, figures out how to run the instruction, the store's okay, now I know how to do greet, I'm gonna put that over here. These are the instructions that I should run whenever we call the greet method. Then we get to this line here. We're creating an array called people and then that array has one string in it. So first Ruby is gonna evaluate that string, this is an object, we just created a new object even though we didn't call .new on the string class, when you create a string literal, that's generally gonna create a new object and so we have to store that in one of the open slots in memory. Next we have this array which is wrapping it and so we need to create an array in memory and inside that array we add a pointer to say this array contains this particular object and so that way we know that this array is connected to that object that said rubius, there's a connection between the two. So then we go on to line seven, we call the method greet passing in this array, so we go to that method. Now the first thing the greet does is it pushes on the string and friends. Now this is not necessarily a good thing to do in your program, mutating things that you get in as arguments could potentially cause a lot of problems but for sake of argument let's say that we push on this string that says and friends. Well the string is in our string that's created so we need to store that in memory and then we're gonna output hello and then a comma separated list of all these different names that got passed in. So this is an interpolated string so we actually have to go on the inside, we have to store that little tiny comma, even that has to get stored in memory and then we call name.join, well that's going to return in a string so we have to go ahead and store the result of that in memory and then we have to go ahead and evaluate that expression, that string interpolation that creates another string and then we have to store that in memory and now we're done calling the method greet. So finally we get to line eight and now we want to output the string y'all are great and here's where we encounter a problem. All the slots are taken up now, there's nowhere in memory to store this. Now Ruby could go out and ask for more memory but some of these objects that we allocated aren't actually needed anymore so let's figure out if we can delete any of these objects in memory so that we can free up space to store this next thing. So now your program stops and Ruby's garbage collector starts and so Mark is gonna run around and the first thing that they're gonna do is part of the mark and sweep algorithm is they're gonna mark every single object with a white flag, a little mark in memory that says okay, this particular object in memory has not been looked at yet by the garbage collector. Again, the white flag means that we haven't evaluated whether or not this object should be garbage collected yet. Now the next thing that we wanna do is we want to mark all the objects that obviously shouldn't get garbage collected, things like constants, things that we know should never go away. We mark those with a gray flag. So in this case, that first array in memory we're gonna mark with a gray flag because puts y'all is in the same lexical scope as where that array was created, line six, which means it might get used later in the program. So we're gonna go ahead and mark that with a gray flag. The gray flag means we should not garbage collect this object but we don't yet know if it's connected to other objects in memory and if they're connected we do not wanna garbage collect those objects. In the case of this array, we'll take a look at the array so we'll say okay, now we're gonna evaluate what's connected to this array because it has a gray flag on it and we don't know what's connected to it. It will see that Rubyist is connected to it because it's inside of that array. So we mark Rubyist with a gray flag to say we can't garbage collect that. It's part of an array that we don't wanna garbage collect and so it's kind of, it's a viral thing. If you're inside of something that can't be garbage collected, you also can't be garbage collected. Now we also pushed on that and friend string there and so we're gonna go ahead and mark that with a gray flag to say okay, this is also in an array that can't get garbage collected so we won't garbage collect that either. Now we've looked at all of the contents of that array. We've looked at all objects associated with that array. So now we mark it with a black flag. Like a gray flag, a black flag says this object is one that we cannot garbage collect, exact same thing as a gray flag, but it means we have evaluated everything connected to this object so the garbage collector no longer has to look at it. We now know that everything is good. Now what Ruby will do is it'll start the cycle over again, it'll pick a random object that has a gray flag on it and find all the connections to it. And so here we have Rubyist, it has a gray flag on it now. There isn't anything connected to it. It's just a string that sits on its own so we can mark it with a black flag. And then we have this string here and friends. Again, no objects connected with it so we can mark it with a black flag. You'll now notice we've gone through every single item in memory that has a gray flag and we had to mark some additional stuff with gray flags. We've recursively gone through and now everything in memory either has a black flag associated with it or a white flag. Remember the black flag means we've looked at that object already, we've looked at every object connected to it and we know that we can't garbage collect it. And the white flag means we haven't looked at it. Now if we haven't looked at an object and there are no more gray flags yet it means that all those objects with white flags are not connected to things that we can't garbage collect. Therefore we can delete all these objects. Nothing is gonna care if they go away because nothing has a connection to them anymore. And so we can go through, we can delete everything marked with a white flag, then we can go ahead and start evaluating our program again now that we have some free slots in memory we can go ahead and store that in memory. Now if let's say that we didn't manage to create enough space to store that object well then we have to go out and allocate more memory. We'd have to go get another table and then we'd have even more slots. So next time we run the garbage collector we'll have even more slots that we have to look at to run through this algorithm. So this is the mark and sweep algorithm. We go through and we mark all of the objects we figure out all the connections between them and then any objects that we haven't marked any objects that aren't connected to things currently being used by the program or that might be used in the future we go ahead and delete. So now you have a couple choices. Do you wanna learn how a Ruby program could leak memory? You may have heard about things called memory leaks in languages like C and C++ but they can also happen in Ruby too. Even though Ruby has a dynamic memory management even though it has a garbage collector it still can leak. Do you wanna learn about, does the garbage collector really to be so complicated? We just spend a good amount of time talking about this mark and sweep algorithm to figure out how it works but is there a simpler way that we could write a garbage collector? Or we could skip ahead and talk about how Ruby's garbage collector has been getting faster. There's some really cool and fun algorithms with some really cute raccoon illustrations so that I mean everything has a cute raccoon illustration. There's some really cool algorithms that have been added to Ruby over the years to make it run even faster because if we just ran what I showed you right now Ruby would be a lot slower than what you see today. So, how many people wanna learn about memory leaks in Ruby? Raise your hand, cool. How many people wanna learn about a simpler garbage collection algorithm? And how many people wanna skip right ahead to learn about how Ruby's getting faster? All right, so it's a close tie. We're gonna briefly talk about how Ruby, how Ruby program could leak and then we'll go ahead and see if we can fit a little bit more in. So, as you're talking to Pattern walking along, you see a table tip over and all the pastries just spill out onto the floor. You point out and say, ah, that's a memory leak. I've heard of these things and Pattern says, no, if objects that are stored in memory just randomly start spilling out everywhere, that's like a serious issue. That's like an operating level thing. That's not a memory leak. Something really went along. Could someone go sweep that up? You say, okay, well if a memory leak isn't when just like things spill out in a memory, what is a memory leak? Pattern goes on to explain, a memory leak is very simple. It's just when you forget to delete something. If you just keep, if a C program keeps malloc-ing memory over and over again and it never frees it or if it forgets to free memory that it's creating in a function, every time you call that function, more and more memory will be used and even though you're not using those objects anymore, they aren't getting deleted and so it's gonna, your program will take up more and more memory. That is a memory leak. Now in a Ruby program, normally the way a leak happens, it could happen if there's a problem inside of Ruby itself. That doesn't happen very often but every so often there's a memory leak inside Ruby core. The way a memory leak in a normal Ruby program works is if you accidentally make a reference to some transient object, one that you only care about in passing but don't care about for the whole program, if you make a reference to one of those objects in an object that will never get garbage collected, well then your object will also never get garbage collected. So if there's a connection between an object that could get deleted to an object that will never get deleted, well then that object that could get deleted will never get deleted. So if you accidentally build up these references over time in your program, you're gonna start leaking memory. Ruby will start taking up more and more memory every time you do this. So let's look at a very simple memory leak. This program is theoretically fine. All it is is we have a constant with an array in it and that array is just a log of all the messages that have been passed into this particular method and then we have a method called print message. It stores that message in array so we have a log of all the messages we've ever printed and then outputs that message is standard out. Now there's a very simple Ruby leak because we're adding each method to that array, that array isn't a constant, it'll never get garbage collected so if we call this function, let's say 10 million times with different strings, we'll have added 10 million strings to Ruby's memory and those objects will never get deleted unless we clear out that array. This is a very common thing for logging applications to hold on to references to objects that are transient but because you wanna log info about them, you store info about them in the log. Well if you don't make copies of that stuff, the objects that you're trying to make logs about, well those will never get deleted because they're stored in this log. They'll never go away. Now this memory leak is very simple. Arguably it isn't a memory leak if you don't care about that extra memory being used. There's the array that's stored in. So let's look at a slightly more tricky memory leak. Ruby has this method called objectspace.definefinalizer. If you've never used this method it means you're probably just an ordinary programmer. This is not a method that gets used very often. However, I dare you to find a single person who has ever used this method and not created a memory leak. This right here is a memory leak. The way that objectspace.definefinalizer works is it takes two arguments. The first argument is just any old object in Ruby memory and the second one is a proc. Whenever that object gets garbage collected, whenever it gets deleted, that proc will get called. So this lets you call a little bit of code whenever something gets deleted. So you can log that it's deleted. You can keep info about it. There are all sorts of clever little things you can do with this. Now the reason why this leaks memory is right here. Even though that proc doesn't look like it has a reference to the instance that's created in the initializer, it does. Because whenever you create a proc in Ruby, all locally available variables get stored in that proc because they might get used later. You'll notice that when you define a local variable you can reference it in a proc. That's the reason why this works. They have to be kept because whenever that proc is called you need those local variables available. Now we haven't called self inside that proc but self is actually still in that local space. We're using it right there. Self is available. That means that that proc has a reference to self. That means that the object that you want to garbage collect want that proc called on whenever it's garbage collected is referenced by the function that will only get called when that object is garbage collected. Now we obviously can't get rid of that function until the object is garbage collected and now you may have noticed we have this weird circular loop. We will only call the function when the object is garbage collected but that function references that object. So there's always gonna be a reference sitting around and so we can never garbage collect this object and thus puts deleted and then the name will never get called and this happens to every single person the first time they use object space to find finalizer. It happened to multiple times for me because I keep forgetting about this. Now, there is a very simple way around this simple. We simply have to create that proc in a different context that doesn't have access to that object that we're creating the finalizer on. So here, we call a separate method that's defined on the class. So self will be the class, not the actual instance and then that method goes ahead, it takes in the name and it returns a proc and that proc still has a reference to self but now that self is the class because this is a class method. It has self dot on it. So this means that now this proc doesn't have a reference to the object that's gonna be a garbage collected and now you've gotten rid of the memory leak. Now those objects can actually get deleted. Now you turn to Patter and say, well, this is all good but I'm getting kinda hungry and I have to go to lunch and Patter says, okay, well, there are lots of things that we could have talked about and you can come see me afterwards if you wanna hear about any more of these things but before I go, I want you to go talk to my son over there. He has something interesting to tell you and at that moment a raccoon walks up on a really weird looking cat and you say, who are you? Well he says, my name is Aaron. Aaron, Patter's son. I'm one of the raccoons who works on a lot of improvements to Ruby's garbage collector and I've been doing a lot of really cool work on something called a compacting garbage collector. The idea is that when things get stored in memory they may get stored all over the place. I wanna try and bring them in so they're all tighter together so that way there are more tables that don't have any objects on them. If you wanna hear more about this, I may be in the Biltmore after lunch giving a talk about this and so Aaron, Patter's son, leaves you at that knowledge. And then somebody kinda looks like a raccoon but a little different walks in and you say, well who are you? Well my name is Koishi. I'm a tanuki. You say, what's a tanuki? Well it's kinda like a Japanese raccoon. One of the really cool things about Ruby is that there aren't just raccoons working on it. We also have us tanuki over in Japan working on improving it because each tanuki were actually the ones who originally created the garbage collector and we do a lot of the work on it. They say, well what do you do in balancing all these pastries? Like well I've been working really hard to make it so that Ruby can do more than one thing at the same time. It's really hard to do. I keep dropping all these pastries. I'm still figuring it out. But one of the things that makes it really hard is the garbage collector. And so if you wanna learn more about that, maybe in the Biltmore in the same room later today you can see maybe not a tanuki but someone named Koishi talk about the work that they've done on the garbage collector. Thank you all for coming. I wanna give a special thanks to Caitlin Cashin who did all of the illustrations for this talk. Can you please give her a round of applause? And again, if you have any questions, feel free to tweet me or email me or come up to me at any point in this conference. I have a lot of Raccoon illustrations if you wanna just look at the Raccoon illustrations or we can talk more about Ruby's garbage collector. Thank you.