 Hi again, I'm going to talk about GIL, Global Interpreter Log in Ruby. I do appreciate you all came here on the third day when everyone is usually tired already. So I will do my best to make this talk as engaging as possible. This talk is intended to be junior friendly, but even if you are an experienced developer, I do believe you will find something new today. So if you are a junior or experienced or pretend to be experienced, this talk is for you. I, myself, work in our industry for a long time. In Ruby alone, I work for almost 10 years and I absolutely love Ruby. I love it so much that even the text on my wedding cake was written in Ruby. Even though my wife is a Pythonist. Just like many of you, I didn't bother researching how exactly multi-threading works in Ruby. What exactly does GIL do? I had to do this research because a disaster happened to me and my project three years ago. My project is called Vico, Vico.com. This is our team. I have been building it for five years already. I started to build it even before it was officially founded. Now, in 2017, we are thriving and we are blooming and we have all chances to take over the world, but three years ago, in 2014, we were at the point of death because of a mistake in our threat safety. I will tell this story in all details later on in the middle of this talk, but for now, in order to explain better what has happened to me and my project, I better start with the simplest example of what is a race condition. I will start with an example of how to earn a million dollar in Ruby. In Ruby, you have to start small. You have to start with first 10,000. So if you get a bank account and you add a buck to it 10,000 times, you will get 10,000. If you repeat this cycle for 100 times, you will get a million. And trust me, after 15 years in this industry, I can assure you, this is the only way you can earn so much money in Ruby. I will even add tests here that the test will assure that we get a million. Please don't be afraid of those weird symbols below. They are there only to highlight characters in console with color. Now the tricky part. Imagine that instead of repeating the cycle for 100 times, I will ask 100 threads to do the job. Please don't try to learn exact syntax of how to spawn threads in Ruby from this example. It is irrelevant to your understanding. Just trust me that this code will run 100 threads, each of which will run this cycle of 10,000. At first, I will try to run this code with a Ruby implementation, which does not have GIL. And maybe the lovely audience will tell me which Ruby implementation doesn't have GIL. JRuby, right, JRuby, thank you. It was exactly in my script. Thank you for saying JRuby. So, I ran this with JRuby and we got an error. It's no way near a million. We didn't get our million. We got a problem. Why? Because it's a race condition. It is a race condition which is actually quite easy to understand and explain. The reason for this race condition is only one line of this code. This simple line of this code is the only who is responsible for the whole weirdness we're getting here. Let's look at it closer. If we think of it carefully, we will be able to expand this line of code to something else, to three different operations. At first, we read from the instance variable of the bank account. Then, in memory, we increase it by one, by one, we increment it. And then, we write it back to the instance variable of the bank account. Now, imagine that two threads are doing it simultaneously at the same time. Now, for the record, the local variable, which is called here value, will belong separately to both threads. Each thread will have its own copy of the local variable value, but they will share the common instance variable of bank account. So, now imagine the first thread reads the value from bank account. Say it's currently five. Second thread does the same, reads the value from bank account, five. First thread increments value, it's six. Second thread increments, it's six. Probably, you already know what it's leading to. First thread saves six to bank account. Second thread saves six to bank account. So, what has happened? Two threads should have added a dollar to our bank account each, but only $1 has made its way to the bank account. Instead of $7, we got only six. This is called race condition. And now, I will try to run the same code with Ruby MRI, which does have Gil. And by the way, ladies and gentlemen, this is going to be the first appearance of Gil in this talk. So, lo and behold, I'm running the same code with Ruby MRI, which does have Gil. And it's correct. It is reliably correct. No matter how many times I run it, it will be okay. So, it seems that Gil saved us from race condition. Ray, not so much. It is too early to celebrate. Now, imagine that a junior developer, a straight junior developer comes in and refactors the code. This talk is junior friendly, so I'm blaming juniors first. What basically has been done here is that two methods, two operations of reading from a bank account and writing to the bank account has been extracted. It is actually quite a meaningful and sensible refactoring. Uncle Bob and Martin Fowler will be proud of it. I personally don't have any objections against such refactoring, but Gil does. Gil doesn't like it. Look, I'm running it again. And it's never. Why is that? Well, actually we got two questions here. The first question is why it happened at all? Why did Gil allow race condition to happen? And second question, why did it start to happen only when the junior developer allegedly extracted methods? Good questions and I will answer them and I still owe you my story about what has happened to my project. But at first I need to do a step aside and fight one very common misconception about multithreading in Ruby. This laptop has eight cores. I will now turn off seven of them and we'll run the same code with Ruby MRI with the same Gil again. Now, over to you. What do you think will happen when I turn off all additional cores and run the same code again? What will happen? Will race condition go away or not? Okay, yes and no. We have collected all possible opinions. Well, let's now actually try that. I'm just in case I'm checking that sometimes it fails and you can say that yes, sometimes you will get correct results because after all it's all about probability. But think of it, our code has become unreliable because of a simple refactoring which seemed innocent and now I will turn off cores and run it again. I'm turning off cores. I'm checking that's really only one core is available. I'm running it again. I'm running it several times. Error. Race condition didn't go away. Our code is still unreliable. So just in case I'm turning back my cores and now I have three questions. Why the hell didn't race condition go away when we turned off other cores? Good question and I will start answering from the very last one, from the third one. Why race condition didn't go away when we turned off other cores? The question is the answer is because parallelism is not concurrency. They may seem as synonyms but they are not. I'll explain it really in details now. Look, imagine we have two threads. First thread has a goal to reach being red from being yellow and the only design of the second one is to finally become blue. So when people hear that GIL allows only one thread to run at a time, they sometimes imagine it like this, that they're coming one after another forming an orderly line as if they lived in Great Britain. But this is not the case. This is not how it is, how it is done. This is neither concurrent nor parallel. This is not the case. Also we wish it was like this. We wish each thread occupied its own core and run truly simultaneously. This is called concurrent and parallel. You really get it in JRuby and Androbinius but again this is not the case with Ruby Marae. So previous was neither parallel nor concurrent. This is concurrent and parallel but with GIL you get something else. You get concurrent but not parallel. In this case, second core is always empty. No matter whether it's available or not, GIL will restrict your ability to use more than one core. You will never be able to run on more than one core with Ruby GIL, it will not allow you. But what's happening in the first core that both threads are constantly kicking each other, fighting for the right to being run on the only available core. Ruby will allow first thread to run for a number of milliseconds, then switch to the second thread, run it again for a number of milliseconds, then switch them back and so on and so forth until they both are done. This is how it happens. This actually answers our question, the last question, why didn't trace condition go away when we turned off other cores? Because other cores are relevant. No matter whether there are other cores or not, Ruby GIL will restrict your ability to use other cores and it will still run everything concurrently even though not parallel. So let's agree on one thing here today. That's the moment where MRI switches between threads is called context switch. It's a simple definition, which we will use a lot here in this talk. But before I go further, I need to do a little side note. When I'm talking to juniors about multithreading in Ruby, they often ask me, if this is the case, if we are not allowed to use more than one core, if everything is so bad, why at all would one use multithreading in Ruby at all? Why? Well, I'll give you one little example. Imagine you are to talk to a remote API, very slow remote API, and you are to make 25 requests. Instead of making those requests and waiting for them, one after another, you could make 25 threads to wait for those 25 responses. Waiting does not consume CPU resources. So even one core can easily handle 25 threads which are to wait for a response from slow API. So back to our thing. I now want to answer the question of why did race condition happen only when the junior developer allegedly extracted methods? Why? The answer is Ruby MRI will switch context when you call a method or return from a method. That's why what junior developer did is what has been done is that the method call was inserted into the most vital part of our algorithm. No wonder we got the race condition. Now, now we are, we have answered our second question and now I'm going to do another thing. If this is a case, then why Gil was invented in the first place? Why is there, why developers of MRI have inserted Gil into Ruby? To answer the question, here is another example, very simple to previous one, but instead of inflating the same bank account, these hundred threads trying to populate the same array. And in the end, I check that we really get a million elements in this array. This method call array.push inside is quite complex. Many things can break there, many things can be corrupted inside if you try, if you run them concurrently. Now I will run it with Ruby MRI and it is correct. It is correct because Ruby Gil does protect internal built-in methods written in C in Ruby MRI. It will actually also protect your methods which are written in C unless they have a callback to Ruby, but that's not the case. Currently we're talking about the reason why it was invented. It was, it was, it was, it was invented because it does protect internal integrity of internal data of Ruby. When you're actually using array in Ruby, under the hood, the structure of an array is quite complex. It is concealed from you, but it is very fragile if you try to run operations on it concurrently and Ruby Gil does protect from it. Let's now try, run the same code with JRuby which doesn't have Gil. If I run JRuby, you see an error. Invalid array content due to unsynchronized modification with concurrent users. The internal interpreted data has been corrupted in JRuby because we run things concurrently. It can never happen in Ruby because Gil protects from it. So the main conclusion from here, the one thing which I hope you will take home today is that Gil is here not for your convenience, but for convenience of MRI developers. It's not about you. So to illustrate that, I will amend this example. So basically I have added another check which will check not only for array size but also for array contents. And I will run it again and no matter how many time I run it, array size is always correct because Gil is here, but array contents are always wrong. Despite Gil protects the operation of insertion to an array, it doesn't protect your code which is around this insertion. It doesn't protect your Ruby code. So it actually answers our first question. Why did race condition happen at all? Because Gil isn't meant to protect your code from race condition. Well, now I am also to tell you another important thing about context switching. But before I do that, let me finally fulfill my promise and tell you what happened to me in my project three years ago. Just like many of you, I thought that this thing is irrelevant to me and my work. I thought I'm not going to use multi-threading. I'm never going to be hit by race condition. I was wrong. Well, again, I remind that my project is called VCO. It's a platform for e-commerce retailers. It does many things, but one thing it does, it takes all the orders or commercial orders of a particular retailer. It takes it from Amazon, from eBay, from Shopify.com and shows them in one place. You developers, you understand that we are achieving that by talking to remote API. We are talking to remote API in the background and for that we're using framework. And maybe the lovely audience will tell me what is the most popular background job processing framework in Ruby? Sidekick, try. Very true, sidekick. We are using sidekick to talk to Shopify. Now, in order to talk to Shopify API itself, we are using what any one of you would use if you were us. We're using official Shopify API gem. Shopify API gem, under the hood, uses Rails Active Resource. Three years ago, in 2014, Rails Active Resource wasn't threat safe. There was a bug in threat safety. So on one morning, on a day when we should have had a Christmas party, a disaster happened. Users, our users, started to see commercial orders of another users. So if you were selling t-shirts, it should have looked like this. But instead, it looked like this. The most sensitive data of any e-commerce business, the live orders has been exposed to wrong users. Now, three years ago, we didn't have many users. We detected the problem quite quickly. We have fixed it reasonably fast. So we survived. But if it happened again today, when we got hundreds of large users, when we processed dozens of thousands of orders per day, when we got billions of background jobs, the damage from reputational loss of such disaster would be so large that we would definitely have to shut down our business. Just think of it, one mistake in threat safety can cost you a business. Now, when I scared you enough, let's go further. So one thing to remember here is be smart, don't be like Daniel, learn this stuff in advance. Because otherwise, you will be in my situation, when the rest of the team were enjoying this view from the rest around, we developers had our heads down trying to do a surgery on database. Now, I am back to my thing about another important concept of context switching in Ruby. Ready for a show? Look here. This is the same very simple example before any refactoring, the very first example. And I remind that it is threat safe. I even increased it to 10 millions so that we really don't have any false negatives here. Now look very attentively. I'm taking this line, I'm adding a true. It shouldn't change anything, shouldn't change behavior, it is the same. I'm running it again and again it's correct. As the next step, I'm converting it to unless false, yeah, it shouldn't change anything, shouldn't change behavior, I'm running that. Boom, race condition. Out of nowhere, unpredictable. How, what, what is that? Well, to be honest, the only reasonable way to answer that is to say that exact points at which Ruby MRI switches between contexts is undocumented internal parts of Ruby. You should never learn them. You should never rely on them. You are not supposed to know where they happen. They are always undocumented in their internal. They are very private parts of the interpreter. It can even switch, it can even change from version to version without warning. Look, now I am at Ruby 2.3. I'm switching to Ruby 2.4, running it, and it went away, it's okay. There are 3,000 comments in Ruby between 2.3 and 2.4. This comment is responsible for the changing of this behavior. It is, actually, it has nothing to do with scheduling of threads, I assure you. It's, well, to be honest, it is an optimization which cuts out all obviously unreachable parts of Ruby code. So here, unless false, obviously we know what is obviously unreachable here. So if I play a little trick in it and pretend that I'm actually calculating here, the error comes back. And then in next version, it can disappear again and then it can appear again and again. Again, that because these are undocumented peculiarities of an interpreter. So another take home thing today, assume context can be switched at any line of your Ruby code. The only safe thing to do. So now you may ask me, what to do with it? Well, there are many things you can do to protect from race conditions. And they are all far beyond the scope of this talk. But I will tell you two things. At first, I will tell you what did we do to recover from our problem, from our disaster in our production. At first, inside kick, we reduced concurrency of workers to one. So we removed all concurrency. Then we manually fixed the mess up in database to the hardest part. And then we upgraded to Rails 4. In Rails 4, active resource is thread safe. And second thing is guilds. Guilds, if they are implemented in a certain to Ruby 3.0, they won't help on their own. Inside each guild, you will have guild, the same guild and the same threads. Inside each guild, you will have all the same problems we have discussed today. Moreover, guilds is an implementation of an actor model. And to utilize actor model, you have to change your algorithms. You have to change the way you think. You have to change your code. You will never get anything effortlessly. You will have to do something. You have to deliberately change your code to be thread safe. Guilds won't save you. Now, just to reiterate several things which I hope you will take home today. Only Ruby MRI has guild other don't. Guild isn't supposed to save you from race conditions. Parallelism is not concurrency. Guild is here is not for your convenience but for convenience of MRI developers. Assume Ruby can switch your threads at any point of your Ruby code. There will be never a magic tool to avoid race conditions or to use multiple cores. Now, this talk is gradually coming to its end. I won't dare to take questions from you because I will not understand your lovely American accents. But catch me in the corridor. I am very friendly. I will do my best to help you. Alternatively, follow me on Twitter, email me again. I am very friendly. I will try to help you. Thanks, that's all.