 All right, so 30 minutes modeling concurrency will be obviously kind of a vicious topic. And we just spent half an hour talking about one specific aspect of this. So there's no way I'm going to cover this entire topic. So what I'm going to do is actually tell you a little bit of a story of my own exploration of this topic, because concurrency is something that I've been interested in for a long time. I'm going back to university. So of course, they enjoyed the most for around this. And of course, what I discovered was other than all kinds of stuff, some of the force blocks, musics, I did my detection algorithms for deadlocks and all the rest. And then I went up into the industry and realized that nobody gave me about any of those things. And in fact, it was something very different. So of course, I pick up a few books and do a quick episode search. You'll find that there's probably over a million pages that have been written on this topic. So clearly, this is something that a lot of people care about. But nonetheless, a lot of them are kind of a safe flavor. It can find a book concurrency for virtually any language and I pick up a few. And in fact, we have books on patterns. We have books on asian patterns. We have books for just about anything and everything in concurrency. But nonetheless, when people think concurrency, the first thing that comes to mind is the threads. Of course, it's not only threads. No, it's all about events. So I'm here to tell you that you forget threads, it's all about events. Well, not quite, right? So it turns out, and I spent a lot of time pouring gasoline on this discussion as well. It's a religious debate. It turns out we actually need both. And to understand why you need both, we actually, once again, need to take a step back and look at the heart. And this is something I didn't appreciate until I actually put all of the software stuff aside and looked at the heart. So first of all, here's a very simple architecture of a core CPU port, right? So you have this port unit. You have a couple of caches and we'll see why we need those caches. Now there's red devices and other bio devices attached to this thing. So today, most architectures, these are rough numbers, it takes about 100 nab seconds to go from the core to the right. So if you're trying to pull down the instructions and execute a new code, that'll take 100 nab seconds. That's pretty quick, but not broken up. So the reason we started adding all these caches and in fact, if you get really into this topic, you'll find that there's L1 caches, L2, L3, some even have L4 caches. There are different architectures for how this cache will be constructed and how the data is shared between multiple cores and all the rest. So to go to L2, it takes us seven nab seconds, which is an order of magnitude, several orders of magnitude better than going to random. Going to L1 is about half a nab second, right? So that's pretty good. Now, let's do some math. You take a lot of the mills to get your CPU today and that one clock cycle is about half a nab second. So that's pretty good. That means that we can actually fetch the next instruction from the L1 cache in about one operation. But the converse of that is any time you have to go to RAM, one often times you do, you're wasting 2,000 CPU cycles. So to combat this, the hardware industry for the last, I'm going to say, 30, 40, if not four years and then say all kinds of crazy stuff that we don't even think about when we're right software. There's pre-fashion. There is a branch prediction, pipeline, hybrid writing, spiteful execution, all kinds of crazy stuff that we don't really even have to think about that all happens under the bed. And if you're interested in the sort of topic that all the bit we link at the bottom. So here's a very interesting example that I came across recently. This is an example from a presentation that was given by Joshua Lough. He's one of the lead developers in a whole bunch of JBL members. And he gave this example where, which of these full blocks do you think is faster? Well, actually, it's faster. One, or two, so one, two. Well, your intuition is correct, but it turns out that we actually don't know. And the reason we don't know is because both the JBM and the hardware is crazy enough such that oftentimes those are actually the same today. But in some cases, for example, in the first case, when the operations one and two are not pulled through, it's significantly slow. So what happens here is once again, if you rewind a little bit, we have branch prediction, speculative execution and all the rest. So what happens is the code, the run sign, will actually execute both of the branches in parallel at the same time and you actually don't get any benefit. So while our intuition says that the second one shouldn't be faster, we actually don't know. So his message was we have to measure. And in order to measure performance today, you actually have to do statistical tests. You don't just run your code once. You actually have to run it 10 times or 100 times, build an histogram, look at the histogram and get your error bars and everything. And then only then can you make some sort of prediction. And that also depends on the hardware and the test of the JVM version and all the rest. So basically the answer is we don't know. So coming back, what this means is we have hardware parallelism, right? That's just embedded in the hardware. And in order to make our software run faster, the hardware has to do all kinds of magic tricks to pre-fashion structures, to execute them in parallel and do all these things. And because of that, we actually have software parallelism. Because of that, we have processes, threads and events. And I'm not saying those are exclusive or in fact even very different, they're all basically the same thing. High-use schedule events, what your event detection algorithm is or if you have a runtime loop or something else is a whole different topic entirely. And I'm not gonna go into that. What I think is interesting though, is because we have the software parallelism, what we've done is we've basically invented a whole bunch of libraries like P-threads, which you mentioned already, E-qual, KQ and all the rest. And we just bolted on all these things to our operating systems, right? So these concepts are not specific to any particular language. They're just there as part of the operating system. And then what we've done is we've taken all those libraries and exposed them to all of our languages. So it is not even the case that when you do some of the language, you actually think, actually think about concurrency. These things are basically an afterthought where we just say, oh yeah, the operating system gives me all the stuff, so we're gonna hear about this library and off we go. We'll go to concurrency stuff, all good. So recently I've been making a way through this book by Bruce State. And anybody here read this book? I highly recommend it, very, very, very good book. Within this book, there's actually a chapter one of the seven languages is Ruby. And one of the questions that the author asks Matt in the book is, if you could go back and talk, what was the one thing that you would change about the language? Right, and his answer was, well, I would remove the thread and maybe add actors or some other more advanced concurrency features. Of course, the question that you should be asking yourself now is what the hell is an advanced concurrency feature? Because certainly when I was talking to currency, nobody even mentioned actor models or anything like that. We were talking threads and all the rest. So what it's implying though is, sure, we have B threads, equal, KQ and all the rest, but there's something missing. There's something in between that we have skipped. And that's that advanced concurrency model. And in terms of that, there's actually quite a few of these things. And this is anything but a complete list of the fact that I can fill this inside a slide. There's data flow, preference, actor models and all the rest, all of that link at the bottom there. And there's a very good collection of the KQDR articles around this topic. In the interest of time, I'm going to focus on just two. One of you guys are probably fairly familiar with at this point. The actor model has been kind of going around and a lot of people have been playing with it. And then there's the micropolis and CSP, which I find very interesting, just because it's actually very similar to the actor model, but also significantly different. So, before we get into the specifics, I think the value of a good tool or model is first in what it enables us to do, right? But also, one thing that's often overlooked is different strengths that a model is imposed on us. So an example of that is a good model will give us something that we can't do before. So maybe it gives us a way to express something. So maybe that's a language feature. Especially we can cleanly express it. It can dig through the structure. So think about rails, for example. The first time you created a rails app and you had those 15 folders, you were like, what the hell is this for? But then later, once you're holding your development cycle, you're like, oh, mate, it actually made sense. Like, now I know the words to put this stuff and I don't have to be here at the second line. My structure, whether you actually like the pattern or not, is a whole different conversation. But that's an example of what a model can do. And likewise, it can dictate a style. So for example, the functional programming will dictate a style and that has its benefits and it has its downsides. So those are the things that a model can enable. But likewise, a model can disallow certain behaviors. All right, so for example, it can impose restrictions on the language such that we just cannot make a certain type of mistake. And the best thing about that is somebody can think of what that is, embed them to the language, and then we don't have to think about it because the language will just naturally not permit us to make that mistake. And you can implicitly make the right choice even the next time you choose your currency trademark it may be as simple as the right set of defaults. And then finally, as we already mentioned, it can eliminate a whole class of errors. So the after model, how many people here are familiar with the after model or know what it means? Okay, so maybe I have the after model and actually a lot of these concurrency models date back to the early 70s, which is in itself very interesting to me because there's been very little conversation that seems around this stuff outside of the academic community, right? It's only now that this stuff is making it into the kind of software development as I know it in practical sense. So one of the first papers that was published was in 1973 around just the proposal, the idea of around the after model. And just to be clear, this is not an API or programming model at this point. This is basically a language instead of, it's mathematical notation that they were after to create a process calculator, as they called it, where you could prove certain things, it's not like that this program will actually be able to finish it with a deadlock, it will give you certain behaviors. All the things that we complain about when we say it's really hard to test for anything or a concurrency code, they were after formal proofs and ways to develop the language that will allow us to actually express all this stuff. Later in 1975, we had the first kind of practical application where somebody took the mathematical notation and put it into something that smelled like something, smelled like something you could write code for. And then since then, there's been literally dozens and dozens of papers around this stuff. Some of the actual languages, the one that most of us are familiar with is called the Erlang. So that dates back to 1986. It wasn't officially released in the first column produced in 1993, but it wasn't until 2003 that we even really heard about it. There is Scala, there is a whole bunch of frameworks that have been built on with this model. So, Colleen, this is a good example, and that's on the JVM. It's not a language, it's just the framework that you have on top of the JVM. So, needless to say, this has been around for a while. And the basic concept is very simple, right? I'm not gonna bore you with mathematical notation, but the practical aspect of it is there's two things. Every process has a name. Every process has a mailbox. And he communicates by message, right? So you have, and when I say process, that's a very ambiguous word. That could be a physical process in your machine, that could be a thread, that could be a different machine entirely, right? So, effectively, you have different things that have names, and you communicate with messages. So what does that give us? Well, it gives us a message-centric view. It allows us communication between all kinds of entities, whatever that may be, and as you can see, it's a very natural fit for the distributed program, right? You can take this model and scale it up to within one machine and between multiple machines. There's nothing in this model that evidences from doing so, which in itself is very interesting. That's not something you can do with threads and sharing memory by itself. So that's a good example of what a good model can give us, right? So it's certain limitations, message-centric view, but it gives us other things in the process. Constraints, no side effects, right? So we're not sharing variables here. We're passing variables. There's no risk conditions. There's no, we don't need LOPS anymore. There's no semaphores. There's never a case where we're sharing pointers to the same thing. So forget it, we don't need that stuff in here. So that's the active model. The CSP model is a little bit alert that dates back to 1978, but the first kind of seminal paper on this was the communicating sequential processes by 24. Later, this evolved into a whole bunch of a whole family, I should say, of other related systems. And the language adoption actually lagged even that of the active model. So there's been limbo. Most recently it's GO, so Google's Go language. How many people here have played with GO? A few, okay, that's actually more than I thought of you. So GO introduces CSP back, and I'm gonna say this is the only language that actually has CSP built in that's actually interesting to play with today. So what's the difference between the active model and CSP? Well, unlike the active model, processes are anonymous. So instead of giving processes names, we will actually give the communication channel a name. And you communicate over these named channels. So the best analogy you can think of is the Unix file, right? Where you create something on the system and you can shove stuff into it, and you have a name for it, you can shove stuff into it, and you can get stuff out of it. That's really that is the only difference between the active model and the CSP, but it yields very different results. So likewise, that's a centric view, right? No shared variables. Allows communication between threads across new machines. The distributed case is a little bit trickier, right? Because now you have this pipe that's sitting in your machine, you have to somehow expose it to a different machine. But we can think of a way to do that, but not unless. Distribute programming is a pretty natural fit. Once again, no side effects, no race petitions, no 7-4s and all the rest. So very simple, but also very different. So what is the advantage, or what is the feature of this model? Because we have, so I'm calling that arrow a named channel, right? So let's call it A, and we have multiple processes. Because we have A, we can have multiple processes just all pushing into the same pipe. We don't need to know who to send anything to. All we need to know is where is the destination that I should be sending the messages to. So you can have 10 workers all pushing into the same pipe and not worry about who's on the other side receiving this data. You can do some crazy stuff. Well, imagine if the channel itself, what if you could send a different channel over the channel, right? Then I could create this weird scenario where originally we had the blue process talking to the orange process, but then we have this guy at the bottom and then the blue process sends the channel through which it's communicating with the orange guy to the guy at the bottom and says, hey, I'm talking to this guy, I'm talking to this pipe, take that pipe and talk to it as well. All right, so we can actually delegate work to the different process. Or you can create a chain, right? So I could do the same thing and say, hey, the orange process, you know, do something and then talk to this other guy on the other end. So you can create very interesting flows of data using this model. So to explore this, I was playing with the OOs, looking at the source code. I was trying to really wrap on that around, what does this actually mean? Because I find that it's personally, until I actually write some code I really don't get and it's already a block of survey I don't get. So I started basically reporting the Go concurrency model on some root. So you can play with this yourself, gem install agent, let's take a look at an example. So this is the yellow world of concurrency world, right? It's the producer-consumer stop. What am I doing? I've been clearing a named channel, right? I'm gonna call it increment or incur here. I'm gonna give it a sign. So channels in Go are site channels. You have to declare what you're gonna be sending over this channel. So I'm going to be sending integers. Here's where the magic happens. You have a special keyword called Go, which in the language Go starts what's called a goercy. In Ruby, in this case, what it actually does is it takes the code block that I'm specifying and it actually starts a background thread. Now you don't actually see any of the threading on the synchronization code because you don't need to. The model that I'm implementing here will take care of all of that for you. You don't even have to think of that. But effectively what it does is I'm defining a keyword in the language called Go. Go will take a channel and then inside of this block we will loop forever and just continue sending messages into this channel, right? Increasing our number at every single sign. And then we're gonna consume the results. So to consume the results we simply call receive on a channel and that's all there is to it. There is two threads running here. One is producing numbers, one is consuming numbers. There's no synchronization, there's nothing else. Very clean, very simple. And actually, more compact than probably anything you can write with threads, right? So let's take a look at a harder but much more interesting example. So I'm going to implement something that resembles a multi-threaded web server. So first, I'm gonna declare a new struct and it's gonna be a request. The reason I need to do that is the channels are typed. So I'm gonna transport requests to types over my channel, right? And the request will contain an argument and a resulting channel. Then I'm gonna declare a new channel, right? I'm gonna call it client request. I'm declaring the type and I'm actually saying that the size of the channel should be two. What that means is, by default, the channels are un-covered. So that means you can send one message and then if you call receive on that channel, you're gonna log in so there's a message. When I'm declaring a size, I'm saying you can push up to this many messages before you will log. We'll see why that's interesting in a second. Then I'm gonna declare a worker or create the worker process, right? So all I'm doing here is just declaring a new profit. So I'm using the regular movie, which will look forever. It'll take a request object and call receive on it. So as you can guess, that's a channel. That's gonna be a channel. Basically, we have a worker that's gonna sit there and listen and receive. It'll sleep for one second after receiving something and then it'll push a message back and it's an resulting channel where it'll print its current time, take the argument, and add one second. Pretty simple stuff. So sleep, increment, and add a timestamp. Then I'm gonna start two of these guys. So I'm just using some Ruby syntax here here. Instead of doing go with two different blocks, I'm just capturing the block and passing it in, which is why I did the drop at the top. So I'm starting two of these guys in fairly. So now we have two threads on the back. Now I'm gonna actually create two incoming requests, right? So here I'm kind of simulating this with a mic itself, but you can easily picture how you would attach the network socket to this and just create these things in that way. So there's request one and request two. The first one, the argument is one. The second one, the argument is two. And the second argument, so our request object, is a channel, a new channel, or which will be attached to it. And the type of that channel is going to be string. Then what I do is I take my client request channel, which is the channel on which the workers are listening. Both of them, right now, they're running the background and they have both called receive on this channel. So they're both waiting for a message. All I'm doing here is I'm taking the two requests that I've created and just shoving them into that queue. And then I call receive, right? And what happens here is because we have client request channel, which is a science tube, the moment I put both of these requests into this queue, both workers pick up the message, speak for one second, and then print it out. And not surprisingly, if you look at the timestamps that we have returned, we see that they both execute a time and tell them when we get a results back, which is two and three. So what we've done here is we have a main loop, we have two workers, and we have a way to receive messages in a thread safe way. And there's not a single synchronization point in this code, right? You never think here that I ever declare a join or a wait or I need to guard for some shared state or anything like that. Much simpler to write. Much simpler to reason about and much easier to test, in fact. Because you can detect a lot of these cases. The language is not where this model just does not make you to make certain sorts of mistakes. I can tell you that while I was writing this, I was implementing all of this using threads underneath and elements I'm trying to invoke some of these edge cases. But once I have, I mean, writing these things very, very easy. So of course, the question now is, well, Ruby, so what do we do about all this stuff? We have many different kinds of rubies. We have J Ruby, which doesn't have a whole term of law. We have JVM threads, which is great. There's in fact a lot of existing work that's been done in the JVM that's in the Vendetta Ruby. There's framework slot and others, which have already implemented the after model on top of JVM. So in fact, there has been some work for how we expose those principles within J Ruby. And I think that's really interesting. And I'm yet to see something that is easy to use and kind of would be even that quite right now in order to use something like any one of those libraries. You still have to go through a page worth of boiler code, Java, to get something up and running, which just feels plain and raw. But that's what you're watching areas to explore. So I think J Ruby is a great platform to experiment with some of these ideas, and that's concurrency models. Rubinius is also an interesting one because there's work being done on a hydro branch to remove the global interpret law. There is the, Rubinius actually has built in channels and other types. So in fact, a lot of the internals of the Rubinius are built around this channel idea. I don't have a penciled experience with it, but it seems like this would be a very natural fit. And in fact, lately Rubinius has been, there's been a lot of talk about using Rubinius for building all the languages on top. So I think this is very first home ground for experimenting with these ideas. And maybe it's an extension to Rubinius, maybe it's any language entirely, but I don't know, that's up to you guys. Mac Ruby has a very central dispatch. I kind of see that as halfway there. It makes threads handling and all that stuff a little bit easier, but it doesn't expose an actual better concurrency model. So it simplifies it somewhat, but it's not the answer. But I do think that at a certain point we will see Mac Ruby and iOS. So maybe there's a reason to invest into that ecosystem as well. And I think you can take threads of central dispatch and build a higher level API to have something really interesting. And unfortunately, finally, I think the MRI Ruby is kind of the big loser in this race right now because it has a bold return to the law. There is some talk about the MEM, which is a bold CBM implementation, but that's, as far as I understand, barely at the research phase at the New Tokyo University, I'm not sure if it's been real work on in terms of getting in as a language and getting in as a language would be a bold other proposition. I have a feeling that's just not gonna happen anytime soon. Maybe there's a room for libraries like agent, like the one I built. Do I use agent in production today? No, I'm not. Do I think it's actually feasible to use it? I think so. But nonetheless, you wouldn't get the benefits as a multi-port, you would get the benefits of much cleaner code that is testable code and safe code, but you wouldn't get the multi-port aspect of it. So that's not as good. But of course, I don't think we should limit ourselves to Ruby. I think in order to really understand what these advanced concurrency models are, I would encourage everybody to actually go beyond Ruby at this point and explore some other possibilities. So I have a lot of fun working with these languages. What I know is it's a fun language. It takes, it's a weekend project for you guys, honestly, because it's a prototype-based language, very similar to JavaScript, you familiar with that concept. You can read things like the organization in two hours. Yeah, so you're interested, primitive is built in around the after-model, so I would encourage you guys to go with it. There's Go, of course, which some of you have already picked up, I would encourage you to look at Go routines and just work for the examples there. In fact, I've taken the examples from Go, put them into agent, and basically re-implemented all of them in Ruby as well. That was my way of learning all of those concepts. So you can pull up the repo, you can look at the Go code and the repo side-by-side and just play with it. Look under the hood, look how it's implemented. Scala, very popular nowadays. Clojure has some interesting concepts, like functional language, transactional memory, all of these kinds of things. So I think before we make any sort of advances in the Ruby ecosystem, we really need to understand what's happening elsewhere because there's trade-offs in every model, you need to find out what works for you first. So in summary, there is cardio-parallelism. It's not about the threads versus events. And as I've said, I've put enough gasoline on that topic myself. So it's not one or the other. You need all of the foam. You need the threads, you need the foam, you need KQ and all the rest. But what we're missing right now is that Extrude's here, which is a half a model, CSP and all the rest. And I think that's what we should invest our time. I don't think threads and events are the right API to build concurrent programs in the future. I think it exposes a language that is error-prone, that is easy to scrub in and easy to mate yourself on, basically. So we need something better. There's a couple of blog posts that I've put up around this topic because I've been exploring this topic. I would really encourage you guys to go out and follow some of the links in this presentation. I think it's a very important topic for us moving forward. And then finally, do take a look at agents, play with it, see if it makes sense. The specs are actually, there's examples about it, and I would encourage you to look at the specs because they actually talk about all the, they specifically specify all the behaviors that the boring scenes should give and you can walk through those and kind of wrap them around those moments. So that's it. I don't know if I have time for questions. A few minutes. It seems like we'll slow down with the problems that we have. So the question is if you have shared resources you go when you solve that problem and you still need to share them. Right. So the constraint that this model places on you is you communicate with messages, right? So instead of passing a pointer, there's never a point where two entities, whatever that entity is, a processor or a threader. The model basically, this allows us to have access to the same resource at the same time. Right, I'm sure you could probably create a synthetic scenario where you try to do that, but that's kind of dangerous. Now, under the model tell you that gold is designed to be a systems language, so they actually bias it towards let's build a high-performance system that is on within the same post, not for the distributed case. So what happens is when you pass a message, they don't necessarily pass a message but they pass a reference, right, which is obviously passive. But basically, you're passing a reference to the different processes you do. Once you pass it to somebody else, you don't have it anymore, right? But that sort of attraction is completely in front of you. So sure, you can probably create scenarios where you're still creating deadlocks and all the rest, but it's just, I'm gonna say, it's exponentially harder to do so. So the question is, I've recommended 707 weeks, are there any books specifically about in currency? You know what, I've, one of the first lines that I've had had a whole array of different books. And honestly, I can say that I can't think of any one book that is specific about, I wish there was a book that talked about all these different models, right? I wish my university was actually mentioned that the actor model or something beyond threats. That's just not the case. I think we're kind of, we're too focused on the hardware software parallelism. We're too focused on the P-threads and the E-pole and the K-tooth stuff. There's a ton of material around that. There's not enough material on the bottom there, which is out of asking for this model. And I think the state of the art research in that field right now is actually the languages that are being invented at the moment. Go, IO, and a whole bunch of products. Closure and all the rest. So I think the best mix of learn is to actually pick up some of the cold samples and run them today. And I hope we'll fix that soon. We'll get more research about this. That certainly minds and mind giving this talk. Open this topic up and encourage you guys to go out and play with this stuff. Is agent-osomized right? One interpreter. It will run on, it runs on MRI. It runs on JRuby. It takes, it is able to take advantage of the no global interpreter law in JRuby. So there's actually a benchmark example that will run faster on JRuby because it can. It doesn't work today on Mac Ruby because their threading stuff is not implemented, right? They basically just like hold Mac Ruby. So it actually exercises quite a few features. Rubinius, I haven't tested. But in principle, I'm not using anything that is specific to any one implementation. It's just using the basic thread parameters underneath. So it should be able to run across all the whole platform. So it certainly minds and minds that. You said that it's blocking the video or to that? Yeah, I understand. Sorry? You said that agent was blocking under over-certainty size. Oh, I said that agent was blocking under-certainty size. Channels, right? So channels can be buffered or unbuffered. So when you declare a channel, you can say this is the amount of messages so you can keep up to, let's say, 50 messages in a queue. After you try to push the 50 first message, it'll say, wait, hold on, stop. It'll pause that thread and keep it there. So the semantics of that depend completely on your application. By default, it is one message. But you can declare that to be something else. So for example, if you want to have a work queue without consent workers, you could say, the size of my channel is maybe 10. And that's outperforming, that's it.