 This is Concurrent Programming of Celluloid, so I work on the architecture team at Living Social. I wrote this library called Clio, which is, like, Node.js, year before Node.js in Ruby, if you haven't heard of it. I also made an actor library entirely based on fibers called Revactor. This had fiber diodes two years before EM Synchrony. I also made a language called Rea on the Erlang VM. This is where I first started playing with concurrent objects, and I kind of stopped working on that after Jose of Lean made a very similar language called Elixir, which was, like, ten times as awesome as Rea. But Rea was really where I got my start with concurrent objects. I thought about talking about either Revactor or Rea when I was here last in 2008. And you know, I just had doubts, right? So I, you know, after moving on from Rea, you know, I was still a huge fan of Erlang. You know, I think these guys were on to something, you know, like, earlier than everybody else. But, you know, if I can't get the Erlang to the Ruby, perhaps the Ruby can come to the Erlang. So one thing I would really love to change is that, you know, I get this idea that Rubyists just don't like threads, they would rather use processes. So can I get a quick raise of hands of people who really enjoy doing multi-threaded programming in Ruby? Yeah, that's about what I was expecting there, all right? So, I mean, this isn't something we can ignore, right? So we've hit the Powerwall CPU clock, like, getting it higher is hard now, right? But the parallelism is extremely easy. We now have 16 core AMD Boulders or CPUs. So multi-core is the future, right? And threads are important. And we in the Ruby community are blessed to have two Ruby implementations that don't have a gill and give you full thread-level parallelism. You can still do parallel blocking I.O. with YARV. So, I mean, even with Ruby 193, Soluloid is still useful for parallel I.O., but it won't get you parallel computation. So my main question to everyone here, when we have 100 core CPUs, are you going to run 100 Ruby virtual machines or are you going to run one? So one of the projects that's gotten a lot of attention lately, I think, is really driven this point home, is Mike Perhams' Sidekick. And the tagline there says it all, I think. And if you weren't aware, this is built on Soluloid. So let me dig a little bit into your revactor, because I was kind of like, go, and then this thing before everybody else, and I thought it was pretty cool. So it predates NeverBlock, Dramatis, which was another actor framework for Ruby. Rack FiberPool seems to be the main way of people were using fibers, just to try to use fibers with rails. And then EMSynchrony, which did FiberDio. But it was inspired by a lot of other libraries. So, obviously, Erlang, MentalGuy had this on-the-bus concurrency library. And then there were two libraries for doing sort of like FiberDio on Python. But it was still single-threaded, and it had a bad API. I'll let you take a look at this code for a little bit and judge it. But to quote Gary there, my reaction is, Watt, I'll let you look it in again and maybe form a few more opinions. But it feels really procedural to me, you know, I'd be Erlang with like for you to call it functional, but it uses a lot of side effects, so it's more procedural to me. I think it's just ugly, and yeah, what's that T thing, right? So, I wanted to do an actor framework that was less like Ruby DSL for Erlang, right? And just more Ruby-like, right? So, use object-oriented programming. So, I think we can make a lot better actor library for Ruby. So, what is Celluloid? So, Celluloid is a pure Ruby concurrency library. Let me give you a contrived example here. So, this is basically like a nested hash data structure. You see it lazily initializes new levels of nesting there using pipe-pipe equals, which is pretty dangerous to do in multi-thread programming. So, we can just throw mutexes around everything, and that's okay, right? And there's a little sort of hypothetical shull session of using it there. So, what Celluloid lets you do is just delete all those mutexes, include Celluloid, and you're done, and you end up with a much clearer program without the need to manually mutex lock everything. So, how does that work, right? So, you might think it does some sort of like automatic locking system or something, right, that we can just like automatically throw locks around every single method, and it'll be fine, right? But that doesn't work. So, there's the set of classical concurrency problems centered on this very issue, right? You can't solve a lot of these complex problems with just a lock alone. So, the solution Celluloid uses is to combine object-oriented programming and concurrency. So, this is my favorite quote from Ellen Kaye here. This is at the top of the Celluloid project, and I've used it on several other projects, including Revactor and Rea. And, you know, there's this idea that objects are kind of like network servers that, you know, really strikes home with me. So, what Celluloid does is combine the object-oriented tools, like classes, inheritance, and messages with these concurrency tools. Like threads locks and cues, and when you combine them together, we get concurrent objects. And I will dig into this slide in depth a little bit later, but there's a general overview of how it works. You make calls to a proxy object. Those get translated into message objects that get sent through these mailboxes. So, the mailboxes are the basic exchange point between threads there. So, there's this idea of communicating sequential processes, and the actor model is extremely similar. But, you know, this has been used to build concurrent software for a long time, and the neat thing about this is you don't need immutable states. So, you know, immutable state is great, and there's a lot of functional languages doing great stuff with it, but there's this other idea of encapsulation, that if we encapsulate state and use messages, we don't need immutable state, and we can actually have local immutable state. So, concurrent objects are more of an abstraction on top of communicating sequential processes, and they provide this sort of generalized deadlock free synchronization system. So, I mean, that's a tall order, right? Like, it's not going to fix every deadlock, but the basic theoretical model avoids deadlocks. And this is something that not Erlang and not ACA can do for you. So, I thought this was an original idea of mine, and then I discovered Pythons did it. So, I've found this paper, Concurrent Object-Oriented Programming in Python with Adam, and it was solenoid, it just blew my mind. They had built a very similar system, and they had done it in 1997. So, I don't know how much you remember in 1997 there, but computers kind of suck back then. So, I mean, you know, we've got our basically 233 megahertz Pentium there, where they're running benchmarks on 256 colors. So, I mean, these guys were way ahead of their time, right? Like, they were writing concurrency software before we had multicore CPUs. In 1999, I was off revolutionizing the state of X11CD players there. So, I mean, this whole idea of concurrent distributed objects was huge in the early 90s. We had things like Corba, right? And Next had the portable distributed object system. And then the web showed up, and it's like someone sucked all the air out of the room, right? Like, this entire field of research practically vanished. And, you know, I think we got distracted by the web, right? I think the web is a very powerful system, but at the same time, you know, I think there are simpler ways we could do things. You know, simpler ways we can build distributed and concurrent software without using web technologies. That would be a lot simpler. So, how does celluloid work? So, this is a shell session with that concurrent nested hash I showed you earlier. Pay particular attention to celluloid actor there. So, celluloid does is hijack the new method. So, when you make a new object with celluloid, what it's actually doing is putting that object inside of one of these actors and handing you that proxy object I was telling you about earlier. It actually does that even before initialize is called there. So, every single method is handled the same way, including initialize. It supports synchronous calls. So, Alan K talks a lot about late binding and celluloid takes it to an extreme there. So, when you send messages to the proxy object there, it's actually translating them into these crazy thread synchronized calls that go out to a completely different thread that will calculate the value and send you the response back. So, here doing inspect, for example, is a synchronous call. So, also does asynchronous calls. So, asynchronous calls we just don't wait for results. So, it's a straight through message to the actor. It's fire and forget is the typical way Erlang describes this. So, how celluloid implements this is by putting a bang onto any method. So, in this case, I'm doing send, but it doesn't have to be send, right? It could be inspect. So, this is kind of like the next tick concept from Event Machine or Node.js, if you're familiar with either of those. It lets you schedule work inside of an actor, but it doesn't run immediately. So, let's say you wanted to do that, but then you want to get the return value, right? So, celluloid provides this third capability here called features. So, the features let you do is get a handle to the return value, and then you can go off and do whatever computation you want there. So, you could do some more awful scale integer arithmetic there. And then, when you're done, you just call value on the future object, and you get the result back, just like you would if you made a synchronous call. So, yeah, these are the basic ingredients to celluloid, right? We have these normal synchronous calls, the async calls in futures. So, I'm sure you're wondering, what's the secret sauce? You know, how do we prevent deadlocks? So, Erlang couldn't really solve this problem either. Erlang does everything with messages, but Erlang, rather than deadlocks, does timeouts instead. So, they didn't really solve on this problem. They kind of punted on it, and they actually peeped the whole idea of remote procedure calls in general. So, something Erlang can't do. So, why do deadlocks happen, right? What we're actually doing when our code deadlocks is waiting for an event that never fires, instead of, you know, whatever event is actually happening that we should be handling, right? So, what we really need is a way to wait for everything simultaneously. So, how can we do that, right? What? What? Donald near this yoda. So, we have this idea I was talking about before, right? We had communiquing sequential processes, and then if we have fibers, right, they're this cheap, suspendable, and resumable execution context. If we put them together, we can have concurrency that does one thing at a time. So, rather than blocking anywhere, for example, when a actor makes a call to another actor, it just suspends to the scheduler. There's a handful of other things that do this, including directly receiving asynchronous messages, sleeping, and I forget the third thing, Rick Perry style there, all right. All right, so here is an example program where a deadlock might occur. It might be kind of hard to see how this would cause a deadlock, but basically, we have two concurrent objects here, and as you see, inheritance also works, which was one of the major sticking points of concurrent objects in the past, but they're trying to call each other there. So, if you see in Mike, right, Mike is trying to greet the other actor. So, it's greeting Joe there, and then Joe calls name back on Mike. So, I mean, that's output there, right, like, cool story, bro, but it is a cool story, right? So, what we have here is a circular call graph. Mike's calling greet on Joe. Joe is calling name on Mike, and communicating sequential processes can only do one thing at a time, right? So, since Mike is calling Joe here, right, he can't answer your response for your name. So, we get a deadlock, right? So, I'll let you look at this here. So, celluloid has a special magical way to fix this. I'll let you look at that for a little bit longer here and see if you can think of the path through the code, but it ends up looking a little bit like this. So, what Mike can do when he calls Joe is actually suspend back to the scheduler there, and then when Joe needs to sort of make like a re-entrant call, right, back into Mike, it can actually create a brand new fiber, handle that call, that completes, and then we can get back to the original method there. By doing this, celluloid manages to skew.logs. So, the waiting tasks suspend themselves, and then the ready tasks run. So, every single method call to celluloid object creates a fiber that might sound a little bit scary there, might sound slow, but actually I think it's fast enough to be useful. So, here are some numbers here. JRuby and Rubinius end up going a lot faster than you are there, so actually having real concurrency hopes with a concurrency framework, but you know, on JRuby there, right, I'm getting 50 microseconds, so you can make about 20 calls in a single millisecond. So, that's kind of slow, but I think it's fast enough to be useful, right? So, what if an actor crashes, right? So, this is something that Erlang again handles very well, right? So, Erlang has a really simple way of saw this, right? If your code is broken, you just crash everything and resume it in a clean state. So, these are the features of celluloid for handling fault tolerance. It has supervisors, so if any actor crashes, you can restart it. Basic idea is you don't want to handle errors, you want to just crash and restart in a clean state. And pretty much everywhere I've deviated from Erlang, it's ended up kind of burning me, so I'm pretty much trying to do exactly what Erlang does everywhere. So, the second little part of this is celluloid IO. So, this is an evented IO binding, kind of similar to event machine or node or something like that, right? So, I think celluloid provides a great abstraction for threads, but what if we're doing IO, right? So, a lot of people are asking me, how do I use Clio or something with celluloid, right? So, what I actually say is you probably want to be using blocking IO most of the time, and that blocking IO is way easier to reason about. It's simple, it's straightforward, and it's the code you can read, there's no callbacks, that kind of thing, right? So, you can do blocking IO inside of any celluloid actor, and it's just fine. Why I wouldn't recommend is indefinite blocking calls, so if you don't know, the call is going to complete, either with success or an error, like say you're listening, you have like a listener socket, right? That can block indefinitely, you probably don't want to do that. The other place this can be a problem that I've seen people doing is talking to external services using blocking IO, and particularly ones that use locks. So, if you have a deadlock in my SQL, right, it's going to spread out into celluloid there. So, okay, that's great, right, but maybe you want more than blocking IO can do for you, right? So, here's what evented IO is actually good for, right? If we have a large number of connections there, I'm saying greater than a thousand, you can actually do a thousand with blocking IO if you want, but I think that's about a good number where you might start considering doing this. So, it's for mostly idle connections, you know, like if you're directly servicing clients, you probably want to give them a thread, and it's good for IO bound problems, right? So, I mean, if you're proxying that kind of thing, it's great, and I think web sockets are the ideal case. I think this is one of the reasons that I know it is taken off in recent history. So, these actors are actually event loops, right? So, they can do a lot of this evented stuff, normal celluloid actors of timers built in, for example, right? So, a normal actor looks like this, you have the actor object inside, that's a mailbox, and then what's actually waiting on for events is a condition variable. So, to get a celluloid IO actor, what we do is swap out the mailbox, and instead of using a condition variable, it actually has a reactor inside of it. So, that reactor is built on another project I made called NIO for R. NIO for R, I basically took the Java NIO API, massaged it, so it isn't quite as ugly as it is in Java, and did some decently-fascinated extensions, both in C and in Java, and then there's a pure Ruby reference implementation. So, what celluloid IO does is give you duct types of the standard Ruby core IO objects. So, where when you made a call in celluloid, right, it would suspend to the scheduler, when you tried to do an IO operation, it would block, it suspends to the reactor, in the case of a celluloid IO actor there. So, these can do evented IO and threaded IO, right? So, you can hand these objects around. If it's inside of a celluloid IO actor, it'll be evented, and if it isn't, if you're using a normal thread, or a normal celluloid actor, it will do blocking IO for you. Oh, sorry. So, there's additional classes here that celluloid IO provides as replacements for TCP server, UDP sockets, I haven't done Unix sockets yet. Let me show you an example here. Can you see that? Yeah, I guess it's okay. So, you know, when we come into this, right, it looks like a normal threaded actor server that you would write. So, it's making a TCP server there, but since we included celluloid IO, it's actually pulling that in from the celluloid namespace. So, that's actually the evented server there. And then it's doing a run with a bang. So, as I said earlier, that's an asynchronous method. So, it schedules work in the reactor, initialize returns, and then the actor starts running that event loop there. So, it accepts connections. Every time it does, it's doing handle connection to schedule work in the actor again. And then from there, it's working just like a normal threaded server. So, yeah, it's like slide 150 here. Sorry about the poor artwork. But, you know, what I think is we have this big ecosystem of TCP socket-based libraries, right? And what the event machine advocates would want you to do is rewrite every single one of those in the async form. But what happened, you know, that doesn't happen, right? Most people are writing normal synchronous libraries. So, event machine ends up with a much smaller ecosystem of libraries, and then EM Synchrony wants to take those libraries and have you wrap them all in fibers. So, it ends up with this really tiny ecosystem of libraries. I think that's a bad idea, right? I think we should be able to use all of these TCP socket-based libraries in an evented manner without rewriting them all from scratch. So, the way to do this is through dependency interaction. So, what we need to do to make it work is hand any of these libraries a celluloid IO TCP socket instead of the normal TCP socket. So, I mean, that's the kind of API you need to do that, right? There's no ceremony there. You're just passing it an option. Easy peasy. All right. So, Reel is a web server I wrote using celluloid IO and HTTP parser.rb, which is the node parser. Here's a hello world benchmark. I'm not really a fan of these, actually, but I think they give you a decent idea of latency of nothing else. All right. So, I mean, it's decently fast, right? And to compare, here's Goliath. So, it actually ends up being faster in Goliath, but it is, you know, thin is about 50 percent faster and nodes about twice as fast, right? So, it can also do 0MQ. So, this is like that celluloid IO actor I showed you before, but instead of having a celluloid IO actor, we have these celluloid 0MQ actor. So, I built this on top of the EFFIR ZMQ library. It actually exposes probably the best 0MQ API you can get in Ruby today. It's not to fault FFIR ZMQ, but it's trying to be a low-level library. So, on top of celluloid 0MQ, I built another library called DSUL here. So, that's a distributed celluloid. DSUL is mostly ready to use, but I ran out of time making these slides, so I think it probably deserves its own talk. The basic idea is it's asynchronous DRB, and where DRB uses threads, DSUL uses actors. So, you don't have to worry about mutexes, like you would with DRB. And then, there's all these asynchronous patterns that you can do with distributed programming that you can't with a synchronous system like DRB. So, Lattice is my sort of like hypothetical vaporware web framework for celluloid. What I'd like to do with it is reuse parts of Rails. Good friends with some of the Rails core people there, and they're willing to help me out figuring out how to do this, so that's pretty cool. One of my big complaints with Rails right now is there is a multi-thread development mode, and because of this, I think people are afraid to write thread safe applications because they're running them differently than how they were developed. So, it's definitely a problem I want to solve. It's pretty hard, but I have some ideas. And then, I think this can really offer, if any of you saw Rafi Krakorian's talk, he claimed Ruby can't do ScatterGather programming, which is completely ridiculous, but I want to make it very, very easy for people to build service-oriented architectures in Ruby and be able to aggregate services in parallel and give you responses faster than going to services one at a time. So, that's it. Bye.