 Good afternoon everybody. I've always been wanted to give this talk because I've worked with Erlang Elixir for about a year. At housing.com I was part of a team that developed a set of communication applications. We leveraged the MQRD protocol. There's a broker written in Erlang called EMQRD and we built a set of real-time apps which consumed off the MQRD feeds and responded to user analytics events. Since the past year, I've been at Amazon and my day job is to write Spark jobs that look at the Amazon catalog of selection data and figures out the relationships between them, figures out duplicates, your typical big data Spark applications. So yeah, I figured I'm at a very interesting spot where I've seen both ecosystems and I thought, I mean, apart from the surface level similarities, where do these systems differ? And if you were to make a choice for your particular application, what would you look into and how would you benchmark or examine it further? That's basically the whole part of this talk. I'm going to be dividing this talk into main sections. The first section is important, a quick summary of the actor model. It's important because even though the actor model is pretty intuitive, you don't need, I mean, you can explain like if I, it's a very intuitive model, but still there are certain things in the actor model that we need to watch out for that we can compare in both implementations. Once we have that down, we'll go into the four characteristics of both systems. So let's start with the actor model. We're all aware of why we need a concurrentian parallelism these days. The block speed of RCQs have reached a sort of a plateau. Multi-core systems is a new norm. And you need to be able to model your programs to exploit these capabilities that the hardware now has. So yeah, we've always had thread. We've always had processes that the OS used to provide. But at a higher level, when you want to actually achieve some tasks, you actually want to have a chat system which is fair to all users, which maintains certain tail latency, you would want higher level parameters than a raw thread or a process. So that's basically the whole motivation behind why we're looking into the actor model and similar models like the CSP. So I mean, ever since we've had Linux, even before Unix, we've always had this notion of processes and threads. That's pretty much what is abstracted. It's the closest possible abstraction to the hardware that's below us. And it actually surprised me that up until 2008, 2009, this was the prevalent model. It's only now that multi-core systems are really the new norm that these languages have taken some amount of traction. So yeah, we've always had persistent threads. We've had locks. We've had compare and swap instructions that the CPU would give us. But we wanted to know what more could we do from this. And assuming we have a higher level language which lets us tackle these problems, what would we look for in that language? What would we look for in that model? The first thing that's very, very obvious is throughput. Because if you have multiple cores, you want to get more stuff done. If you just want more stuff done, that's throughput. Once you have throughput, you don't want a certain amount of work, I mean, certain kind of work getting stopped. You don't want your latencies, the tail latency system, being impacted. Second thing you want is fairness, the latencies. The third thing you want is data integrity. When you have multiple threads accessing a single location, accessing a single cache, and you've seen in the first talk about compute architecture, how there are four different levels of caches and the complexity that's there, you want very safe heuristics for when you can access shared memory and shared resources. So you want safety, data integrity. And the fourth thing that's important to me is explainability. I should be able to explain why this doesn't deadlock or why this live locks or why this is actually concurrent, why this is not serial, why this is not sequential. So when I'm looking at any system, I'm looking at any actor implementation, when I'm looking at any concurrent, any technique to achieve concurrency or parallelism, I'm looking at these four things. So actually to go further, let's take a concrete problem. You guys have seen this, I think, in college. It's a very classic problem. Anyone? The tiny philosopher, right? So, I mean, I picked this problem because it encapsulates pretty much what I set out in the last slide. And if you have any implementation, if we go further down the slide and we actually implement the solution to this, we would be able to figure out how the two systems differ. So just to briefly summarize, what's happening here, so there are five philosophers sitting around a table and you have two folks adjoining each other because it's a circular table and the philosopher needs to eat. So each philosopher needs access to both his phones, both his folks, and once he's eaten, he goes and does some work. That is, he does some work. To get down to what we want to do, the work here could be three types of work. He could be doing some IO. He could be, say, read a book. He could be using up CPU cycles. He could be tokenizing that string that was contained in the book. And finally, he could make a network call where he would be probably summarizing it and sending it to, posting it to some blog. So basically two things. First thing is he waits for two folks to be available. That's just a synchronization point and once both are available, he sets about doing his task. So this is also a good fit for what we're going to discuss because here each philosopher is independent. You would ideally want each philosopher to be doing his own thing and if you have a multi-core processor, you want all the philosophers to actually be parallel, not just concurrent. So yeah, let's do this example. Let's go ahead and let's look at theoretically what an actor is. Like I said, an actor is an independent entity who can only communicate with the external world through message passing. He has certain private state and he can accept messages. He can spawn multiple actors based on some message that he received. Basically you can think of the actor model as important as someone who reacts stimulus, which is a message. All status is self-contained. That status is not exposed anywhere else. So with that, let's see how the dining philosopher's problem... If you can intuitively map the dining philosopher's problem to an actor world, I think we would just assume that the philosopher is going to be some actors because they do some work. They accept messages. One solution to the dining philosopher's problem which is the easiest to explain right now is to have a central arbitrator, a central waiter who decides what folks will go to what person. So you can think of the philosopher as an actor who asks a waiter, give me two folks. Basically the waiter is going to look at what folks are available and if it's available, he's going to give it or he's going to say no folks available. So the critical thing that's happening here is the arbitrator needs to synchronize the access to the folks. He cannot give out one folk and he cannot have another waiter or another philosopher asking for two folks and you cannot have that kind of data incoherence. So basically what's happening here is the arbitrator is deciding who should get the folks and the philosophers are doing work. So if you actually map it to what parts of the actor model this affects. So you can see that there is, so the arbitrator is actually enforcing serialized access to the folks. So basically the arbitrator has a queue and every philosopher can put in a request and that request is going to be processed sequentially and similarly the philosophers themselves have stayed and once they're done, they can again tell the actor, I mean the waiter, okay I'm done and take back these folks. So basically you're restricting the general arbitrariness of logs and scheduling to one of queues that each individual actor possesses. That's basically the intuition behind the actor model and how it applies to a lot of problems that we see today. I think with that we're pretty much done with what the general actor model is and we figured out that the queue is important. So that's what serializes access and that's pretty much what can trigger some action. So this is the second part, this is the actual meat of the talk. We're going to see how the queue differs in both, I mean the implementation of queue differs in both systems, how scheduling differs in both systems and the other things that come up when you actually look at it practically and not just as a theoretical concept. So first among them is messaging and mailboxes. So messaging. Basically when a philosopher says, give me two folks, how does he send the message and how does Akka enforce the message and how does Elixir or Erlang enforce such a message. So you have two ways to send a message. You can either copy that message and send it across to the mailbox of the actor you want to talk to or you can send a reference to it and both of them have the trade-offs. So what Erlang does is a direct copy. Unless it's a raw string, unless it's a binary, it does a direct copy of whatever message you're sending and copies it onto the mailbox of the receiving this thing. Whereas Akka doesn't do that. Akka passes a copy of a reference of whatever you're trying to pass across. But first we'll just look at what are the pros of sending my copy. So when you send my copy, first thing is your very simple post-isolation. So the process that received the message has everything that is required for its execution. It doesn't need to refer to a central heap. It basically has the whole message. Very, very, very simple to understand post-isolation. And if you want to have garbage collection, which like I said is a practical constraint, you need to have copy of the message with each actor. You cannot refer to a central heap location where basically you can't have, per process you see without actually copying the message. And what is the cost of copying a message? It's allocation. You allocate whatever space that basically takes up space. But I think when that line was being designed, allocation was a solved problem. You could either allocate a buffer initially, you can take it from the OS, and you can allocate, you can just take up spaces in there and use that to actually copy the message. So basically the gamble that they went for as allocation is very common as very cheap. It's worth the other benefits that actually copying the message, however big that is, is going to cost. And the console actually, I mean, it's very easy to reason out, right? Memory consumption. So if you have like a huge record, you're sending it across, it actually has to copy that record. And the actual cost of allocation in case that heap is full, right? So these are roles in the cons. If you pass by reference, right? Again, pretty simple to reason out. You don't have memory consumption. It's just a pointer to where it's stored in the heap. But then when you actually run the GCN, you actually want to clear out that message after that message is processed. You need to reach across all your actors, figure out everywhere that message exists and hear it out, right? And it's actually faster. It's just passing the reference. It's actually faster. So in Erlang, like I said, it's copied. And everything is immutable. Everything can be pattern-matched. Pretty simple straightforward. And in ACA, so a message pass is nothing but a function call. And function calls in ACA are a copy of the references made and that's what is passed. So it's pretty much the default Scala method call. So one thing to note here is there is no enforcement of the immutability of that message. You can initialize a var and you can just pass it. There's no compile-time check that actually enforces it. You can imagine it can lead to a lot of problems. You initialize a var, you pass it, you edit that var, that message passing. I mean, the message that was passed really has no significance now. But the convention is to pass case classes with your message in it. So yeah, we stick to convention in ACA. Now that's about messages. And I think the critical part here is about the mailbox. Because from actually having logs all across your system, you've identified the contention points to the mailbox. That's one way how the actor model works. So it's very, very important that the mailbox is efficient. It should have daily efficient Q and DQ operations. And typically your Q operations are going to be multi-producer. A lot of other forces are going to be writing to your mailbox and it's going to be a single consumer. That's a classic end-producer one-consumer problem. And there are some well-known solutions to it. We'll look at how both of them solve it. So in Erlang mailboxes, it's unbounded. And from what I read, it's very possible to crash your Erlang application if you just keep sending messages to a mailbox and that's unhandled. So that's like a very common thing. The second thing to note about the mailbox that I think is important in our day-to-day, this thing is selectively supported. So what you would have is, so you have your whole mailbox and in your handle color and your handle info, you can write a receive statement with priority, right? So what the beam watch machine is going to do is it's going to take each message, see if any of the receive clauses match, and if it doesn't match, it's going to put it in a different queue. It's going to do that for the entire set of messages that's queued up in the mailbox. So while this is really great for programmer expressiveness, you can say, okay, I want to read a high-priority message first or say you can say I want to set the status of something first before I post any other message. This is something that we reach out automatically, but this could have a cost because it's actually transferring all the unhandled messages into a separate location and then transferring it back once this is done. This could actually lead to your receives being really slow if you have a lot of unhandled messages, right? ARCA is very simple. It uses, I mean, there's one of the things I like about ARCA because basically they have a constraint that it has to work on the JVM. They have a constraint that, not typically constraint that they have to work with whatever the Java ecosystem already provides, right? So for the mailbox, they just go with something that's there in Java, you're still concurrent, right? Either use a blocked link list, or any of the other queues that are there in the utility concurrent package, right? And on top of that, it provides you some behavior. So you can say the mailbox should be bounded to some size. You can say if it goes beyond certain size, it should just reject the messages. So there are certain settings you can apply on top of these queues, but underneath it's just one of these standard utility concurrent queues, right? And basically you have these configuration options. You have bounded, you have typically unbounded. And then you have a priority queue, which I don't know, I mean, which is interesting. So you tag every message with a priority and you receive based on the priority of that message. You also have something called a dead-letter box. So which is where, say, if you send a message to a process, I mean a process sends a message to an actor, a typical threat sends a message to an actor, where does it reply to? It replies to the dead-letter box. It's that place where basically the, if the entity is dead, that's where it ends up in. Then you have certain things like the capacity and the timeout. The timeout is, I mean, correct me if I'm wrong, I don't think there's anything, any concept of a timeout in an airline message, right? So when you send a message to a process, to a process mailbox, if there's a lot of contention happening on that mailbox, or yeah, if there's basically a lot of contention happening on the mailbox, it can take a lot of time to get received. So there's no concept of a timeout there. Here, you can specify a particular timeout, and after that, the message is failing, the message passing will just fail. You also have a capacity, which is helpful in preventing over-matches. So basically two queues and the different variants of it, the link blocking queue and concurrent link queue. So we have messages down, your mailbox is down, right? What's left is, I mean, this also, I mentioned selective receive, right? If you do want that functionality in ACA, you have to do it yourself. So one thing you could do is buffer all those messages that's in your mailbox, keep it in your internal state, and then act on it. The second thing you could do is actually use become, which is a way to have a different receive definition, have a different receive partial function, handle your message till your high priority message doesn't come. So with the combination of become and stash, so stash, basically for any message that doesn't fit your priority, right? You can just stash it. And then once you get a message that fits your priority, you become whatever you want to do for the low priority messages when you un-stash those messages. So this is like a thing that I've seen a lot of ACA code which is super intuitive in Elixir Online. It's pretty clear what's happening here, right? So here what I'm looking to do is, I'm processing some messages. I want to have a status code for every message. So what I do is I wait for the status code. Only if it comes, do I actually loop into the actual message processing system? If I don't, I just stash it. So yeah, pretty much. Something that's very simple in Elang, I mean, with you in Elang. Right, so next important thing that comes up is scheduling. So you have messages, you have immutable messages, you have the mailbox. That's the bone-up context in between different actors that are trying to write to it. So what's next to scheduling, right? So, I mean, this is something that I mentioned earlier. You need to schedule a process once there is some stimulus to it. Like, if there's no message to act on the process, there's nothing to schedule, right? So both systems rely on a message hitting a mailbox to actually schedule that process, right? So yeah, I mean, before I go into this, yeah, before we actually go into the definition of how, I mean, go into the implementation of how to schedule. Here's just a brief overview of how. I mean, if you've not seen the ACCA version, it is how ACCA is defined. This is how typical actors define. I've not used, I mean, I've used GenServer here and compared with ACCA because it has, I mean, for the purpose of this discussion, it's much more relevant, right? This is a philosopher, right? It accepts state in its arguments. So it takes a waiter, which is the actor, which is the arbitrator, which is who you ask for folks and it has a particular name, right? And basically what you need to do when you inherit the actor trait is define the receive function. So the receive is pattern match and it's pretty similar to what you do in an airline. So here, what is being done is the folks are available. This is what I first pattern match on. I begin the eating, right? And I send the arbitrator, which is the sender that I'm done eating and I think is the activity where I said there's going to be some, this guy, some BCP usage and some network IO. And in case I ask a waiter for some folks, there's no folks, I just begin the eating, right? It is similar in Elixir, where you have two handle info state blocks where it's pretty much like, pretty much corresponds to one to what is happening in ACCA, right? So like I was saying, messages are what defines when a particular should be scheduled, right? And if you look at the implementation, this is not something I expected. It's actually the mailbox which actually gets scheduled. It's not in ACCA at least. It's not the actor like I would have thought, right? So just to repeat whatever the thing, messages can be thought of as stimulus. They are the trigger to actually schedule a particular thing. And when an actor gets a message, it needs to be scheduled as much as possible in the interest of an actor, right? And so what is the implication of this? So basically how you schedule defines two things that I mentioned earlier, right? It defines the throughput. Do you schedule one actor with all the messages that are in that mailbox that gives great throughput if there's nothing blocking? So if you schedule one actor, assuming there's no dependency, there's no blocking call, suppose it's just like the philosopher is probably just calculating prime numbers or something like that, it can give great throughput, great work is being done, but it's at the cost of other actors that are being grossed in. If you don't schedule, if you don't schedule other threads fairly, you have the latency issue that I talked about. The scheduling is basically key to avoiding, I mean actually getting an optimum of both. So the better you schedule it, the more fair and probably less throughput of your schedule, it's more fair overall. It depends on what you want, right? So yeah, we're trying to optimize for both throughput and latency, and you would see how this is different in both arcane and elixir, right? And the constraint is that both of them have to build on what the OS provides, right? The OS provides threads, OS provides IO callbacks, so both have to build up based on what the OS actually gives you, right? So again, one more thing I like about ACA is how it uses something, something very common, right? So if you use Java, you've typically used a thread pool, an executor service, and you submit tasks to it. So ACA basically uses different types of thread pools to do its scheduling based on the task at hand, right? So the term for the class that actually does this in ACA world is a dispatcher, right? And so basically it decides which actor is going to get time on the CPU, right? And so by default, when you set up an actor system, which is sort of the supervision tree, it's a root supervision tree with starts of spawns of other actors, that's where you can configure which dispatcher is going to handle it. And by default, any actor spawned in that supervision tree is going to get that dispatcher. And you could, for a certain number of reasons which we'll go into later, you could specify which dispatcher you want for a particular actor, right? So what does it have to work with? Again, we have Java, Util concurrent classes to work with, right? We have the thread pool you have the runnables to work with and it's to make an actor system out of that, right? So yeah, so what do you have? When you actually think of an actor processing something, it's a task, right? So there's some behavior, it's encapsulated in the actor and you have some message that's going into it and this is basically the runnable, the state that's also defined in the actor. So you have behavior which the actor provides, you have state which is in the actor and then you have the message coming in. So these three things actually encapsulate a task, right? And you have a pool of threads to run on, right? Which follows the thread pool interface, right? The key is to actually load balance across these tasks and ensure fairness, right? So basically our task at hand is to convert the actor paradigm into something that can be run on basis of runnables and thread pools. So what it does, like I mentioned earlier, is it models each mailbox of an actor as a runnable, right? So if you think of an actor, it's basically a closure of some functionality, right? Instead of taking the message to a particular function, you can take the function to the mailbox. That's basically what's happening there. So basically you have a queue which is something that a thread pool maintains and for any actor which has an incoming message, right? Where its mailbox is non-empty, it's queued there. In the order of when the message came in, right? And basically what's happening there, like what does an actor do? He actually executes the receive statement, right? So when you actually dequeue from the mailbox, it takes that message, takes the current state and applies it to the receive function. That's pretty much what Akka is doing, right? So sending a message to an actor is something very commonly done, is something encouraged in our actor systems, inverse-to-thing. The thing is appending to its mailbox, fighting amongst other threads, and appending to its mailbox in a safe manner, synchronized manner. And the second thing is putting that mailbox on to a task queue which a thread pool, one thread in the thread pool can become, right? And this should do it, right? So this is basically the soil, what theoretically we want to do with the Akka model, right? So let's go back here, right? And we'll take a look at this. This is one thing I've commented out, right? So when you actually think, we're doing three things I said. Doing file IO, we're doing some CPU hogging, and we're doing some network IO. And as you can expect, these are blocking, right? If I just submit a task, if I just submit a runnable to a thread pool with blocking activities, it's going to make that, it's going to make that thread come off closer. And once the thread pools, all the threads of thread pool are similarly off, basically know what that's happening. So any blocking call, right, is going to result in one thread going off the thread pool, and it's not, I mean, it's going to hurt the latency of some other process. So imagine a very IO heavy task, right? Like something like this, where each actor you're actually writing to this or something like that, very immediately you're going to over in the thread pool, and you're going to have zero provision system. Even something like calculating a CPU intensive activity, right? Even that is going to hog the CPU, and it's in a limited core system, that's not even going to schedule the other threads back into that core, right? So this is like a very glaring issue. This is basically something that we want to solve, right? And even if there's one occasional blocking call, that can have a very serious impact on the latency that some other process will have to pay for, right? The other thing that's there is context-switching calls. When a thread gets scheduled to a processor, a lot of resistors are get set. There's the whole actor state that's actually encapsulated in that runnable. So every time it blocks for IO, it gets rescheduled. You're going to pay the context-switch calls that comes with the thread, right? Ideally, you would want to have a thread that's constantly on the CPU, and the stars are being submitted, right? That's the place where you have the maximum throughput. Ideally, you want to get there. So how do you go about it? So one solution you could do, and that's something that's mentioned in the ACA manual, is to segregate your actors, is to look at exactly what your actors are doing, and put the IO heavy actors into one dispatcher, right? You could have something like probably a cached thread pool, which basically spawns a thread for every IO activity. And assuming, I mean, I think a thread in Java is, I think, 512kb or 1mb. Assuming you don't have a lot of such active actors, you can actually make the system work without, I mean, overreaching on your memory requirements, right? That's one thing you could do. But again, this has two things. One thing is, apart from the memory cost that you pay by actually spawning a thread, which, again, if you think about it, is not doing much. It's just waiting for IO. It's just quickly checking for IO. The other cost you pay is to actually figure out what is happening in which actor in your system, right? So if you're working on a typical ACA code base, you could very easily refer to some Java library where there's some blocking call and you don't know till it hits product, till you actually test it out, right? So this is something, I mean, that's hard. I mean, if you're designing a system from the beginning and you know exactly what kind of actors you're going to be using, this is something you could do. But otherwise, I mean, I would find it pretty hard. I would want ideally the language to solve, the framework to solve this for me, right? The second solution, I mean, which is what is mostly used is futures. Futures is a Scala construct which has been adopted into the ACA framework with some tweaks. And I'm not sure probably they will take a draw from Scala these days. So here what you do is, I mean, you guys are familiar with the future, right? It's the result that will come in in the future. So you don't really block on that call. You just have a reference to that result which can get fulfilled in the future. It can be successful, it can fail, but it's going to come in the future and not going to block on it, right? So in the philosophy issue, right, we had to read a file, we had to process that page, and we had to do some network activity. So you can actually compose the future this way, right? You can wrap the reader file in the future, you can wrap all the three things in the future. And this for comprehension actually unwraps it when actually this returns. And when a thread actually gets down to doing this work and returns when it unwraps and you have this syntactic sugar for actually doing callback-driven programming, right? So this is pretty much how you would solve it in ACA, right? So you have the result and then once you're done with what you're thinking, you tell the waiter that, okay, I'm hungry now, you can give me more folks. Which one? Right, right, right, right, right, right, right. I actually wanted to... Yeah, it should be featureless thing. I actually was awaiting in the receiver. It should be this thing. The yield actually returns the future. This should be featureless thing. So yeah, the whole promise of future is it can be composed, right? You could map over it. You could actually write four components over it. So you don't really have to have the issue of actually writing callbacks for it and solve it in that manner, right? And futures are guaranteed to give a result. It either times out or it gives a success or a failure, right? And one more thing to note is the future is going to be executed in the dispatcher that's configured for the actor, right? It's going to be... I mean, ultimately, it is some work that needs to be done, right? And it's going to be done in the dispatcher that is configured with the actor where this feature is created. Considering that we had one received task, right? That actually spit out into three different tasks. It actually created three futures. One to read a file, one to talk some CPU, and one to actually send out a mail, do some network activity. Can we do better? Can we do better with our thread pool, right? Which is what led to something called the fork join pool, which is what is used most commonly. So if the thread pool was just task using threads, right? So what will happen is... So you have certain configuration. So you have something called the core pool size, which is still when... I mean, when you have tasks that are being submitted, it immediately is scheduled on a thread, right? And if it crosses that pool size, it's going to queue it into the internal queue of the thread pool, and it's going to wait till some thread gets down to finishing it, and actually it steals that work, right? Not steals, dequeues that work, right? And we want to do something better, especially in tasks like this where one received, which is one task is going to generate three further subtasks, right? You don't want that thread to be scheduled. You don't want that thread to be scheduled, and the other thread is actually lying empty, right? So which is basically the whole promise of fork join pool, right? And this is capable of work stealing. Like in the earlier example, the three other futures, the task for corresponding those futures would get scheduled on the thread which actually executed that received statement, right? And there are other threads in the thread pool. They would immediately steal it, and you would actually be utilizing all the resources that you have in the system, right? So for tasks which are not really long running, right? And where, like, you don't have a lot of IO. You can actually use fork join pool to, I mean, to limit the problem. So we discussed the problem where you could have a separate dispatch system, and the other way, the middle ground is to actually have a dispatcher which is based on the fork join pool and use futures to compose over a time. We'll see to it. And Akka has actually kept dispatchers configurable, right? So there are three or four types of dispatchers. Mostly you would be using a default dispatcher. There's one dispatcher called the pin dispatcher, which I thought was interesting because it creates one thread pool per actor. So if you really want something that's, that requires immediate CPU, that requires immediate distinct, that would create a pin dispatcher. This is typically how configuration would look like. Basically the stuff that I mentioned, you would, in a thread pool, basically giving configuration a thread pool that this dispatcher is going to run on. So you're going to give your core pool size, which is that multiplied by the factor is going to result in that many threads. And until three into two, which is six tasks are not in the system, you'll just spawn the thread and give it out. Okay, more than six tasks is going to create an internal queue and it's going to keep the runnables in that queue until it reaches the max of six into two, which is 12 threads, right? You can also mention the mailbox that each actor in the system is going to get here. You can mention whatever mailbox, capacitors one means unbounded. And the throughput, this is something that again is tunable, right? So I mentioned when you actually process a message, what you're doing is actually dequeuing from the mailbox, creating a task out of it and running it, right? So instead of dequeuing one message, you really want great throughput. You can dequeue any number of messages and you can ensure that that actor is running, right? So basically throughput is configurable. Again, this results in skew in your system. This results in certain actors actually hogging much of the resources which could have been fairly distributed. But again, that's something you can do. So I'm going into all this detail in ACA because although this is pretty similar yet, very different in airline, right? You'll soon know why. So certain gotchas are futures. So first thing is when you actually compose a future, is to encapsulate something stateful in it, right? So you have the sender method that's available typically in your receive block, which actually mentions the actor ref of the message you received, right? Of the actor from where you got the message. If you encapsulate in the future, the future might run at a different point in time and you might have... I mean, you can send a message to a wrong... It basically beats the whole point of having data correctness, right? Second thing is mutable state. In your actor, you are allowed to have variable fields. Everything doesn't need to be mutable. So if you have mutable fields and you actually have a future which is getting run on a different instant time, it beats the whole model of an actor that we know that one message causes some state transition and that state transition is available for the next message, right? So that promise is lost. So there are certain future gotchas that you have to be aware of that is not enforced by anyone. That's something that you need to look into and put away some of that. Airlang, right? It turns out it's very similar, yet different. So you'll know why. So again, like you have in the ForkJoin pool, when you start up the Airlang... I mean, when you start up the watching machine, it's going to start one scheduler thread. It's going to hug each core and wait. That's typically our run queue that was there in the ForkJoin pool in the Java world, right? And this scheduler steals work. If it sees tasks in the other scheduler's queue, it just steals. I mean, this is a very complicated implementation, but in this way, it's very similar to what is happening in the ForkJoin pool. So what is different? Different is because the Airlang watching machine was geared from the onset to do such concurrent and parallel activities, concurrent at that point in time. It has actually annotated each call, each method call with how much resources it's going to consume and if it's blocking or non-blocking, right? So this is something that's not there in the Java world, it's something that's not there in the Scala world, right? So if in the example that we saw there, right? If I do file.read, the watcher machine knows it's a synchronous call. It's not going to block it. I can just write file.read and go ahead as if it's a synchronous thing, right? So with any network activity, with any disk activity, it automatically knows that it's a sync and it needs to be de-scheduled and something else needs to be put in the run queue, right? So something that we have to do, we have to do using futures in Scala is done automatically because every function is annotated, right? And this is about, this is something that is probably understandable. The thing that I find amazing is this is a concept of reductions which also applies for CPU activities, right? If say you are crunching through some numbers, it's calculating a prime number, right? After some amount of computation, you actually need to yield to the system and it needs to be de-scheduled, right? So for any CPU activity, and I think this is the only system currently in existence, you have the effect of preemptive scheduling, right? And this can be with simple things like actually sending a message to a mailbox, right? If that mailbox is under high contention, it's going to cost you more reductions. You can at least send fewer messages to a mailbox. This is going to happen with garbage collection, right? With a process, there's a lot of heap memory that's taken up. It's only going to do part of the garbage collection once because it actually counts how much focus is done and de-schedules that task with regular expressions. Everything that takes up some resource, you have the concept of reductions and typically you have a 2,000 reduction. So basically you... I mean, this is basically the whole crux of airline scheduling, right? Like you have really fair scheduling. No actor is going to go stuck, right? And it doesn't have to be something like IO where it's very obvious that it needs to be de-scheduled. It's as common as even CB activities, right? Or even writing to a mailbox or even DC, right? So that was about scheduling. That creates a huge difference between both ACCA and airlines on scheduling. Next is common things like behavior, right? So it depends on two things. So for... Yeah, that's something that I forgot to mention. For IO, right, you can mention how many... I think that it needs to start up at the time of... At the time of you starting to watch a machine. So what it's going to do is it's going to do something that we had to do manually in ACCA, right? We had to have a separate dispatcher for IO heavy tasks. This is something that's happening manually and there's something that's happening automatically. For any kind of IO, it's going to put those tasks into the IO queue and it's going to get back to it whenever E-Pole or KQ or whatever OS implementation says, okay, this is done, right? And with normal CPU reductions, it's just going to go back to the end of the queue. The next thing is going to be produced. The next thing is going to be processed. So the scheduler thread, to answer your question, never gets de-scheduled. It's always running. It's always running. It's one o'clock. Oh, I have... Behavior, right? Behavior is something that I found a little frustrating because I'm used to the convenience that the gen server behavior that we typically use. You have very, very clear... I mean, very clear difference between sync and async calls. You can do state maintenance, right? And then you answer the sync and async call, right? And you can do init, cleanup, supervision, all bend into it, right? These are very common parameters that I need to use really any task, right? So I'm used to that. So ACA has slightly primitive ways to do it. It expects you to build something on top of it, right? It's more performant, though. So what ACA has, I mean, enforced through convention is two things. If you want some kind of a sync behavior, you need to use the ask audio. You need to use the ask audio. So what this, again, is going to return is future, right? And you need to map to whatever... Because there's no... I mean, you don't really have anything like a type in a return. This thing, you need to map to whatever is required. That's with ask. And this is pretty much the only way you can actually have a non-actor system talk to an actor system. You ask that actor system and you get to return to future, right? And, yeah, this can also be useful blocking once Once you have the future, you can await on it. You can have blocking, like you have an angle call like that happened in the client. But I think it's pretty similar. You have the tell, which is everywhere with the exclamation mark, which is pretty similar to... Your handle info is very similar to the sync command that's selected, right? And, yeah, so... So one thing that I saw is... If I'm an actor and I'm producing a message, I can't really tell if the client is waiting on me or not, right? This is something I've seen in programs. So when I'm doing... So with Alexa, right, when I'm in the handle call method, I know the client is waiting on me. I know that probably I can... If it's something that takes time, I can send a no reply and I can actually process that message later and then send a reply. But with this, I can't really tell if the client is waiting on me or not, right? In both cases, I just get a normal message and I reply the normal way. It's just based on the way you called it, if you're asked or if you're told, is how you block or you don't block it, right? I haven't seen a lot of code bases, but I would assume you're reasoning about programs would differ greatly with this. So certain frequent tasks, this is pretty similar to what we have in Elixir called GenFSM, right? Here you have become and unbecome. So become and unbecome are... Become takes a partial function as basically a receive block and the thing with that is you can pilot up, you can stack it up, and based on certain messages, it's actually on the perimeter of the system, you can swap out that behavior, you can add on certain new behavior. So this is what I showed to you in the selective... The selective receive statement there because before a certain variable was set, I would behave like... I would actually watch out for that particular flag, and once that is with me, I would become something as I would become a processor and actually wait for messages to actually process, right? So here are other life tackle hooks, which is something you would typically need restart, pre-restart, when a supervisor actually needs it back. Final thing, garbage collection, right? So the thing the act model does is actually constrain your programming style, right? Instead of having logs anywhere possible, instead of having thread pools anywhere possible, you are restricted to having logs as the message queue. You are restricted to actually scheduling receive statements in each actor, right? So if you're going the length of doing all of that, you're going to maintain immutable state inside the actor. You should read some benefits out of the garbage collection is what I thought. I mean, I don't know if the design charges... So basically, can you explore GC to actually figure out it's an actor model that's being used here and actually be more efficient about it, right? R-line actually does it. Okay, not particularly, but I would like to be corrected if that's the case. So the common heap for all actors was states, right? So state is just like any other immutable reference in Scala, right? Common heap for all mailboxes because it's just a concurrently accessible queue, right? And you don't know at any point in time what actors are polluting, what kind of... I mean, what actor needs garbage collection. It's just a heap, it's growing, right? And yeah, so the thing with this is... I mean, the thing with this is most of these messes actually get cleaned up pretty fast. So if you have a generational garbage collector, most of it is going to get cleaned up in the first generation, which is pretty cheap, which actually gives you a decent performance. But if you have long-running actors, it could either lead to you having to take up a lot of memory to do something or full-on GC pauses once you have reached a certain limit. And again, it's not particular to a particular actor, right? You could have one actor which is probably like your DCP acceptor is accepting a lot of connections actually, is editing a lot of garbage, but still the GC is going to run for the whole system. It's going to stop the void. Yeah, I would assume a VM is isolated, but so yeah, and a GC is going to affect everything, right? It's not just stop the void. The GC is again one more thread which is taking up a CPU core, right? It's going to take up... I mean, it's going to cost some day latencies to go up because of GC, right? And yeah, I would suspect some hooks to be there to figure out what's an actor model. I don't need this process actually because I'm editing a lot of garbage, but I couldn't find any. So yeah, P95 latencies could take a hit when the GC runs. Requires tuning because you need to figure out when the GC needs to run. This is also a problem in itself, right? And you need to figure out how much memory you can give, and tuning GCs are quite the black magic, right? On the brighter side, like I said, unless your GC, most of these messages get consumed really fast, you don't really have a lot of garbage from the messages which still happen in the state, right? Erlang GCs, I think something you guys are familiar with. So these... I mean, this is a GC that's built for actor systems, right? So what happens in a garbage collection? You start from roots of where your objects are, where your... How it started. And then you travel down and see what are reachable, what are in scope, right? So the tree for that is very, very, very small when you actually look at it per process. You don't have to look at the whole heap space, right? So... And the other thing you could do is, like I said, for a particular actor, you could trigger a GC. If you figure out that one actor is going to generate a lot of GGs, it's causing a lot of strings. If you generate it, you could say, okay, the heap space for a particular process because even that is isolated. Exceed a certain world size, trigger GC for that. Right? And like I said, even GCs that are run per process are also subject to reduction. So, I mean, this is like as there as you can get. So even if you have to do GC, it's going to cigar out, okay? And clear out this much. Again, it's going to get de-sheduled. And again, it increases the queue again. This line is going to get GC run. So basically like hands down it prioritizes latency over support. So the other thing, I mean, I mentioned messes in the first thing, right? Messes are copy value. So you don't really have a central heap space to clean up. It's all per process, right? The whole architecture is per process. And yeah, each process is maintained state in its own heap. Nothing to actually look out for. Like it's a true actress. It's actually nothing. Nothing else to depend on it. So yeah, the conclusion is this, pretty simple, I guess, is that if throughput is a concern, right? If you want to do a lot more work, but you don't really care about the, about actually having predictable tail latency, might have good latency, but having predictable tail latency, you should stick to ACA, because it's straightforward. It basically gets towards that, right? And all the other things that are there about the futures and about, yeah, those are things you could very easily work with about configuring dispatchers and stuff like that. You could very easily work with if you want the throughput that you want. The other benefit is you have the whole JVM system to actually live off, which is quite the big point. And ACA actually provides a very nice way to interrupt with it. Like I said, when you actually ask a return on the future, so you can have very perfectly sequential headcode calling an actor bringing off some computation, running back because you have futures, right? So it's sort of, yeah, it fits a very particular niche, right? Erlang on the other hand, preemptive scheduling, incremental GC, path process GC, right? And true inviutability, right? So, and the whole avoidance of the future, which I think is a programmer when, right? So, yeah, I think very clearly, you can see what is prioritized over what. Both are really engineering pieces, I mean, engineering marvels in its own, right? So, yeah, that's the conclusion. So one thing I want to add is when I actually propose this talk about a month back, I set about writing some benchmarks and figuring out what actually happened. So there's a couple of very common benchmarks. There's the ring benchmark where a particular token is passed on in the ring. There's a ping pong benchmark which gets touted about a lot in ACA, where like the two threads, there are two actors which are just exceeding messages between them, and you scale the number of ping pong that are there, and you figure out what the latency is and what this thing is. When I did that test, I really couldn't interpret the results because this is too many things happening, right? And even if I did interpret it, it would be for one particular use case. So I think if I had enough time and if you're making a call about using this system in this thing, you should really look at how your actor model is going to look like, what are going to be the actors which are going to block, and what is really important to you? Is it fairness? Is it low latency? Is it just throughput? So I think it's very application-specific. It's very developer-specific because you want to probably utilize the ecosystem that's already present. And, yeah. Hi. Yeah, excuse me. When you actually have record, when you actually have TCP messages being passed around, when you actually have binaries which are preferred in a central eavespace, you have very specific things. I mean, what you could do is actually look at the tooling that's behind both the systems. So ARCA used to have something called the Type-Safe Console which let you figure out what is the garbage being generated in actors, how often are there being starved, and stuff like that. Erlang also has a ton of tracing tools. So yeah, it's very involved work. It's very particular app-specific work. But then Erlang is the system which does one thing really, one thing to be taken into account. So we have such a use case. How many effects can you create in ARCA? How many actors can you create in ARCA? So the ARCA, if I remember correctly, each actor is going to take up I think 64 kb. That's the default runnable that's created. It also depends. That's the empty runnable without any state. So, I mean, if you have two-exam memory, you can imagine. It's much bigger than what the Erlang process is. I think the Erlang process is around 32 words or something like that. So ARCA actors are slightly heavier. Actually based on voice threads, right? No. JVM thread. So the process itself, right, it's just a closure. It doesn't matter. But it's running on JVM threads and Erlang its processes are not based on voice threads. I mean, underlying, you have the same thing to work with, right? No, no, no. See, if you are comparing this actor model between Erlang and Java based anything, JVM currently doesn't have that what do I say, scheduler built into the VM. So it actually uses the scheduler based on the OS. But in comparison to Erlang VM, it has a scheduler. So you have the OS thread primitive which the JVM builds a lot of stuff on, right? That's the thread that you get in JVM is definitely an abstraction of what the raw thread is. Similarly, we have an abstraction in Erlang. It's also what the scheduler thread is based on. So can you preempt an ARCA thread? So you don't preempt an ARCA thread. So if your ARCA thread is doing a heavy task, it's an OS thread. It's a blocking thread. It gets the schedule. Right? It doesn't get this. That's also one more problem, right? It hogs the code. Right? Definitely not because you've not tagged each function in the JVM to say, okay, this is going to take up this much space. So like preempt or whatever. So how can I send a remote message to the Yeah, remote. I mean it's not something I consider at all, right? Which throws up a lot of interesting questions, right? So in Scala how do you send a remote message? You're not actually copying by value. Are you sending a reference across? That's something I didn't go into because I thought the channels here are pretty interesting to dive deep into. But yeah, I mean ARCA advertises on location transparency. It's pretty similar to all these things. They use, I think, a protocol layer where they see life and do it. But I don't know how the data integrator is maintained. So regarding futures, futures do wait, right? So when you want futures, they do wait. So anything on the JVM, we can't really compare the JVM and Erlang VM. It's just the actor model. You have to be concerned. But it's totally different the Erlang actors and the JVM. So if you have to at least compare, then JVM should have the fibers. Like intracts. But even then, you can't have the Erlang style and properties because green threads, the fibers are like. So what are green threads? Green threads are MS to N threads, right? So you have M user level threads. It depends on the VM. So if you're having a fiber. So I'd like to interject at this point. The lecture is going to be only one hour long. I'm sure Prana will be available for the discussions. It's just a cooperative multitasking versus pre-empty multitasking. So it's just very different. You can't really come back.