 So, our next speaker is Devani Wu. She is an active member of the Elixir community for a long time. I remember personally, Singer speaking at Elixir London 2017, and then Elixir Conf 2019, and now here at FOSDM 2020. So, she has also a number of open source projects in Elixir. And, yes, she's going to talk about processes in grain, so let's give it up for Devani. Thank you. So, thank you very much. I think I talked too much. So, today I'll try to keep it very short. So, for myself, I basically write programs, and sometimes it breaks, so I fix them. So, I botch them. And that's my Twitter account, which you could use to, you know, to discuss certain things with me if needed. So, today's topic is about, it's about Orleans and especially within the context of Elixir applications. So, what exactly is Orleans? I believe if the framing of the Orleans paper is correct, then it starts with a problem brewing inside of Microsoft, where it's really related to management of state that doesn't fit on a single machine. So, usually when you write applications, you start with one server, and that server is a database server. So, you have a single server, it's a pet. Then, well, state history in the database. But then, you may end up with something like this, where one server is no longer enough because it gets overloaded. So, now you have one database server, and many stateless application servers. In this case, the state is still stored on the database. But now, in order for this state to be processed and brought to the end user, you now have to have the data travels through two tiers of networking, first from the database to the actual server, which doesn't work. Results and sent to the customer. So, there's basically no consistency. So, basically, sorry, I pressed two times. So, basically in this situation, you end up with a lot of network traffic, and sometimes you still get overloaded in this regard. So, usually, you will add some kind of cache server in this case, where since the database is so overloaded, you add some replica nodes, and they might still not be enough. So, you add a cache server or multiple cache nodes, and you end up with no consistency because now you have to handle cache invalidation. This is a problem that plagues the web community, and has plagues the web community since the beginning of time. In Rails, for example, old cache objects in Rails usually contain some kind of cache key. The cache key is a hash or just a timestamp of the object. So, it was actually revised, I think, in a more recent version of Rails, where instead of doing this, which pollutes the cache because if an object is modified, the old version that has been cached doesn't go away. A new version isn't generated and put into the cache. Now, in the cache, you have two objects, one old, one new, and the old one isn't used anymore. So, you introduce a lot of cache pressure, and so on. I think in the most recent version of Rails, it has been updated to a point where the cache key only contains the object's identity, but then within it, there is some kind of hierarchical structure. So, if the object needs to be updated, the key is reduced, but the content of that key or the content that's linked with that key is updated in this manner. Anyway, there's a lot of management you have to do in this kind of setup, because now you have two places that potentially store state for you. So, you have to consult a cache. If it's not in cache, consult a database, and remember to update the cache. Remember to not update the cache when the database has been updated again, and so on, and so on. There is no consistency, and it's horrible. So, yeah. Basically, two issues. You have too many computers, and you have the wrong kind of computers, and the computers you have don't do this thing you should want, and the computers that are doing this thing you should want are not the ones you have, and so on, and so on. So, you have infrastructure sprawl. This is what happens, and you have invalidation issues. So, at least that's, that was the framing of the paper, where, since this was the issue, a stainless middle tier doesn't provide data locality. You have to move data from one place to another. So, there's a lot of transportation going on for that data. So, they look at the actor model, and the actor model looks appealing, because you can write a function and put the function anywhere to run the function local to the data. And, well, in a more modern way of saying, it's basically data gravity, but in Microsoft speak, it's basically this. So, they continue on to say that actor models, such as Erlang and Alka, burden developers with many distributed system complexities. The key challenges aren't that you need to manage the life cycle of actors, because you have to create them in one place. You have to communicate with them, and when you're done with them, you have to tell them to go away. Otherwise, they will stay around, and so on, and so on. So, the developer must be a distributed systems expert, unless if audience comes to play, that's essentially what their paper was about. So, their result was basically, each actor is represented as some kind of grain. You can implement code for a grain, like you implement any kind of object in .NET, and the grain has four states, where persistent means that there is only information in some kind of data service, like a database, and there is actually no computation going on, there's no process running in your server, but you can address anything, address a virtual actor by the name. Let's say, for example, a customer pays, the customer pays an invoice, in this case, you have actually two actors, one is a customer, another one is an invoice. So, when the customer pays an invoice, you call upon the virtual actor representing the customer, and you call upon the virtual actor representing the invoice, they will each refer on a database, do the right thing, and then you tell the customer, you are paying this invoice. That's essentially how the virtual actor works, and then you don't have to worry about stuff such as, okay, so now I'm done with the customer, I close down the customer actor, and I close down the invoice actor. Essentially, the Orleans runtime does it for you. So, that's why you have these four states. So, with seeing the construct of Orleans, a grain basically has an identity, which is an identifier, email address, or the customer invoice, number, referring invoice, or so on. There is some kind of behavior, which is linked with that kind of grain, which is, well, in Donetsk, it's basically your class, and in Erlong or Elixir, that would be similar to how you always use GenServers, but in many GenServer implementations, you always do use GenServer, so you comply to GenServer behavior, but you actually implement different callback modules. So, that's the behavior, and the state obviously is state. Well, and obviously, Orleans being Orleans, if you look up on the web today and look for its documentation, it'll tell you, okay, this is how you deploy. So, well, wait a second here. So, all your grains basically represent single-reader processes. One, count two, they can be configured to allow re-entrant calls, which you may or may not want, but it's going to be imported later on. So, they are scheduled cooperatively, so unlike Elixir or Erlong, it's never preempted. So, if any grain takes a long time whizzing Orleans, then everything gets low, or they will block, and the justification provided in the paper was that this was actually better for discipline teams, because it provides better predictability and multitasking, but you have to keep in mind the context. Now, the good part. So, these virtual actors are automatically activated when there are messages. If there are no messages, they are not activated, and also when they're done, they stick around for a bit, and then they go away automatically. So, that's a good part, because in Elixir or Erlong applications, well, you've got supervisors, if you want to use them, you've got distributed supervisors built upon like lusp or swarm or whore, if you want to use them, but you have to create your own processes yourself, and you have to name yourself, and you have to tell them to go away on your own. So, Erlong hosts or silos in this regard, they are responsible for hosting grains, so you can't just take a grain and just run it out of there, because in the context of Orleans, there is no such thing as a scheduler already implemented. So, they went on and implemented a scheduler, which is responsible for executing work that these grains are asked to do, and there was no out-of-the-box clustering, distribution, and so on. So, they implemented this part as well, and there's a global way of scanning the whole world and finding out where the grains are, if you want to do a listing, but the paper also acknowledged this is kind of expensive to do. Well, so, obviously, this is what happens whenever you have something like that. So, Orleans basically resembles an Erlong paradigm. So, obviously, somebody has ported it back to Erlong. So, basically, a fellow Mr. Slalter actually ported it to Erlong a few years ago, and I think he might be here at the event. I don't know where he is, but I think he's based here. So, it's called Erlings, which is basically Erlong Orleans, and the premise is basically, okay, so each grain is basically single-threaded, so it's one process. So, each grain is one process. Process registration in Erlong or Elixir, well, you can use PG2, Process Group 2, which is the updated version of Process Group that's in the core Erlong distribution, or Core Erlong bundle is part of the kernel, or you can use Swarm or Horror to do process registration, but in this case he chose to use LUSP for it, because LUSP, well, it's more modern, and it does provide various scalability, and it works with partisan if you want to. There is another point, which is grains are either stateful, meaning they have state. If you work with it, it has some kind of state, and when it gets activated, the state actually is stored elsewhere, so when it's activated again, that stored state comes back up and is hydrated and put into that grain again. So, Erlings has an ETS-backed or Erlong-turned-stored back provider, which does it for you, it's out of the box available, you can use that, or you can choose not to use that, and obviously grains are implemented in a way that you implement callback modules, so it's very similar to GenServer. So, yeah, and last thing, if you don't want to use Erlings, but you have a similar problem, for example, you already have a system in place running Elixir or running Erlong, that already has a lot of processes, you just want to, you know, they can go away when you're done with them, then use a state machine, or more precisely, use the Bondo-Mealy machine from OTP, the GenStateM behavior. GenStateM is provided as part of the OTP distribution, and it allows you to do something like, okay, transition my machine from this state into that state, but add a timeout, invalidate the timeout if I get any message, otherwise, continue, and this allows you to very easily build something that goes away automatically if not bothered, but sticks around to process more events without having to use a framework. Yeah, so I figured that it would be better for me to show you some examples using Erlings, and my chosen example is this. This is Converse Game of Life, and if you look at the Game of Life, you can find some interesting things. So basically, it kind of resembles nature. Each block is a cell, and there are rules around when cells become alive, when cells become dead, and when cells remain alive or remain dead. So obviously, the implementation here is that, okay, so each cell is a grain, and the grains, either talk with each other, so each grain talks with its neighbor to find out whether it should die or it should become alive, or alternatively, each grain can query a database, and the database contains state as to whether grains should be alive or dead, and you start with a randomized state, and you end up with something like this. So this is just a video, but I have some live versions here, I think. So, okay, so I'm just going to start an interactive elixir console here, and this version contains the version where grains ask each other about what's going on. So I put in some kind of helper function, so it's easier to try. Okay, so I'll explain what all you have just seen. Basically, when I started the helper function, what it did was, okay, I want to first find a reference to a particular grain, or in this case the big grain, the game grain. The game grain is in charge of creating the game and maintaining the game, and more importantly, making sure the cells are updated on precise ticks, and not just randomly or just whenever they want to. So find a game grain, tell it, start building your cells, and when you're ready, because it's a call, so the call will block, when you're ready, start a timer, and this is because although the game of life specifies how the cells will come alive, become dead, or remain alive or remain dead, it doesn't do something like, it wouldn't specify how you should implement it, and if you have a lot of cells, and they can all make decisions that affect not their own states, but also future states and the states of other neighbors, you want to make sure they make these decisions in an orchestrated manner. So in this implementation in particular, each cell can be told to tell its neighbors about its own states. So this is the first thing the cells can be told about to do, and the second thing the cells can be told about to do was to, according to how many neighbors have told it that the neighbors were alive, figure out whether it should die or not. So this is how this version works. Yeah, so it also has modules, this is basically the callback module implementing the early grains, early grain behavior, specify there is a provider for saving state, the provider has a short name called in memory, this is configured elsewhere, and the placement of the grain is preferably on the local host, this is because again in our lanes as in Orleans, the placement of the grain can be locally or anywhere in the cluster. Depending on, also you choose that as a developer, but also you choose that depending on whether the grain should be stateless or not, because in grind designer Orleans you can actually use grain steps workers, although in elixir application you probably want to use gen stage or something like that, which is higher level and closer to the spirit of elixir. So when the sale starts it's going to gain my ID, and an ID, this ID actually is basically the coordinate, and there are some management functions, but the sales can be told to prepare or to commit. In preparation it counts its neighbors, and according to the current status determine whether the upcoming status is alive or dead, and in commit it updates the future state, it grabs the upcoming status, put it into current status, and clears upcoming status. So as to the neighbors, basically each sale is started with eight possible neighbor fields, this is the state held by the sale, so the game grain actually makes these sales and sets the neighbors, so the type spec is incorrect, because actually there are eight possible neighbors and I just haven't updated the type spec. Anyway, in this in this version it just basically the sale asks each neighboring sale, but there is a problem with this implementation, which is that the game grain can tell all the sales to update, but in this case the game grain will be making calls to each sale, and if the game grain were to make a call to that sale, that sale grain, then that sale grain is going to be busy, and it wouldn't be able to answer any other calls. So in this implementation the update is basically singles-rided, because you cannot have one sale answering anything else when it's already answering something else, and yeah, so it's not very good. Alternatively, and also in a in a version that's more closer to the spirit of okay, to the spirit of Orleans, there's another version, which is also simpler, it's got fewer fewer files, and I shall show you this one. So in this case again, I first basically start everything up, and it works, but internally it's entirely different, and I'll explain. In this version, when I start the game, I actually make a game entity, and I store it in a data store, then I grab the grain representing the game, and I tell it, the idea of the game is baking to identity, and I tell it, start, and when a game grain starts, well it just starts the timer, it doesn't do anything else, it doesn't do any kind of perforation, why? Because that logic is in your sales, but more importantly, when the game grain wants to update on the timer tick, it asynchronously calls all the grains at the same time to first pre-create and then commit, and these sales grains actually do not call each other anymore. Instead, what they do is, well first when it starts up it loads from a database, but what they do when they want to count the neighbors is by doing repo queries. So, I have eight possible neighbors, and in this case the game is actually run on the torus, so the top of the edge is basically linked with the bottom of the edge, so there are no boundaries in this version. So the game just basically looks at the database and determines how it should behave, so in this case there is no more communication dependency between the sales, and so you can run this program in very high parallel, basically of very high levels of parallelism, where I think all cores available are being packed at the moment. Yeah, so all the cores are being used. I think this is the most inefficient implementation of the game of the long you will ever see. Basically, like that, so in this case it's like 20, 25, 30, 30 by 30, something like that, so close to a dozen things. Obviously, this implementation imitates nature, and a better implementation wouldn't have done it like that. It would, a better implementation may have just allocated a two-dimensional array, and just have done with a single function, and the function does the thing on the array in an iterative manner, and that would be much faster, and it would probably just use one, a little fraction of available processing power, but if you want to use Orleans, I think this shows you how Orleans could be used. Again, each isolated process, or each isolated grain controls its own state, reading from the data store, persisting to the data store, doesn't really communicate with other grains too much, especially not at the same level, it may communicate with a parent, it may communicate with a child, but not at the same tier, so as to work around the same work around that Microsoft people use re-entrancy to solve. Yeah, so I guess there's, there's only one last question left, which is, should you use it in your Elixir applications? I think it depends, but keep in mind that Orleans, in itself, provides you with a framework that you can base your application design upon, and in any case, if you're creating a lot of actors, you ought to have a way to make them go away at the end of their useful life. All right, thank you very much, any questions? So the question is, do grains provide ways to asynchronously message each other? And the answer is yes, because grains are processes in Erlang and Elixir, and processes can send each other messages. The only problem is, if one grain is busy doing one thing, and the other grain is trying to message it, now you have two busy grains, because the second one cannot get a response on her, the first one becomes free. So this is why in the dynamic limitation of Orleans, there is re-entrancy, so you can call the grain again, and the scheduler does its magic for you. But in this case, I think, I think there might be a very complicated way of doing it, which is, for example, you don't use calls anymore, everything is cast, so this grain sends a message asynchronously to another grain, without waiting for a reply, and then carries on, something like that, and if you move everything to casts only, then yeah, it's doable. Or you can redesign your architecture, so there is no need for such kinds of communication. Please. I see a built-neighbors function, built-neighbors function. That is an Erlang-level service discovery question. It's a no-discovery question, and I am not too familiar with LostPG to answer whether, when there's a new node added to the cluster, whether it picks up that node automatically, but I think it does. So LostPG, according to documentation, uses either its in-house distribution, I'll go, or it uses or falls back to Erlang distribution, and if you add a node to Erlang distribution, obviously that node is there. So related to that, there is a question of low-balancing, which is, if I have a lot of grains running in one silo, and I add another silo, does that silo automatically take over 15% of the grains? I think the answer is no, it doesn't. However, if you use a similar library called Swarm, written by Bitwalker, Swarm does do that. So Swarm moves the processes around, based on available resources, and based on available hosts. So if you spin up another host, I think it does move towards it for you, but if you were to try and implement your system in a way that load is automatically balanced, then the best way would possibly be to make sure you don't have long-running grains. So grains work, and they return to idle, and when they return to idle, they will eventually be recycled, so they don't, they don't really exist. So when that grain is needed at a later time, it will just be spun up, anywhere else. Yes? So question, is this distributed around only? The answer is no. I wrote it on a single node, and it ran on a single node. Can you run it on two nodes when it's distributed around? The answer is yes and no. If you have two nodes and they don't communicate with each other, then you cannot have the grains running on one silo over to another one. So it wouldn't make sense. Um, if you're asking whether beyond distributed Erlang, can you use another technology to achieve the same function so you have grains that are linked, but you don't use distributed Erlang? The answer is yes. You can, so last PG itself supports using something other than distributed Erlang. Does that answer the question? Anything else? Well, if there are no further questions, then thank you very much for coming today.