 Welcome to my talk. My name is Gyanind and I work for this company, Gray Orange. So let's see, what do we do at Gray Orange? So I'll say in a nutshell, we are into warehouse automation. What it means is, see, there is in any warehouse, there is always a material movement logistics, and we just automate that. When we automate that, basically we use robots, which you call as Butler. See, alternative to automation like you can also have manual. In manual, a person will go to different let's say racks and pick up items. But in our case, the racks are brought to a particular place. We call as PPS, put and pick station, and there is a PPS operator, a human being who is going to pick up those items. And as a result of that, you can improve your put and pick operations, multi-fold. We did some benchmarking, but we don't have any hard number right now. So let's see, a typical warehouse operation. See, in typical warehouse operation, like initially or time to time, you are going to replenish the stock. And basically in a nutshell, it's a very complex algorithm because you have to know where to keep these things like affinity and lots of things. And there are lots of business rules also. There are many products that they cannot be kept together and things like that. Similarly, pick operation is to process customers orders. Let's say in case you're talking about typical e-commerce, right, the customer orders we are processing. Again, this is also very, there's a very complex algorithm because we want to see when a rack comes to a PPS, we want to ensure that we can do maximum pick from a single rack. Because you're not talking about one order, we are talking about multiple orders. So we have a put task and a pick task and there is a task manager. So put task and pick task, they only decide which rack to pick and where to bring. Okay. But eventually someone has to assign a butler or a robot to perform this task. So it's like when you take Uber. Okay. So a particular car is assigned to you, depending on one which is nearest. It's not necessary that the car has to be free. In case like you are at the airport, in case that particular car is dropping someone, you might get assigned that particular car. Okay. So we also have a similar kind of algorithm where we assign a particular butler to a task. So that is the job of a task manager. Once that is done, after that most important thing is navigation. That means now the butler has to go to that particular rack, bring that rack to the PPS, and once all the pick output operations are over, it has to take it out from PPS and bring it to same location or another location. All right. So let's say, this is my typical warehouse where you can see there are certain locations which are empty, there are certain locations which have rack, and normally you have number of PPS in butlers. The normal ratio of butler to PPS is five is to one. That means if you have 100 butlers, so we are going to have about 20 PPS. And we have seen that in a manual warehouse, on the average, a person can pick about 12 items per hour. Whereas in automated warehouse, this can be in hundreds. So you can see that there could be a multi-fold improvement. All right. Okay. Now let's see. So let's talk about navigation task. So what you see here is the task manager has assigned a butler, sorry. Yeah. The task manager has assigned a butler, like we're talking about this task. Okay. The butler has to be assigned. And the butler has to see effectively our criteria for, see, we want to maximize our throughput. We want to maximize our throughput here. That means the PPS operator should be able to pick as many items as possible. And to do that, in case I can always have a rack in front of a PPS operator, in case I can always have a rack in front of PPS operator. And once all the picks are done, it is immediately removed and brought it to some other place so that the next rack can come in front of a PPS operator. Okay. So if I can achieve that, I know I can maximize my throughput. If we say that, okay, it doesn't matter. I can keep bringing these butlers here and that will enhance or that will, so that I can ensure that I always have a rack here. But what happens, you can see that around this area, there's a lot of activity which is going on, but butlers are coming in and butlers are going out. So if I have too many butlers here, it may so happen that there's no place for this butler to go out because there are already butlers standing there, right? So I cannot have too many. Otherwise, and if this butler cannot go out, other butlers, they cannot come in. So if I say that, okay, too many is not a good solution, let's have too few. If I have too few, I cannot ensure that there's always a rack in front of a PPS operator. So what I need to do is I need to ensure that we have just enough. What is that number? It is being decided by the type of warehouse, the size of warehouse and various other things, okay? So if in case I can have just enough navigation task that does not create a clutter here, as well as it can almost ensure that there is a rack in front of a PPS operator, I think we are home, all right? Okay, so let's see. So this is our requirement, okay? In terms of our real life use case. And to achieve this, we use gen stage. So gen stage is a part of, it is not really a part of Eluxer, it comes as a dependency. So if in case you install Eluxer, gen stage will not be there, but you can download using a dependency in the mix. So let me give you some idea of gen stage. Then we'll see how we are going to use it to solve our problem. So normally what you do, you create a data pipeline, okay, so normally this is how we create a data pipeline. You have a producer, like in case you talk about Erlang and Eluxer, everything is a process which holds to your state. So you have a producer and it will initiate the data flow, then you can have intermediate process. These are optionals. In case you want to make any transformation, so you can have multiple of them and then ultimately it comes to consumer. So you can see that in this kind of scenario, we are not really controlling the amount of data which is getting pumped into this pipeline. We are not really controlling, okay. But if we change the direction, if we change the direction, in case we say that okay, the consumer will decide whether it's ready to process a data or not, okay. And based on its demand, based on the demand of a consumer, the producer will send the data. Just, it's a shift. It is just a shift, all right. So if we do that, if we do that maybe we can regulate the amount of data which is coming to a consumer, all right. And this is basically gen stage. In gen stage, this is what we do. Gen stage, the consumer will control how the number of events that will flow into, that will flow from producer to the consumer. But the question is, in case I want to understand, how does this really behave? Because you can see that at every stage, like you can have multiple producer, then you can have multiple consumer, here also multiple producer consumer, you can have multiple producer consumer here, you can have multiple consumer here. And each one is making a demand and then based on demand from various consumers, it is going to dispatch events. So initially when I started reading about it, I found it very complex, okay. And basically that was my WTF moment, okay. So I took a break. In fact, I'm telling you through my personal journey. Then slowly I said, okay, it doesn't matter. Okay, I don't understand it, that's fine. Let me write some code and see how it behaves. And that is what I usually do. So whenever I get stuck, I write code because I found at times, you cannot infer everything from the documentation, okay. And when you write code, you can write code to understand a particular use, specific use case. So I started writing code. And what I found that, okay, I have a consumer. Consumer sends a demand to its immediate producer, then see this acts as a producer as well as consumer. So this is producer for this consumer, but this one is consumer for this producer. So this also sends a demand, okay. But the demand which is being sent by this consumer to this producer has no correlation with the demand which is sent by this consumer to this producer. Second thing is, now let's see on the top. See, producer gets some events. So each producer has a dispatcher. So we'll talk about dispatcher a little later in more detail. So that's his strategy. So this producer will, based on the dispatcher, it is associated with, it is going to send events to this, which is acting as a consumer. Once this process receives events, it becomes producer and it really doesn't matter. When it acts as a producer, it really doesn't matter how it received events, okay. It really doesn't matter how it received events. So you can see that it's been, so for all these intermediate processes, it's role as a consumer or it's role as a producer, there is no correlation. They're independent. So that means to understand the whole thing, I can treat these two, producer and consumer as an independent unit. Then this producer and this consumer as independent unit. Similarly, this producer and this consumer as independent unit. Okay, so my light went on and I was happy, but this you have to write code to really understand this. So what we did next, what I did next, okay. I think this is what I have already explained, kind of. Okay, so let me skip the slide. So like I said, okay. So this is what I was explaining, okay, the same thing. So in case I just try to understand these units separately, probably it'll be easier. So this is what it is. Now here what you see is I have multiple consumers and I have multiple producer, all right. Now each consumer is going to send demands or we call it subscription, okay. So each consumer is going to send subscription to each producer and these subscriptions are independent, like demand. It doesn't need to make the same demand to all the producers, they're independent. Similarly, each producer is going to receive subscription or demand from each consumer, okay. They're independent and it's up to you. Based on your use case, you can decide what kind of demand, what kind of dispatch or everything. Another thing which is important is these two producers, they are not aware of each other. They don't know each other. Similarly, these consumers, they are not aware of each other. A consumer knows both the producers, okay. A producer knows all the consumers, but a producer one does not know what is producer two, what is producer two doing. Or a consumer does not know what is, consumer one does not know what is consumer two doing, all right. So effectively, for all practical purpose, I can treat them as two separate units, okay. So this is my producer one with consumers and this is producer one with same set of consumers because they are independent. I can always do that, all right. So now, just let's talk about one of them. So now I can see that the whole complex thing, whole complex structure, I can break into a very small structure which I can easily understand. So what happens here? What is the producer's characteristic? We always associate a dispatcher because they're multiple consumers. So we associated dispatcher with the producer which will dispatch once it receives events. So dispatcher is a strategy. So based on this dispatcher, it is going to dispatch events to its consumer, okay. And a consumer, and a consumer is configured when I talk about subscription. What we do is a consumer is configured with what we call as maximum demand and a minimum demand. There are two things, okay. So this is what you need to specify. In case you don't specify, the maximum demand value is 1000 and minimum demand value is 500. And this is what is being sent to the producer when it subscribes. So producer maintains the maximum demand, the subscription of each consumer separately. The producer maintains, and another thing is when you do subscription, when the consumers, they do subscription, they do one by one. So let's say consumer one first subscribes, then maybe consumer three subscribes, then consumer two subscribes. So this is called as subscription order and it plays a role. So whenever, so let's assume that, let's assume that we have subscription for all three consumers, all right? So whenever a producer receives, not only that, when we have subscription, the maximum demand is sent as the first demand to the producer. So that means in case, let's say it has sent, it has, in its subscription, its maximum demand is 10. So that means the demand from consumer one to the producer will be 10. Let's say from here it is 20. It'll get a demand from consumer two as 20. And let's say here it is 30. So producer will receive a demand as 30. So that is the initial demand, all right? And whenever the events come based on the dispatcher, it is going to send those events to these consumers. But there are certain rules. The rule is it will never send, so events could be very large, like it can receive hundreds of events together, okay? To any consumer, it will never send demands which is more than max demand minus min demand, okay? So let's say in this particular case it was, the max was 10 and min was six, sorry, min was four. So at any time, at any time, it will not send more than six events to consumer one. And likewise for other consumers. Second thing is, all right, so initially the current demand, initially the current demand was 10. And let's say it has sent five events. So the current demand reduces for consumer one. It becomes five, okay? Then let's say a little later it receives another three. So it becomes two. The current demand becomes two. So whenever its current demand falls below minimum demand, it will send its next demand, okay? And the next demand value will be maximum demand minus current demand, which will be, in this case, it will be eight, all right? So that is how it operates. All right, so any question here? No, I think I am going. So now what we are going to do is we are going to see different kind of dispatcher. Like I said, okay, let me go back here. All right, so what happens? GenStage has a behavior called dispatcher, which has about six callback functions. And it also supports three implementations of them. Demand dispatcher, which is a default dispatcher, broadcast dispatcher, and a partition dispatcher. Okay, so we are going to look into these three, one by one. So demand dispatcher, the way it operates is, like you have multiple consumer, okay? And the current demand decides, the current demand for each consumer decides which one is going to get events. So the consumer which is most greedy, that means the one which has the most maximum current demand will get the events. So let's say the consumer one, the current demand is 10. For consumer two, it is six. For consumer three, it is, let's say again, six. Okay, so consumer one will get the events because it's most greedy right now. In case all three of them, they have the same current demand. In that case, it goes by subscription order. Like I said, subscription order is important. So it goes by subscription order. All right. There's no max and min. What do you mean by? Mm-hmm. Yeah, see the, he's saying that there's a constant demand. Okay, I am still not understanding your question. See, you have to, see when you do a subscription, you have to give a max demand and min demand. Mm-hmm. Yes. Yeah, that is mandatory. It is mandatory that he has to put max and min. Okay, if it doesn't put, it'll take the default value where max is 1000 and min is 500. All right, so I don't think we can avoid it. Okay, so this is how the demand dispatcher works, which is the default. Now let's see the second one, broadcast. See, broadcast is fairly simple. See, broadcast works like this. Like I said that all the consumers, they are sending their subscription. So you have multiple consumer. So whenever producer receives the first subscription. Okay, so that, like in our case, let's assume that, sorry. Let's say in our case, let's say consumer one is the first consumer which sends its subscription. So you can assume that this is like a master consumer. Okay, and you can treat this as a producer with a single consumer. And it works like a demand dispatcher only. Except for the fact that whenever it is sending events based on its maximum demand and minimum demand parameters, the same events will also be sent to other consumer. It's a broadcast. So that is the difference. So the behavior is dictated by producer and the consumer one. Okay, and other consumers, consumer two and consumer three, you can say they are like siblings or parasites. They don't play any role. They just receive it. What is the use case for this? I don't know. I've not used it. Okay, I don't know. I can't tell you right now. Okay, now partition dispatcher, this is interesting and this is what I used in my real life. So in partition dispatcher, what happens? A producer has a function. By using that function, it can partition all your events into a set of mutually exclusive partitions. All right, and like in our case, it's a PPS. So PPS is a coordinate. Okay, so I can partition all my events based on that coordinate. It's unique for each PPS. So once it partitions, you can see that. So let's say I have a set of events. So I partition it and I create subsets, each belonging to a partition. So I can say that effectively my producer is nothing but, the main producer is nothing but a set of small producers, each belonging to a consumer. And after that, this is like a demand dispatcher. So that means this producer will create a subset of events which should go here. And out of that, how many events will go? It will depend on what is the maximum demand, minimum demand, and its current demand, just like a demand dispatcher. That's a good question. Okay, so the question is, how do you get the fault tolerance? So there are two aspects to this. Okay, your OTP, see basically if you really see, gen stage is built on top of gen server. So you always make a part of supervisory tree, but that's not the complete answer. So you can restart. But let's say in case it has not restarted for some reason. Okay, I make this as a transient in my supervisory. So in that case, what will happen if there is, let's assume that this has died. Okay, now there are certain events for this now. So they'll not be lost. Okay, the producer will keep them. All right, so that is the complete answer. Okay, in case that particular consumer is not available, that is not thrown away. All right, so now let's see our real life example. So what we have is, we have a navigation task. A navigation task about consist of many parameters, but I'm going to talk about there's a start location and a end location. Okay, end location is nothing but, okay. You can see that this is my start location and this is my end location. Okay, so you can see that each PPS has a end location which is unique. And my navigation task, I need to specify these things so that the butler knows, we calculate the path, all those things, and a butler knows then how to reach here, all right. So you can see, this is my producer, this is my producer and it gets bounded with all the navigation task. So using the demand dispatcher, it will send tasks to a particular PPS, all right. Now how do I manage? How do I ensure that? Because here the number of tasks could be too many. How do we ensure that this PPS or any of these PPS is not bounded with many navigation tasks? Because otherwise we're back to the same thing. Like near this PPS, we'll have too many butlers and that can create a deadlock. So what we do is we find out the appropriate value of maximum demand. Let's assume that this maximum demand value is 10 because it all depends on the type of warehouse, type of orders we are getting in all those things. And my minimum demand will be, if it is 10, although I didn't mention it explicitly, it'll be nine. That means basically what I'm saying is, just give me one navigation task, all right? And in my each, like I said that they are all gen server. So in my state of this gen server, what I'm going to do is I'm going to keep maximum demand as one parameter and also the number of navigation tasks, they are in progress. So initially it'll be zero. Now the moment I get first task, okay. So if you really see this is also a kind of a PPS coordinator, the actual navigation task will be performed by another process. So it'll just offload that to another process, okay? And this is done in a particular function called handle events. The moment you come out of handle events, since my minimum demand is nine, it'll send, this particular consumer will send another demand for one, okay? And so if there's another task for this particular PPS, the coordinator will send it. So it'll keep sending it immediately. And you'll see that on the average, see what happens when you talk about navigation task. The butler has to go to a particular location and it has to bring that rack to the PPS. And after that, the PPS operator has to pick up all the items. So you can see that on the average, it can take 30 seconds to one minute. It is, because it's a physical device which has to move. Whereas the interaction between the, this PPS server and the navigation coordinator, it is fraction of milliseconds. So that means I get the first task, it is offloaded, and immediately I send another demand, I get the second task, that is also offloaded. So it is almost happening concurrently, all this task. So if I say that my maximum demand was 10 and I keep track of how many navigation tasks are in progress, okay? So the moment I hit that, my navigation task has 10, which are in progress. At the same time, what I'm also doing is, for each navigation task, heuristically I can compute how much time it is going to take to complete, okay? So I have some heuristic data which I can complete. So when I hit the 10th, when I get the 10th navigation task, the time I do not finish my handle event function, I wait for the minimum time which is required for one of those tasks to complete, okay? And I expect that to be over, even if it's not over, that's fine, I'll explain why. So let's say I, the minimum I have is about 20 seconds. So after 10th calculation, 10th navigation task, I wait for 20 seconds, and then I go out of that function. So another demand is going to, is sent to the navigation coordinator, all right? So you can see that this way I'm controlling it. I'm controlling the number of navigation tasks, which is being performed here. The reason being, why heuristics will work for me is that, see we are talking about 10 navigation tasks. Instead of 10, if I have 11, it doesn't matter. Or instead of, still I'm not making this area too crowded with butlers, or instead of 10 if I have nine, still I can almost ensure that there is a rack in front of my PPS operator. But instead of 10 if I have 20, that means there is a serious problem with my heuristics, which I need to improve. Or if I have five, that means then I cannot ensure that there is always a rack in front of a PPS operator. And again, it's a problem with my heuristics, which I need to improve. Okay, so this is how using this particular abstraction, which is GenStage, I can control the activity around this. All right? Uh-huh, mm-hmm, okay, okay, okay. All right, all right. So the question is, how do we differentiate different tasks? So when I talk, when I'm talking about GenStage, okay, I'm implementing GenStage only for this task, okay. If there's any other task, this is being handled by a GenServer, not by GenStage, all right? So for me, I just need to regulate two PPS tasks, okay? Because if you really see this PPS task should be executed, the moment I have done all the picks or puts from PPS to a location should be executed almost immediately. I don't need to regulate it, is that correct? Okay, so I'm creating GenStage only for this kind of two PPS tasks, that's it. See, it is an abstraction over a GenServer, that's it. So if you really internally, GenStage is nothing but a GenServer, all right? So anyway, thank you. In case you have any questions, I'm here and we are always hiring. So in case you are interested, we have a booth outside. Okay, then thank you very much.