 Welcome to this new Calyx webinar. I'm Renato Cabocanti, principal engineer at Light Bands, and today I'll talk about Choreography Sagas. So there are two types of Sagas, Choreography and Orchestration Sagas. Today I will only cover Choreography Sagas. In an upcoming webinar I will talk about Orchestration Sagas. But in both cases it's about implementing a long-running process in distributed season. We'll talk about a process that spends over more than one transaction. When we think about Calyx entities, they live in their own transaction boundary. You cannot mutate two entities at the same time. You mutate the first one and then you have to move to the second one. And they live isolated in their own transaction boundary. Now when we are building Choreography, the abstractions that we have in Calyx to allow us to build Choreography are the entities on one side, your strict consistent domain model. And the other side you can use subscriptions. Those are actions that you add a subscription annotation telling Calyx that you want to receive the events, if you are subscribing to an event source entity, you want to receive the events from your entity, that action and that method, or you want to receive state changes from a value entity. So in both cases you can use an action, you add a subscription and tell Calyx, I want to receive updates or we can say, you want to receive new action, the facts that happened in your application. And you want to do that because you want to connect, you want to react to the facts and connect, let the facts flow through your system and mutate other entities. So you can imagine that you modify one entity at a place, and then you listen to the events from this entity, for example. And when you see an event is delivered to your action, you then can send a comment to another entity, as such you can connect them and make sure that when one mutates, the other will be mutated as well. They are mutated in different transactions, but it's guaranteed because of the subscription that this fact will traverse your system. So in a nutshell, we can say that choreographies are about building the pipes that will allow facts to traverse your application. Let's see a more graphical example. Imagine that here we have a Calyx service, and I have three nodes in production, and I have three kinds of entities that are spread over my nodes. I have the A type depicted here with a yellow circle, and I have the B type, it's yet another kind of entity that I have, and the C one. So in my business, I need to react whenever something happens to A, I need to listen to that and generate some comments that will hit entities of type B. So for example, something happens in A1, then I'm listening to those events, and I will then generate a command that I will send to B1. And something happens in B1, I want to generate something C1. And here we can see that those entities, they are in different nodes, they could be on the same machine, doesn't matter much, for us, this is something that Calyx will take care of for you to spread the entities, but the abstraction that we need here is to have is the subscription that will allow us, give us the guarantee that when something happens, I will get notified and then I can move forward and execute another action. In the case here, something happens in A1, I'm listening to that using the subscription, and then I send a command to B1. B1 is now producing more events that I'm also listening to them, and then I will react and send another command to C1. So there are different steps here, from entities to subscriptions to subscriptions, back to entities. Let's see another more concrete example. I have here a shopping cart, and the user added already two items, and then it removed an item. So that's what I have in my journal, item added, item added, an item removed. The content of the payload of those events are not important for our example here. So at some point, the user decided to check out the event, the shopping cart. So a checkout event, it's emitted, and I have in my system a checkout subscriber that is listening to events from shopping cart and taking actions from that. And the user case here is whenever checkout is emitted, I want to create another. So now I have one individual entity, shopping cart, that got checkout, and this event will trigger an action in my system. Under the hood, the subscriber will receive this checkout event, checked out event, and will create another. Of course, here the other will also emit its own events, and I have another action here, another subscriber receiving events from other entity, and also is a good sum action. In the case here, whenever checkout is done, the checkout subscriber gets notifies, it creates another. Whenever another is created, the other subscriber gets notified and it creates, initiates a payment. So as we can see here, there are three entities, but one action that happened in the shopping cart trigger cascaded a few other commands down the road. So we start with checkout, we react to that, we create an order, we react to that, we start, we initiate a payment. But when we are working with subscriptions, it's important to understand how it works behind the scenes, so you are prepared to understand what are the possible errors that we can have when subscribing to events. So one important error is what we call a head of line error. In that case here, I have some events, event one, two, three, four, and five, doesn't matter much what is inside those events, they belong from an entity called user entity, and the letters next to it are the identifiers. So I have three different instances here, and the events are all together in the journal, and I'm subscribing to those events. So I have an action with a subscriber notation saying, I want to receive events from a user entity. When you deploy that in production, Calix, so you define your action, you have your event handler which contains your business logic, and you deploy that into Calix, and what Calix will do behind the scenes for you, it will start to read the journal, and it will pick event one, and will deliver to your event handler, and when you finish processing this event, Calix will save the offset. So I search that if something happens, if you restart that machine for some reason, if you redeploy it, it will start on the next event, because event one was already processed. We save the offset, we save that event one has already been delivered to your event handler. But then this process keeps going. It keeps consuming the events and delivering to you the events in the order that they are found in the storage in the journal. But imagine that when Calix tries to read event three, it fails to deserialize it. So we have the event persisted in storage, but for some reason you change the format, you change your code, and now what is in storage cannot be deserialized back to the type, to the model that you have in your code. So now it fails, it fails to deserialize. This is a kind of problem that you can only solve by fixing your deserialization, bringing back the type that you had before. If you need to do a migration of your format, you have to add it. And because otherwise Calix won't be able to read back this event, it will always fail. And when the subscription fails, it crashes and then it waits a little bit and it starts over again. When it starts over again, it will go back to event three and then it will fail again. So you need to fix it. See if you have a serialization issue when reading events from your journal, you have to fix deserialization. In such a situation, you are kind of blocked here. Calix won't be able to continue delivering events to your event handler because it stuck on event three. So you have to fix it. Another kind of error that you also have to fix is when your event handler has some programming error that is just failing whenever it sees event three here. So Calix picks the event from the journal, deserialize, okay, we succeed to deserialize it, we deliver that to your event handler, but then your event handler has a bug and it fails. Again, here, this subscription is failing. It will keep failing and Calix will keep trying to deliver this event tree to you and it will continue failing. So here again, you have to fix your code and re-deploy. The third kind of error is when we read the event from the journal, we deserialize it, we deliver to your event handler, you do your processing there, you return it, and then Calix needs to save the offset and for some reason, network issues or you just shut down the application at that moment, you de-deploy it, everything, you remove all your nodes. Something happens that Calix are not able to persist that offset. When this process gets back to life, it will try to re-deliver the event tree because for Calix, this event was never delivered. We missed the chance to save the offset. So for us, it was never delivered. So we delivered it again to your event handler. And that's why when you're building subscriptions, you have to make sure that your event handlers are either potent. Whatever you do there inside, you need to be protected and be sure that you can receive this event more than once and that it won't impact your data because of that, that your data is protected, is either potent. So we imagine that you're receiving the event and sending commands to some other entity that entity must be able to react and detect that this command was already sent. Now, in all those three cases, this case here, the first case and the second case, they are kind of blocking because it blocks your flow here. The subscriber cannot move forward until you fix your code. The third one, it's more intermittent error. It happens. If it happens, it will just recover but with the consequence that your event handler may see that event number three more than once. So this is an important aspect when you're building choreography sagas because subscriptions are an important part of choreography sagas. That's how we built choreography sagas. So I have a little demo so we can navigate a little bit through the codes and you can see how it works in a real application. As a use case, I will show you how you can solve one of the major challenges in event-sourced applications and it's called the set-basic consistency validation. And it's like that. You have, let's say, that we have a user entity that it's an event-sourced entity and in an event-sourced system you don't have a table with columns, user ID, name and email. What you have is a journal with events. So let's say that the user ID here is the unique identifier for this entity and you have the name, okay, fine. And you have the email address but you want the email address to be unique across all the user entities. You don't want two users using the same email address. So how can I build this uniqueness? How can I have this uniqueness constrained if what all that I have is a journal with events? I don't have a column email in a table where I can say here there is a constraint, this column needs to be unique, has a unique constraint. I cannot have that in an event-sourced system. And by the way, all the entities they leave their own transactional boundaries. So I cannot go into create an entity and say, before create it, have a look on all the other entities in the system and check out if any other user is using this email address. So it's not possible to do such a thing. So we need to think of our application in another way. We need to solve this problem using something quite different than what we usually would do with a relational database. So for this example, we're going to use a choreographer saga. And what are the pieces that we're going to have here? First, we're going to have a user that will be an event-sourced entity. Then I will have another entity. In that case, we'll be a simple one, a value entity. That will be a kind of barrier for me. It will be just a unique email. The idea of this value entity will be the email address. And I use that as a barrier to control the fact that if it's already used it or not. If it was already created, it's in Calix, it's in the storage, I will detect that and I will be able to say, no, I cannot create this user because this email is already in use. Otherwise, if it was not yet created, I will be able to create a value entity for this email address and then I can move forward and create the user. And I will have two actions with subscriptions, subscribing to the events from the user entity and to the state chains from the unique email value entity. And I also use a timer because there is an error condition, there is an error situation where it will be possible and we will simulate that, that we reserve an email but we don't create the user. So now, because some error happens in the system, so now I reserved the email that's one transaction, I need to create the user but let's imagine that the user is not created for some reason. Now I have a reserved email that I need to free. So I will use a timer for that. So let's first have a look here. We're going to have an application controller that will get income request from the outside and request to create a user. But before creating the user, it will try to reserve an email address. And for that, we're going to have this unique email entity that we will first call it and if it doesn't exist, we will reserve it, we will create it for the first time and we reserve it and then we create the user. Let's have a look at the code. So here is my application controller in which I receive a request to create a user. Here's the command but before creating the user, I will send the commands to this other entity, the unique email entity. I sent this command which is reserve email. That's the ID, email. That is the user ID. Again, here I will use calyx clients to talk to this entity and the unique identifier for this entity is the email itself. So that gave us the guarantee that there is only one entity with this email address, one unique email entity. In my whole calyx system, there is only one associated with this email address. It's the unique address of that. Let's have a look at reserve methods. So first, let me scroll back here. So this is my value entity. It's not an event-sourced entity. It's like a key value entity and it holds the state of a unique email. The unique mail has an address, the unique ID of this email, this status and eventually it will have an owner. This status can be not used, reserved and confirmed. When I start, it has an address because it's the unique ID. It might not be in use. When I start it for the first time, it's not yet in use and therefore it does not have an owner. When I hit the reserve methods, I will check, okay, if it's already in use and it's already assigned to another owner, then I will emit this error. If it's assigned to another owner, okay, it means that this email was already created and it's assigned already to this owner, so I have nothing to do. But the last case is, okay, I'm creating this. I'm making this reservation for the first time. So when I take the email here, I will update the state of my value entity, saying, okay, that's the address. It's reserved that the state and that's the user that is trying to reserve that is trying to use this email. So this user here, this owner ID is reserving this email address. Once that finish, I execute this call here and then when it completes, I will go further and will create the user. And what I wanted to achieve here is first reserve the email, I created the user and then I will have a subscription listening to user events and then it will see, okay, our user was created. There is an email that was assigned to this user. So now from my subscription, from my, there are two choreography sagas here. The first one is this one that we are seeing now. So this small saga will say, oh, I saw that the email was assigned to this user. So it's time for me now to go to the unique email address and confirm it. Let's see the code. So here I have a user event subscriber. It's an action subscribing to events from user entity. And for the moment, we will look only to this method here. Whenever an email assigned event is emitted by the user entity, this code will be colored. And what we'll do here is say, okay, an email was assigned to this user. So let me hit now the value entity using this email and let's confirm it. Let's see what this method does. So here I'm back into the unique email entity and I will call this method confirm. And if it's reserved, so if it's still in that state, oops. If my email is on the status of my email is reserved, well, it's time to confirm. I just got the information that in my subscriber here, I just got the information that this email got assigned to a user so I can confirm it. So here I changed the status of my email. I saved the states and I'm done here. Here I changed it to confirmed. Now, if for some reason, when this method is colored, my email is not confirmed. It's not reserved. It has some other state. I just ignore it. I just move forward because I don't want this method to fail. Remember, the potency that I was talking about, whenever this event is delivered, it might be delivered more than once to this method. So either I confirm or I do nothing, but I don't fail. I don't let it fail. Now, back to the slides. This is the sunny day scenario. Everything worked as expected. But of course, life is not like that. Sometimes it will happen that we reserve the email address, but we fail to create the user. So what can we do here? So for that, we need to build another small saga that will make sure that this email gets unreserved. So what we'll do here is we'll create yet another action, but this time, this action will subscribe to unique email chains. And whenever it sees that the email was created or isn't status reserved, it will schedule our time saying, okay, if it stays like that, I will unreserve it. Let's have a look at this code now. So here is the other subscriber I have, yet again another action, subscribing to value entity of this type. Here I won't get events, but I will get the full state of this value entity will be delivered to me. So the first thing that I'll do here is to create a timer ID using the email. The email is unique, as we know, and I'm just prefix it with timer just to make it easier to follow in the logs. So if the email isn't status reserved, reserved here, then I will schedule a timer. Look here, I'm creating a call, but I'm not calling that yet. It's the third call in Kalex. I'm not executing this. It's not being called. It's lazy. I create this call and I will ask Kalex to create a timer now for me using this ID, using this delay and with this call. So what will happen here? For demo, I put it on 10 seconds, but you can put like five hours if you want. You can put a very large number. And just to make sure that this email gets unreserved far in the time. You don't want to immediately unreserve it because you want your system to be able to process the request. So here what we are saying is we are telling Kalex, look, this is the timer ID. In 10 seconds, please come and call this unreserved method. Let's have a look here on the unreserved method. I'll say, okay, if the state is reserved, I will move it back to not use it, the empty state that I have. If the state is confirmed of not in use, I don't touch, I just return done. Because I don't want this method to fail either. I'm in reserved state and I want to bring it back to the initial state or I just move on. Here the empty state is just, I erase the user. There is no owner anymore and it's not to use it. And that covers the situation where the user, we fail to create the user. So after some time, the subscriber here, the scheduler, the timer that we schedule will unreserve this email address so it can be used by another user. Now, what we really want is to create the user and then we have the timer in one side. And then, sorry, we create the unique mail and then we created the timer to eventually clean up. But then we created the user and but when we create the user, we will have the user event subscriber that will confirm the email address. And when the email is confirmed, the unique email subscriber, we also see this change. We'll see that an email was updated and now it's confirmed. So what it will do, at that point it can say, okay, I had a timer. I had a timer that I scheduled to unreserve this email, but it's already confirmed. So now I can remove the timer. So let's go back here. And that's what the second here is when I scheduled the timer. And here is I receive yet another update of my unique email and say, oh, it's confirmed. So okay, good. I can just now cancel the timer. It's important to understand here that imagine that you don't do that. Imagine that you just return here like that. I could eventually even comment out this part and say, if it's reserved, I create the timer. If not, I just ignore it. And that will also work because what's going to happen here is imagine that you don't cancel the timer and then at some point, because you register a timer before, Kalex will run this timer and we'll call this method. Your email address is confirmed. Kalex calls this method here and it will say, oh, it's not reserved. It's confirmed. So it will move to the else and say, okay, I have nothing to do. Your code will say, okay, if you're confirmed, I don't have anything to do. I don't need to change the state of my unique email address. So you just move on. And so the timer, you can leave it there and it will fire and your system will still be in a correct state. But it's of good practice to just clean up timers that are not necessary anymore. So if it's absolute timer, just remove it because you are saving resource. You are saving calls in your system and all that has cost, no? So it's better to just remove it. What else that I can show here? Let me see, yeah, that's the flow. We can see here that I have actually two subscriptions and they are not one. I can say here that we have two sagas, two tiny sagas, one between that, make sure that the interaction that data flows from user back to the unique email entity to confirm it. And there is another one based on a timer in which I listen to something and I take a decision. Okay, I have an email that is reserved and in a few minutes I will unreserve it. And if it's already confirmed, the unreserved will be ignored. And if it's still reserved, I will free that email address. So let's go back to the calls and I want to run it to show what happens in this application. I want you to see the saga being executed in the background. So the application is running. The first thing that I want to check out is this call here. I want to check what is the status of this email address. So basically it's saying, I hit this email address and ask, okay, give me the status and it's saying, okay, it's not in use. There is no owner, not in use. This email address is free. So let's reserve, let's create a user. This one, 001, John Doe living in Belgium and it will use this email address. If I check here, I can see that the saga didn't complete yet because this email address is on status reserved. Let's go back to our diagram here and what happens here is that we are at that point here. No, yeah, this one. We created the email address. I believe that at that point the email subscriber got ready, the notification that a unique email address was created, was reserved and it created a scheduler to unreserve it and the user was created but the user event subscriber, when I did this call here, it didn't yet receive the information that the user got an email assigned it. So it didn't yet confirm my email address. So we are actually in this point in time. Of course, while I talk here, everything completed already. I can now check this one and as we can see, it's confirmed. And if you check the logs, you'll see that the application controller got an email request to check it. Then I tried to reserve the email address. I hit the unique email entity to reserve it. Then I move forward to create the user ID, the user. I created the user here. Again, it's me here, interact with the controller. The email subscriber received the information that an email was reserved and then it will schedule a timer to unreserve it. But then in the meantime, the user events subscriber received the information that an email was assigned to a new user. So it goes back and hits the unique email entity and confirm that it's in use. Further, now the unique email subscriber receives the information that this email is confirmed, meaning that it doesn't need the timer to be unreserved. It doesn't need the timer anymore. So you say, okay, email is already confirmed, deleted the timer and the timer is deleted. So that flow happened and we end up in this situation. The email got confirmed, we marked the unique email as confirmed and then we cancel the timer. So the two sagas here are working in cooperation with each other. Now, let's simulate an error. I have this other email address here. It's not in use. I call it invalid.acme.com, not in use, and I try to create an invalid user. So here I'm using a random ID and as you can see, the payload is missing the name. Here I don't have the name. If I do this call, if it's a good call, what we're gonna happen here is we first reserve this email address. But then when we try to create the user, it will fail because I'm missing the username. I have a condition here when I create my user that the name needs to be filled. I did that on proposed so we can simulate an error. So if I go back here and I try to create this user, let's see what happens. Let's check. It's invalid, so it's reserved, sorry, for this random ID and in the meantime, my, here is the error because I had a wrong payload for my user. In the meantime, the timer kicks in, kicked in. And now the email is unreserved. Let's check again the status of the email and now it's back, not to use it. So what happened here is that we got this situation. I reserve the email address. I fail it to create the user and then that other saga, the unique email subscriber will unreserve it. And I have here another user. So we remember that this user was created, its email is there, it's confirmed, so its picket, this email is confirmed to be in use by this user, user 001. Let me try to create another user using the same email address. And that will fail because if I go back to my controller, what happens here is that I try to create the second user. I try to reserve the email and this method here will fail because my, this email is already in use. That's what we're gonna have here. It's in use and it's not for this user that I'm trying to create. So I return an error and I search, this is, this fails. And because this call fails, this block here won't be executed and I will move to the exception handling where I will emit a message. Okay, this email is already reserved. Now there is another call here and I let that for you to, if you want to explore a little bit more, how can I change an email address? It's a similar situation. Here I'm trying to modify user 001 by changing its email address and here is the same. I will first need to reserve this email address and if it's confirmed and if I can reserve it, I can move and change my entity. And once I change my entity, I need to unreserve the previous email address, the John Doe. The previous one is Doe at Acme and I want to move to this other one. So I need to first reserve this, I change the user and then I unreserve the previous email address so it can be used by someone else. There is also a Saga logic for that here in this project but I won't show. I will leave it for you to explore. And so let's go back here. A small recap of Choreograph Sagas. We built Choreograph Sagas by composing entities, being that event source entities or key value entities. We listen to events or to states changes using subscriptions and we let the data flow through our applications. One important characteristic of Sagas is that once it triggers, it's a flow that propagates through your system, like a wave of actions that and commands and events and subscriptions that got chained in each other and produce a final result. So it's something that runs in the background for you. You have to encode everything. Something happens, something triggers the first action and then the first command let's say that creates a wave of changes in your system. It's very important to make sure that each part, each interaction is well tested and that you have thought about all the possible error conditions because since it's a background process, you don't want it to keep fading the background because when it starts to fade in the background, it will block your saga but it also may block other sagas as well because the subscriptions, if you have a head of line error, it's not that single event that may be blocked, all the events that happen after this event will also be on hold because your subscription first needs to finish processing that event where it's stuck now. So head of line errors blocks everything and you have to be aware of that and make sure that you don't let them happen. Now what are the drawbacks of choreograph sagas? I think it's already implicit in what I just said. There are two things here. They are hard to debug, background process. You don't have a total visibility of what is happening there. You can have your tracing, you can have your logs but you don't have a place where to go where you see okay my saga is at that point in time and basically the fact that it's hard to track is what makes it difficult to debug. So if you want to have more fine-grained control over your sagas then it's better to implement it using orchestration and that's what we're gonna see on the next webinar. In the next webinar I will talk about orchestration sagas and I use exactly the same use case but I will implement it using Klex workflows which allows you to build orchestration sagas. If you want to check out the source code and explore a little bit more you can scan this QR code. It points to a GitHub page, a GitHub project where you have the source code. Okay thanks for your time. We invite you to join our Slack channel. If you have more questions you can reach out to us there otherwise you can also scan that other QR code and discover more about Klex. Thank you very much.