 So today I'm going to talk about actor model and basically our journey in how we build the systems based on actor model, deploy it to production and what we learned, what came out of it, what problems they do face. So let me start with a short game with the auditorium. Who have heard about actor model? Who have used actor model in sort of play projects or, I don't know, some one-off things? Who have built production systems on it? Did you notice that I haven't raised my hand when I was talking about play projects? That's exactly what just, you know, on the, like as we say in Russian on a blue eye, we just built something on the technology that it would never work with, deploy it to production and it sort of worked. So just a quick, like more detailed introduction, so this is myself about ten years ago, eleven years of software engineering, used to be full stack before it was, you know, made single page applications before it was mainstream, turned backend with a red mark for like about three years and, you know, built Scala, functional programming, response time-sensible applications and sort of high load radio applications. This is where to find me. So what's the problem that we're solving? So this screenshot basically shows the, like, delivery grid. Well, I basically just adopted it a little bit from our, you know, this is actually like current screenshot from Lazada applications and web browser. Technically it's a little bit wider because it only shows three days, but actually we're showing it for seven days. And it's basically, like, how many of you actually order something on Redmart or Lazada? Okay, so to those who haven't, we're basically giving two hour slots for deliveries and in our model we just basically, that's our promise to customer and this is very bad to violate it. So we do everything we can so that from the technology perspective we're not violating this promise. And that basically means that we need a very strong, like, not a strong, strong, but still very strong consistency model to sort of support it. So in particular, like, we sort of decompose the problem so that we need, like, linearizable problem, serialization level model, which is basically, like, on the right it's sort of single object consistency models, on the left it's more like consistency models that touch multiple objects. So in our case, we just basically needed the strongest consistency model except, like, strong strict serializable that talks about single object. And we basically used a lot of ACCA and octro model technologies and features to achieve that. So, you know, taking a step back, what is octro model? That's basically, like, a model of computation. So, you know, theoretical groundwork for actor model was made in early 1970s or late 1970s, early 1990s in small talk language. So there is a sort of, like, anecdote that Alan Kay, who is considered the author, the father of object-oriented programming term, actually meant something that's really closer to actor model than to object-oriented because in his definition, objects, everything is an object. Objects interchange messages, objects run concurrently, and so on. That's quite close to what actor model actually does. There is other, like, on the internet, the result of the argument that it basically, like, this was a definition of small talk before small talk 80. And in small talk 80 is actually object-oriented, but before it was actor model. But anyway, so actors run concurrently. Actors exchange messages. Messages is basically the only method of communicating with actor. You're not supposed to hold a reference onto actor. You're not supposed to send the actor, like, over on the internet. You're not supposed to serialize it. It's just basically it. Only send messages. Actors encapsulate internal state. The only way to access internal state, somehow, is to send the message to an actor and wait for response if necessary. While it's not strictly enforced in ACCA and in most of other actor model implementations, the best practice is to only return read-only views of the internal state if necessary. And, like, one actor is a pretty boring actor system, and they usually come with hierarchies. And in hierarchies, it's basically, like, one of the mechanisms to improve reliability by compartmentalizing, handling, and recovering from the errors. So in Scala, actor model implementation is provided by ACCA. And, you know, just a few words about Scala. If you don't know what's that. Scala is basically a multi-paradigm language running on GVM. It supports object-oriented programming. It supports functional programming. It supports tactile model. And even if you mix all together, it doesn't, you know, make your eyes bleed. So here's basically an example, like, of a short one that has examples of all the three sort of paradigms or programming styles. So in ACCA, there are sort of, like, with latest version, I guess 2.5, I'm not mistaken. They finally released the typed actor, I don't know, infrastructure, which were in sort of beta state for quite a long time. Yeah, but I guess I'll start with simple ones and just, you know, illustrate by differences. Simple actors basically receive and send messages. Handling every message is sort of concurrent free. So actor does not pull any messages from its mailbox until current message is processed. Actors can swipe the receive method, which is basically the entry point for messages. And there are multiple applications to that. The more known ones are, you know, modeling finite state machines or, you know, basically changing actor behavior. Well, yeah, the other one, the other notable example of usage is that you can model a mutable state using immutable data structures. And the only mutable part of your actor is the changing context, changing receive function. So one of the caveats is that for simple actors receive is a partial function, which means that it actually takes any as an input and it's a partial function. So there is no guarantee that the message that you send to an actor will be in any way processed. In fact, that's sort of like a common caveat that when you send a message and expect a response or an action, but that message is either totally unknown to an actor or it cannot process it in its current state. Because then there is no case in this partial function that you are using currently, nothing will happen. So if you wait for response, you will not get it. If you expect some action to happen, nothing will happen. So that's sort of common caveat and one of the reasons for, you know, coming up with this archetype infrastructure slash different world, that sort of tries to address two things. This receive being partial function and second improves type safety. So archetype is basically like you need to actually define what is the type of messages that this actor can handle and your receive functions become a total function that takes that, I don't know, interface trait as an input and produces some output. So the big difference is that in simple actors there is a sender helper method that you can use to sort of respond to the actor that sent you this message. In NACA type, they made a decision to remove that to sort of explicitly include it in the message so that type safety can be observed. And that took much longer than I expected. So yeah, on top of, you know, basic actors in NACA there are multiple more sophisticated features like a few notable ones are ACA streams, which is basically like an implementation of reactive streams on top of actors, you know, gives you back pressure within your service. ACA HTTP is basically like a library that gives you ability to handle HTTP requests. Again, this provides back pressure between the services. ACA persistence gives you event sourcing and one of the most powerful but also most, how do you say that, complex to handle is ACA cluster with various plugins such as singletone charting, distributed data and so on. So cluster gives you an ability to build application level clustering. Peered was like a classic mix is cluster, charting and persistence sort of gives you a cluster of persistent entities that are transparently rebalanced between your notes, scales almost horizontally, persistent messages and sourcing and so on. So this sort of puts us closer to what we actually did. So that's basically like a high level overview of our solution. Basically, we have like a gateway service that serves as an entry point to all of the rest capacity service being an orchestrator. So I'm not showing full details here, but there are other services that capacity interacts. One notable service is this capacity cash haul service, also known as God of Cash, which is basically cash as every bit of infrequently changed data from other services such as restrictions and parking locations, stuff like that. And there's transport capacity service, which is actually like a stateful service that gives us all the consistency guarantees. And then we have Cassandra and Redshift. Cassandra is a main data store and Redshift is a data store for analytical queries. So if we zoom into transport capacity service, which is, I guess, the most interesting thing there is, like we have GRPC interface as an entry point. Then I sort of sliced it into three use cases or three requests. We have get availability requests, which is basically like a read request asking for at what slots do we still have capacity so that we can show them to the customers. So that grid on this slide, right, is basically served from the response of get availability. And then when customer performs a checkout, we receive reserve, and that go through actor sharding mechanisms over the network, falls into like one particular shift actor and causes state updates, right? So cancel basically the same except like we initially tried to use other Aka cluster plugin called distributed data. So that plugin gives like a collection of replicated, conflict free data types like CRDTs. But in our observation, we sort of basically figured out that distributed it works pretty poor. And it doesn't hold the request rates that we're anticipating. And surprisingly, just mindless broadcasting to all the actors work much better. Okay, so I guess I've removed one important slide. So we structured this whole thing so that like Red Mards model of operation is that we have shifts that perform deliveries, right? And in each shift we have like few drivers and few vans. And the most important part is that those vans and drivers are not shared between the shifts. So in this case, shift actor basically represents this one last mile shift. So all the capacity that we have for deliveries is basically nicely encapsulated in a shift. And by representing it as a shift actor, we for free got the guarantees that Aka provide, which is this one. So because of actor like Aka's guarantee that no messages are processed concurrently, which is basically have sort of a single threaded compartment for executing and handling every message. It also encapsulates internal states so that there is no way to access that state outside, no way to mutate it, and no way to have race conditions. So on top of that, since our shift actors are sharded, Aka makes sure that there is exactly, well, sorry, not exactly. At most one instance of sharded actor running somewhere in a cluster. And handles rebalancing, recovery from crashes and stuff transparent due to us. So that basically like these two together gives us two things. One is completely concurrent free execution model. And the second one is that we always have one writer for a single last mouth shift. And because failures and node crashes and networks plates are handled transparently by Aka, there's no single point of failure in the system. So that we can add or remove nodes to our running cluster at any time. And all the loads is nicely rebalancing between them. So what do we have learned? Serialization product is very important. So initially we chose cryo and that was actually a very poor choice. So the thing is this serialization happens not only for messages that are sent between the nodes in the cluster, but also all the events persisted by your persistent doctors are serialized as well. And cryo basically doesn't have any support for background for compatibility. So yeah, and in terms of event sourcing, like canonical implementations that every event should be maintained for like eternity and the application should be able to process it and so on. And with cryo it was basically quite challenging to perform schema migrations on the events. So basically the lesson learned is that binary product is more efficient. But you really need to keep like a plan for schema evolution and choose a mechanism that support forward and backward compatibility. So as I mentioned earlier, distributed data let us down. So not all the ACA plugins are equivalent, like equally good. Sharding is awesome, except it has quirks. So it's just basically as soon as you are doing any ACA persistence, you need to have ACA sharding. And then you need to figure out how to solve brain split problems. And Lightben has a paid plugin for that. Or you can roll on your own implementation. And sort of one last thing is that best practice that we figure out is to separate actor, like messages handling, persistence, recovery, and blah, blah, blah, from actual entity behavior. Because it was quite challenging, like initially we sort of had everything as an actor state, like we're playing variables like maps and integers and stuff like that. And it was quite challenging to test the business logic in isolation from the actor. So when we sort of like artificially introduced an entity and separated handling messages from updating, like doing the business logic, it became much better. Yes, so as a result, we do have the consistency model that we needed. Each service runs in highly available mode so that we can restart them at will. Our production fleet is basically 11 and T2 micro instances. And it sort of handles 660 requests a second under 100 milliseconds. And that's basically it. Okay, we have another game to play. So there are some swag from our lovely HR department. So they asked me to sort of play this game. The one who asks the best question gets the best swag we have. Other questions will be also rewarded. Come on, who wants a tumbler? Do you want to speak through microphone maybe? Yeah. Yeah, that's basically one of the topics I've intentionally omitted for to sort of shorten my presentation, but basically LightBend, some documentation on ACA has a list of four strategies that you can employ. Like the simplest one is basically static quorum. You know that you will be running X number of instances and you just basically make sure that the nodes that observe less than half of it terminate. Right, so that's exactly what we did. Because we actually knew that we will be running three. And this was sort of like, let's just go with the simplest one. And it sort of worked well because we don't need to outer scale, right? The more sophisticated ones include dynamic majority, which is basically like it recalculates how many members it has. But that one is sort of more dangerous because like membership changes might not be propagated in time. So if your cluster accidentally grows and then shrinks dramatically in short amount of time, like, and then splits, these two, you know, regions of your former cluster might not detect the same number of, you know, what was the largest number. So that's sort of more riskier. The other one is oldest first or something. But basically like the parts of the cluster that has the oldest node in it is kept, everyone else is just terminating. And the third one is, or the fourth one, you basically designate a particular like arbiter base something. You just basically designate a node. All the nodes that doesn't see that arbiter node terminate. And that's basically it. But in our case, static forum was totally enough. And we built our own implementation. It's really like probably one additional actor with like 50 lines of code or something was not too hard to build was much harder to test. Okay. Anyone else? How do you test this? Test what? The system. Anything. Anything. So we have like a collection of unit tests. Right. So let's go by like, you know, bottom up at the lowest level, we have our business entity. So that business entity doesn't have like any dependencies on anything. So we just unit test it in obvious ways. It's almost side effect free. It just logs. So it's almost like functional programming, almost. Then on actors, ACA provide ACA test kit. So that gives you ability to send messages and assert, expect messages and stuff like that. On top of that, there's basically like a set of integration tests. I'm not 100% sure if we do the tests at the HTTP level. But, you know, at least in some sort of like, not necessarily in transport capacity, which is like the bottom. But for capacity service, I definitely do have sort of tests that test entire thing. And then we obviously have end to end tests that are maintained by RQA. But like the basic, like the, if you allow me to rephrase the question, how do you test actors? Right? So, plain actors, there is a test kit that basically runs your actor system. It gives you ability to spin up actors, monitor them for termination, send messages, expect messages in return, blah, blah, blah. The most powerful is probably sort of called autopilot. You create an actor and configure it, what to respond. And then it just, you know, engages in communication with other ones. So, other things, like the more advanced features of ACA, right? So, streams, there is stream test kit. Right? In the, you know, degraded case, you can just, you know, create a stream, send one message, one item into it, and assert on what comes out like from the other side. You know, pretty standard, you know, like specs to Scala test, whatever can do that. Cluster, I guess is more, most hilarious because there is a multi-JVM SBT plugin that allows you, within the single test, spin up multiple JVMs running multiple ACA systems, joining them together into a single cluster, then sort of putting synchronization points like all the systems need to reach this state in order for the test to proceed and then send messages and assert on it. We did have a few multi-JVM tests. It was pain, right? It runs slowly. It starts for like two minutes. It's very hard to maintain. So we just basically used them to test this self-implemented brain-split protection strategy. And that's it. And then we removed it. Thank you for asking. Hey, Damien. Oh, well, dependency injection with actors is sort of interesting. So in order to create an actor, you need to create so-called actor props, which is sort of like a recipe for the actor system to create your actor. It is supposed to sort of encapsulate everything it needs to have. So, well, there are basically two things to dependency injection. One is how to create your actor. And the other one is how to inject your actor into other actors that you need to use it or other services or whatever components. So to create an actor, you just use your normal injection. Well, let's say it like this. You use a dependency injection container that has a support to use functions as factories. Right? And your factory is basically your actor.props, whatever. So for play, there is support. In juice, there is support. In macware, there is support for that, and so on. For injecting actors into other components, you are not supposed to do that. You are supposed to inject actor references. Right? And for that, you first create an actor, obtain an actor reference, and then inject it normally. Again, play has support. Macware has support if you know how to do that. Well, basically everything. You guys are using Macware? Yeah, we are using Macware, which is a stat sort of compile time injection library for Scala. Is that complexity necessary? You wouldn't believe how many times I've tried to convince our business users and management that this level of consistency is not required, and eventual consistency would work. I failed, and that basically means that we needed that level of consistency, and that collection of technologies was more or less straightforward way to achieve that. Okay? I guess should we wrap up? Looking back, would you do anything different? I would not use distributed data in the first place. And, well, yeah. So in capacitors, which is an orchestrator, we used Accastreams as sort of like an integration, like a wiring together, wiring all things together thing. I would probably not use it this time, but I guess we're currently developing a system that would be using Accastreams and I'm pretty confident that's the right way to do that this time. Okay. Do you use any message broker like Accastream? You also get an application. So between nodes in the cluster, Accastream has mechanism to sort of transparently send messages. So it's like from the perspective of Accastream sending a message to another actor running on the other node, nothing changes. It still uses the same Accastream, but then Accastream transparently serializes it and sends it over. So in terms of using Kafka, Rebit, MQ, whatever message broker, there are adapters that can consume messages or send messages to and from Kafka, Rebit, MQ, blah, blah, blah, and convert it to actor messages and send it somewhere. But in our case, we didn't need to use that. There was a question. Do you find that it's easy to hop up? Do you experience a problem in updating the design class in an actor? Maybe a microphone. There's like a loud air con. Do you have a problem in architecturally design of the whole actor system and how the message should go and then how to maintain it and how to scale it? How do you manage it? I guess honest question would be that this whole thing was built by two senior software engineers. So there wasn't a lot of communication required, and we easily understood what happens. When we started onboarding more junior folks, there was some learning curve for sure. I guess in our case, what saved us was that we, on raw actors without any streams or blah, blah, blah, blah, there was just one actor, one type of actors that are sharded, and then there are no hierarchy. They're sharded, they're supervised by ACCA and built-ins and blah, blah, blah. So like message exchange is more or less straightforward. You just only have one actor. Then we sort of have a convention that you put all the messages in ACCA companion object. You put all the events in ACCA companion object. You put a sealed trait in them so that like messages are there, events are there. This is the structure of your code and so on. So with all that convention, it's made really easy to follow the flows within one actor and like its immediate surroundings. And we didn't have a problem of having like enormous actor system that have multiple types of actors that interchange periodically. Well, theoretically what I would do in this case, there needs to be like a really good collection of diagrams that describe like what kind of exchanges happen during processing of a request. That's probably not really good answer to your question, but that's the best I have. I think ACCA type might actually help on that front a little bit, right? Because in this case, like you have like a pretty well-defined interface of messages and also you need to explicitly send like basically where to respond to that. And then you can use like normal navigate to support in IntelliJ or Eclipse or whatever. One of the things that we find is actually very good to use like that is because everything is so dynamic that you can never reproduce the same scenario. It's not critical, you cannot predict it because there are messages that send you one time and which scenario seems a little different than the way that all the musicians is completely different. What was the message to the consumer? Well, I think you're at the scale where even like classically built solution would be hard to navigate as well. It's not a silver bullet. Complexity at the heart of complexity is not unavoidable. Okay, thank you guys. Yes, we have two winners, this gentleman and I'm hesitating. Do you want a cup? Just kidding. Okay, and the gentleman at the back. Okay, so thank you guys for asking good questions. Okay, thanks. I don't know how other swag is supposed to be given but I'll ask who and return and everyone who asks questions will have something. So thank you guys.