 And then hand over to the guys to actually know what they're talking about. So, um, yeah, I'm my name is Nino I'm the CTO of Sashaway Since we're a fairly young company. I was just gonna Yeah, just want to quickly say a few words about You know what we do at Sashaway. So Sashaway is essentially a fintech company online digital wealth manager. So What we do is, you know, a goal-based approach to savings. So, you know, as a customer you can sign up you say, okay I want to say for retirement We help you first of all to you know, how to think about your your saving goals So, you know, how what how old, you know, do you think you are gonna be when you retire? How much money do you need when you retire and so on and then based on this configuration? We create a portfolio for you and this portfolio we also used to invest your money into so you'll set up For example, a recurring payment every month you transfer whatever a thousand dollars and we invest the money for you So it's the so-called robo advisor as it's industry term, right? Just a few words for the, you know context of our tech sack. So We're basically using Scala and Lagom in the in the back end for the trading side of our applications So, you know, it connects to the bank you connect to the broker and then we have a you know Front-end that consists of no JS API layer MongoDB and react front-end So relatively standard front-end second on the back end We decided because of the transactional nature of our business that it doesn't make sense to go with the you know Event sourcing approach and that's how we kind of ended up with the Lagom framework And that's where Jew and Mick are gonna tell you a bit more about how how we're using it and what kind of We're our learnings in doing so, right? Yeah, and with that I'll just Hello everyone, I'll just quickly make a start by introducing myself Okay, my name is Jew I've been a programmer for about eight and a half years. I I mostly worked in investment bank hedge fund building the electronic trading systems I've been using Scala for last two years including I've been using a car library at work still learning a lot. So, yep So that's just quick intro about myself Okay, so let me begin to talk on lagom Actually, I just didn't have enough time to memorize all the script So I have to please focus on the screen rather than me. Okay. I think I'll better sit down actually So Okay, what is lagom lagom in a nutshell is the reactive microservices framework developed by Leipend It's support both Java and Scala API Java version was first released last year March and Scala version was released last year December It is open source, but also the commercial support from Leipend is also available Still in very early days. As I said, it's less than one year old Okay, and Some let me introduce some features that makes lagom a bit unique as a framework. So Lagom favors the Distributed persistent patterns in contrast with traditional centralized databases. So it encourages an event source architecture for data persistence The default pattern for persisting entities take advantage of event sourcing with command query Responsibility segregation CQRS. I will talk about event sourcing and CQRS a little bit more in-depth later on Also, the framework emphasized that finding the right boundaries between the services Align them with bounded contacts are the most important aspect in architecting microservice based system so it has very strong domain driven tendency and Lastly lagom assembles a collection of existing and proven technologies and add values on top of them So I'll talk about this in details later on so it uses play a cock Sandra and Kafka to build the framework So questions were asked like why did you guys choose lagom when it's less than a year old So it's true not many people wants to be the first one to use the new framework in production It's a fair concern to have so let me explain to you our reasons to choose lagom as our back-end framework despite of these concerns So there were basically four things that I wanted to achieve in in a new project in in this new trading platform Those are productivity performance Concurrency and scalability first productivity being able to code in Scala was a just very good start for me It's a strong type system and concise codes was definitely a strong plus and Also lagom provides some handful features to run and test lagom services locally in your machine in SBT so You can do some cool stuff like incremental compilation and hot reloading of the services and so on in SBT Performance wise, I think JVM is the one of the best virtual machines out there. So it was just enough Concurrency, I just didn't want to go back into using thread and locks So the fact that it's based on a car was a huge plus. I'll talk more about a car Scalability although I didn't exactly know what distributed persistent means at that time to me it just sound like they Lagom prepared a lot for the linear scalability at the fundamental layer so it actually met all these four categories really well and So let's talk about technologies the underlying technologies of lagom so So lagom's backbone is composed with many industry proven technologies such as Akka Cassandra and Kafka and play of course So lagom did not really reinvent the message broker or persistent storage. Neither it reinvented the actor model they they're all based on the proven technologies and so lagom to me it it came as abstraction that is strongly opinionated on how to combine all these together to form a microservices system and JDM as well and Another reason we choose trust lagom was that we agreed with its key ideas So although I didn't have experience in ES or CQRS previously We like the lagom's underlying idea that includes event sourcing and CQRS and also its Its tendency toward reactiveness So we thought that there will be a lot of advantages we can gain from this key concept Even if it takes extra effort to understand them and get used to them in the beginning we believe that it would be a good investment for the future and Also, I felt that light band must have spent lots of common Must have seen a lot of common patterns in the way people build successful distributed systems using Akka framework So when lagom came out, I thought that finally lagom came out, you know after seeing all that, you know best good practices and bad practices that are done in the industry they finally came out with the open source framework that you know collect these best best practices from the industry and turn them into the open source framework Or so we weren't really afraid to get our hands dirty and dig into any solve any problems we encounter while using it In fact few weeks ago we encountered one bog in the code and me actually committed to fix for it So we were not blocked by it. This is one of the one of the benefits of open source open source framework You can modify the framework and fix the problem yourself if you need to And lastly the support from light band team and the community has been really great In the lagom Google group, I asked about 150 questions over the last six months None of those questions were left unanswered. This is excluding the email communication I had separately with the community members to exchange ideas and to ask for help and Of course some freebies from light band or so helped us to stay lawyer Okay, so I mentioned Akka framework as one of the important reasons for choosing lagom and you know I thought since many of you guys here are you know familiar with Akka and actor model I thought it would be good if I can explain a little little more about why Akka was the confidant booster for me So as I said I spent many years developing enterprise system in the domain of financial trading So many times the performance was one of the foremost requirement of the project. So we normally we developed multi-threaded program with thread and locks and As many of you would agree expressing concurrent programs using thread and locks is just not easy Yes, there were some facilities developed to help with the issue such as concurrent collection Letchies and thread pools and all that but still it was just relatively low level and prone to various kind of errors Like deadlock starvation and whatever as a programmer I can usually reason very well about the behavior of a sequential program small unit of function But as soon as multi-threading comes into the picture it became difficult to reason about So my seniors advised me to study this book of Java concurrency in practice I studied it number of times, but it was still very hard and and the problem is like not everyone's are reading it So so I was actually hoping if I could spend more time on Implementing and thinking about the business features rather than worrying about data corruption and and and the state of the threat I wanted to spend more time worrying about the state of Aggregate route for example So that is when I started learning about Scala and Akka So I just totally loved how Akka can provide lock-free concurrency Parallelism and an extremely effective way of managing state Also, it's message-driven nature Seems like it's feeding very well to the distributed computing So it it was a paradigm shift for me It simplified the concurrent programming by shifting my focus from how to achieve the goal to what goal to achieve and As I became more familiar with actor model I found that actor can be very suitable for representing an aggregate route in domain-driven design context with messages representing the concept in the UV cut us language in domain-driven design and using aggregate using actors as an aggregates So and also actor naturally applies the principle that only one aggregate should be modified in one transaction. So I Think MIG will briefly mention about actor in his session actually so And but that there were still a difficulties I mean so or though I mean because I didn't have experience in building a You know microservices system from the stretch in especially in domain-driven design domain-driven development style myself I intuitively felt that you know Like Akka can somehow help Akka can somehow provide me a stepping stone to do all that So I had this concept But I believe that you know one can make good use of actors message-driven nature to tackle the challenges of event-driven Systems and just distributed system and it also seemed like it fits very well for representing a aggregate route in DDD as I said But I just needed some guidance I felt that actor alone may not be sufficient enough to build a distributed microservices Systems because first of all actor in Akka framework are not type safe And also I wasn't quite sure how I can ensure the delivery of the message can be guaranteed with eventual consistency I did take a look at persistent actor But but to for me to just combine all this concept together to implement the production level Production quality microservices system. I just I wasn't just Confident enough or experienced enough to do that and that is why I was so glad when Lagom came out and realized that it Contains all the ideas that I had and provided provides the guard railed approach with very good default Okay, so Now let me really talk about lagoon framework As you probably have noticed it's a pretty pretty big framework It includes a lot of ideas and technologies, so I may not be able to cover all the details in the session I'll cover only the three important APIs in in Lagom so First one is service API which provides a way to declare and implement services to be consumed by the clients This is one of the most straightforward API's out of three, so I'll make it very quick and There is message broker API which provides a distributed pop-up model that services can use to share the data through topics Lastly, there is a precision API that provides event source precision entities. So let me go through them one by one Lagom so service API to begin with Lagom services are described by a service descriptor So This interface defines the two things first one is how the services are invoked and implemented Second is the metadata that described how the interface is mapped to underlying transport protocol. Let me show you an example So So this is a sample I have taken from our portfolio service descriptor just a portion of it and It shows the two endpoints currently one is the create portfolio and the other one is get portfolio summary So I think this is pretty common features and patterns in almost any web framework to have have some kind of this kind of service Descriptor to declare your endpoints So I won't go into too much details, but I'll just mention one feature that makes a Lagom service descriptor quite unique That is your service descriptor interface and the actual implementation of it is kept in different project to achieve the loose coupling So as you can see the service descriptor which lives in portfolio service Belongs API project and the actual implementation belongs to implementation project So that's about service API Let's now talk about the second API, which is the message broker API so Message broker API provides a distributed pop-up model that service can use to share the data Topic is simply a channel that allows services to push and pull the data with each other In Lagom topics are strongly tied hence both subscriber and producer can know in advance what the expected data will be So in order to publish the data to a topic a service needs to be declared in the topic It's descriptor like this in this example in this example our pricing service is declaring that the three topics price topic or price updates FX topic for FX updates debit on topic for any debit on events So Lagom is using Kafka as a message broker me will talk more about that Kafka will distribute message for a particular topic across many partitions so that the topic can scale Messages can message send to different partition may be processed out of order So if the ordering of the message you are publishing outside your service matters You need to ensure that the messages are partitioned in a way that order is preserved Typically you can achieve this by ensuring that each message for a particular entity goes to the same partition all the time Lagom allows this by letting you to configure a partition key strategy in this example, which I took from Lagom website it uses It is it is telling Lagom to use post ID as a partition key So primary source of messages that Lagom produce is the precision entity events, which I'll talk in the coming slide It takes a stream of events and adopted to a stream of messages Which send to the Kafka message broker in this way You can ensure that at least once processing of events by both publishers and consumers Which allows you to guarantee a very strong level of consistency between your services This point on inter service consistency will be liberated in more depth by me later So, okay, let's now take a look at how we can subscribe to the topic To subscribe for top to the topic a service just needs to call topic that subscribe on the topic of interest and In this example portfolio services Interesting subscribing to the FX rates update event, which is coming from the pricing service And when you call topic that subscribe you'll get back a subscriber instance in this code snippet We have subscribed to the FX topic using at least once delivery semantics That means each message publisher to the FX rate topic is received at least once by the portfolio service But possibly more than once as well So as you can see it subscribed the closing FX rate updated message from pricing service and and send that Update to to the portfolio to update the positions and prices Okay Now let me move on to the precision entity API actually this is a quite a big one Before we go into details. I have to explain the three concepts to you first one is about aggregate route So each persistent entity in lagom corresponds to the aggregate route in domain-driven design so it's important to understand what it is and Second and third are event source and CQRS So lagom favors the decoupled and distributed Persistency over the traditional data storage So an event sourcing and CQRS are the techniques that lagom promotes to achieve this goal So when we design our systems We often think in terms of models We try to model the real things into our code and in any systems with persistent storage of the model There must be a scope for a transaction so that changes to the model happens in a consistently and atomically on Aggregate is a cluster of associated Objects that we treat as a unit for the purpose of data change Each aggregate has a root and a boundary the boundary defines What is inside the aggregate cluster and the root is the only member of the aggregate that outside object are allowed to hold reference to In this example the portfolio is a root of its aggregate that consists of stock positions cash parlance and orders By clustering these entities as an aggregate cluster we achieve the transactional consistency between these entities for example We must have a guarantee that when the buy order is settled for the portfolio two things need to happen one Stock position of the purchase stock need to increase because we just bought more to Cash balance need to be reduced by the amount we spend on buying that stock So while trading stock position and cash balance as their own separate entity It makes sense to cluster them together in the same aggregate root with the order as orders entity So we achieve these transactional consistency that I just described and by making portfolio entity as a root of Of the aggregate that represent this cluster We can feel safe about giving reference to this cluster to the external world Using portfolio as a handle So now we have defined portfolio as our put Aggregate route in and which and now I'll be stick to it in the coming slide as an example Let's now talk about event sourcing using using aggregate portfolio as our aggregate route Event sourcing is a practice of capturing all the changes that happen to aggregate as events Which are immutable facts of things that have already happened. So in this example When the customer first creates an account and opens the portfolio with stash away For example, it can be stored simply as portfolio created events Of course that event will carry all the required information like account ID and base currency and all that Rather than the complex interaction that would have taken place in a traditional crowd application like begin transaction Commit these and that we just simply stored it as a portfolio created event and when the customer makes the deposit of 50k for example that also get captured as a domain event as fund deposited events Any state changes like this fund deposit security buy sell we throw and so we throw whatever We capture all of them as just simply as events and store them in order of occurrence And we use this event to accumulate that we accumulated to explain the current state of portfolio So over the lifecycle of portfolio, you will probably see like a lot of events accumulating like this Since it's since the portfolios burst so a portfolio get created and then all the events that happened to the portfolio will Ford left of all the events will become your current state So let me show any real-world example So this screen shows actual events that occurred to the portfolio persist entity persistent entity in in our system so as you can see on the left side it Represent the current state of this portfolio and on the on the right side right hand side the table is the event log Which shows all the events that occurred to this state with the sequence number and timestamp on it So the the very first event that happened to this entity was a portfolio created event And if you want you can also take a look at content of each event in the event log like this So you can actually look at the details of the event event sourcing there are a couple of advantages of event sourcing that we really like So event sourcing guarantees that the reason for each change to an aggregate instance will not be lost This is a major difference to traditional crowd-based approach when using traditional approach of serializing the current state of Of the aggregate to the database. We are always overriding the previous states. We just don't do that in event sourcing We append to it However So retaining the reason for every change that happened to aggregate since its birth it can can give us invaluable It can be really great use for the business for the following reasons so First is reliability. It provides a 100% reliable audit log of the changes that made to the business entity second is Near and far term business intelligence analytic discoveries because it emits a lot of data the facts about the Facts that happen to the business. You can make use of analytical Analysis on them later on if you want so and the full audit log as I mentioned This is a feature that is particularly beneficial for a company like us Nowadays the regulatory requirement for financial institutions are very heavy and having this full audit log as a feature Which basically shows every state changes that according to the platform can play play very handy and save us a lot of times over the time for for regulation and compliance purpose and Traceability debilitating you can trace all the events and and you can actually debug through it replay ability You can replay the events to recover the state so even if you lose the current state you can always go back to events to to refer current state and If you want you can also use it in a way that you can you only replay back to up to certain point in the time to to see what the state was like at that point and Lastly event because events are appended events are persisting a pen only fashion. It's extremely fast So in this and this is how a portfolio persistent entity looks like as you can see I have overwritten the three abstract type members which are command event and type Command event and state Sorry, so command is the the superclass of the command that this portfolio entity will accept and Event refers to superclass of the events that portfolio entities entity will emit and state is the state of the portfolio Which is is the placeholder of other entities that I just described in the previous slide like a positions and cash balance and so on and As you can see the initial state there is an abstract method that represent the initial state of entity when it's first created when the precision entity is first created and The behavior is the method that returns the behavior of the entity which I'll explain So the function that process incoming commands are registered using on command method of the actions and you should define One command handler for each command class that portfolio entity can receive I'll show you so so so the usual Workflow that's going to happen on our entity will be like this so command enters and it emits event and it becomes a new state So let's take a closer look at the not traded variable which represents the behavior of the portfolio when entity of the Portfolio when it's not yet created So it has one common handler that handles the created portfolio command and has one event handler that handles portfolio created events So in the command handle book you you will see that ctx dot then persist command This then persist command will persist the portfolio created band to our message log Sorry to our event log in Cassandra and when this event has been persisted successfully only then the current state needs to be updated By applying the event to the current state the function for Updating the state are registered within own event block So here we are registering the details of the portfolio to the portfolio state and returns a new state So it goes like this create portfolio command comes in it becomes portfolio created events and it becomes a new portfolio state And then you will see this kind of event in your Cassandra message event log So there are a few drawbacks with event sourcing firstly when you don't have deep understanding of the business Domain that you are building for it can be quite difficult to to reason about the potential events that can happen and Also since it's relatively recent concept to most people there is a cost and the risk of introducing this to your tech team and But the biggest disadvantage of event sourcing for our specific use case in in such a way is the information retriever So imagine if you have 1000 portfolios in the system and each portfolio has on average say 500 events and we want to period periodically query the system to find out how many Apple Stokes does the stage away currently have It will be difficult for us to query the event logs to do this We have to query the event log to get all the events that's related to our security position And then we have to actually go one by one and sum them up and and with the database like Cassandra This kind of group by operation is not supported So and this is exactly why we are using security pattern to tackle this problem, which I'll explain later So what is security? So security tells us that our application can be divided into two data models command model and the query model So we have already seen the command model in our previous slide So the this side we have already seen the command enters and it emits the event So so the update data store refers to events being emitted in our case But in the core side we have API that is read only so and this read sign will subscribe to the changes that happen on the right side and will reflect that changes on its read only on on its read side storage and Normally that table or the view is optimized to so serve a query business query So I'll so this is an example of read side processor that we have in Portfolio portfolio service. So So as you can see it extends the read side processor and you have to declare the type of event that that Yes Somehow Sure, sure, okay, so so out of this part so this Cassandra session read side and Are given by the lagom framework and this put position DB handler is the one that I built Read side processor is given by lagom so so read side processor with Which we declare the event type that we are interested in subscribing to and In this example what I'm doing is when security purchased event arrives at this processor It's for that event to the position database handler to update the total security holdings table Just to to serve Precisely that query that I just described like how many Apple stocks we have currently in stash away Position DB handler is nothing but just the interface to Cassandra database storage So it will execute the insert statement like this when when it's called and And then you create the repository interface that connects to that table that I just show you To to solve any business queries. So so in this way We we just achieve the optimized read side view from the static table compared to dynamically event source to view Okay, so That's about it in this talk. I just touched on why we chose lagom and I described three API and And in doing that I I gave you very high-level overview of aggregate route and event sourcing and CQRS Now Mick will elaborate how lagom helps to achieve the consistency within your microservice systems. Thank you So hello hi there So if I can briefly introduce myself, okay, thank you. Hello, okay Okay, hi, my name is Nick. So briefly about myself. I worked at oh, okay. Thank you So my name is Nick and I worked in red mark before so we have a bunch of red mark people here And we're in the Lazada so don't forget to sign up for live up and Now I work in stash away and we have openings too. So if you like I'll talk So I want to talk about maintaining consistency in microservice system So one of the hard part of a microservice when you have like things decoupled is that how do we maintain consistency of data as you Throw flow from one domain to the other so until separate the talk into so In the columns wise we have the separation within the microservice and outside So that means within your domain and across the domains and Then I want to look into the consistency model that each of these part take So particularly the lagom persistent and see that you talks about which is our main actor Actor like a main main guy, right? That it actually has a strong consistency and we get linearizability out of it and At the last roll we have eventual consistency world where We have the read side which is the query part of the CQRS and we can use it to implement the long-running business process and When we go into the eventual consistency part of the outside world, then we have Message pops up that Kafka out of the box from like on health. Excuse me So, okay, let's talk about first them within microservice a strong consistency and we have our persistent entity of like on So how is it achieved in this framework? So as we know we have actor so actor is known to be share nothing model and It's seriously process the message. So let's say you have Alice as we built our domain of a customer and So we might have a lot of concurrency concurrent commands that's come in but Alice will only process things one at a time because we have a nice mailbox and once we finish one message We'll continue to consume the other so locally we have no concurrency issue But well now we're in distributed world, but we have things in a cluster, right? So if suppose we have to Alice's and Alice receive multiple messages multiple commands at the same time How would it be handle it? So actually it's a starting without any replication that means within our cluster We might have multiple JVMs multiple applications that are running But Alice only live on one of them So let's say in your system. You have a set you have like all your customer IDs They will be equally distributed into your different application. So the club though So we have stayed through application in this way So that means when Request that comes in and wants to talk to Alice if the set if the request go to note one service Wonder it has joined Bob. It'll be redirected to Alice So in this way We can be sure that there only exists one Alice in the whole cluster and that's how we get strong linearizability in persistent entity and I mean, of course with the event that get emitted and How it's applied to the state that you has described and that's the the right that we actually operating on top of our Up on in our actor So it looks so good But what's the cause of this linearizability and how we can minimize it? So as we as we have starting with no replication So when one note leaves what happened when the note that Alice lives on died? So in Lagom these entities the IDs are redistributed to the to the run to to the applications that's still running and of course There's a delay that you know you have to reconstruct your events so that you can have a living entity of Alice But this is optimized because the we have snapshots that we can only replay We we can replay the set of events that we can still apply to snapshots So basically we don't have a long list of events that we have to replay in order to bring that bring Alice up even when she moves from one when she moves from staying in one application to the other from from one note to the other and We also have latency so we get availability in the sense I mean we get less delay because we have optimized read view and we are able to redistribute our entities But when we have network partition so because we are Starting with no replication and that means when our cluster got split a set of our customer a set of our entities might not be reached Right and then now we read we are not able to serve our clients until Actually, we this partition is healed whether it's by a cluster management tool or it's by human intervention that we can Heal this partition. So this is the downside, right and So now I want to reason about the Lagom this model of distributed system So a lot of us know captain right like CAP consistency availability and petition tolerance And we say that we can choose two out of three whether it's and see CPE or AP and It receives some criticism that it's too binary. I mean it doesn't allow for granularity of Let's say we look at availability. We might be able to look at it in terms of delay instead So I would like to invoke delay sensitivity framework that Martin Kledman talks about I mean it's a very good paper but The idea is that the stronger consistency model that we take the higher Our system becomes sensitive to the delay so and the paper he lays out very neatly about different model, which So if we look at the first one linearizability that Persistent any actually has so we have strict strong Consistency so We have large delay the our system would be sensitive to delay both in the right side and the read side and then Because you know all these operations have to happen as if it happens Within the same unit of time atomically and then we have sequential consistency and causal consistency which actually the rest less relevant to this and I'm also afraid that I will explain it incorrectly, but to briefly mention sequential consistency, it's where When you create a lot when you create right, and then you make sure that the read side that you have you see things You see your right side in order and then causal consistency is that only the one that has causal relationship has to be ordered and It achieves so with this weaker model of consistency our system become less sensitive to delay So if you think about let's say sequential consistency Let's say we have a mongo where we have a primary and we have a slaves and The right side have to coordinate and that means when it coordinates And if there's a lot of delay in the system in the network It will be sensitive to the delay because it has to coordinate all the right sides, but on the read side the client that's able to continue reading it because And and the client will get the delay as much as the replication lag But he he can always read the values So now I want to add another axis which is actually this is just my thought and might be incorrect But has nothing to do with my tin Clemens paper, which is the scope That means if we know that our system is sensitive to the delay So how do we reason about this delay? So I think maybe we can think about it in terms of scope So the scope of operations, right? So if you look about if you think about distributed transaction and if you want to get strong consistency We know that it's very expensive. It requires a lot of network coordination And if our system it's slow if the network it's slow, we'll have a very slow system and Now if we move domain will limit our domain closer Maybe you think of the first one as a monolith where we have we execute things across the domains And then we move to like a micro service where we still owns our own domain We still have table locking because our underlying representation of the domain. It's separated into different tables and then we have and Then we have a pen only log and in memory operation like hmm. What is that? so it is first is an actor so It doesn't know it doesn't need a lot of coordination is a pen only event and in those events apply Into your in-memory representation of your state. So it's very quick. And so basically you have a very smaller small scope, but then you still suffer the Sensitive to the delay because you want the strong consistency in your persistent entity But can we do better? What if we want to have replicated persistent actor? So that means so I mentioned previously that we are in the model of Sharding without replication and as a result we suffer from availability because if the part of the shards are not available Then we bring less available and then we have replicated persistent actor. So and I think her do you might know about it? because I Heard that you might know about it So replicated persistent actor I think there's a frame local event trade was Ashley was written by the same other who wrote a cup of systems in a it's early stage So the idea is that we can have replicated event logs That means compared to what we have in our academic persistence I'll even our actors cannot share the events But because we can distribute them across different locations and we have ripped so now all actors can share the events But when we share the events we shared event logs and then we become more available But how do we coordinate? Concurrency if one service if one call go to Alice in not one and the other go to Alice in no two, right? So if we define our actor in as a CRDT conflict free replicated data types So those can be reconciled without network coordination. That means eventually they will arrive at deterministic state or I think event traders who has a nice API that allows you to resolve these version in conflict But why what why would one still choose a stronger mode of consistency that Lagom provides? So I think if business requirement is that you need a strong Strong consistency, so I guess we have to and then I think it correspond more strongly with the share Nothing model of actor and as well as the strong consistency boundary of the concept of the aggregate root Concept in DDD. That means we must be able to you know maintain this and so if you use replicated Persistent actor when you receive a command and you consume the event you might get events that actually got generated from That was a result of the command from the other instance Now Yes, so that was Linerizability in the domain that we care about right and this is the right side is the right side of the CQRS is the C part C part is the command part and now we go to the read side, so we are in the eventual consistency world and Andrew has mentioned about a Lagom's nice abstraction of the read side processor So to put this in graphical way, so from down up we have transactional boundary, which is a strong We have strong consistency and then That is when our events got emitted and when immense got emitted that means things actually had happened And we have to obey it right and those stream up into the event processor And eventually it goes into your table and that is your read side table But what is to note here is that T here is not equal. It's not equal to zero So of course we are outside our strong consistency boundary and there will be a delay that we have to embrace at the read side so But so to be eventually consistent We we not only embrace the delay But we must make sure that the operations the events that we want to apply to our tables that we Have exactly one semantic that we don't want duplicate events or we don't want to miss any events, right? So how do you do that? Lagom has its own offset so each event comes with an offset and Even processor has to keep track of which offset that it has processed so We both update the read side that we care about and we persist the offset that means and they happen atomically So we do have transaction So like what did I say? Transaction so yes, so Lagom use Cassandra light Transaction so it's as atomic that means you know all a succeed all failure and But what are performance right? But it's okay because we already say that T is not equal to zero. So T is unbounded finite time so once our business requirement is embracing this eventual Consistency model. It's okay to that that we have performance cost at the at the read side Right exactly so it's on time-based UID and it's given out of the boss of Lagom when you emit an event It comes it got persisted together with a time-based UID Right so it will not be out of order the event that such as a clear sequence number and because Things get processed in actor and events got persisted So, you know that the command commands and the event does come in comes in order Do you expect and underlying Lagom? I think it uses Cassandra read and write a quorum that means before it returns your guarantee that the events will be there So and you can also use this to achieve long-running business process That means if we think about other business process of side effects that you want to create it's something that you're okay with Getting the delay so You can have the same kind of Mechanism where you have an offset the tracks which events have be further processed But because outside effects here It's not a persistent side effects that we can come it together with our Cassandra to have a Atomic transactions of the offset and actual things that we want to upgrade So that means what could happen is that okay? We create up side effects and then our system crash before we could commit the offset So that means when the system comes up against like oh, yeah, I haven't seen this massive I haven't seen this event and we replay the event again, so we do have Nice consistency of our event will that our event will reach whatever operations that we want to continue But it will be at least once the delivery semantics So the consumer side has to be at impotent which I assume a lot of consumer sites even in rest implementation would have to Would should assume at impotency anyway Now we're going to the Wild West not really like the outside our domain So how when we communicate with other services What what do we do here, right? So when we talk to other service, I would say like they actually only care as much as the events that you care about So have you had colleagues that comes to you like hey Can you expose a get-and-point for us because we need to enrich our data with your we need to enrich our Domain with your domain, right? so How can we solve this because when we have this get a synchronous communication we at Time bound with the other service that means the two services have to be present together So we abound we are more fragile to failure So when you do a get-and-point you actually was asked you you are asking for the static view of the domain that you care About but if you can image the events So the your client can actually create a local view of materialized View so basically there's a materialized view out of your service and some people would say that you know having Duplicate source of data might lead to inconsistency But I'm saying that it would be consistent because we have a nice mechanism of ensuring that these event stream will eventually reach your client and What about when you your friends say like hey, can you call this event when you finish this application? Can you call this endpoint when you finish this application? This this this process or it can you raise an event when you finish something? So when let's say if you persist something into your database, so you're done with the operation, but your friends still cares about Some events that they want you to continue to propagate So what guarantee do we have like if we finish if we crash right after we finish our operations can we replay the events again to Make sure that we can still Call a friend so in this way our system also become in can can become inconsistent and When it crash at the place that you do not want it to be now so if we see that events are the only source of change in the system and Then these I would say that these propagation are sufficient to whatever outside where communication would care about and Because your state would only be changed by events So if you say that you want something out of me, I should be able to get it from my events as well and I think this is similar to some implementations of when people want to stream the Replication of log of database so if you think of database also as a state of the application and then When it wants to replicate itself, and then it would stream these Oblocks to to to it's a replica to create the same events, right? So this is similar where those operations those database operations is actually the events that our system care about and what other system care about But here we have events that are more expressive We raise the level of abstraction out of the database not tied to the database and then it's semantic rich We can put in a lot more in our events that it's a that has business clear business logic embedding in it So and yes Yes Yes, so okay, sorry Right, so actually let me qualify that statement so You can adapt your events of course you can transform your events into the message that you can put in the Competence topic so you do have ability to abstract certain details that you don't want your clients to care too much about That's one thing and the other it's that I mean this is only the case where you actually want your system to be less time-bound and I think I Don't I don't advocate this to be the solution for everything and let's say if your custom if you actually have this for let's say a client like a consumer let's say in red mart you actually want the Item I'm not saying that we cannot expose get endpoint to serve clients items, right? It's not like the front end should have their own cash and rebuild these states But it is they like in internally if you want some kind of data meta data to enrich your objects Then it's okay to have and if your business is okay with this delay and with That the data might not be the most up-to-date. I think it's a valid case that can decouple your system in time So yes, that's correct, but they can do the work just enough to Satisfy the requirements so a lot of time. I think when we expose and get endpoint We have the problem that we expose too much and then we also add so what one could do It's to define endpoints to be very specific to the client to give only the information the client care about But it is not in general enough but then over here the clients can build their local view to Retain only the information that they care about as well So yes So Why should claim care about So if Then Yeah, so As I said, I think this is a trade-off that you can decide It gives you an alternative that allow your system to be less fragile because if you still continue to I think I am repeating myself. Yeah The reason why I less favor get I favor other services to actually subscribe to up to my service The reason why I think again is synchronous and and you know, if you subscribe to the event, you know The bigger problem I see is Right Complete loss of abstraction Right, that's very valid point and indeed we can filter out I mean Within the before we publish the events we have really nice We have full control to decide to filter out the events that we don't want to expose and to convert into a different message type And that's an and that is as easy as converted one case class to the other Before we put it in the queue So it doesn't have to be the whole events that have ever happened in charge system Topics right so we in fact what we do is for the services they expose different topics for different things or type of messages So I mean you can't describe to whatever you So the entities coming out persistent events coming out of persistent entities are not the one that's actually Sent out to the services actually We have another layer of abstraction That's why that one which will promote you to separate between API project and implementation project And API project is where you actually find the public contract to the work and there you can actually have different Like case class with the less number of fields in there for example Right, yeah, but I think that's very valid point that When one says that oh different people have local view of yourself I as a service owner if you can a little bit worried like I don't want to make sure that I will You know have to inform you everything when things change So I want to have the owner ownership of my data, but I think we can appropriately put some abstractions over it so so I was thought so yes, so a We can have at least one's delivery when we publish to Kafka topics and it used a similar mechanism of offset and similarly where we cannot have atomic Operation of publishing to the Kafka topic and coming in the offset Then we have at least one's delivery and clients should be prepared to be I didn't put in consumer So yes, okay in this section Explaining about how we can reason about this our lagom system and where the things be has strong consistency in the blue area and Things propagate to the red side and we can think of the green box It's just a materialized view whether it's the outside world of your your your own red side events And those are achieved through event stream Thank you Expand Data are If any event is missing or in the worst case, maybe your server suddenly shut down So maybe you have some Right, so are you referring to this? When there's resolving yes, so actually this is not lagom So this is alternative way that one could implement, but has a weaker consistency model So so this is just a purely as an alternative that I put it here So and so therefore I would not be able to answer your questions about how these Versions conflicts get resolved because in lagom You will never have conflict Yeah, because everything is linearized and you can have full control So what I'm talking about if you want to reconcile your data Maybe you have to have some external logic to handle that It's not Well, we in fact use for example for reconciliation on a business side, right? So that you will have a weed side, for example those tracks That lead to a certain balance state Right So true any questions First question is Like Right So yep, so when we have inconsistent data and then You're not able to process it I don't think there's a nice abstractions for now to put them in a dead letter queue and We actually still exploring of how to resolve that problem but one way of resolving this is to Actually, if you know what could potentially go wrong in your system We can model them as events that actually has happened and indeed But it can be events that does have to mutate your state But it can be events that inform or further notify that we do see some inconsistency So we abstract the So so It's like failure Like some invalid data correction up something so like to give you an example and recently we had a case where the Event came a command came in. It was trying to reduce the cash balance. So it actually Published the cash but balance decreased event and then according to our logic that event actually need to be applied to put for Your state to actually reduce the balance but due to the rounding It was it was less it was short my portfolio state portfolio was short by 0.1 cent So so so it couldn't actually persist that event. I couldn't actually apply that event on a straight state So so when that happened actually I was like a bit panicking like I mean this is like what the fuck I mean so but so what we did was like we let that event to just keep hanging in there to like it's just keep trying to Apply that event to the state keep trying keep trying keep trying and while this guy's busy with keep trying I actually released a new code to actually just corrupt the data actually corrupt the state So now the state goes below zero zero Low negative balance, which is like theoretically wrong, but we actually corrected by appending the fix event Like like increase cash balance event. So so one way of handling the poisonous message, sir Yeah, just that actor it's the failure is isolated to that actor instance. Yes I think Lagom does think about that and it provides API to actually deal with event migration But it is yeah, so it does have support for event migration That means for the events from the history that you have you can specify at which generation at which version which kind of Functions you want to apply up to the most recent version of the event that you have Char somewhere to be Right so Right sorry the question from the back. Yep Right side So let's say if the right side fails it could fail because of let's say Cassandra the tables are the persistent Right, so I think that's is purely configurable by your underlying database because what Lagom provides is basically a way for you to update your read side and then your actual API that you actually care about you can implement it in your own way and Therefore if it fails it will be the matter of the underlying persistent technology that you choose to implement this read side Are the right Right, so The right side and you're referring to when the message comes into the persistent actor that we care about right? So if that notes fails Lagom will automatically rebalance the entities to the existing running applications So let's say in the example where we have Alice So if the note that Alice died Alice and her friends would be distributed to the remaining notes and Then then the client relies on a cut clustering to do that, but like that is an area where we're still investigating further So that's a good point I just cashed the ID like like so I looked at the this book of like Enterprise architecture whatever that explains how to handle the idempotency issue and I looked at it It just says to cash So like that's what I'm doing and then so I think like over the time if we cashed too much for like I don't know what like 1000 IDs or whatever. I think we'll need some kind of mechanism to actually trim out the first 500 whatever Messages that we have to free up the memory space, but yes, I mean I don't see any other way like ways to actually handle that Yeah Now the actual actually not not like offset it's more like a business layer So so now for example if the deposit request come through it will have it will carry on the deposit ID with it And then I'm like, okay, but by the way, I already have this message in a processed deposit ID sequence The sequence of you ID already has it like then I just ignore then I just look the warning say oh, I got a you know duplicate message and just right and I mean that that would be the case for a long lift object So if you have a short lift object, then I mean you can decide you can also design your domain into a more Granola way like for example some of our system. It's very time-related So we could and some of us do implement the main where the entity is specific to let's say this month And then it's can cash as much as data. I just like and as you mentioned if some of our entities actually has these and These IDs as a business value already. It's not really an Irrelevant data that we just do it for the sake of idempotency, but you actually have that Value inside your nested object or anything. So yeah Oh Right just to clarify this duplicate event is not the event that got stored as the main source of event That we used even source that the message here stick the Kafka message that that one business room so the damage if we're too corrupt anything It's purely within your entity and you have expressive power to actually program it in the way that you like So so actually that was the biggest challenge for us when we first adopted this lagom and actually you have to think very differently now Like I mean so so I'm coming from the background where I always Begin everything with begin statement and do so many insert and update and either you so that's how you achieve the Tomicity either you do it all or fail or but actually what you have to do in it. So I'll give you a good example. So You actually have to Think in you have to actually think as closely as the real business Activities for example, so say okay When I first designed my lagom trading system I was like I designed in a way that My trader service just go to the market like assuming that portfolio has cash balance And then just buy it and then and then and then ask like ask for portfolio service to accept the purchase security But there was a scenario there was a case where it's portfolio doesn't have enough money Like why did you go and buy it? You know, they kind of scenario. So actually you I actually Don't realize it. Oh, okay You have to actually think very close to how actual stock brokers would operate what they would probably do is they're probably Cored the portfolio owner and then and then they were like, okay I'm about to make 1000 buy order for Apple Do you have money and then if he goes like, okay, I have money Can you reserve it? So you have to actually reserve it and then and then like By after confirming that you actually have the enough Reservation in the cash balance you go ahead and buy and settles it. So so so that is how we solve the distributed transaction in very outside The way it makes the real words more closely. Yep Yeah Right so Right so the reservation so I think we can influence your space coming in that sense But as you can see it requires multiple trips of events get generated multiple trips. The services get little chatty, right? but then in events eventual consistency world if I could say One would say that we can look at this in terms of apology oriented Yeah, that means you can go ahead and assume that most of your operations will succeed and if it fails you create Compensation transactions so in the real world for example, let's say in the warehouse Where we say that our customer want to have things in our warehouse, but then You know we could as much as we could gift Transaction for the customer from our database from our take At our technology, but in the database what if someone breaks something? I mean then What it's here in the database is no longer consistent with what's in the real world and there will be a valid business Float actually have to handle that case so any eventual consistency way We would think that you know transactions is not something that it's actually Mimic the real world if we just assume that most of things will go fine and then we can go with Compensation which has to be implemented in a well functioning business anyways Right so it would be like a new business like refunds Right, so I think in the ATM machine examples and he is a classic example of how We make sure that the distributor system So let's say like what if the machine ATM is disconnected from the world and can people actually overthrow? I think banks have some implementation and actually I don't I'm not from finance. Oh, you can correct me. It's okay So yeah, I mean so so So it's like yeah, so so there's no like just seeing a two-paced commit in as far as lagom is concerned I think but like I think there's a pattern called saga like that's about Distributed transaction like it's about your pairing your command with the like commit with the rollback, but like I Just didn't have a necessity to actually use that yet I mean, but we can actually handle all the like I actually never had to worry about the transaction ability when I'm as I'm building this Framer which which involves so many financial transactions and trading What No, no, we still like there's no touching it such thing as suspension in actor like in Lagom as far as lagom is concerned like just keep receding but like for example the Example of portfolio, you know So if somebody reserved like 90% of it's cash for trading if the if the other actor actually reserved it already and then now you have 10% remaining and if Another comment come through and asking asking to reserve 11% But then it this guy has only 10% left. So what do you do is like you can so that is when so because of time I couldn't actually hand touch based on this part But actually when the comment come into the actor when the comment come enters the persistent entity We have to validate first because we cannot argue with the event that actually already happened Even is something that already happened. It's a fact We cannot argue with it The only way to prevent from the portfolio to go negative balance is actually you have to validate the command Just check if you have enough money if you don't just tell him to So Yeah, this guy the command yeah, and actually in Lagom there is a CTX that invalid command method, which which will just Return back the error message to the to the sender Even a fake there's a there is There's a significant Latency between the one Yeah, so how how how would you treat it? I mean do you are treating strategies? We're not building a real-time training platform Let's say performance aspects that are certainly a Concerned right if you have the eventual consistency today Are not that relevant for us with a once per day, but it's a long-term investment strategy We're actually affecting once per day for our entire So, you know, we have enough time more than enough time But in fact if you if you want a I mean I still think that Lagom is a good candidate for building real-time live trading system as well The reason is that actually it has very tightly Integrated with Akka stream for example I show you the message for pics in my screenshot before and there I was only showing the Message type that sorry the endpoint types Your endpoint type for example can be a stream like you know You can actually open up a stream of communication through web socket for example and and also you know You can you can you know Kafka Kafka is extremely performant You can pops up your you know like I think it's a very good good candidate for Live trading as well Yes So Yes So that is why we're not using event rates, right? right, so I think if you have a business model that Has that entity the only one entity that receives a lot of load Right right understand so if it has a lot of load so let's say in the real world We can one way would be to actually think about we can actually have to Replicate this into actually its own entity and we implement our own Resolution with the real business flow That means like if you have one bank teller and if this bank teller has a lot of customers in the real world You should have two or three bank tellers that's one thing But the other solution is that I mean if your entity receives a lot of load You can think about how to scale your node and if you have like a lot of entities then With this Acker cluster as you increase the number of nodes The there'll be less number of entities live in your application, right? And that means it is it is actually dividing the load and as far as Alice is concerned in Lagom If there are two at any given point of time, that's a system failure for us There's a system failure we have to we have some air issue going on in our system if there are two in fact, they're not replicated Right so yeah, like make this crap. I think you can you can scale by Like I think you need it depends on the business Business problem right so for us This is not really an issue because of the load is quite nicely distributed among all our customers now It depends on what the use case is why do you have so much load on this single? Entity right and then as Michael saying like you need to find a way depending on business problem to break it down Like why why does for example if like one entity receives? I don't know let's say 90% of your traffic for some reason then why is this happening right? Is it is it and then and then find a way to kind of break it down on a business logic level, right? And then you scale kind of horizontally, but breaking it apart That means that that means you have to redesign your actor that means you have to redefine your aggregate roots So we actually had similar kind of situation so where like a rebalancing load was Like say we have let's say here 28 different risk profile like with the different location targets in our station I award and then but then let's say because Singaporeans are more like a conservative investor Let's say they all you know going to like they all goes into the risk profile between 8 and 10 so so for example if we use this risk profile number, which is 1 to 28 as a partition key of ID of the actor then then it will be very likely that you know This this actors with ID with between 8 and 10 will have like most of the traffic So so so you have to actually think Like you have to actually partition it differently then then that means like I don't know I haven't actually thought about it yet, but like you have to choose your partition key wisely to distribute the load so so Yeah, right. I think to add to that. I think a lot of highly distributed system, whether it's a Cassandra or Kafka one of the key horizontal scaling is choosing the right partition key, right? So and I think the same should be applied when Modeling your business domain as well such that Yeah, but now I'm saying before the fact right you're asking like if we discovered after the fact Yeah, so I think it should start with the right and I think that's why DDD driven DDD design it's very crucial part of designing a successful even sourcing CQRS system but otherwise Yeah, I don't think I'm if you are satisfied yes, if you're a partitioning we'd say the individual users surname for example and then if you deploy that Korea for example and Your actor that handles L will be like Killed because there's like half of the Koreans are me You know, so you have to choose your partition key really well and then So that it's evenly distributed So the whole point of actor is that you know, it's a linear scale ability, right? You can actually distribute the load just by creating another instance Right so I guess model your act as you're Everything goes to the aggregator and depending on how a model you might distribute that Do I guess the child's act as so whatever you do, you know, that's actually I guess you're Single source of food any information you Actors a point where you handle Concurrency, right? So you have actor so that thousands of friends coming in making coming to question this guy is the one that Make everything the sequence So that okay, it's fine, right? So that one day realize that all Overloaded with your quest so you probably create two actors leaving in two JVM So That to me sounds more like a stateless actor like like, you know, like you just create an actor to process and Distribute the request and the actor doesn't necessarily have a the state So I think to your example I think maybe looking to something like event rates might be a reasonable choice because it is replicated Event source actor and it has API nice API's to allow you to resolve the conflict And if it is a counter, that's even simpler because you can model it as CRDT where it's I think add-only like monotonically increasing So I think if that is one of your problem Maybe already can give the talk next time about event rate So I think of course this is not one size fit all right, so Like one actor transferring its money to the other actors, right? so that that sounds like Like that to me that sounds like you know, even in real world you need a broker in between so I'll probably map a Come up with the concept of broker actor or something transfer actors and like that that actually you know Controls these transactions Cluster and plaster just insures availability When the actor goes down Just that cluster Just one instance, you know running somewhere on multiple nodes That's it. If you need performance, there's problem. You know, you're just that Yeah, I've got this question probably subject to what what are your experience with that Some non-trivial For me, it's still a new world, I'll be honest with you I'm still learning a cut cluster and actually though like the way how they these actors forming two different nodes I'm still actually learning them. So but yeah, I mean I mean, we're not live yet, right? So we're running this We're getting close to launch I'm sure, you know, once we're live it will be we'll learn a lot more in a lot faster Maybe we can call you that Our best expertise And it works Feeling that, you know, like the simplistic way to resolve this questions around a cluster is just buying the conduct our subscription Probably that's why there are not many resources available online. I mean All of them No, actually paper is just rooting like eight billion transaction per day using seven jbms running a couple of threads. No paper No If we just spend a lot of time in planning business logic, so we just haven't had like to death to actually investigate But yeah, we would love any sharing about that cluster in this meetup. I think it would be helpful for the community Problem Something like when you turn the cluster down, it starts to record It's down no down really actually like this total panic I'm panicking It's promising anyway, it's promising promising And I honestly don't see many other alternatives out there, I mean so And then that is why we're looking for senior I Think even look is locked right with customer. No, I think you can still use yeah They allow us they allow you to use JDBC for the specific layer, but for the Kafka broker layer, I think you probably can but I think it's just the support the API support that it has for now With the offset maintenance It's only with Kafka, but I can imagine a custom build solution because I think the cuff the Pups up side. It's a little more distanced from the Ashley core of CQRS and even sourcing Okay, thank you So right Yeah, I think the underlying it's it's scala, but they do provide both API's and Java came up first Yep, so I was shocked actually I think so Question about your domain boundaries. Yeah, so it sounds like you have to get your boundaries, right? Yes, but how do you feel? New flexibility to extend it in the future. So for example, your customers in the future could have different tiers Yep, you have basic golden right, and maybe some would have priority for their commands How confident are you that you could? I think within its own model, I think it allows for a very easy extension because basically you just define a richer representation of your domain and then if your feature is actually an Add-on it shouldn't interfere with the past events. So basically you can think of this as just Once we have the new feature you have a new types of event stream that right, I mean, but I see potential Problem with that actually let's say we have a customer service And then let's say we already defined the customer entity and then customer entity is identified by its own ID And then later on if we introduce the membership like a gold customer silver customer If if this differentiation Introduce a huge Changes in the customer state itself I think it's going to be a problematic because like that means we have to come we have to build another Entity that represent the different type of customer like a gold customer entity But then but then there could be a potential like duplications in ID and like and especially if these new features added later in the time It could be a potential problem But if it's just though if it's just just a tribute that explains the characteristic of this customer entity I don't see why we can't just introduce as a enum field in our states And then you know like like just you know feel this in a state and and and implement different logics in our command handlers I think that would be a really bad design from the get-go Right Actually, you know what if that's the case I'll actually create a new service. I mean like like I'll actually plan to retire this old entity that has very quickly defined event sets Just I don't know maybe keep them running But like I'll create another entity with a better design and then And then actually same thing. I don't know like So I think that's I think that's the same kind of major migration that you have to do in traditional database anyways like in any Huge schema change. It's always a pain Right so So to the first part it's that so events should be something that has happened that means that when the events are mutating the state is shouldn't Create more source of divergence That means when the event was persisted Almost your final answer should should be there too. So because it's um immutable thing that Has already been been has already been It's your program has changed right, so instead of Instead of having 1.5 How could that happen? I mean like because the event is immutable So you're talking about like what if what if the event will actually get corrupted? Yes Okay Sorry just one last question What's the meaning of life Just uh, just for lunch Joining stash away, I think it would help you to philosophize about life and eventual consistency