 We're going to discuss some event-driven architecture right now, and I've introduced it myself before. And tip number one, this is the book that I published three years ago, migrating to microservices databases. I've been discussing microservices in the past five or six years. And the number one question that everybody always asks me is that, okay, I know we need to break our monolith into smaller pieces, microservices, but everybody's talking about the behavior, which means the code. The code, we got it. It's not easy, but we got it. But what about my data? And I didn't have a proper answer for that. That's why I started a research. And after approximately two years, I was able to publish this book, migrating to microservice databases, and thanks to Red Hat Developers that bought the royalties of the electronic version. If you go to this URL, or to the developers.redhat.com website, or to my Twitter, to my Twitter, the peanut tweet on my profile has a link to the book too. You can get the electronic version for free. So that's first tip. And from this book, I am getting this quote, code is easy, state is hard. Because as I mentioned before, it doesn't matter how hard is your behavior inside your microservices, splitting code is much, much, much more easy in their distributed data. And why is that? In distributed systems, we have an interesting analogy. We'd like to say that data suffers from gravity, which means that data has massed. What does it mean when we're trying to distribute an architecture solution? It means that the more data you have in one point of your system, the more data attracts. And why is that? Well, in complex enterprise information systems, we have a lot of complex relationships between the data. We have a lot of correlation. And it's much easier for you to establish this correlation if the data is stored closer to the other endpoint. So the more data you have in one endpoint, the more the temptation to put more data together. And then you start to create big microservices or small monoliths around even this distributed data. Very common pattern. Another implication of gravity, why data has mass, is that the more data you have in one single piece, the harder it is for you to get this data and move to other points. So for replication, distribution, all else, the more data you have, the harder it is to distribute. So these are some concepts that we need to have in mind. When we discuss this very traditional solution, as I mentioned before in a previous talk, the traditional way for people to try to distribute the data when they have a microservice or distributed architecture is that, well, let's just expose the information through a REST endpoint and access everything over HTTP. And some of the main issues about that are, for example, latency. Whenever I need to fetch the data, this data used to be stored locally in the same database. So it was very close to where it's been used, but now I have this latent issue. Whenever I need to request the data, it takes some time to go back and forth from the network to get the data that I need. It also has some availability issues. As we mentioned before, in this case, we have temporal coupling and uptime clipping. Both services need to be up at the same time or else my system is not working properly. And because of that, I also have the performance issues. It's going to be much harder for me to have a proper performance because of the network between the two endpoints. And we know that REST over HTTP will never perform as well as a local database connection. We need at least one level of magnitude worse than a local connection. So we have performance, latency, and availability issues. And to solve these problems, some of the first things that comes to our mind is to add a cache solution to solve this. Because we have a performance problem. Well, let's try to add some caching. And caching also, to a certain degree, can help you to solve the availability issue because if the remote endpoint is down, maybe we only need the data that is already cached. So you'll be able to work for some time whilst you have the cached information. But caching also introduces some issues by itself. And also, and of course, if you're using caching, you might be want to tie it to some polling strategy. So how often are you going to fetch the information? If you're fetching information, for how long are you going to keep the information until it expires? Five minutes, 10 minutes, 40 seconds, it depends on other requirements. For example, how often the data is changed on the other points? And how long can you afford to have outdated data in your local endpoint? And also, if you're using caching in your system, it's very likely that the data that you're storing locally is stored in a different format from the cached. What is very typical for enterprise information systems, you are storing your local information using a relational database. And you're caching that using, if you're using an HTTP cache, you're caching JSON or XML in your cache. Or if you're caching in an application layer, if you're using Java, you're caching Java objects and not the data. And for you to correlate this data, in the past, you used to just issue a join to your database. All the data was correlated. Now you have to perform in-memory joins in your application with the objects that we have here and with the information that you're going to fetch from the database. You have issues. It doesn't matter what solution you embrace. And of course, if you're adding cache, you have the problem of eventual consistency. And in my book, I discuss these issues with eventual consistency. And in enterprise information systems, we don't use pure eventual consistency. Because in a pure eventual consistency model, we might have data conflicts. Usually, we're not allowed to have data conflicts enterprise information systems. So we always want strong eventual consistency, which means I have a single point of right in my system, which is the remote endpoint. But when you talk about S developers, enterprise information system developers, where you're facing with the customer and business analysts or something like that, if we ask them, do we need strong consistency, they will always answer, yes, I need strong consistency. I need to be consistent all the times. But in the real world, you don't need strong consistency all the times. Think about that. The most important business decisions are usually taken with a report in the hands. And this report has been printed like six hours ago or one week ago. So they're taking the strategic decisions based on information that is already outdated. The information is correct, but it's outdated. Well, that's eventual consistency. So instead of fighting against eventual consistency, we need to start to embrace eventual consistency in our daily activities as developers. So eventual consistency will be the standard way of using consistency models in our distributed systems architectures, in our microservices architectures. But before, I try to present you the solutions of how can we deal with distributed data in an event-driven architecture. And I know event-driven architecture is a very large subject. I wouldn't be able to cover everything in just one hour. So I chose to cover, we are only going to discuss what are the data distribution strategies for event-driven architectures. And maybe it's a good choice because that's a point where most people struggle with. So before digging into the solution, let's go to our time machine and look like 10 or 15 years in the past and analyze how data was managed 10 years ago. So what were the technologies that we were using 10, 15 years ago? First, Java E6 was a release in 2006. So 13 years ago, we were still using EJB 1.1 or 2.0. So yes, we were trying to find about entity beans. We had to create super specialized classes, more than one, actually, to be able to persist and carry the information from our databases that wouldn't scale very well. So it was very hard to create this kind of technology, very hard to deploy, very hard to maintain, very hard to test. And then we also had the release of Java 6. We also had the release of Hibernate, for example. And Hibernate 3, I believe, was released also in 2005 or 2006. 2006, we had a release of JPA 1.0. So we started to change the way that we were persisting our information, at least in the Java technology. But if you use it to other platforms, too, after the creation of Hibernate, a lot of other technologies start to see, well, maybe this RIM thing might be a good solution for some problems. So yes, we thought that Hibernate would be a very nice solution to migrate from the entity beans. We also had some SQL mappers. But RIMs has its advantages over SQL mappers, too. And because of Java 6, we added a feature to the Java language called annotations. So in the past, we had all of the metadata from our applications stored in XML files. And we had to maintain them separately. You maintain your Java code, and you need to maintain your XML files in sync. So we basically replaced the XML hell with annotation hell, because now all of our metadata is stored in annotations in your codes. And some people might argue that, since it's hell, both options are bad, but annotations are better than XML. And why is that? We learned that metadata is much more useful if it's stored close to the data, if it's stored close to where it's used. So in the past, Java code, look for XML file, it's hard for you to establish the correlation. Now the metadata still needs to exist, because we need this made information. But now it's stored very, very close to where it's used, so it was definitely improved. And it changed the way that we developed Java applications. And another good point to emphasize here in Hibernate is that entity beings were a super specialized class to store and retrieve information. With Hibernate and JPA 1.0, we were allowed to use plain old Java objects as the structure of the information. So now we're storing pojos, retrieving pojos. And in the beginning, the prets on the Java word and that, unfortunately, is still true today, we created what we call an anemic domain model. What is that? We have pojos are simply DTOs. They don't have any behavior. They are just containers for information. And typical enterprise Java application architecture is that we save pojos and retrieve pojos from database. And all of the business logic is contained in service classes. The service classes contain the business methods. They usually have the transactional delimiter of your application. And for you to be processing any business logic, we'll be curing something from the database. You will be getting all of the information out of your pojos using gathers. You will process that and then put everything back using setters. Well, that's not really object encapsulation, right? You're exposing all the information and putting them back. But at least we had an alternative to specialize the classes. Pojos, and if you ever studied domain driven design, well, maybe you could be adding some behavior to this pojos. This, I'm not arguing that pojos are super bad. For example, if you have a very simple crud architecture, as the one that I showed when demoing Quarkus, maybe pojos and an anemic domain model are a perfect fit for your very simple crud model. But as I said before, in our world, enterprise developers, we usually deal with very complex business domain models. And these very complex business domain models might benefit from richer domain models, which means that our classes might benefit from having behavior, too. So that's one of the things that, why I think that studying domain driven design is very nice for enterprise developers. Also, 10, 15 years ago, we coined this term called event sourcing. Event sourcing is much older. It's like 30, 40 years old. But we only started naming it event sourcing approximately 10, 15 years ago. And also some of you already know event sourcing. You might be using event sourcing in your systems, but I have to explain very briefly how does it work. So if you're working in banking industry, you know that you don't store the balance in your account in a single row in a single column in your application. You don't have an account ID, a customer ID, and the current balance server application. We also call this snapshotting, this approach of storage. And why is that snapshotting? Because you only know how much money do you have in a bank account now. You only know the current version of the information of your system. That's a snapshotting. I'm not arguing that it's bad. Again, if you have a very simple crud user case, this is probably the best way to store your information. But for other types of data, maybe we have better approaches. For example, in the banking industry, you would store the amount of money that you have in your bank account as transactions. And the true definition of event sourcing is that the state of the data is a stream of events. These events don't need to be necessarily persistent, but it's very common for you to persist those events so you can reuse them later. That's the case of the banking industry. The state of the data is a stream of events, and you want to persist these transactions. So how does it work? If all of the bank accounts start with a zero amount of money, and you just supply the debit and credit transactions one after the other, you can compute, add, and subtract operations, and you know how much money you have in your bank account. Some advantages of these architectures that are this pattern is that you have very fast writes because your persistent mechanism only needs to be an append-only file system. For example, you're just writing to the end of the file. You have a free auditing log. If you want to know, well, maybe my balance is wrong, well, just start from zero and apply everything again. You have a free auditing log. You also have a free time machine. If you want to know how much money did you have in your bank account exactly one year ago, just start from zero. Apply all of the transactions until that point in time. You know how much money did you have in a bank account. And as said about auditing, it's so efficient that banks, for example, that's the routine that they execute every single night. They have a snapshot of all of the money that you have in your bank account in the night before. So tonight, they're going to run all of the transactions that happen during the day. And they're going to check if the snapshot plus the transactions are equal to the current amount of money that you have. If anything is different, they're going to flag that account. And a person is going to verify the next morning what happened in your system. So don't worry. The bank never loses money. So if you think, oh, I have more money than expected. Well, maybe something's wrong. The bank will take care of that extra money that you think you have. That's how event sourcing works. So that's the basic of event sourcing. And what are the benefits of event sourcing? Well, one of the benefits is that event sourcing and events in general enables you to think in the events that happen in the system. So let me try to explain you why it's important. When I learned how to model or how to develop applications, it was many years ago. So, well, I don't know how many of you have learned that. But many years ago, we used it to study something called DFDs. We had the entities. We had the repositories and information. So it was very usual. And we also learned how to use databases and later SQL databases. So it was very common for business analysts and for me as a developer and an architect, too. It was very common for me to interview customers, the users of the system say, OK, let's try to discuss the system. So what does the system have? So the system has a customer, has an order, has an invoice. That was the first question. Second question was like, what does a customer have? What does an order have? Oh, order. It has an ID. It has a date. It has a customer ID, which I'm going to relate with the customer table. Also, what else? Well, it has a payment option. The customer has an ID, has a social security, has an address, has a telephone. When you're asking these questions, you are more worried about the structure of the information rather than the behavior of the system. And this kind of modeling was very appropriate 10, 15, 20 years ago. Today, when we're leaving this microservices world, maybe other types of approaches are better. Are a better fit. And I'm not saying that the structure of the data is not important. It is very important. But 2019, maybe other things in the system are more important for our modeling. So one of the things that we have to learn that in 2019 is much more important for you to think about the behavior of the system. And events allows us to think about the behavior. So if you ever prepped some domain-driven design, you know that in domain-driven design, we have a technique called event storming, which is how we try to model some concepts of the system. So event storming, we just started discussion with all of the stakeholders in our application. And we don't talk about structure. We only talk about what are the events of the system. So what is happening in the system? OK, the events on product is added to the shopping cart. OK, that's an event. The customer pays for that. Well, that's an event. So you just discuss the events, and later, you start to model other parts. And it might sound silly, but the way if you change your mindset to events, you'll see that we'll change completely the way that we'll try to model your applications and we'll try to separate information in your endpoints. So events should be very important for the way that we model our applications. And another concept from the past, CQS. CQS was coined by Bertrand Mayer in his book, about the AFIL Programming Language. And when I read the book, many, many years ago, but Bertrand Mayer released the book, I think it was in the 80s, 84, something like that. And when I read the book, some years later, or many years later, he proposes something called CQS, command query separation. And the quote that I have from the book is that asking a question should not change the answer. And when he tries to explain CQS, when he says asking a question should not change the answer, it means that, well, you need to have separate methods for reading and writing. So you have a method. You can't have a method that reads the information and updates the information at the same time. So if you're using CQS, you would never find a method like find and increment. No, you have a finder and you have an update method or you have an increment method, never both. And if you use CQS, it means that not only a method should never do both operations at the same time, but when you're designing your service interfaces, you should split your interfaces in write methods and read methods. So if you're coding in Java, you would have an interface for the write operations and another interface for the read operations. And at that time, I thought that it sounded very silly because in 99% or 100% of the cases at that time, we would be implementing both interfaces in the same class. So why waste time implementing them in separate interfaces? OK. But Bertrand Mayer, of course, was much wiser than me and other people. And many, many 30 years later, we realized that, oh, maybe that's a really good idea to implement them separately because if you do this, if you implement write methods and read methods in separate interfaces, you can choose different implementations for your read and write operations. And I'm not even talking about distributed systems or microservices. It's not an unusual use case. Many applications these days, they prefer to write information in a relational database. But when you're curing the information, you're using an elastic search database. So that's a perfect example of CQS. You have separate interfaces for reading and writing operations in your Java code. And you're implementing them in separate classes. It wasn't the case many years ago. But today, it's not unusual for you to choose this type of implementation. So that's why it's very useful for you to have separate interfaces. Having these in mind, 10, 12 years ago, Greg Young coined this term, CQRS. CQRS stands for Command Cure Responsibility Segregation, which is a very fancy name. If you think that I can't get what does it mean for Command Cure Responsibility Segregation, don't worry. When Greg Young coined the term CQRS, he just wanted it to look like CQS, but be different. So he had an additional letter, and later, well, picked the words. So now we have an acronym, which is like CQS, but different, but stands for Command Cure Responsibility Segregation. So that's how you give a meaning to some letters in, well, as I said, developers are very good in naming things. CQRS, and I'm pretty sure that you already created a CQRS pattern for your persistence. But you didn't know that it was called CQRS. So how does CQRS works? So your CQRS means that, yes, you will have CQS. You will have different Java interfaces for reading and write operations. But you also will be using different models for reading and writing operations. So a traditional credit architecture, you use the customer class for writing, to inserting and updating the database, and for queuing. Whenever you're queuing your database, you're retrieving customer objects. But if you're using CQRS, it means that you still have the customer class for writing. But when you're reading, you're using a different class. What does it mean? It means that this is a credit architecture. I have a customer table. If I'm inserting to customer, I use the fields. When I'm selecting, I'm selecting from customer, I get all the fields. But if I decide that I need to create a custom query for performance reasons, I need a report. I only need these fields. I'm going to create a custom query, select ID, name, phone from customer. I'm very likely I'm going to store these in DTO. We've only these three fields, yes? That's a different read model. So I'm writing with customer, and I'm reading with this DTO. Later, I need another report, which has ID, name, and address. I create another custom query. It's going to return me another DTO, which is a different representation of customer. It's another DTO. And you realize that when you have CQRS, usually you have one write model, and you have multiple different read models. So whenever you write information in one way and retrieving that in a different way, you are using CQRS. And why we did that? Well, primary reason in the past was performance. In 10, 15 years ago, we were creating CQRS for performance. And an improvement over the CQRS was that, well, we are not constrained to read and write from the same database. For example, the example relational database and LSS search is a clear example that I have a CQRS pattern with separate data stores. I write in one data store in one format, in one model. And I'm reading my information from another store with another model. That's a typical CQRS pattern with separate data stores. And I want you to pay a close attention to this picture, because this is the model that we're going to use to distribute our data in microservices architecture in enterprise information systems. So this is the current recommended architecture. This is the one that we're going to use. So on other examples of a CQRS with separate data stores, like if you ever created custom query against a view or a materialized view, you were using CQRS with separate data stores. Because you were writing to a set of tables, and you were queuing for a separate set of tables. OK, views are not really separate tables. But if you use a materialized view, then it is really separate tables. You are reading and writing from different parts of your system, from different parts of your storage systems. It's a CQRS with separate data stores. And I had to explain all of that also to say CQRS and event sourcing. When Greg Young coined the term CQRS, it was that because we were discussing that for some business use cases, event sourcing is a very nice fit. But everybody's used it to creating CRUD architectures. How do we bridge the gap between event sourcing and CRUD because they are very, very, very different? Well, we might need an intermediate step. And what we realized later as that CQRS and event sourcing work extremely well together. Getting back to the bank account example, if your data store is still this one where we're writing the transactions, you realize that it's very fast for writing but very slow for reading. If you want to know how much money do I have in my bank account, start from zero and apply all of the transactions over and over again. It's slow, and it gets even slower the more transactions and the more customers that you have in your system. So what do we do usually in production systems? We are going to create a different table in our database. And this table is going to be our CQRS representation. It's going to be our CQRS read representation. What does it mean? We are going to write the information in the transaction table, but we're going to read the information from the balance table. And we're going to find a way to update the read information from the write information. If we're using a SQL database, it's very likely that you're going to apply both operations in the same transaction. If you don't have transactions, it's very likely that you're going to apply some sort of a synchronous notifications that you'll be able to get these transactions here and update the balance that you have in your bank account. And remember, if you're not using serializable transactions in this operation, it means that you have eventual consistency. You are already embracing eventual consistency. And when some customer or business analyst asks and answers, when you ask and they answer, yes, I need strong transactions, and you say, well, the money that you have in your bank account doesn't use transactions. Do you really need transactions? Because the banks are already eventual consistent. So transactions are the right model, and the balance is the read model. And the secret of using CQRS with event sourcing lies on this arrow. So what are we going to discuss right now are what are the restrictions, what are the requirements, and what are the possible technologies that I can use to implement this arrow from one point to the other. So why are we using CQRS 10, 15 years ago? The number one reason for being implementing CQRS was performance. I have a query. I don't want to fetch the entire database into memory. I'm going to create a custom query, an optimized query to fetch only the information that I need. And with that, I was creating a separate read model. So number one reason was performance. 2019, we are using CQRS for other purposes. For example, distribution, availability, integration, analytics, the first free one are very useful and very common in microservice architecture. And why are we doing this? Well, as I mentioned before, with the HTTP example, my endpoint needs the custom information. Custom information is on the customer microservice. And traditional ways, I'm creating a REST representation and fetching that through HTTP. Doesn't scale, temporal coupling, uptime coupling, and performance problems. This is a typical pull approach. I fetch the information when I need the information. But as we learned before with reactive systems or reactive programming, our system perform much better if instead of using a pull approach, we change to a push approach. And we reduce the coupling in our system because right now I need to know that the customer information comes from the customer microservice. I need to know where is the customer microservice. I only work proper when the customer microservice is alive and well. This is the type of coupling that we have. But if instead of pulling the information, if instead of fetching the information, if I receive the information, then I have a push approach. Whenever customer data changes there, I will receive a message, a notification, saying that the customer information changed. And with the data that is coming from the message, with the message, I'll be able to update my local copy of the information. So what is that? Well, if the write endpoint, if the customer data is always being updated there, that is the secure RS write representation. And if I'm consuming the customer information from my local copy, which is high performance and everything else, which is easy to do joins, for example, if I'm consuming locally, this is a local secure RS read that is stored. And if any other endpoint in your system needs the customer information too, they are going to do the same. They are going to store a local copy of the customer information, and they will receive notifications whenever the customer data changes. And another benefit that we have here is that if we use REST plus HTTP, everybody will be fetching the same thing. So everybody needs to know customer. But if you're using a push approach, customer, microservice doesn't even care about who is consuming the information. And you are the one consuming information. You don't even care where does this information come from. I don't care. I'm just going to listen to a customer channel, and the events are going to come to me. Maybe, maybe this information is coming from multiple sources, but I don't care. And maybe they can even change the source of information, but again, I don't care. You have a much looser coupling if you're using this kind of a push architecture. So if you're reading locally, instead of fetching the information, performance is much better. You're distributing that because you have a local copy. It's always available. Oh, my local database is not available. Well, your microservice is not available. It's not because other thing is not available. You're unavailable. Your entire system is unavailable if your local database is unavailable. And some people discuss, oh, I need to provide secret breaking if my local database is not working. If your local database is not working, you have a much bigger issue. You can imagine providing fallbacks for all of the operations on your database. It doesn't work that way. Usually, we fight the problem on the other side. We need to make sure that our database is always available to our microservice. But I understand that for some unicorn scenarios, you can change the way that you're dealing with the problem for enterprise information systems. I've never seen one that works properly when the database is done. And now that we have flows of information coming through our system, now that every time that the data changes there, I generate an event and it's propagated through a channel, very likely through a message broker. Why can't I have real-time analytics? And what is analytics? Whenever I wanted to have an analytics of my data in the past, I would create a custom query. I would spend some time creating a query. I would execute the query against my database. And I would get the results. So I would know the analytics. I would get the information that I wanted. Remove the snapshot of the data that I have right now. Then later realize, well, some analytics are much more powerful if I'm able to get the historical information of my system. So we created something called data warehouses, which were basically multiple snapshots of the data. So you would get the information, not everything, of course, but you would get the strategic pieces of information and copy from your database to your data warehouse. And you would do that, for example, every night or every week. So later, when you would issue queries against your data warehouse, you would have the stored confirmation. You would have how much per day, per month. You would have that data already processed. And you wouldn't have a single snapshot of information, but it's stored confirmation. But again, 2019, if we're trying to use a push approach, if we're broadcasting the information changes through a channel, why can't I listen in real time to the data changes and use something like Apache Spark to get real time information? Oh, I want to know in real time, I don't know how many orders are being processed or how much money I'm making right now in the past five minutes, without issuing multiple queries like this. Oh, if I'm a ride sharing company, I want to know how many ride shares I'm having right now. And I want to compare this information to the baseline that I had one week ago at the same time. You can do that. It's much easier. You don't need to be creating this very complicated queues. You can use the real time tools that we have these days to gather that information. So you start to have other possibilities for using the data once you change your approach from pull to push. So this scenario is the one that I've described before. You have one at the point which is the owner of the information. And you have multiple at the points in your system which will be consuming that information which are read data stores. And you will find a way to be updating that information remotely. And you will be creating events for that. What type of events? We are women talking about event-driven architecture. There are multiple different types of events. There are notifications, for example. But since we have all of this context in our minds, so we were just explaining what was created 10, 15 years ago. But these concepts are very, very, very useful today. One of the funny things is that when I took my master's degree 20 years ago, I was studying distributed systems. And one day I was talking to my colleague and we were discussing, you know that all of this stuff that we're studying here, we're never going to use this into the real world. And they said, yeah, this stuff's too complicated. But 20 years ago, we want this complication. So when people started discussing microservices, well, it's just distributed systems. Do people really want to go into this kind of complexity? Apparently, yes. So back to the future of 2019, what would be the best approach for us to solve the data distribution problems of our architecture? And again, event-driven architecture is much larger than this. But I'll focus just on this piece of the architecture, this piece of the solution. What is an event-driven architecture? Well, an architecture that is based on events. And what are events? Events are things that happen in your system. And when I say things that happen, it also implies that events are things in the past. And we also learned that you can't change the past, so events must be immutable. It happened. You're not going to change. Which changes the way that we model our system? Because traditional systems, we're going to try to implement operations. Basically, you have that, oh, I have this class. And when you want to implement behavior, you're going to add methods to that class. And that method, the method names are verbs. And the way that your verbs are always on the present. When you have events, no. Very likely, the most common verb that you're having or the most common method that you have in all of your entities will be a method called accept. And you don't have a choice. You have to accept an event. And why do you have accept? Because the event already happened. I don't want to accept the truth. I don't want to accept the past. Well, it doesn't change the fact that an event already happened. And that's a example of this kind of behavior. For an event that, for you have a cashier, you implemented the cashier system. And suddenly, somebody goes with a potato chip bag to the cashier. I want to buy this potato chip bag. You scan the barcode. You go and pay. It's an event. Somebody bought a potato chip bag. And you generate an event. It's going to be propagated to the message bus. And when it reaches the inventory system, the inventory system says, I can't accept the event because the inventory for potato chip bags is zero. Well, somebody had a potato chip bag because somebody bought that thing. So who is wrong? The real world or your database? So it's not up to the inventory system to know if the event is right or wrong. The event already happened and is a truth of the system. So if you think that inventory is zero, well, maybe you need to have other procedures to compensate or to find out what happened. What happened really, what is very common because I work in the industry, too. In this case where, well, somebody bought a potato chip bag and inventory zero is very likely somebody bought the product and forgot the potato chip bag. Somebody got back into the counter and it was available to purchase again. It's a very common solution. And it also happens that somebody steals the potato chip bags so the inventory is positive but you don't have anything in the counter. So that's how systems are supposed to be modeled in the real world but we as developers, we think about the perfect word. We don't think that these situations are going to happen but if you're using an event-driven architecture, you don't have a choice. Your system needs to accept whatever event has happened in the system, okay? So you need to be prepared to deal with that. Once you accept that events are going to happen and your classes are going to your objects are going to accept whatever event that happened in the past, you have the discussion. Should I model my events as low-level events or should I use domain-level events? So if you get some fancy architects or you read some domain-driven design book, you think, oh, domain-level events are the best. Well, it depends. And why is that? I'll tell you that you have some different requirements and consequences depending on the type of events that you choose. Let's start with low-level events. Benefits of low-level events, they're super simple. You only have three different types of semantics. Low-level events you have like order created, order updated, and order deleted. The semantics never change because these are the basic operations that any entity in your system can have. But what can change? You still have coupling? Yes, you do, but what can change in this type of event? The scheme of the information. So the semantics will never change, but the schema will change. So you need to prepare for changes in the schema because you're going to add information to the event. You're going to remove information from the event. You're going to change the structure. You're going to change the type of the data that is in the event. So you only need to be prepared to verify the changes on your schema. But the complexity of your architecture when you're using low-level events is on the consuming side. Have that in mind because now I'll switch to domain-level events and explain what's the dichotomy here. When you have domain-level events, you have very high-level events, like they are rich in semantics. So you're not saying that a customer, you're not creating a low-level event like a customer created events or a customer updated events. You will receive something like customer address changed events. So this is a domain-level event. You have a lot of semantics. Customer phone number changed events. Lots of semantics. And when you're processing this thing, suppose that I'm the shipping service and I'm interested in customer address changed events because if I have a pending shipment, maybe I want to be notified about these events because maybe I can change the address in which I'm shipping my package. So you are interested in customer address changed events and I only want, I don't care about the telephone, I only care about the address. And now you realize that when you have domain-level events, the complexity of the architecture is in the endpoint generating events because now you have to know what changes, what type of event do I need to create and you're probably going to broadcast these events in different channels. But on the other hand, if you have this complexity on the sending side, receiving the event is much easier because I can ignore all other events and only listen to the ones I'm interested in. For example, for the shipping service, I only want to listen to customer address changed events. And what's the problem with this approach? Oh, but it sounds wonderful because you have a lot of semantics, exactly. The problem with domain-level events is that they have a lot of semantics and semantics of the systems change over time. Okay, you know the knowledge of the system changes, the requirements of the system changes, the semantics changes over time. And when you're changing semantics in your code base when you have a monolith, you just apply the refactor and it's going to change everywhere that thing is being used. When you have a distributed system, well, now you have like 50 different code bases and you need to know who is using these events. If you have a very small company with very few teams, you can go asking, are you using this information? Now imagine a very big organization with 50 different distributed teams. Are you really going to be able to manage this kind of change? Oh, I'm going to change the semantics, for example, or if you introduce a type or if you change a type, I have customer address changed events. Now I have more complex requirements, I need to change the semantics, I need to introduce new types. Now I'm going to add, well, customer address changed event, now depends if they changed the address inside the same state is one type of event. If they changed the address in another state, but same country is another event. And now if they moved it to another country, is another different type of event. You see, I changed the semantics. And I still have the possibility of changing just the schema of the event, the structure of information. But since now I can have both, you can see that became much more complicated for you to maintain the remote endpoints because you still have a very tight coupling. And the coupling in a distributed system that uses messages or events is in the type and semantics of the events. So, and how do you combat that? Well, and I said, well, events, domain-level events sometimes are great. They are good when you change the semantics. And remember I said that low-level events, the complexity is on the consuming side. And domain-level events, the complexity is on the sending side. And why is that? Well, I'm the shipping service. I need to be notified that the customer address changes. But they are low-level events. So I only know about the customer created, customer updated, and customer deleted. Okay, I want to know about the address change. So I'm going to listen to all customer events, at least the customer updated events. A customer updated event arrives. What piece of information changes? I have no idea. I have to parse all of the information to check what part of the information changed. But if I want to know what changed, I necessarily need to know what was the old version of the information. So I need to store the old version. And when I retrieve the old version and compare to the event that I received, the address has changed. Now you need to process the event. So when you also have low-level events, you're not allowed to filter the events that you want. You have to listen to everything because you don't know what changed. So low-level events, very easy to generate the events. But the complexity lies on the side that is receiving the events. Also low-level events, semantics never changes. You only have changes in the schema. When you have domain-level events, very hard to generate the events. Very easy to consume the events. But if you have the change in the semantics of your system or your events, then you have a huge cascading change in your system. And depending on the size and the structure of your organization, it might be super hard for you to contain this coupling between your events. When would I recommend people to use one versus the other? I would only recommend domain-level events if your business domain is super stable and you're an absolutely expert on the fields. So I've been doing this in the past three years. I know everything that can happen in the system. This thing didn't change in the past three years. So I'm going to do domain-level events because, well, that's everything that I did in the past three years. And I know a customer, for example, that did exactly that. No, you want an event-driven architecture. We are going through aiming for domain-level events because there are only 59 types of events that can happen in the industry. And they did it right for the first time. And it's working properly. And they implemented this architecture in six months. On the other hand, you don't know exactly you want to produce an MVP. You're new to the business domain. You don't know exactly how to model. And the requirements are going to change over time. Maybe it's safer for you to stick to low-level events. You won't have the power of that. Consuming events will be much harder. But with low-level events, you are on the safer side. Semantics will never change. And maybe later, you can evolve because I would say that it's much easier for you to evolve low-level events to domain-level events rather than the opposite. Because once you've screwed your domain-level events, it will be very hard for you to roll back to a previous architecture. And why I say that? Remember that I said that created, updated, and deleted low-level events? But here, I want to know customer address change at events. Well, what prevents you to create another microservice? We just verify that information. So you have a local copy of the data. And the only responsibility of this microservice is to filter the information and create specialized event types. So I'm checking, oops, the address changed. I consumed this message. And I broadcast customer address change at events. And you have the added benefit. If you think that this type of event is not going to change, this microservice is going to consume the domain-level events. If you think that, well, maybe it's risky, well, just keep consuming the low-level events so you are on the safer side. So domain-level events are absolutely great. Maybe it's like the perfect representation of the events in your system, but it's risky. Have this in mind. We're choosing which types of events to use. And now that I've passed through the discussion of how do you model your events, let's discuss what are the possible technologies that you can use to implement the solution. Let's start with the brokers. For example, you might use a traditional message broker like ActiveMQ, or you can use the new Kira on the block, Kafka. What are the requirements and implications of each one of the solutions? For example, type of data. Do your messages need to be consumed in the same order that they were created? OK, that's one implication. Well, both ActiveMQ and Kafka support that. But how many endpoints are going to consume this order data? If you know previously, beforehand in your architecture, what are the endpoints that need to consume this information in an ordered way, then you can use ActiveMQ. You have no idea who is going to consume your information. Maybe it's better to use Kafka. And how do I know if I need the order information? Well, remember event sourcing? There is a super specific case of event sourcing when we model an event. And this is a distributed system thing. Some types are what we call CRDT, Conflict Free Resolution Data Type. Basically, it means that the order of the events, they don't change the end result. In the banking system, transactions, add and subtract operations are CRDT. It doesn't matter the order. The result will always be the same. But created, updated, and deleted, they are not CRDT because their order does matter. So it really depends on how you're modeling your system. So if you need order, you can choose both options that you have other applications. For example, many people don't know that if you're using a traditional broker like ActiveMQ, you can have order guarantee with the benefit of, well, ActiveMQ as most message brokers. They have one contract. They have once and only once guarantee. They guarantee to you, we are going to deliver your message and it's going to be delivered only once. If you're using Kafka, you don't have this guarantee. So the burden when you use a message broker is the broker. The burden when you're using Kafka is in the consumer application. And I think it's much easier for you to decide which technology is better. If you think about Kafka not as a message broker, but as a persistent distributed log. What are you doing? Well, you're writing to a log and this log is automatically distributed. And when I'm writing messages to a log, I can consume this message whatever the way I want. For example, Kafka. With Kafka, you can replay the messages. With a traditional broker, the message was delivered. It's gone because that was the contract. With Kafka, I want to process all the messages already but maybe I want to reprocess all the messages. Why? For example, when you have a big data application and you're doing customer profiling, you want to process all the messages, you create a set of profiles and you separated the customer by profiles. But then you tweak the implementation. Maybe my implementation now is better and I want to check which one is better. Well, Kafka, send me all the messages again. I want to process all of them and create a separate profile. Then you can maybe maintain both profiles and compare which one is better. So this is a typical use case that in which Kafka is clearly the winner is a much better solution than a message broker. Because if you don't use Kafka, can I replay all the messages? Yes, but then you have to store all the messages. You have some implications. With Kafka, the messages are already stored. Other implications, yes. As I said, if I have MQ, you have order guarantee. If you know the endpoint, if the connection is persistent, if the queue is durable and if the dead letter queue is zero, you don't have a dead letter queue. Whenever you have a dead letter queue, you don't have order guarantee because a message, oh, I failed. Go to the dead letter queue. When it's processed, it's out of order. So if you have all of these requirements, you have order guarantee in a message broker. And if it's enough for you, maybe active MQ is a better solution. For Kafka, for example, I can only process my message once. You all have to implement all of these enterprise integration patterns that we know in the past for an important consumer to guarantee that I only process the message once. Well, you have to implement your code with message brokers. It's in the broker, okay? And then comes Camel. Camel is a very popular, the most popular library for you that implements enterprise integration patterns. So maybe if you want to filter, you want to distribute, you want to aggregate, you want to have this complexity, Camel is very likely the best solution. For example, the example that I had before, all you want to consume low level events and generate domain level events with Kafka is super easy. And if you're interested in microservices that data distribution, if you don't want and won't use distributed transactions, maybe you want to implement the SEGA pattern. Okay? I don't know how many of you know SEGA, but SEGA is a compensation strategy for distributed systems. Where do you don't use transactions? And Camel has SEGA sagas out of box. Oh, this one failed. I need to roll back all of the other operations. So I need to keep track. What are the operations running? Which one were successful? What do I need to roll back? It can become like very complicated. Camel already handles all of this. Okay? Another technology that is very interesting, Debizion. Debizion, sometimes you have a legacy system, you want to use EDA, event driven architecture, but you don't want to touch the legacy code base or you have a system where multiple applications are writing to the same database. So there is not a single point in your system where you can intercept the events and broadcast them to the bus. Debizion, which is a CDC tool, connects directly to the database transaction log. So whenever a table, for example, we change it, insert it, update or delete it, Debizion reads that from the transaction log, generates a message, and broadcast that through Kafka. Okay? So Debizion is a very nice solution. And of course, I have to think that Quarkus and Microfile with reactive streams and FNLs are a very good fit if you want to implement a reactive approach, a reactive system using an event-driven architecture. All of this information is part of their material that I'm collecting because I'm writing an event-driven architecture tutorial. I've been studying this type of architecture, but all of this knowledge about event-driven architecture is usually scattered through a lot of different books. And I wanted to, well, I think it should be easier for us to learn and understand that. So I'm writing a book and initial estimate would be December. Okay? But I'm a developer, I'm very good at estimating, so maybe you shouldn't trust it. Okay? And so the, but the initial result, the initial code for that I'm using in my book is in this GitHub repo. And I intend this to be the reference once my material, my research is finished. I'm going to publish all the code examples here. So if you want to check, this is a work in progress, but the initial result is already there. I intend to evolve that. I want to have a very, I want to have like a demo application with an example of event-driven architecture. I want to add like all of the technologies and all of the types of events on there. There, but as I said, it's still a work in progress. And I've been traveling so much lately, so I didn't touch that in the past four months. But, well, December is still the deadline because once we reach the deadline, we move it forward. That's how I work. I'm not late yet. Okay? But eventually the work will be finished. And that's what I had to share with you today about event-driven architecture. I hope the information was useful. And if you have any questions for me, I'm available. Thank you very much. Say analytics comes into picture. Yeah, do you use any predictive analysis like business intelligence tools like Tableau or Power BI like that to analyze events? Okay. Yes, you can use like this BI tools for analytics, but usually they are not real-time, okay? They need to, they use an approach where they need to fetch or carry the online database to process that. And when I'm taking real-time is, let's say, a completely different approach without overloading the read data store. Okay, thank you. Okay. The division often you talked about for CDC. Is that a paid off? Is that a beta? Paid off, it's a open source. Oh, it's 100% of our source. But if you want, Red Hat just recently added support for it. So if you're using that in production and wants Red Hat to support it, there's an option available. Bring the events in Kafka. So can we use event streams to have the messages in memory like Kafka streams or K tables? If you want to use Kafka streams for that, yeah, no problem. Okay, and second question I have is, suppose I want to write to both Kafka and a database. Is it a good pattern where I write to both because in my scenario, I can have a Kafka down or a database down. So is it a good practice to do or is it a very common scenario? So I have the information, I need to write the snapshot information in my database. But at the same time, I need to broadcast the message because I'm using an event driven architecture. The problem is that Kafka doesn't support transactions. So you might have the scenarios that I've written to my database, but the message to Kafka failed or even more common. I'm sending that on the same transaction. The message was written to Kafka, but I didn't write that to my database, okay? There's no easy solution for that because you need to treat that on the Kafka side if I have the information or not. And one very nice solution to this pattern is if you go to the Debesion blog. Gunnar, the lead architect of the project, wrote a post about using the Outbox pattern with Debesion, which is I consider this to be very likely the best approach for solving this problem, like for writing to Kafka and writing to the database at the same time. So you have the guarantee that you won't have these failure scenarios. And another possible solution is that you use a message broker with transactions, but then it implies that you're not using Kafka, okay? But take a look at the Outpost pattern. It's a very nice solution to this problem. Okay, great, thank you. Thank you.