 My name is Ebert Wolf. I'm with InnoQ in Germany. We are a consultancy, so we mostly do software projects. I do consulting. I do a lot of architecture reviews, architecture work, and I do some trainings. I did write a few books. So I wrote a book about microservices and architecture approach. There is also a free booklet that sort of sums that book up. And I wrote another book about technologies for microservices. And again, there is a free booklet that gives a short overview about the subjects that this book covers. And last but not least, my colleague Hannah and me, we wrote a book about or a booklet, I should say, about service measures is a nice new technology for microservices. So what I'm going to talk about now are things that I saw out in the wild that made projects fail or at least have some problems. And I will start off by giving a name to each of these issues or challenges. And the first one is the common data model. And there is also a citation that I would give from some project that might run into that problem. The citation is the services need some common model to communicate. Let's start off with some microservices. We have one for order processing. We have another one that takes care about the delivery. And last but not least, we have another one that writes some invoices. And it seems to make some sense to come up with a common data model for those common business objects that you have such as a customer or an item. And you would use them for communication only because in the database, of course, there will be different databases for each of these microservices. So we will use it for communication only. And we can also have a different internal model in each of these microservices. So that's the first decision. And it seems kind of sound. And the next decision is, well, so there is this common data model, why would we implement it for each of these microservices? So we built a common library that implements that data model, which also seems kind of a wise decision. But now in the project that did that, all of these services need the latest version of that library. So if we have a new version of that common data model, we have a new version of the library. And we will have, we will need to update each of these microservices with the new version. So I would say congrats, because now if we do have such a change, we would need to redeploy all of the services. So we don't have a decoupled deployment anymore. And I would argue that we have created a monolith deployment monolith, which means that all of this stuff needs to be deployed together if we have some change in that common data model. But we also have the microservices challenges. So we have separate Docker containers or whatever it is that our microservices are built of. So we do have the complexity, but we also have that monolithic deployment that comes with a monolith. So that's how we end up in such a bad decision. And as I said, this is actually something that one team ran into. And I would argue that the individual decisions are not that stupid. If you look at them, I would rather say that you can sort of understand how they came up with those individual decisions, which lead up to this problem. There is a different sort of very related challenge. So let's imagine that we built one of these new fancy technologies or architectures, I should say, where all of these microservices communicate with events. And we store these events in Kafka, for example. So Kafka is a nice message oriented middleware. And the huge advantage of Kafka is that you can store large amounts of data in Kafka, and you do event sourcing. So the idea is that you can rebuild the local state from those events in the centralized data storage. So you have Kafka events are in Kafka. So a new order comes in. And if you delete the state, the database for delivery or invoicing, it can be recreated by processing those orders again. So essentially what you have done is you have created a system by the local database for the delivery process or invoicing is just a cache for the global Kafka storage for the events stored in Kafka. Because if you remove those local databases, you can actually recreate them from the events stored in Kafka. So what you have come up with essentially is a shared database. It's just that this database is in Kafka. And it is really the one database or the one source of data that matters, because your local databases are just caches. And I thought we would want to stay away from those centralized data models. We want to have separate data models in each of these microservices. So I would argue that there is something wrong here. Also, there is a general problem here because those data models for communication actually have many dependencies. So in my example, there is one microservice that issues events, the order process, and there are two microservices, the delivery process and the invoicing, that would consume these events. In a real-world application, there might be a lot more and maybe you don't even know who is processing the events because they are anonymous. And if you come up with or if you have that kind of situation, then these event data models are actually hard to change because you don't know who is processing the events. So if you want to change something, if you want to remove some data from that common data model, then some of your services, all of your services might fail. So that might be a problem. And if you can't really remove any attributes from your common data model, it means that your model is bound to grow because you can only add additional data. There is no way to remove data, so it will just grow. So what's the cure here? So one idea that you can have is you can have separate local data models and no global data model for communication. So this is basically applying the same idea that you have with the data model that you use for the database and applying that for the communication too. So you will also have different data models for communication. So you wouldn't have a common data model for communication or events. Instead, you would have separate data models for each two microservices that are communicating with one another. So in my example with the order process and the delivery process and invoicing, you would have a specific data model that has the data of an order that is interesting to delivery. And you have a different data model with the data that is important for invoicing. And I would argue that this is actually probably what you want to do because those are quite separate data models. So if you want to have a delivery, you want to know how many items are ordered, you want to know the weight, the size, and you want to know where to ship them. Why for the invoice, you need to know the billing address, the prices, and that's all information that you don't need for delivery. So maybe splitting it up into those separate data models makes a lot of sense because thinking about it, it's actually two different things that you're interested in. So also from that perspective, it's probably the better way of doing it. Now you could argue that you will have a lot of data models at the end of the day and you would have some kind of inflation there. And I would argue there is a trade-off here. So it's independence. If you have those separate data models, they are obviously independent. And on the other hand, you have one data model, which basically means it might be simpler because there's just one of them. So it's a trade-off between independence and having one model. So that's a trade-off. And I would argue that there is no one single best solution. So I'm not advising to use many small data models for communication. I'm just saying that there is a trade-off here and you should think about whether you want to have just one data model for the events or whether you want to have separate data models for the events in different circumstances. So that's the first thing that you might run into. The next one is a flaky system. And the citation that I have here is, what is resilience? And it's sort of like domino stones that, well, if the first microservice fails, the next fails, the next fails, and the next fails, and then you do have a problem. And in a microservices system, there are just so many more changes, chances for something to go wrong because you have a lot more servers. You have the network communication between the microservices. You have many services running on all of those servers. So there is just a lot more that can go wrong. If you have a deployment monolith running on one machine, there is just one machine that can fail. If you have just one deployment monolith, there's just one process that can fail. While in a microservices system, there is the network, a lot more servers, a lot more services. So there are more chances for something to go wrong. Now microservices will depend on one another. And I would argue that there is no way around that because at the end of the day, there should be one system made out of microservices, and you can only create that system if you do have those dependencies. And for that reason, one failed service might make another service fail, and then that one might make another one fail, and so on and so on, just like domino pieces. And this is actually a real challenge. So here is something that happened over in Germany. There was a fly that, well, just was sitting on one domino piece and that one fell over. So that one ruined that domino record attempt that was done in Germany. So that is what can happen with your microservices system as well, because if you have these microservices and one of them fails, then the next one, depending on that might fail. The next one, depending on that might fail, and so on, and so on, and so on. So what's the cure against that? Well, if you want to avoid such situations, you can build your system to be resilient, which means that your microservices will still somehow continue to operate, even if another one fails. It is very unlikely that that microservice will still work as if nothing has happened, because then the question is why it even talks to that other microservice in the first place. So if it can just continue to operate, if the other microservice fails, then why would it talk to that microservice at all? So maybe there is something like, okay, up to that limit, we can still process orders, even if we can't do a check on the customer, whether he or she will be able to actually pay for that order. So maybe there is some limit in there. And as you can see here, there is, that's actually more like a question of domain logic. So what will that limit be? Is there even such a limit that is those are things that you have to think about when you think about the domain logic? At least the least thing that you can do is you can provide some sensible error message. So you can say, okay, this microservice currently doesn't really work at all, because some other stuff doesn't work. And what you should avoid is to make other callers wait forever, because then the callers might lock up because they wait forever. And that is what you really need to avoid. So it's still fine to provide some kind of error message. It's fine to provide some kind of limited service, but you should not just lock up and do nothing and block all the requests. So what can you do? I'm actually a huge fan of asynchronous communication. I would argue that asynchronous communication has a sensible default for failure. The sensible default is you process that message later. So asynchronous communication means that you send some message, and it will be processed eventually at some point in the future. So if the receiver is currently unavailable, it'll just be processed later. And that's sort of built into the system. There are limits to that. So if the security service fails that checks for the authorization of some users, then it's probably not the best idea to build a system that will still work if that one fails, because then there will be unauthenticated or unauthorized access to some functionality. That's probably not a good idea. So there will be some limits to resilience, but still you have to think about the trade-off, and I would argue that it's also a decision about business logic, as I said. The question around resilience and flaky systems might be sort of obvious, but I included it into the presentation in particular to prepare the next thing, and that's synchronous calls. Something that a team that does that could say is, well, we do microservices the Netflix way, because what Netflix actually does is they do synchronous calls between microservices, and they actually do quite a lot of them. And the problem there is if you have cascading synchronous calls. This is something that might be easy to understand for developers, and I actually believe that's the main reason why we see it out in the wild, because it's so very natural to have a microservice calling another microservice. That's what we're used to. So we have some classes, and classes have some methods, and classes call other methods. So why wouldn't microservices just call other microservices? It seems to be something that people are used to in particular non-dissibuted systems. And what you end up with is a call graph that looks like this. So there is some call to some microservice. This microservice in turn calls another microservice, which again calls another one. This one returns eventually. Then there is another one that is being called. Then this returns, and then we have another one here that talks to even three microservices, then this returns, and then you go on. And the problem here is that there are 1, 2, 3, 4, 5, 6, 7 calls across the network. So those are 7 chances for the network to fail. And this also means that 7 times you have to pay the overhead for the network for serializing and deserializing data for the latency in the network and so on and so on. So what you end up with is performance issues because your calls go through the network latency issues because the latencies for each of those network calls add up. You could try to work around that by having calls done in parallel. So the first microservice called two microservices. The second one called three microservices. Obviously you can try to paralyze that. And if you do so, then the total time that it takes to process those services is just the time that it takes to process the slowest call. So latencies don't add up, it's done in parallel, it's just bound by the latest the call that takes the longest time. However, I would like to add if you do so, then one of the benefits of this model is gone because in that case you have to think about priority processing and you can't have that very simple model where everything is just processed one step after the other, but instead some things are done priority. You might end up with flaky services because it's probably hard to compensate those failures. It depends on what you're actually doing there, but if that service never returns, you have to think about what you do and in some cases that might be hard. And that is why I'm advocating asynchronous communication because then you do have resilience and it's pretty much for free. So what's the cure here? I would argue that you could try to do asynchronous communication and that might even be quite natural if you do business events and if your system is built around business events. So that's one way of doing it. However, to me, it is sort of the wrong level of cure. So what I would try to understand first if I have a system that has that kind of problem is the microservices maybe too dependent on one another. Is there a problem because the domain logic is so intertwined into one another? All those microservices are really one thing. So maybe we should have a different split into microservices. Maybe that's the problem. And if we do that split, then the problem just vanishes because we don't have so many calls because we can now introduce synchronous communication because there's not so much dependency. So that's what I would try to come up with. There is again another, I would probably call it anti-pattern that is related to that and those are entity services. So the idea here is to well start off with some kind of sort of object oriented modeling and to look at the domain and take each of these domain objects and make them one microservice. If you do that, then you would have entity services in my example as follows. So there's the order process, delivery process and invoicing. And we have some service that takes care of the domain object customer. And we have another one that takes care of the product or the item. And that would also probably mean that we end up with a centralized data model. That's the first anti-pattern that I was talking about that we have one data model that tries to represent an item or a customer. And that's something that we just introduced here because right here those two are centralized data models obviously. Also we will have synchronous calls because as soon as I want to process some order, I probably need to have some information from the customer and from the products. So I will have a lot of communication here and I will do it synchronously because there is no way that I can deal with the order if I don't have that information about the customer or the product. Which also means that probably every call goes through three services, order, customer and item maybe, which again gives me some performance problems, latency issues and also resilience issues. So entity services, talking about resilience, if I do have a failure here in the service that takes care about customers, then what's the solution? What's the sort of fallback? Is there sort of a default customer? So if some order comes in, it's probably an order by Ibar because he does a lot of orders anyways. I don't know. It's hard for me to come up with any sensible resilience approach at all. So probably as soon as the service that takes care of customers fails, all of the other services will fail too. So we end up with the flaky service anti-pattern. So I would argue and there are, well, closely related things. So maybe you will have a common database instead. So in that case, we are not implementing microservices, entity services. Instead, we have one database and in that database, there is all the information about a product, all the information about a customer and that information will be retrieved and used and changed by all the different microservices. That might be a centralized data model. And that's the very first anti-pattern. In that case, performance and latency is not an issue because it's just in one database. So it shouldn't be a problem and it shouldn't be flaky because, well, the database is probably always there, but it's still a problem because there is no way that we can evolve that those data models from the microservices independent from one another. So they are bound together and they meet in the database and that's not a good idea. So what's the solution? I would argue that you should have as much information about the customer in the order process as possible and the same for delivery and for invoicing and the same for product. And most likely that will be different information. So for delivery, concerning the customer, what I'm interested in is how am I going to get that stuff to the customer? So what's the shipping address and these kinds of things? And for invoicing, I'm probably interested in how is the customer going to pay for that and maybe also the country he or she lives in because of VAT or whatever there is. And there are simpler things for the product. So the product for delivery, I'm interested in how large is it and how am I going to get it to the customer? Maybe it needs to be cooled or something like that. And for invoicing, I'm interested in the price and how much VAT there is on that thing. So it's separate data for all of these different processes. So it makes sense to start that data in each of these separate microservices in their local databases. So I would argue that microservices should have their own data model. That's something that Domain to Design talks about. Domain to Design born in context and that's another subject for a completely different talk. You might share a database that's fine for me, but then you should have separate schemas. So each of those databases should have each of those microservices, sorry, should have its own separate schema so that they don't depend on one another in the database and that there is a clear separation there in the database. Here is another pattern. It's actually something that I do see in customers and what a team that does that maybe says is, well, we want to have some flexible and maintainable system, so we are going to use microservices and then we will have such a flexible and maintainable system. Here is an architecture diagram with a bad structure of some deployment monolith. Here is a diagram of a bad structure with some microservices. It's the same thing. Microservices just say that the modules that you implemented are microservices. It doesn't say that you come up with a different, better structure. It's just how you implement those modules and for maintainability and so on, the problem is the structure. The problem is not how we are going to implement those modules that make up that structure. So microservices are just different modules. It's just a different way how we split our system into modules. So you wouldn't use Java packages. You wouldn't use C++ namespaces. Instead, you're using microservices and for that reason, microservices won't fix modularization. Instead, you will come up with some distributed big ball of mud that is not maintainable and changeable. Actually, if you do use such an approach where you have that bad structure and you use microservices, it's actually getting worse. The reason why it's getting worse is because microservices have something that I call extreme decoupling, which basically means that there is a lot of decoupling in the microservices architecture. So you can have those microservices deployed independently. There is network communication. So you can scale them independently and so on and so on. There is a lot of decoupling. Usually decoupling just means if you do a change to some module, the other modules are not influenced. But if you have microservices, if you scale up one microservice, the other microservices are not influenced. If you deploy one microservice, the other microservices are not influenced, and so on and so on. So there is a lot more decoupling going on. But if you do have that bad structure, then you probably need to do multiple coordinated deployments because your change will affect a lot of different modules, so a lot of different microservices. So somehow you need to deploy all of them. And for that reason, bigger changes might become even harder. And because your microservices might talk a lot to one another because they depend so much on one another, they might become chatty. So you will have those network performance problems, you will have latency problems, you will have resilience problems and so on and so on. So if you have a bad structure, it's a bad idea or microservices make the problem even worse. So just moving over to some microservices architecture because you think that will fix your structure is probably a bad idea because if you don't really fix your structure, then you will have much worse problems than before. So what's the cure? I would try to decouple logic and again the solution that I propose is domain-driven design and boundary context because in that case you have separated loosely coupled domain models in each of those microservices. You will probably have less communication and you can also use a migration approach where you migrate each boundary context and therefore you have a stepwise migration approach where you would migrate one boundary context after the other. So for that reason, I would not reuse an existing structure for migration because chances are the structure is bad and if you migrate that into a microservices system it becomes even worse. So my advice is if you want to fix the structure, microservices won't help. If you want to fix the structure, fix the structure and fix the architecture and don't just migrate the whole thing to microservices. So as a conclusion I think there is a lot of bashing concerning microservices lately and I think that's actually sort of well unfair or I should rather say it's sort of limits people because they think well microservices are a bad idea so why would they use them? I would argue that there has to be a trade-off. So microservices are a solution for specific problems. They solve some problems. They don't solve other problems. So for that reason you would need to decide in architecture for the problem at hand and come up with the right approach and that is what makes software architecture so interesting and challenging and that's why I still enjoy it after all that time.