 I was fortunate to get involved in one of those kind of career defining projects, a project that was just fascinating in its complexity and we did some really cool things. What we were trying to do is we were trying to build one of the world's highest performance financial exchanges. As part of that exercise, we developed a fairly unusual approach to the architecture of our systems to get that working. We were driven by ridiculously high performance demands of the system and then later got involved in writing a thing called the reactive manifesto on the back of that work that defines systems like this a little bit. So what I want to do is I want to go through and kind of describe some of the properties of these kinds of systems and talk a little bit about that. The kind of tagline for this, for the reactive manifesto in general, is kind of you don't solve 21st century problems with 20th century software architectures. We are moving on, the game is moving on and this is meant to just kind of capture that a little bit. In 2005 a large system would probably consist of tens of servers, it would need to respond in seconds, it would have a few hours for offline maintenance and would be processing gigabytes of data. A large application now is probably thousands of different points in a network, millisecond response time, 100% uptime and petabytes of data. These are very, very different. The hardware, we tend to get divorced from the hardware in our industry. We tend to forget the absolutely unbelievable phenomenal advances in computer hardware. This is just one dimension of that kind of performance increase. These pictures are all kind of related to storage. There's a cost per gigabyte chart over there. This picture here, that's a 8 gigabyte SD card sitting on top of what's called a ferrite core memory unit. I'm not quite old enough to have programmed any computers with ferrite core memory units. That thing is a memory, each one of those little magnetic rings is one bit. That's kind of the technology of mainframes in the 1960s and 70s and we went from there. This picture over there, the one in the bottom right-hand corner. Again, two SD cards. The one from 2005 is 128 megabytes. The one from 2014 is 128 gigabytes. That's a thousand-fold increase in density for kind of about the same price point. There's these massive changes. I want to take you through just a little exercise to try and just position in your minds how staggering modern hardware has become. Let's imagine for a minute that one CPU cycle in one of our computers, commodity hardware, my Mac here, most things, one CPU cycle took one second instead of the kind of rate that it operates. What would that mean in terms of performance? One CPU cycle is actually about 0.3 nanosecond because of the 3 gigahertz that's kind of how that plays out. If that was equal to one second, a level one cache hit in the processor would take the equivalent of three seconds. A level two cache hit, nine seconds. A level three cache hit, 43 seconds. We're still on the chip. We're still in the processor. But this is the difference in performance. Now we're going to go for main memory access. This would take six minutes. Computer to computer, over 10 metres of fibre, 18 hours. Solid state disk IO, four days. Spinning rust, conventional hard disk, about six months. Internet transfer from London to Australia, 19 years. Computer reboot, particularly if you're running Windows, 31,000 years. It's just trying to put this into some kind of human scale. We forget the phenomenal capacity of these devices that we program. Those advances should impact and make us think somewhat differently about the nature of the systems that we build and the architectures of the systems that we build. As I mentioned, I got involved in this thing. This is called the reactive manifesto. The idea of the reactive manifesto is to try and... We'd learnt some of these things about building different kinds of systems. We wanted to try and capture those in some way. Fundamentally, this describes the behaviours that reactive systems are responsive, resilient, elastic, and they're message-driven. This is a hyper-microservice kind of architecture. It's one way of thinking about it. I want to get into a bit more detail and explain what we're talking about. First of all, reactive systems are responsive. They respond in a timely manner. This is the cornerstone of usability, but it also means that we can build very, very, very high-performance systems to just kind of give you a flavour of that when we were building our exchange from a message, a packet coming into the edge of our network with two network hops processing the matching engine to do a trade, packet hopping back out, 40 microseconds. This is staggering, the kind of performance. Reactive systems are also resilient. They remain responsive in the face of failure. This depends on replication, containment, isolation, and delegation to be able to achieve these kinds of things. They're elastic. They can scale up and down. They remain responsive under varying workloads. They respond to change by increasing the resources that are allocated to delivering behaviours. These tend to be very decentralised architectures so that they tend to duck contention points and bottlenecks in the processing of information. They're message-driven, and more specifically about that, they're asynchronous message-driven. I want to talk a little bit about asynchronous in a bit more detail as we go through this, because I think that's one of the more unusual aspects of programming these kinds of systems. Asynchronous message-passing is the foundation for many of these properties that I've described. It gives you loose coupling, isolation, location transparency, temporal decoupling, because of the asynchrony and that you're no longer held to time demands, and the ability to delegate errors to other parts of the system, which is one of the things that makes them so resilient. I think here's a list of properties that we can consider for our four reactive systems. They're flexible, loosely coupled, scalable, easier to develop, which may surprise you, but I hope to convince you that I'm right when I say that. More tolerant of failure. Respond to failure gracefully when things go wrong, and very responsive to users. Again, that might surprise you when you think of asynchrony, but actually, for an asynchronous system, you do less work. I'll come back to that. This is kind of a fractal architecture with the kinds of systems that I'm describing. They can be incredibly large distributed system running in a cloud or something like that, but also this is actually the way that our hardware works. This is something kind of fundamental in computing about the way in which this operates. Large systems tend to be composed of smaller ones that are built up from these kinds of structures. They tend to depend on the reactive properties of these parts lower down the stack, and the systems tend to benefit at all scales from these sorts of things. Just to try and describe what I'm talking about, this is a schematic diagram of a modern Intel processor. This is a network of compute units. They communicate through asynchronous messaging over the QPI message buses internally, and they coordinate their activities this way. This is how you get really, really high performance at the hardware level, and it's also how you get really, really high performance at the software level. Let's start digging into these in a little bit. I think one of the things that all of us, when we first learn to write computer programs, after we've kind of done the hello world, we might write something with a function where we're going to call a function and we're going to wait for a response of some kind. We kind of grow up as programmers with the knowledge and the assumption of synchronous interactions in our coding, but if you stop and think about that for a minute, the whole idea of synchronous communications is really, really very weird. If you and I have a conversation, if I'm talking to you and I ask you a question, while you're thinking about it and responding to me, I don't freeze, I don't stop, my brain doesn't cease processing while I'm waiting for a response from you, but that's how our software works on the whole if we're building synchronous systems. This is weird. I would suggest that most of the universe reality is asynchronous. It's not synchronous. If you want to model the real world in our systems, then asynchronous is probably a better fit. It also starts to give us some tools to really overcome some of the complexities, particularly in distributed systems. Distributed systems are much, much more complicated than local systems. Because things can go wrong in different parts, you have to worry about that. You have to think about that in the design of the systems, and so they become much more complicated. A synchronous program gives us some tools that simplify that process. Here's one example of what I mean. Let's imagine that we've got two components and they're communicating with each other. Something triggers component A, and that means it talks to component B. Now, there are number of places where that conversation can go wrong. There might be a bug in component A. There might be a problem establishing the connection to component B. Something bad might happen while the bytes are in flow and we might drop bytes in the communication. Maybe there was a problem establishing the connection at the component B in. Maybe there's a bug in component B, and the same set of failures on the return journey for the message. If it's a distributed system, all of those points are points of failure. Now, the problem with that is that in reality, component A only knows about some of, can only detect some of those failure cases. It can't detect all of them, and so it can only know a bug in its own right and a bug in its ability to communicate with the other thing. Everything else, we've got to make complicated coding to cope with the case that there might be a failure, that we can't detect somewhere else in this distributed system. The other way in which this breeds complexity. Let's say we're doing this, we invoke component A, that was a process and that starts going, that results in a communication to component B. Now the process is frozen while it's waiting for a response to component B, and then when it gets the response back from component B, it carries on processing. That's fine until you now want to scale this up a little bit. How do you make that go? What if, you know, you want to do more work while it's waiting for the response from component B? Typically, either what we do is that we just wait until these things are free, or we do this and we start writing multi-threaded code and then we just wander into one of the most difficult parts of computer science concurrency. This is a much, much more complex programming model now. We have to wait for all those things to finish, we have to be very careful about shared state and all that kind of stuff. The other thing is that if there's any shared state between these things, that's incredibly slow, it's incredibly expensive. Concurrency is an extremely costly thing in terms of performance. If you want to maximise the throughput of a piece of software, run it on one thread, because as soon as you go off onto another thread, you incur a massive... Usually, depending on the nature of the concurrency techniques you use, it may be as much as three orders of magnitude delay from a single-threaded performance. A thousand times slower, because you're waiting for the concurrency synchronisation and so on. The synchronous comms increases coupling in time and in location. So what could we do instead? Sorry, there's one more thing. If there's a downstream thing from here in this case, the situation is even worse, because now we've got those failure cases as well that might happen that we can't see, that we can't detect. So what could we do instead? Well, we could start using asynchronous messaging instead. So instead of doing these things synchronously and waiting for responses, we could run our logic on a single thread in each of these components, so there's no concurrency, and then we could just do work. So the same failure modes are in place. But the difference is that the two that we can detect are the only two that matter in this case. If it fails downstream, we don't care because we're not blocked, we're not waiting for a response. We're not trying to co-ordinate the state of the data between the two places. We can fire the message, forget, not worry about what's happening anymore. Sometime later, maybe in the next microsecond, maybe the next year, a response will probably come back. This is a much, much more scalable approach to solving problems, to writing software. And if these things are single-threaded, as I said, it also means that this is incredibly efficient because you're not getting the cost of concurrency and co-ordination of state between different threads of execution and so on and so on. So here's a little example. Let's imagine that we've got a bookstore and an inventory that we're trying to implement. And we're going to place an order for a book called Continuous Delivery. All my examples are buying books called Continuous Delivery. We're going to place an order. That's going to reserve Continuous Delivery from the inventory so that we can use it. If we're doing this in a synchronous way, we're now kind of blocked while we're waiting for a response back. So the bookstore is now stopped unless we do the multi-threaded thing. We're blocked. So what we could do instead, we could do this as an asynchronous model. So here's a message coming in ordering Continuous Delivery. That's going to send a message on to reserve the Continuous Delivery book to the inventory. And at some time later, maybe a microsecond, maybe months, there's going to be a response that comes back and we can communicate that response back upstream. A good way of thinking about it, one of the nice patterns for implementing these kinds of systems is to treat these little blobs of behaviour as little blobs of domain logic and as little state machines. So what happens, oops, sorry, I zoomed away. Let me try that again. So we can place the order. We can set the state of the book to Reserving while we've sent the message on to the next one. At some time later, we can process the other books. Those are also in Reserving State. Some time later, we get a response back. We can change the state of the book to Ordered instead of Reserving and we can kind of manage that. It's a really, really simple pattern. If we wanted to implement it, so what happens if the inventory was in a data centre that got hit by a meteorite and he's not coming back for weeks? It doesn't matter. The bookstore bit can still function. It probably can't forward the orders, but it can take the orders, it can make progress, and we can then write code to analyse and say, tell me all of the books that are in the Reserving State because those are the ones that are problematic and we can decide what to do. So we have all of the information that we need. It's a great way of thinking about how we organise the logic. This kind of pattern I think is really quite... It's really simple and it's really useful. It kind of uses the domain-level message semantic, so we're talking at the level of the problem domain. That's the nature of the messages that flow between these services. We can kind of migrate the state of a domain model based on message imports so we can kind of keep these little bubbles of logic stateful to understand what's going on. And in response to a state change in the model, we can generate a new event and we can move on from there. So let's just walk through an example again. Same example pretty much and just make sure that it's clear. So here's the bookstore. We're going to order the continuous delivery book. That's going to do some processing. Change the state of the continuous delivery book to ordering. That's going to forward the message on to reserve the continuous delivery book in the inventory. That's going to do some processing. We're going to change the state of the continuous delivery book to reserved in the inventory. That's going to send a new message back saying that the book's reserved. We're going to change the state of the book in the bookstore to ordered and we can forward the message on saying that the book's been ordered. So it's a really nice simple little model for the interaction. If the inventory is not there, then that second part of the work is lost. But we're still in a reliable consistent state as far as the bookstore is concerned. Sometime later we can come in and say, tell me all of the books that are in the ordering state. Tell me what's going on so I can list those and decide what to do about them because the inventory is broken. Later on when the inventory comes back, the message will be delivered, the reserved message and processing continues as it was before. This is a really, really resilient way of building a system. All of the pieces of the system don't have to be up and running at the same time. As long as the messaging system is going to safely get you the right messages in order, it's all going to work. At this point, maybe if you've been thinking about this, you're thinking, okay, so what about getting the messages in order? So I just want to reassure you. You shouldn't have to write this kind of stuff. You can buy this kind of behaviour off the shelf. But just to describe one pattern that you can use to get a reliable ordered set of messages, idempotent meaning that they're executed only once. So we have some behaviour here. We have a ring buffer and component A is going to send a message. So it places the message in the ring buffer and there's a cursor on the ring buffer to say where it's going to put the message and it's going to populate that entry in the ring buffer. Component A can carry on doing that to its heart content. In the infrastructure, that is multi-threaded. That's kind of hiding the concurrency complication in the implementation here. So the thread to actually transmit the message is going to kick in at some point. It's going to read the messages from the ring buffer and it's going to transmit them over the wire to the other component. The other component is going to put that into its buffer and it's going to move the cursor on to populate a few more messages. So let's imagine now what happens is we send the message and the message fails somehow in transmission. When it tries to send the following message, component B can say, you just sent me message 4, I was expecting message 3. So it can send a message back saying, I was expecting 3. So now the component A can rewind its cursor and it can send all of the messages from 3 onwards. We built this system, we built a system like this to give us this kind of resilient message in support of our stuff. Sorry, I'm having problems with latency in my presentation today. So isolation, so what this kind of approach gives you is it decouples the components of your system in both time and space. So you don't need things to be available at the same time, at the same rate. And actually you don't really care where they are. If the messaging is kind of pub sub style messaging or broadcast style messaging, you don't care where they are, you just send the message and something's going to process it. This kind of architecture depends on a few base principles. One of them is that you don't share anything other than via messaging. There are no backdoors, there's no shared databases or anything like this. This is where they're very strongly aligned with kind of the microservice kind of model. And the whole system is architected around on the intercomponent communication over these protocols, these exchanges of information, these business level conversations that are happening between the components of the system. So let's talk a little bit about isolation. So here are our components and here's two components sharing a database. That's what we're trying to avoid. We don't want these back channels of communication in the system that's going to compromise the determinism actually of the system. What we'd like is we'd like to be able to completely replay the stream of events and get the service back into exactly the same state that it was whenever we replayed those events. So we can't do that if we've got these back channels. What we can do instead though is we can give each one of these things their own data store or whatever they need and kind of put that within the boundaries of the system. If one of these services needs a graph database, we can implement a graph database. If one of them needs a column store, we can implement a column. It doesn't matter. It's down to the service. We can do more than that as well though. But we need to aim to kind of share nothing and the boundaries of the service are inviolent. The only route in and out of these things is via messaging. The next thing to think about here is the idea of back pressure. So if we've got all of these messages, cues in effect of information flowing between these services, what happens if one of them can't keep up? I think one of the underpinning ideas is this, that you can't isolate stress. If a service is under stress because it can't keep up, you can't solve that problem in that service because there's too much stuff coming in. So the system as a whole needs to respond in some way. It's no good for that service just to fail because even if it fails and then you start up a new instance, you're going to be back in the same problem as you replay all of the messages through. The other thing to think about, which is a slightly weird thought, is that cues represent an unstable state if you think about it. In a system, in a real world system, a cue is nearly always either completely full or completely empty. It's not stable to have them perfectly balanced because we'll talk about that in a little bit more. So the idea of back pressure, being able to signal back up the stack to inform the sources of information is important. But let's look at this unstable state thing. So here's component A talking to component B. If component A is slightly slower than the consumer component B, then the cue is always in... the message cue between them is always empty. So that's not very problematic. This one, though, probably is. If it's the other way around, if component B is slightly slower, then what happens is that over time, the cue is going to get full. In fact, it's essentially always going to be full. At which point, what do you do? So you could make the cue bigger. So usually what happens is that. Usually what happens is that the thing falls over. So you could solve this by making the cue bigger. But actually, this is a dynamically unstable state. The cue is still going to fill up. So maybe we're going to drop messages on the floor or the system's going to fall over. This is not a stable position. What usually happens is something more like this. So what we need to do is that we need to have a strategy to cope with this kind of stress. We can't solve the stress at the point at which the system is under stress. So what we need to do is that we need to signal upstream saying, slow down. I can't keep up. Ultimately, that might have to go all the way back to the user, but you've got to signal all the way back and you've got to build that into the system. Some of the tooling and infrastructure around reactive systems has built in back pressure mechanisms that allow you to do this kind of thing. Elastic systems need to react to changes in demand and nearly all computing these days is increasingly distributed. And this means that there's no real difference between cluster kind of scalability and vertical scalability. Components need to be mobile. We need to be able to move behaviors around in the system and scale them out and scale them back down and all that kind of thing. So from a programming point of view, a good approach is to kind of have one programming model. From the services point of view, it makes no difference whether the services are co-located. They just talk to each other via messaging, asynchronous messaging. The implementation of the messaging might change. It might be using something much more efficient if they're co-located than going over the network, but if they're distributed, you can go over the network. You can discern that kind of thing at deployment time or even at all runtime. And then how do we scale these kinds of things? So we can start to think about scaling through sharding. So we've got components A talking to component B, and we've got a stream of messages coming in. And if that's not fast enough, we can start thinking about how we can speed that up. One way in which we can do it is to build some slightly smarter infrastructure which is able to determine the difference between different invocations. And so now we can kind of root some traffic to one version of component B and other traffic to another version of component B. So this is another of the dimensions in scalability for these kinds of systems. So what does all this look like when you put it together? You tend to have a bunch of systems and doing different kinds of roles, and they're kind of talking to each other. Nearly always what happens with one of these services is that some message comes in, they do some work, change some state, and generate some one or more messages coming out the other side. And then other services might react to those messages. And you might have multiple services reacting to one message and so on. You might have messages going back the other way. You have gateway services at the edges of your system that will translate into and out of the logic of your system to keep the logic inside your system clean of external dependencies. One of the other wonderful things that you can do, I realize now that I think I've missed a slide from this deck, is you can externalize a whole raft of the accidental complexity around the system. So for example, you've got this stream of messages coming into a service. You don't really need to store those in a database because you could just log the messages. You could just record the stream of messages and then you could just replay those messages when you wanted to reevaluate that state. So you could move these services around to different places, replay the stream of messages and so on. The system of record in our exchange was the in-memory state of two of these services, these rich domain models in memory, and we could do clustering, persistence, scalability, all of those things external to the service themselves. The service didn't change, but it could be clustered, scaled and so on. Here's a picture of our system. This is an example. So we divided our services up into three different groups. We have what we call core services. These are the seriously high performance bits of the system, and these are the things where the in-memory state was the system of record. We call these general services. These were more like traditional kind of microservices kind of things, where they might be backed by a database or something like that or a file system or something. Outside of the edges of that, we had gateway services. The job of the gateway services was only really protocol translation. Security and protocol translation. So they're going to translate between any external communications mechanism into our domain model that's represented in the core services and the general services. We built our entire enterprise system to work this way, and it gave us this massively high performance because of the efficiency that we talked about. The thing that really landed with me, the reason that I'm talking about this today, the reason that I think this is important, is it's the nicest programming model that I can remember for a system of this complexity. Because when you're writing the code, all you're worrying about is what's the core problem domain need to do here. Because inside these little bubbles, these little services, that's all you need to worry about. You didn't need to worry about persistence or scalability or all those sorts of things. Those are externalised from the system. That's a wonderful property to have. I made that presentation a little bit shorter than I anticipated, which means we've got time for questions. Are there any questions? To process the messages faster. How do you ensure ordering of messages in that case? Effectively, you're partitioning the messaging. They're going to be ordered within the context. The little green blobs are all going to be in order. The little yellow blobs are all going to be in order. But there's no ordering between the green and the yellow blobs. That could be concerning, right? If I'm working with the transcription system, where I really want that my message A should get processed first and then message B should get processed first. I think that's one of the kinds of things that you have to think about in these sorts of systems, is trying to define the boundaries of the isolation so you don't care about those sorts of ordering problems. It's kind of the eventual consistency model. One take on the eventual consistency model, I suppose. Let's say that they were orders for books, as we were talking about before, going to those different services. If each one of those things is concentrated on ordering a book, you don't really care whether they're in order or out of order. If you do, then you don't shard on that boundary. You shard on a different boundary. You do need to think about the approach that we took to eventual consistency and distributed consensus kind of things, was we didn't want to try and build distributed consensus protocols that would define consistent state across the whole picture. We tried to architect our system in a way so that it didn't... Within the service, a service would be completely deterministic, but between services didn't need to be. That's about drawing the lines in your bandit context to isolate the services appropriately and that kind of level of design, I think. There's one over here. We'll come back to you. Yes? Yes. How important is this message-driven paradigm for a system to be reactive? I think it's essential. A synchronous message-driven. I don't think it counts as a reactive system if it's not message-driven in this sense. For us to apply this paradigm to real transactional systems, there could be systems which have zero tolerance to the latency with message-driven systems and asynchronous being part of the exchange, does it have an impact on the latency for such kind of distributed systems? It does have an impact on the latency. It's dramatically faster. If you think about what's going on in a synchronous system, at some level you're going to formulate a message of some kind, you're going to send the message, you're then going to block a thread, and then you're going to process the message and then you're going to send a response, and then you're going to unblock a thread. The asynchronous system is simpler than that because you don't have to do the block a thread and unblock a thread, it's relatively costly exercises. The most high-performance systems in the world are asynchronous systems. The asynchronous system is a more efficient way of exchanging information, but it's a slightly unusual way of thinking about the programming. Really that's the point I'm trying to make. One of my friends who was involved in this project, he talks about synchronous programming being the crap cocaine of software development because we've all got kind of a dictate to this idea of synchronous programming, but actually for distributed systems, if you do away with it and you start just going with the synchronous system, it's a much, much simpler programming model, it's much less complex in terms of the resilience and the scalability and all that kind of stuff, and in terms of performance, so our system was the highest performance for a little while, was the highest performance financial exchange in the world, as far as we could tell, and it was all built on these principles. There's no compromise in terms of performance here to do this kind of thing. Now, it does mean that if you want to do that kind of thing, really high performance, you've got to have really, really good messaging systems, and some of the common ones, RabbitMQ and stuff like that aren't that fast, so you use faster ones, but the model still holds for those messaging systems, too, if you're not looking for ultimate performance. In fact, thank you. There was a gentleman over here and come back to you. So you said about maintaining the state of domain entities in the services, right? Now, whenever the workflow triggers, the entire transaction spans across two things. One is the database because of the persistence, and second is the publishing of the event or a message to the broker. So how do you maintain that distributed transaction? Because over there, we still need to ensure the distributed transaction and maintain the consistency of that, right? So one of the things that you give up in these sorts of systems, and to be honest, any distributed system of any scale these days is distributed transactions. So those aren't really a scalable solution. So the kind of approach that you buy into instead is really this eventual consistency model instead. And part of the design of these things, it's not as hard as I'm making it sound, I don't think, but part of the design of these things is to design your services along boundaries where you don't care whether that thing over there is upstate. So I can give you a concrete example. So in our example, let's say a trade was coming into our exchange, you're going to be traded. We needed to process that in our exchange to see whether it matched a price that was already in there and all that kind of stuff and send a message out. We also needed to archive that in a data warehouse so that we could kind of do management information queries and all that kind of stuff. So what we had was that we had one service that was listening to these messages coming in, a trade coming in, then that was the matching engine that would match it. The persistence of that was externalised. There were no databases because databases were too slow. Then there was another service somewhere else, running on a different machine altogether, that would just listen to the same events and would persist them to our data warehouse. So you get to break the problem up into these different pieces and each one of those pieces is simpler to break. This is the classic kind of microservices kind of thing, but the difference here is that they're strictly asynchronous and very high-performance. We rolled our own. We built our own because the performance demands of our system were so high. We evaluated some of the third-party messaging systems and we just made our own. My friend, so I was the head of software development on this project and my friend was the CTO, Martin Thompson. Martin's now got an open-source project where he's writing a messaging system that has some of the same kind of behaviours that we're talking about. It's called Aeron. That's the highest performance way of exchanging information between computers that we have at the moment in terms of messaging. So you can do a lot, as I keep saying, this is a high-performance solution, not a low-performance solution. There we go. Okay. Kind of a two-part question, I guess. So you've clearly been around for a while in the industry. How is this thing that we're all talking about and doing, my company we also do, to some extent, how is it different from all the SOA stuff and the shit that was sold 20 years ago? How is it actually fundamentally different? It kind of looks like the same thing to me. You alluded to my age and being a grumpy old man, I can say. Actually, it's not very different. So we were doing some of these kinds of things 20 or 30 years ago in a different take. We used different names. We called it component-based programming stuff like that. If you were doing that stuff well, it looked a bit like this kind of stuff, but there were ways of doing it less well. I think one of the things that's changed is that we've got better ways of talking about it, better ways of describing it now that make it a bit easier to do it well. My second part of that question was what ended up happening back in those days is you had all this enterprise architecture governance API documentation bullshit which kind of slowed everyone down and we all started hating it and that's why things like Ruby on Rails came about and everyone's like, yes, screw all that. Let's do this the easy way. Then we went to rest and then we came back here again. Now I'm a CTO at a company with 30, 40 microservices and while the architecture makes sense from just abstract standpoint, managing a large team working on different services and deploying them and event schemas and all that stuff, sometimes I wonder how much it's worth it or when this should be applied and when we should smoke the crack. Because it's just an easier way to get things done. I think that's fair, but for my distinction, I am now so opinionated about this stuff that I think that if you're building a distributed system, you should do it this way. Now if you can get away with that building a distributed system, that's a lot easier. So if you haven't got a distributed system, then fine, don't do this. But if you've got a distributed system, I think this is easier. But to make it work is kind of tricky to the infrastructure was problematic. But now we're starting to see commercial offerings and open source offerings that provide this kind of infrastructure that we can reuse. So I think that the illusion for building complex systems, high performance systems, whether they're high performance systems but scalable systems, I think that I wouldn't want to do it any other way than this because I think this is the easy... I've been around long enough to have tried most different ways of doing this stuff. Part of my specialism, I suppose, over the last 30 years has been distributed systems, distributed computing. And this is the easiest way that I know that I'm doing it. Distributed computing is always hard, but this is the easiest way that I know I'm doing it. But it's not as easy as running stuff on one computer. Yeah, yeah. So we ducked to one of the problems. One of the ways in which our system, our exchange, was not a microservices architecture was that we had it all in one repo and we deployed it as a monolith, which meant that we could test it all together and we could verify all of... We weren't too worried about the protocols changing because we could test them all. There was a lady over here that had a question. Hi. One of the products that I've worked on had an enterprise service bus. We called it a transaction broker. But it was repeatedly a point of failure for us. What can we do to make that more resilient? So I think one of the failures of the ESP, the enterprise service bus model, was trying to put too much smarts in the communications. I think that the communications infrastructure around this is actually fairly lightweight in some ways. So all you need is some basic publish and subscribe messaging and then you need a collection of various services that you can mix in. You can get this stuff off the shelf for one of my home projects at the moment. I'm using Kafka to do the same kind of stuff. It's not as high performance as the stuff that we built, but it's absolutely fine. So those sorts of things... I think that's... If I'm honest, I think that's a largely a solved problem. But if you are taking the step of externalising the accidental complexity in your system and divorcing that to the infrastructure, you've got to be confident that you know what it's doing and that it's working. Time for one more. This gentleman was waiting for ages to ask a question. Hello. It's a very small question. In one of the slides you showed that you have got components independently which are communicating with each other and below that you mentioned that they need to have independent data store. At this point I could not really understand. I just want to know when you say independent data store what do you exactly mean? Because many of the data stores have some parts related to each other. So how does that work later on? So the level at which I would care about independence in this kind of architecture was that the services were not communicating with one another through the data store. So there's no shared state between components. Okay. Read is a lot but not basically updates or edits or something of that. That's what you're saying? I guess let's just talk about it. No change of states, that's what you mean. No change of states but they can just retrieve the information but no change of states in a shared environment kind of stuff. The way that I think about it is that there are no back channels of communication between the services. The only allowable route for communicating between services are the messages. Okay. I want to thank you. A very beautiful presentation and yours is the only presentation that I've seen which has got so many questions and we really don't have time for that. Thank you very much. Thank you. Thank you.