 I'm excited to have a guest speaker. I've known Barry for a while. Barry is the ex-CEO of NuoDB, but he's still heavily involved in the company. Every time I go visit there, Barry's there. I met Barry when I was in grad school at Brown back when, before it was called NuoDB, it was called NimbusDB. I actually found the NimbusDB M&Ms in my office. When I came to see of you. So I invited Barry to come to give a talk when I was in grad school because we didn't actually understand what NuoDB was. And when he came, everything really clicked for me and I understand what they were doing and why it was cool. So I'm really happy to have Barry here come now to talk to you guys about now a full system that encapsulates all the things we talked about this semester that's actually running in the wild and solving field problems. So Barry's going to give a talk, feel free to stop him and ask questions and do it all along. Okay, awesome. Thanks guys. Thanks Andy. And good to be here guys. So Andy says I've been involved in databases for a long time and the story here is a little bit boring so I'll tell it quickly which is that I was kind of retired and done with running companies and starting companies and all that sort of stuff. Happily running a school and an energy company and whatever else. And Jim Starkey, my co-founder came to me and he said Barry I've solved this holy grail problem to do with databases and I said what is it? He said I can do efficient distributed transactions. In fact I can do those dynamically. I can do that in a sort of a scale out fashion and I said no you can't. I know enough about this stuff to know that there's lots of people that have tried to solve that problem. There aren't really that many solutions. All the solutions are highly compromised. Sorry Jim, that's not true. And he spent a few weeks persuading me in talking to me about other stuff that I'm gonna talk to you about today and I said my word. That's quite amazing what you just described. That's the most elegant thing I've seen in a long time. I shut down all the other things I was doing, jumped in, we took this thing, we patented what I'm about to talk to you about which by the way we got in less than a year with no office actions. They said there's no prior art. No one's ever done anything like this before. And we now have backers which include the founder and CEO of Ingress who is also the founder and CEO of the Postgres company, Elastra. The founder and CEO of Informix. All of these are investors. Former CEO of Cybase. Pretty much all of the relational database company, big guys other than Larry Ellison and I don't think I'm gonna be able to get him on board. So this is a big thing. All of those people went through the same thing I went through which is at first they said no, no, no, no, no. You're either fooling us or you're fooling yourself or something. And it turns out that what Jim's come up with is a really interesting set of trade-offs around distributed transactions. No one else is doing it this way. Where we are today is we are displacing the major relational database companies in major corporations around the world. The big banks, the big stock exchanges, lots of the big technology companies are ripping out the relational database products and replacing it with a cloud style relational database which is what this is. We call it a durable distributed cash. I just had a conversation with Andy. He likes to think of it as a shared disk system. We probably disagree on that. Not gonna show you the demo. Let me just jump straight into the question of what this is all about. And by the way, just this presentation is not a super academic presentation so hopefully it's relatively lightweight but I think it will be quite challenging for you technically. So we'll take it step by step. The first point is just what are we talking about? We're talking about a database system where you add nodes, it goes faster. And you take nodes away and it doesn't go as fast. Simple as that, okay? That's really hard to do with a relational database system. What you're looking at is something it's going from. It's actually quite an old chart but it makes the point. It's going from one node to two nodes to four nodes to eight nodes to 16 nodes. And that goes from like 60,000 transactions per second to 800,000, something like that. Fairly linear scale for this particular workload. Your mileage will differ. Down the bottom left you'll see the latency, database level latency. It's running at a lot less and then a millisecond in terms of latency. That's because as you will see it's basically an in-memory system. And so this is what it's about. By the way, you could take away any of those nodes and it's just gonna go slower. And I really mean any of those nodes including that first one or the first four or whatever. These appear to appear nodes. You add them, you take them away, goes faster, it goes slower. It's resilient to failure and so on. That's what Elastic Sequel is. When you look today you're seeing more and more other companies starting to try and build this. You'll be familiar with Google Cloud Spanner. It's a version of this using a very different approach much more of a kind of a heavy-handed approach to try and get there. There's Cockroach and there's others that are now trying to get into the space. We've been out there for a few years and being very successful. So, excuse me, I put this up here just as a kind of a point which is I find sometimes it's quite hard to explain the system to database people. They're the worst people to try and explain the system to because they already know everything. And I think of it, I wasn't there but I'm guessing that the guys that first introduced jet engines came along and they said, I've got this engine and it's phenomenal. It can fly at 60,000 feet without a turbocharger and it can fly faster than the speed of sound and it can accelerate a 300 ton jetliner to take off speed in 40 seconds and the other guys go, no, you can't. To do that, your valves would float or your piston frequency would be too high or you're gonna blow your bearings or something. And the jet engine guy gets, no, no, no, you don't understand, we don't have valves, we don't have pistons, we don't have any of that stuff. Yeah, we do have some of the things that are similar. In this case, for example, we do have a compression phase and we do have an exhaust phase and all the other things that you have in a traditional engine but it's different, it receives the same, it gets the same end, but it's different and so why do I say this? Because I really want us today, I want you to, for the first few minutes, to forget everything that Andy's told you, okay? Much as it will be useful later in the presentation, please start and work with me on what this thing really is. By the end of it, trust me, you will understand it and then you'll have much smarter questions. Do ask questions along the way, because you will have some but I think what you're going to find with a lot of it, I'm going to say, hold on to that question. So what are the fundamental principles in the system or the fundamental components in the system? Two things, things we call engines and things that we call atoms and understanding, if you don't understand those, there's not much point listening to the rest of what I've got to say. So let's try and figure out what those are. Engines very simply are processes, Linux processes, if you want to think of it that way. An engine is a Linux process, a database consists of some number of those things running, they're all peers of each other, there's no central anything, there's no supervisor, there's no authority anywhere, there's no central name server, it's just a bunch of processes, peer processes that are in some sense collaborating with each other, okay? When customers think about it, they think, well, what we're talking about is a machine or a VM or a container or something. Yeah, but it really is these processes. You could run them all on one machine, not a good idea. You can run them on lots of machines, you can split them between some number of machines. Engines are just the nodes in the system are these engines that are basically processes. They're very asynchronous, they're very asynchronous both internally and externally, they're very highly multithreaded, they're very, very busy things, these engines. And you'll see kind of how they fit in a moment. When I say they're peer-to-peer, they're also autonomous. So unlike a traditional top-down kind of design, these aren't systems that somebody says these are the things that must all join up to be a database, like when you think about partitioning a traditional database, you think, well, I've got four machines and I need to divide the database between the four machines, that's a top-down process. This is a bottom-up thing, these things can just join in and leave, right? An engine can join in the database anytime at once, it can leave anytime at once. We can put some constraints on that for management purposes, but architecturally, engines are dynamic. They come in, they go out, that's how we scale out. That's the whole point about scaling in and out. And then what they do is that they delegate as much as they possibly can to the next thing I'm gonna tell you about, which is called atoms. And so that's a sort of a, it's a bit, what we're getting to is basically an actor pattern, but let's put those two things together in a minute. So atoms, the most important thing to understand about this architecture is it's all built around these things, which are basically container objects, smart objects that contain data. Any data in the system, not just your user data, your tables and your fields and everything else, but your indexes, your metadata, your everything, everything that's persistent state or even in some cases not persistent state is one of these containers. The containers are intelligent in multiple ways. They can certainly serialize themselves to disk. They can serialize themselves to networks. They can replicate themselves. They can keep replicas up to date. They can do all sorts of things. But there's all these kinds of atoms. And so just to talk about why that's very, very important, and you've probably heard this a zillion times in the context of Docker, but I'm gonna say it again. There's nothing interesting about shipping containers. They're just big steel boxes. Okay, what's interesting is the fact that they're standardized means we can have vastly more efficient ecosystem around those containers. Okay, so it's not the container that's changed the world's shipping industry. It's the fact that these containers can be loaded and unloaded from ships very, very fast, that they can get onto trains and out of the port as quickly as possible that they can be handled in these highly efficient ways. And we've now built our world's ports and our cranes and our trucks and our trains everything else to be around these containers. None of those systems need to understand what's in the container, obviously, that's the point. And doing that as an internal architecture of a database is radical. It was one of the biggest breakthroughs that Jim came up with is to say, let's get away from thinking about tables in memory or tables on disk or anything like that. These are just container objects and all of the interesting things that happen in the system are really about these containers, what we call atoms. You'll hear me talk about atoms quite a lot. So, what are the atoms? This is not supposed to be exhaustive. I'm certainly not gonna go through and explain whatever one of them is, but just so you can get a feeling for it. There are a variety of kinds of atoms. All of them, obviously, are atoms. They're all handled by the system in the same way. They're instantiated, they're deleted, they're copied, they're stored, whatever else, in the same way. There are various kinds of atoms. If you look at this kind of dark blue one on the bottom, it says what it is. That's where your user data would be, for example. But there's some more exotic ones. If you look at the ones that are called catalogs, these are the atoms that are used to find other atoms in the system. So, atoms are used as the mechanism for doing essentially a name service or a directory service. I just kind of put that up just so you get an understanding. Everything, literally everything in the system is atoms. The maintenance of consistency between all those atoms is done in the same way, whether they're user data atoms or index atoms or any other kind of atoms. Everything's an atom and, therefore, the whole system turns into, basically, an atom processing system. At a deep level, new ADB is just a bunch of these engines that are loading and saving and instantiating atoms. So, as we build up, that's what we're trying to do here so that there's an understanding of how it all works. I want to just talk about atom life cycles. And to do that, I'm going to, for a moment, say to you, please forget everything you know about persistence. We're going to talk only in memory and we're going to talk only read-only because it's easier to understand that step before we try and understand the really hard stuff. So, what you got here is five engines. You've got a bunch of atoms. There's only three in each one, but it could be any number. And I think of these as in-memory instantiated objects, container objects. Okay, that's basically what they are. And some of them are replicated. You'll see, if you look around, you'll see that, you know, I don't know if I can pick one out there, but there'll be some atoms that are number 12, for example, is on both engine one and engine two. There's nothing about the system that requires these things to be sort of singletons, right? They can be replicated across the system and we'll find out more about how that works in a minute. So, you've got engines contain atoms and now we start getting into some of the kind of slightly strange stuff. Traditionally in a database, if you're running Postgres or MySQL or Oracle or any of these things, if you're wanting to load data into DRAM, you're loading it off the disk, right? That's how it works. It's caching stuff off the disk and doing stuff with it. But here's the example. Engine one decides, I need atom number 56. It's loading it from someone else's memory, right? It does the same for atom number 91. What we're talking about here is that these engines are capable of loading atoms from any other engine in memory. This is all in memory, right? And so, atoms can arbitrarily load data from any other, from at least engines can load data from any other engines at any time. And that, of course, could be 20 or 30 or 100 of these engines with these various objects. So, what you're starting to see here is something that feels like an in-memory distributed cache. The ability to be able to load data between them. I think I mentioned to you, and it's sort of implicit here, all atoms have a namespace and they're numbered in that namespace and they're unique, but it's not the instance that's unique, it's the content that's unique. It's the atom itself, it could have multiple replicas. So, you can do this kind of loading. Well, I want to just make a couple of points. First is, when I say loading, it's not moving that atom from one engine to another. What it's doing is creating an instance of a replica of that atom, okay? So, it's basically saying, okay, you've got that atom over there, atom number 23, I want atom number 23, I'm gonna construct it here and it's going to be a pair of the one that you've got. Okay? It's not being moved. There's no sort of transactional moving of atoms around. What happens is I just create another one and it's got the same state. I recognize that instantly, for people like yourselves, you're starting to think, well, how do I keep them up to date with each other? I'm not gonna get into all sorts of transactional issues and two-phase commits and whatever, but we'll deal with that. Right, for the moment, it's read-only. That's why we're talking about it like that. So, probably enough on that. I think one point is, again, just to say, when you do replicate an atom, you're not secondary to that atom, right? It's not like there's a superior relationship between the two atoms. The other thing that's kind of interesting is that you can drop atoms. Remember, I said that these engines are autonomous. They can at any time drop anything they want. The engines don't have to be the same executable, by the way. Anything that behaves according to these network protocols can take part and it has to behave according to these rules. One of the rules is, we don't care what you do with these atoms. You can load them, you can drop them, you can do whatever you want, as long as you tell us which ones you've got so that other people can load them whenever they want. Immediately, they'll ask some questions about how do we ensure that we're not dropping valuable data on the floor and we'll come to that in a minute. But the main point here is to understand, these are these kinds of autonomous pairs that are joining together, taking part and acting as autonomously as possible in the system. There are exceptions to being able to drop things on the ground and I'll get to that in a minute. It's quite important. In effect, what it is, is that some of these engines contract with the rest of the system. They commit to the rest of the system that they're not gonna drop atoms. Those are special ones and we'll talk about those ones in a minute. But the general rule is you can load stuff and you can drop stuff whenever you want. It is basically a set of a cache. We talk about it as this durable distributed cache. And so what you're seeing here, we've talked about how these atoms can be moved between these in-memory nodes, how they can be dropped in these in-memory nodes. You've got cache algorithms. The cache algorithms don't have to be the same on all the nodes. They're local and that's probably implicit. What would be the answer to Adam's opinion? It's not a one-one mapping. But one way to think about it is that we add your pages. One way to think about it is that atoms are pages that can lead you down an unfortunate track which is that atoms are much more live than that. As we'll see in a minute, an atom is an identifier, a state, and a queue of deltas. A page doesn't have that sort of liveness about it. So we'll get on to that. But yes, if you want to at a very kind of a basic level think of it as atoms are a bit like pages they are if you think of pages as being a lot more intelligent. So obviously most people think of databases as being about storage. Jim Starkey, my co-founder, regards this as the single biggest mistake that people make when they design database systems. And so what they do is they start with storage, they say it's a storage-centric transactional system and then once you do that you can't scale out because now you're locked into some kind of a single storage mechanism. And so far we've talked about something that behaves entirely in memory and the question is but then you've still got to do storage, don't you? And the answer is yes, obviously. And so imagine for a moment that there's one of these engines that we've been looking at on the screen that says the following three things. One, guys, I have all the atoms. If you can't find the atom anywhere else, you can always come to me because I've got all the atoms. Okay, so one of these engines says I've got all the atoms guys. It also says I'll always have all the atoms. I'm gonna keep them, right? So also therefore you're safe to drop them if you want to because worst case I've got them, right? It also says I'm gonna undertake to keep them even under circumstances of power failure or whatever else that we normally think of transactional systems being able to survive. So those three things, it does say by the way I'm probably a bit slower, right? And as we're sitting here thinking well how does he do that? We're probably sticking it on disk or something, right? And that's why he's probably a bit slower. But other than that, I'm just like the rest of you. The only difference between me and you guys is I've got all the atoms, I'll always have the atoms and I'm guaranteeing that I'm gonna have them under certain special circumstances as well. And so you end up with something that looks a bit more like this where you've got this engine five has said those three things. I'm the guy that's got all the atoms. I'm faking the fact that they're all in memory. I'm slower than the rest of you. Notice something. None of the other guys know anything about whether it's stored on disk, let alone how it's stored on disk. The disk storage kind of strategies are nothing to do with the core database, right? In practice, that could be stored on any number of key value stores. It could be stored directly on your file system. It can be stored in NVRAM if stored anywhere you like. And the rest of the system doesn't know or care. All it sees is performance and guarantees, right? And of course, as I said earlier, what that does is it turns the other guys into a place where they go, well, in that case, I can throw stuff away whenever I want because worst case, people are gonna just be able to go to this other guy that's got all the atoms, right? So I've kind of come into it through this kind of way of the in memory and everything else because it's very easy for people to misunderstand when they see a disk on a diagram like this, they go, oh, I get it. Yeah, okay, let's start talking about that. I want to understand what your disk indexing structures are and everything else. Stop, right? That disk is really just a key value store that's taking these atoms that are being serialized. They're being plonked on the disk in some format. Doesn't really matter, right? What's really matters is the fact that is the protocols that are going on between these guys in memory. And think about it, if somebody comes along and if engine one needs to get some data, as you can see, is able to get them from any of these guys, eventually someone might, like we've got engine two here, not being able to get object 62 from somewhere in memory, it goes to the guy that's got everything on disk and the guy that's got everything on disk says here's object number 62. Once again, when he does that, that's not being moved, right, it's just being replicated. Remember I said that? Nothing ever gets moved, it's just that engine two is just going to engine five and saying, I want to replicate that object and have my own private instance of it. Now it's sitting in two places, one on this guy, engine five and one on engine two. So the guys that are doing this kind of magic stuff about persisting things, we call them storage managers, might not be a good name, because it tends to make people think about storage engines and things like that, they really are just engines that happen to do some backing store. And the other ones are called transaction engines. So if you ever see, if you go to our website or something, you'll see a lot of language about transaction engines and storage managers. They're the same thing, okay, just one of them is the guy that's kind of volatile and DRAM based. The other is a guy that stores some of the atoms, okay. Not much else there that I need to, let me just talk about this. It's not architectural how the data is structured in deeper storage. Sometimes people come along and say, yes, but is it a column store or is it a row store? That's back to the jet engine analogy. Let's not talk about spark plugs and let's not talk about bulbs, okay, wrong question. On the back end, it could be both. We could have an atom storage strategy which is column storage. We could have one which is row storage. We could change it tomorrow, right, on your running database, wrong question. And the same is true of exactly what the underlying storage architectures are. We've run this thing in a wide variety of different things. Typically people run it directly on the file system, but you can do a wide variety of different things. So furthermore, there's nothing to say. Given that that storage manager is really just a specialized transaction engine, what's to stop other transaction engines from just doing the same thing? Is there anything in the architecture I've just described to you that says there can only be one? No, nothing in it that says there can only be one. I do understand. We're gonna have to talk about transaction commits and things like that in a minute, but just for the moment, we're thinking read-only and whatever. No reason why engine four can't also say, you know what, I'll also have all the objects. Stick them on some storage mechanism. Maybe a different storage mechanism from the first guy. And now, if one of these other engines decides it wants to load a particular object, it gets a choice. It can go to one of the guys in memory. If that object exists in memory, doesn't exist in memory, I can go to one of the storage managers. By the way, I know upfront at very low cost, the cost, the response time of going to any of these guys. So when I've got two storage managers, and I know that the object is in both storage managers, I can look at the response time and go, I'm gonna pick the one that's network closest to me, okay? Which by the way, might not be that network closest to me if it gets really busy, because the cost function is gonna start favoring the other one and you get natural load balancing. Yeah? Is the cost function something that's built in when you build the system, or is it something that changes when you run the system, like something remembers how fast it's going to load? No, it's a heartbeat. It's a heartbeat between all the nodes. So it's constantly adjusting, which is why you get a constant load balancing, okay? Because if I've got a choice of five places to go and get an atom, and everybody goes to the first place, he gets busy, he's no longer top of the list, I'm gonna go to the next one. So the system just naturally, and that's typical of the kinds of algorithms that go on in the system, okay? So, but there's more to it. What I said, if you listen carefully, was I said, well, this guy's really a redundant copy of the other guy. Both of them have got all the atoms. Doesn't have to be that way. Actually, we recommend that you do it that way, because then you've got K-safety, right? You've got redundancy, and one of them can go away and it'll keep going. But you could say, well, you store all the even numbered ones, and I store all the odd number ones, okay? Now we've partitioned the system. From the perspective of the transaction engines, they don't care, they just go, who's got it? No, that guy's got it. I don't care why he's got it. In fact, we don't do it based on atom numbers, but if you're gonna partition it, you would typically do it using a SQL predicate, okay? But you can partition it any way you want. You could take the population of the US and partition it by men and women or something, and you know, that's great. But you could also partition it by state or something. In fact, you could do that in parallel. You could have what we call one storage group, which is these two guys that have partitioned it based on sex, and another storage group somewhere else that's partitioning it based on first name or something, right? In the same system at the same time. So this isn't, when we hear people talk about, can you partition the system? We're like, no, no, no, no, stop, right? You can partition the system lots of different ways at the same time, redundantly, okay? And lots of power there, yeah. So all right, how do you ensure the consistency of the cache? I'm gonna come back to that. That's the hard, that's really where the patent is, okay? Because a lot of this is fun and interesting. It's just good design. That's the hard question. So let's move quickly. So we've already said this. Basically, it can store overlapping, it can be redundant, it can be partitioned, it can be whatever you want. We would normally say to people, don't deploy with a single partition manager, so storage manager, why would you do that? Much better to have two or three or five. If you're running it in multiple data centers, put two in each data center, and so on. Oops, I just told you that it can run in multiple data centers. We'll get onto that. So what does the whole system look like? Pretty much what you expect. So applications or app servers up there, you got a thing on the right which I'm not gonna talk about much, which is a whole distributed system of its own. That's a raft-based distributed system, basically a distributed, sort of what would you call it? You've got a bunch of brokers and load balancers and agents and stuff like that that have got shared state. It's a direct, it's not really so much a directory service. It's really domain manager, which is the security aspects of the system. It's load balancing between them. It's membership of the database, things like that. But yeah. So there's that. And then we've talked about this layer, which is what we call the platform layer. It's the atom layer. And there's two pieces to it, the transaction engine storage managers, which are really the same thing. That's what it looks like. And we're gonna have a few more diagrams that look like that. Before we do that, what about SQL? I've said all of this, sorry, question. So good question. We're gonna go through that in greater depth, but here's the answer. When this, say application one wants to connect, there's five transaction engines, right? Who do I talk to? I go to the brokers and I say, give me a, okay, and they're redundant brokers. So I go to the broker and I say, which transaction engine should I talk to? There's a load balancing algorithm there, which could be round robin. It could be sort of a hash-based thing. It could be a sort of usage-based thing, whatever. It tells you which one to go to. And if that broker goes away, as I say, it's redundant. You go to the next broker, okay? In fact, it's also how it works when things fail. If that transaction engine goes away, the application goes, oops, what do I do? I go back to the broker. The broker gives me another transaction engine. Get going again, all right? So I've said all of this and I've said it like hardly mentioned the word SQL. And yet if you go to our website, it's SQL database, right? SQL database, why is that? Because that's where the market is, okay? It's my little not funny joke when I say that there's a 30 or 40 billion dollar SQL market and a zero billion dollar not SQL market. And so we're after the SQL market. It's nothing to do with what the architecture can do. And so let me take you through it. This is basically what a transaction engine looks like. That's what a storage manager looks like. It's the same thing. I already told you that. It's the same executable, different command line flag. And what are the inside them? Yes, there's a SQL engine sitting on top of the atom layer and there's a KV API, a storage API on the bottom of the engine. In the case of the transaction engine, it doesn't use that storage API. In the case of the storage manager, it doesn't use a SQL engine. The SQL engine doesn't really know it's talking to distributed database. It's a conventional SQL engine. It's our own. It's a modern, next generation, very good ANSI support, very rich, great optimizer, great execution engine all built from scratch by us. It sits on top of every one of these engines that you see. Okay, so when a connection comes in, you're running that whole SQL parsing, SQL execution, SQL optimization, all of that stuff. And so you might say, well, gee, Barry, then couldn't you replace that engine with something else? The answer is yes. There's nothing special about it being SQL. Okay, those atoms don't know they're part of a SQL database. The storage on the disk is stored by value. It's self-describing store. It doesn't know that it's a SQL database. The SQL part is really only the top part of each of these transaction engines plus the client drivers and everything else. And so is it possible for this to run be a JSON store? Yeah, we have one in the labs. Is it possible for it to be a graph store? Yeah, we have one in the labs. It's completely possible, yeah. So the operators in these engines consume atoms or tables? So the layer of the engine which is the atom layer deals only in atoms and there's nothing else. But it gets asked to hand atoms up to the SQL layer. And the SQL layer knows has a thin shim that understands it's not a standard ISAM API, it's an atom API. It basically says I want atom number 791, it gets it. And then at the SQL level, it just treats it like a conventional SQL database. Yeah. So it could in fact be a non-SQL database. That's not where we're going. Okay, we may or may not take that to market at some point. What we're really interested in is being that world champion at being a distributed SQL database and we think we're basically there. Now I'm gonna walk you through a select and the question earlier about how things are connected. Here we go. So application one connects to the broker. The broker says, yeah, you need to go to transaction engine one. The select statement, which is a read of course, comes in and says it connects into that transaction engine. Nothing very interesting there. Sends the select query. At that point, this transaction engine says, oh, I need the following. And basically it's gone through the SQL parser, it's gone through the built up an execution plan. At the bottom of the execution plan, it says, oh, I need the following set of atoms. It goes and fetches those atoms. If it can, it'll fetch them from memory. If it can, it'll fetch them from nearby memory, not memory of a machine that's in London. And then if it has to, it'll go to one of the storage managers, it gets all those atoms together. And from there on, as we just said, it's kind of like a conventional SQL system, right? At that point, it's got all the data, it does what a conventional SQL system does, and it hands back the result. Easy drivers in the talk student broker, you have run your application here. You'll figure out where to do that. Correct, yeah, correct, correct, yeah, correct. And in some cases, we've been able to, even abstract failure like that, so that when a transaction engine goes away, that the JDBC driver itself goes to the broker, finds out where to go next, and reconnects for you. So with some kind of client-side drivers, that's not really possible. That's not a big deal anyway, right? It's just, it's a, so that's it. So that's your select, yeah. Probably for optimizations, could I use the time to load the atom, or not know what to do? So atom affinity is kind of a very interesting topic, and I'm keen to talk to some of you smart guys about this, because it's really kind of cash management, right? It's a kind of cash algorithm, is what you're really talking about, and about how does a connection that's coming in figure there's affinity of the atoms that are already preloaded onto this particular node. The simple way that our customers do that today is by having a hash-based load balancer, right? So that the client effectively is saying, this is a similar kind of part of the database that I'm talking to. Here's my hash key, right? And we just use that to redirect you back to the same transaction engine. But that's a fairly crude thing to do. That could be automated, yeah. I'm sure everybody understands MVCC. Jim, my co-founder Jim Starkey, was the guy that built it into the first commercial product back in, I don't know, 30 years ago, was RDB ELN from Digital, and also into Interbase, which is another of his products. It's an alternative lock-based concurrency. You know this, readers don't block writers, writers don't block readers. This is what Jim's whole life has been about as a database guy, and so as you would expect, new RDB is kind of steroids version of MVCC, right? This is distributed MVCC done by the guy that kind of invented it. So I don't wanna spend too much time on it because we don't have a lot of time, but what I wanted to just make sure everyone understands is that, is that if you wanna think of it as a record level, what's inside these atoms is MVCC, right? And that's the way to understand as we move on to updates. So updates are quite hard. I'm warning you, you probably, if you're thinking hard, you're gonna come up with 70 questions that I haven't answered up front, but let's talk about them. Same deal, we're gonna do an update. I've shut it down to one transaction engine and one storage manager, so we can't get too confused, and off we go. So we go to the load balancer, of course, it sends us to the transaction engine. We send our update query. The update query says, hey, I need all those objects. The only place to get the objects from is a storage manager. Calls a storage manager, says I want to create my own instances of those objects. Off we go. It does mutation locally, right? So this is the first thing where we're doing an update. That update query said, you need to change these atoms. So you did, right? What does change the atoms mean? It means create new MVCC versions, right? New versions of the records. That's what this is. That's why they weren't read. So now you have atoms up here that have new versions of the records as yet uncommitted. They're dirty, right? They're dirty versions at this point. And the green ones down here don't know anything about it, right? So what happens next is that we get asynchronous. That's important. Replication going on, right? Remember I said, the transaction engine never just takes this updated thing. It says, here, here's the whole atom. Go and store it. That's not the architecture. The architecture is to say to the guys locally, oh, what are the differences between your last version of this record and the new version of this record? It's these things. Okay, great. Send that replication message asynchronously. Cure it to the guy, to the other guy, to your kind of replica, right? And that gets sent off whenever it gets sent off, right? And that happens. Now, this could be, I mean, within a transaction, you could do 10 million commutations of those objects. And what's happening is we're pumping out these replication messages. They're about 20 bytes, typically, okay? They're very, very small. They're batched, they're asynchronous. They're also encrypted, but anyway. And so, those are being sent off, sent off, sent off to these other guys. And what's the other guy doing on the other end? When they're coming in, he's saying, okay, I'm kind of creating these dirty records. They're not canonical. It's not part of the database. It's not been committed. But, you know, keep going. And eventually, of course, what happens is, I mean, they're gonna send a rollback. Well, the rollback, because the client just told us to, right? The client has said commit or rollback. If, oh, the system said rollback. The rollback is easy. It's a no-op. It's just, by the way, you know, that stuff I sent you, forget it. Doesn't matter, right? The commit is, oh, you know that stuff I sent you? It's now canonical, right? Exactly how the commit works is quite a long discussion. And it's quite an important discussion. I hope we can get to it. But I just wanted you to understand that a simple update, that's how it works, okay? But don't pop a commit. You need to know that your, because your, the commit has to be synchronous. You have to know that you can talk about it. Correct. Correct. Correct. You mean that, you know. So you need to be able to say flush, but, oh, by the way, I told you much of these other deltas. Make sure those get in. Yes. Okay. Is the answer. Okay. There's more to it, but yes. Okay. Okay, so that was a trivial one, Barry. That's not how we run our systems, okay? So what do you have? What if you've got a multi-replica update, okay? So what's changed here is, is that we've got, by having more of these transaction engines, and by having another storage manager, now we've got multiple replicas, right? So let's see what happens. Same deal we ask you to connect to. He says, that guy over there, yes, doesn't seem to have been doing any work. He's got nobody in cash. Go for it, right? So we connect to that guy. We're same deal. We send in our update. In this case, we go, oh, we don't have to fetch everything from down here. We can fetch it from these other places. So we'll go do that. One of them we have to fetch from down in the storage manager. We'll do that. And we go ahead and we do our mutation, which is the same thing. So now we've changed these items in some way. And once again, now we've got to do our replication stream. And what happens is, as you would expect, we go back to the guy that we replicated it from, and we say, here's the replication stream, off you go. That's great, isn't it? No, not great. The reason is because there are other replicas in the system. And so you'll notice that in fact, this guy number 62 was not just, didn't just have to update the 62 on TE4. He also has to update the 62 on SM2, right? And so this is kind of a PubSub mechanism. Think of it as PubSub and you're sort of getting there. If you're at a number 62, you're interested in everything that happens on 62, whether or not somebody replicated me, okay? At this level, this sort of replication stream thing that we're talking about is basically being published out to everyone that's interested in it, okay? And so in the more general case, this is why I simplified it. You're actually, the replication updates is important. Now, this is the area where Andy and I were talking about. Because we didn't just replicate to the storage manager, if we just replicated to the storage manager, that would be a shared disk architecture, okay? Because now everyone else has got a dirty atom and they have to go and pick it up off the storage manager. What's happening here is that the guy up there, 62 up there on TE4 is fully up to date, right? Doesn't have to go to the storage manager. Storage manager is just a kind of a log almost. As far as he's concerned. So of course we do have to do the same thing with this commit and rollback. How are we doing on time? Yeah, okay, so good question. Yeah. Yeah, that's a good question. So remember earlier on, I showed you the different types of atoms and there's one kind of one I says called a catalog, okay? So when you bootstrap, when you're transaction engine and you're bootstrapping, you load one object, object number one it's called actually, which is the root of the catalog. It's a tree, the catalog's a tree, okay? And the catalog is essentially a name service. The catalog says, you know, here are all the atoms in the system and here's where you'll find them right now, okay? It also tells you the latest, we were talking about the cost function, the heartbeat. That's also in the catalog and there's a bunch of other stuff that's in the catalog. Now you're gonna say, well, how do the catalog keep up to date? Using exactly the same mechanisms. That's why you use atoms. And remember I said, it's great to have everything being a container. The catalog itself is a distributed system. There's no central directs, not like Hadoop where there's a central directory service, this directory service is completely distributed and partially replicated. I don't load the whole catalog tree. I just load the pieces of it that I care about, right? So, see if we can move to the hardest part. I'm trying to make sure I've got time for questions here. Okay, so what about update conflict? This is not some sort of single user analytics database. This is millions of concurrent users. It's transactional LTP style, kind of operational database. And you're gonna have lots of people updating in parallel and Barry, are you really sure that you can have a hundred TEs and everybody updating in parallel and this thing's gonna behave in a transactionally acid fashion? The answer is yes and this is a big part of it. So, these two guys now, two applications are saying, you know what, I want a TE. I want to connect to a TE and they go to the brokers and the brokers say, yeah, sure, here are the guys to connect to. They connect to them, usual stuff, they send their updates. I changed the colors a bit so they're not too confusing. You can see that this guy app one goes and loads a whole bunch of things from various places. This guy TE3 loads a bunch of things from different places. But if you're looking carefully, you'll notice that actually they've got the same data in some cases. So, Atom 6 has been replicated onto both of them. So has, yeah, whatever, at least Atom 6 has. And so what's gonna happen if they mutate them is you've got a problem, right? Now you've got a situation, this is the hard problem, this is where a lot of the kind of sort of two phase commit stuff comes in, right? How do you deal with the fact that these two sort of separate transaction engines that are by definition highly asynchronous, extremely decoupled, kind of carrying on in their own way, and suddenly they both got the same Atom and they both wanted to update that same Atom and how do you deal with that? What we don't do is try and fix it at commit time, okay? So in so many ways, this system's an optimistic system, but as relates to update conflict, it's a pessimistic system. What it does is that you have basically a distributed serialization service, which has got a bad name as well, it's called a chairman, that makes it sound more powerful than it is, it's really just a serialization service, and it's distributed, and it's got failover mechanisms and so forth. And what happens is both of these guys, these two nodes, these two engines, TE1 and TE3, send messages to the serialization service asynchronously to say, you know what, I'm changing this thing. Let me know if that's a problem. Sometime in the future they might get a message back to say you got it, might get a message back to say you didn't get it, right? And by the way, again, they're autonomous, they can do, theoretically, they can do anything they want with that answer. I mean, they can't update it, but they can at that point hand back a transaction fail to the application, they can do all sorts of things, and practice what they tend to do is to wait to see if the guy that did get it actually rolls back, in which case I'm just gonna continue, because you don't want to get into a live lock, and you guys are probably quite familiar with those kinds of issues. So what happens is, in this case, TE3 lost, okay? Went to the serialization service, the serialization service said no. So TE1 goes ahead and does its commit, or its rollback. Like I said, the public, could you describe the chairman's stricter case lock? It's, I'm not gonna argue with that. Not gonna argue with that. No, it doesn't make me uncomfortable, but I think what is interesting, I mean, there's a lot of interesting pieces to it. I mean, the reason I have no problem agreeing is because by definition, anything that does what I just described, that does a kind of conflict avoidance is some kind of a lock. It has to be, right? But I think that what's interesting here is it's very asynchronous, right? What's really happening there, this is a very simplified kind of toy town thing. You're actually got millions, typically, of atoms for a particular query. An atom, by the way, just to say it's probably 50 to 100 k in size, okay? So they're not that big. And so you've got millions of atoms, and then this kind of conflict resolution doesn't happen at atom granularity. It happens at much finer granularity than that. But what's happening is that the query is most likely trawling through very large numbers of atoms and sending out these messages, these asynchronous messages to say, have I got it? Have I got it? Have I got it? Have I got it? And then it's sitting there waiting for the responses, right? And so there's a lot of kind of pipelining and interleaving of it and not very much sort of synchronous waiting, right? So. For asynchronous, strictly a lot. Yeah. Okay, yes. So that's pretty much all I wanted to say about some of the deep down stuff because I think there's gonna be a lot of questions or maybe some questions. But what I did want to do is to just give you a kind of a sense of, okay, so why is this so exciting to users? And remember I started with this idea, this chart of kind of just being able to add nodes and it just goes faster, okay? And so imagine for a moment, here we've got a system, single database. By the way, I should have said earlier, at all times because of the structure of this thing, it is a single logical database. The application, the JDBC application only sees a set of tables and a set of rows and a set of, that's it, right? It doesn't, there's no kind of notion of partitions or anything else at the application level. And in fact, those things, those kinds of partitions and other topological changes could be happening while the application's running. It's completely independent and orthogonal. So the application here is running. It's got a million users or something cranking away and it turns out that we've hit our limits. We've hit our performance limits, let's say, and so we want to get more throughput. And so what happens? Well, when you walk through it, it's actually amazingly trivial. Once you've got a system where you can add nodes and take nodes away at will, these answers are all very simple, right? What happens? You bring along your new machine, you've just gone and bought it at Best Buy, you take it, you fire up your Linux, you install our stuff, you connect it in, you have to give it some credentials so that it can connect into the database. It's there, it sits there, okay? I already told you it actually does something else, it loads atom one, right? And it sits there, it's waiting, right? It's not doing anything, it's not being told to do anything, it waits. It waits until this guy decides, oh, you know, I now want to make a connection. The broker takes a look and says, gee, there's a new transaction engine in town, the guy is not very busy, I think he needs some work. And so, boom, you've bootstrapped it, right? Immediately that connection happens, some kind of query gets issued, demand loading of all the atoms that you need, off you go, right? This can happen in, you know, 50 milliseconds or whatever it takes to fire up a Linux process. You know, it's not, sometimes you look at, people are so excited about what some of the kind of cloud architectures can do. But literally when we do this on Amazon, we're sitting waiting for them to give us a machine because the database comes up like that. I'll give you an example which we did as a kind of almost provocative thing. Hewlett Packard came to us a couple of years ago and said, we've got this thing, it was called the moonshot box, which is, I don't know if you're familiar with it, but moonshot was this kind of high density server. So they have a, basically it's a 4U rack with like 45 little microservers in it, each with eight gigs of memory and an Intel Atom processor and all this kind of stuff. And they said, gee, we're gonna, we want to take this to market, we're gonna do this big launch. You guys are distributed database, why can't you guys run some great benchmark on it for us? And we said, well, because those are not very good processors and just, you know, it's fine but for the same money you could actually go faster on your older machines. Tell you what we can do. So let's break the world record for running the number of databases on a $60,000 server. And they said, well, okay, whatever that means. So we built an emulator for WordPress, which was, you know, basically each WordPress account had, you know, 1,500 entries and this amount of kind of access and update and everything else. And we ran a new ADB system across that moonshot box. We ended up running on that box of 45 servers, 60,000 databases at the same time. 60,000, that's actually more, but let's call it 60,000. And you ask, what are we really doing? Well, actually, we were cheating. We were cheating because it wasn't 60,000, it was 6,000, which is still a gigantic number. Okay, 6,000 at any given moment was in practice what we were doing was hibernating the database because it's so quick to start these things up that what we would do between queries if we just say, oh, this guy doesn't seem to have been doing anything for 10 seconds, shut down the database, we can get it up in much less than a second, okay? And so, and we have a patent on database hibernation as a consequence. And so when you've got a database system that's so lightweight, so easy to fire up nodes quickly, right, it changes the dynamics completely. We also did something just parenthetically, I say, with it, which is that we said, guys, we'll do something else for you. Put one of your big, kind of whatever they call, DL380s or one of their big servers next to it, adjacent to the Moonshot box. And what we'll do is when one of these databases gets hot, we'll move it without taking it down, right? What did we do? Well, we notice, obviously it gets busy. We go to the other machine, which is on the same network. We fire up an engine, a TE, right? We redirect the work to there. We shut down the old TE on the little machine, right? From the user's perspective, they don't see anything. They've just moved on to a bigger machine without missing a heartbeat, without losing any data or anything, okay? So that's called database bursting. We have a patent on that as well. So adding a TE, we're not talking about, you know, people talk about, well, yeah, I've got this great sharded system, you know, and then I can actually put it on to more shards or, yeah, but to do that, you have to go back and repartition and do all sorts of stuff. We can just add as quickly as we like. One way of thinking about what's going on at this top level is that this is a kind of a self-organizing in-memory partitions, this demand loading that goes on. It's like a self-organizing in-memory partition. And they're overlapping partitions, obviously. So that's pretty cool. All the usual stuff. What about losing a TE? Well, so we already said, these are DRAM systems, right? They don't touch the disk. All of the data that's sitting in that transaction engine right now is one of two things. It's either committed, in which case we can lose a TE, or it's not committed, in which case all that's gonna happen is that the client is gonna get a transaction fail because there's gonna be a connection fail. It's gonna turn into a transaction fail and your application is gonna have to retry the transaction which it should be designed to do in the first place. So you can always lose a TE and there is no consequence. Now, there's quite a lot of under-the-covers kind of stuff that we have to do to kind of reconstruct some of the system state that's distributed and so on. But first approximation, you can go along and you can shoot any TE anytime. You can shoot all the TEs if you want to and restart another one. And you're not gonna lose a transaction. You're not gonna lose any data. You're gonna keep going. So how do you cope up with spikes and transaction latency in these three contributions? How do you keep those community spikes down? I mean, it's always gonna be a challenge to do things like that. We tend not to see significant latency spikes of the sort that you're describing. When a restart is as quick as it is, then it's not like you're sitting, a node goes away, okay? There's gonna be a delay for a reconnect. There's also gonna be a reduction in capacity until we fired up another one, right? And you can have different strategies. Some people will deliberately kind of over-provision. We'll run an extra TE or two, whatever. But I think the major latency hit is the reconnect. So it's still sub-second. So TEs can go away. Now let's talk about adding a storage manager. So we've got the system running and I walk up to it with my new shiny new machine and I say, oh, you know what? I want another storage manager. Why? Because I think it's more reliable to have redundancy or I want to partition it differently or something. So I stick the storage manager in and connect it up again, supply the credentials and what does it do? It takes a look and it says, oh geez, guys, you guys have got 10 million objects and I've got one or zero, whatever it is, okay? There's a problem. It starts loading them, right? How does it load them? Same deal. It loads them from memory if it can. It loads them from wherever it's the fastest place to load it from, okay? So it just sits there until it's caught up, right? So a new storage manager comes online. It figures out what objects it needs to load, loads those objects, and then it puts up its hand and it says, guys, I'm ready. I can take part. Suddenly, everybody else that wants to load objects has another option of where to get objects from, okay? So very straightforward. Again, there's detail there about the configuration of what's gonna, what do you want it to store? Do you want it to store the whole database or only the database for this continent or whatever it is? But that's the basic idea. So the nice thing here is, and also by the way, this also works for partial catch up. So you could, for example, say, I want to have a poor man's backup. I'll just switch off one of these storage managers, put it in my stock drawer for a couple of weeks. If I bring it back and put it back into the system, it's gonna take a look and say, oh, I need to catch up. It'll catch up and get going again. Once it's connected, by the way, and up and running, it is fully a pair of the others. It's not secondary, right? This isn't a master and a slave and whatever. Once it's up and going, you can just switch off the others, right? Let's suppose it's one that you can figure to have all the objects. You literally can add this one in, have it catch up and shut all the others down. If you think about that, that means you can move a database while it's running. I could start that up in the Amazon Cloud, for example, into my database that's running in my data center. And wait, some stage it's gonna say, I'm ready and I'm running and everything else. I can fire up some transaction engines in the cloud, shut everything down in the data center. My whole database is still running. I've kept all my data. I was running at a million transactions per second. I'm still running at a million transactions per second. People talk about data gravity as being one of the big problems. We're like, yeah, kinda, right? We don't really think it's a problem when you've got this kind of architecture. That's pretty bad. So, okay. So, well, it's a marketing term. Data gravity is this idea that now we've got more and more dynamic systems. You can run Docker elements in any way you want and microservices can be redeployed and what all those kind of stuff. But data, you can't move. Data gravity is this idea that data is welded to the floor somewhere. And so, in wonderful cloud stacks that we're building, everything is dynamic and everything is movable and everything can move around, but data can't. Data is stuck to where it is. And so, data has gravity. It's being pulled down to Earth. And what we're saying is, well, yeah, if you've got an old-fashioned database, that's true. And this is kind of an obvious one. Losing an SM, depending on again how it's configured, but you presumably configured redundancy into your system. Your disk drive crashes, SM goes away, operator gets told, fire up another SM end of story. So, just to summarize. What you've got here is a system that's a rich ANSI SQL database system. If you look at who our customers are, they are literally moving Oracle and DB2 and SQL server and all that stuff onto us. Because why? Because it is not a lightweight SQL implementation. It doesn't work well except when you do joins or something like that. It's a rich ANSI SQL implementation, full asset semantics, MVCC style. It's cloud native. All of this stuff that we're talking about, cloud native. By the way, all driven by REST APIs. This stuff about, we were talking earlier about what happens in the admin layer. The admin layer exports these REST APIs, which is fire up an engine for me here, fire up an engine for me there, shut down that engine over there, reconfigure. It's all REST APIs. It's all manageable using Kubernetes or your favorite scripting language. Very cloud native, very kind of scale out and elastic. In memory performance, again I was talking to Andy about it earlier that people talk about, well, there are these other sort of elastic SQL database products and the answer is yeah. Let's talk latency for a moment. We've got in-memory latency, really, really high speed. That thing that we showed earlier when I showed you that chart and I said that our database level latencies are much less than a millisecond. At the time that we did that test, we actually compared it to a well-known document database and we found that we were 10 times faster and that was supposed to be a really fast database. Okay, so. So, on-demand capacity, we talked about that, continuous availability, more and more important, all applications nowadays are 24 by seven. Admin automation, we talked about what that's all about. And what we haven't really talked about is the fact that other than what Andy says quite rightly and this can be very important, which is commit protocols, almost everything else in the system is asynchronous. And what that means is that we're much, much less sensitive to poor latency networks. If you wanna put it a different way, that means we can run on multiple Amazon regions, we can run in multiple data centers. And it really is a case of I can just, if I've got a bunch of nodes running in Manhattan, I can walk over to a Josie City data center, add some nodes into a database that's running in Manhattan and instantly I've got a distributed database running across that network. Are there trade-offs there? Of course there are, but it will do it a lot better than anyone else will because it's happening at this kind of atom replication level. So that's really all I have in terms of presentation. I started off saying, bear with me, I'll try and build this up and build this up so that you can, by the end of it, you can feel like at least you know kind of roughly what the language is to use if you wanna ask me any questions. But I just, but I have left some time for people to ask any. Yeah. Yeah, so that's, I mean, yeah, sorry. So the question that was asked was, I mentioned that you could add a storage manager in the cloud or somewhere else remote and that it would just gradually copy across all the atoms and then after that you're running hybrid or you could run in the cloud. And that's true. The question was, well, what if that's a very large database? At that point, that's really a matter of network bandwidth. There's not much we can do about fixing the network bandwidth for you, but what we would say is you're better off just picking up the disk drive, you know, walking across to wherever you're wanting it to be, plugging it in over there and letting it catch up on that end. Okay. Or you can leave it to run and catch up over a long period of time, but there's not much that we can do other than that. Yeah. The durability of the transaction. The durability of the transaction. So this is one of the areas that I didn't touch on very much. So I mentioned that after a transaction engine has completed the transaction and let's say the application is issuing a commit. At that point, what happens is that you've got a commit protocol messages sent out. Okay. And it's queued behind all the other messages that are being sent out. The exact semantics of commit in the system are tunable, right? So as part of the configuration of the system, you can say, I want commit to mean K safety in memory, right? At the low end. Or you can say, I want it to mean K safety on disk. Or you can say, I want it to mean safety in multiple data centers. There's various things that you can do. And the system, and essentially what's happening under the covers is that it's really about understanding the acts. What's coming back from who, when, right? And at what point is it determined that the transaction is committed, right? So you can dial in if you want to, and we've had kind of military type customers that have said, we need it to be on oxide on three disks. That's what transaction means for us, right? And the old fashioned databases can't do that. Can you commit a transaction on three separate disks? Yes. We've had other people say, you know what? Yes, my stuff is transactional, but it's not long lived. I don't care, I think that my DRAM is perfectly safe. All I need is three copies in memory, and I'm done, right? That also works. I'm sorry? That means it's a synchronous application on command. Yes, the commit protocol, there is a requirement to wait for response, absolutely. So like latency, like for example, you said that latency is very low, but how do you deal with stragglers and all that when you're coming in? Deal with stragglers. What do you mean by stragglers? Right. Well, so let's be clear. You've asked two different questions. The one is on commit, if you're needing to wait for a response, there's a possibility that through network latency issues or kind of the machine's busy or something that it's got. Yes, that's what a commit is. We will have to wait. And typically, when people are running, for example, multi-data center, you try to make sure that your commit is not happening across a WAN, right? And you can do things to configure that, right? So that's, but yes, you and I could easily construct a scenario in which there are commits trying to run across a WAN so many times a second, that's not gonna run very well. Your second question about stragglers, I think, I need to kind of reframe it slightly because it depends exactly on what you mean by a commit. Now, if you said a commit means that all of them need to come back and say they've got it, right? Which nobody does. Nobody does that. If you do, yeah, then you're gonna all sorts of, you're gonna go as slow as the slowest soldier, basically. That's not what you do, though. A default commit might be, I mean, I'm going to get a commit back from the first storage manager that responds to me. I'm gonna consider that committed. I'm sorry? Why did that base sort of take off? Well, no, because I'm waiting for the transaction commit. If I don't get the transaction commit, transaction isn't committed and you get a failure to back up to the application, okay? So there's different ways in which you can configure it. You can configure it, as I say, to say, I want it back from two transaction managers, I want it from two transaction manager and three in-memory guys, whatever. There's part of what's hard to understand here and maybe is what you're touching on is that we really don't need for everybody to have a consensus, right, in order to commit. What we need is sort of a baseline that's agreed on what that commit is and we have to figure out the rest in housekeeping, okay? Yeah? I think there are a number of subsystems you're running and you'll be like the PubSupp system and the Broker system. Are these actually separate systems or are they re-using the data that is connected? A huge amount of the stuff that's core is extremely time critical and is not off-floated to other processes. The core engine is a single executable and does a huge amount of the stuff itself, including all the network protocol stuff and whatever, right? There are other things that are separate that are not time critical, you know, and so this is why we've got the separate kind of rafted admin system that takes care of some other things, but no, if you're talking about things like how do we do the network kind of effect, it's an effect of PubSupp, it's not really a PubSupp, it's an effect of PubSupp, that is basically hand-coded in order to get the kinds of performance that we're looking for. Yeah, one more quick question. Yeah? To what extent can you tune transaction engines in the same data, is it different so that one would have lots of storage and that one would have lots of CPUs and stuff like that? Does the autonomy of the admin system enable that kind of heritage of the admin teams? Yes. And I'll give you a concrete and useful example. So first of all, yes, there's no assumption here that these are the same class of machines, the same number of cores, the same DRAM, the same anything, that's part of the fun of it. But there are times when that can be really useful. For example, let's suppose I want to run what's called an HTAP workload, you know, sort of hybrid transactional analytical. So you've got an OLTP workload that's, you know, chundering away at like a million transactions a second. Someone comes along and says, well, I actually want to do some analytics on this. If you're on the cloud, you can say, oh, great, I'm going to grab for an hour a machine with a terabyte of DRAM to set up a TE. I might actually also start a new storage manager if I want to, just in order to handle that analytics workload. Again, because it's all MVCC, any long running analytics queries really going to be running against old versions anyway. And so there's not going to be a lot of chatter. And so that's actually, there's some very useful kinds of things like that where you can sort of say, okay, once a day or whatever, I'm going to fire up a big machine. You know, I'm going to do my analytics or my incremental stuff. It's going to have marginal effect on the rest of the system. Shut it down again and go home. No, the decision is made by you calling an arrest API. You can configure that into Kubernetes or something else if you want. We as a company moved away from what we were doing was kind of policy based management in which we were doing a lot of that stuff. And we said this intelligence is moving actually into kind of cloud management systems. And so we'd rather work with those guys and let them kind of drive that stuff as part of overall cloud optimization. All right guys, let's thank Derek. For the coupon. And I'll see you guys on Wednesday for the final view and the system cookery. We will be talking about Cockroach PB and we'll be talking about Spanner as well. I predict we'll be the third guys. So all right guys, see you on Wednesday. Thank you.