 Co-Coach Labs. In his time, he's worked as an architect and a developer at companies like Groupon, HashiCorp, Samsung, FreeBSD, and more, including DoorDash. He'll be demonstrating some of his wealth of experience in this area on zero downtime. So expect things like best practices, things to avoid, and much more. Sean, the floor is yours. Excellent. Appreciate the intro. I'm not going to dwell on that too too much. Let's get right to it. So zero downtime. Having spent the fair amount of time in the commercial consumer side of things, downtime is not particularly well tolerated. Whenever you're not able to take an order, you're not making money. It was true 10 years ago. It was true 20 years ago. It's true today, obviously. And one of the big challenges, some of these, the nature of this hasn't changed, right? The things that people are interested in terms of availability. One of the things that has also evolved over the years is the fact that there's less tolerance for getting to zero downtime. It used to be that you could go and build all kinds of artisanal and bespoke availability strategies for your organization, but that would come at a huge distraction, organizational distraction, where organizations are spending their cycles on working on zero downtime infrastructure. How do you go and plumb something? When you're worried about how do you plumb downtime or when you're worried about plumbing inside your infrastructure for data plumbing or whatever else it is, you're not focusing on your core business, which is how you go and acquire and retain customers. So there's a big shift that's been happening in terms of how do you go and iterate? How do you develop your product faster so that you're able to be successful? And especially now that there's a little bit, I'll call it a cash crunch in the startup world and certainly inside of enterprises where people are interested in, how do they optimize the efficiency and productivity of their workforce? Because if you're focused on delighting customers, you're not focused on the humdrum of making sure that your database doesn't go down or making sure that you're able to scale and not worry about how do you partition and shard your information. And then the last one is that's also becoming a thing is who wants to go and learn yet another bespoke database? I mean, there's obviously people in the world that really like padding their resume with lots of different fringe technologies, but at the end of the day, there may be diamond the rock there in terms of something that you're interested in, but do you really want to go and spend three or six months training up a new engineer in order to become productive in your environment? Or would you rather have something that you can drop into your organization that has a familiar look and feel and allows developers to hit the ground running quickly? So these are kind of like the backdrop of conversations that I've certainly had for a long time and a number of businesses bring to us as kind of common themes for us to bring to us as themes that they're interested in hearing more, seeing success stories about. So it's definitely interesting talking to them today. So it's no stranger or not no stranger, but it's obvious that outages are costly. They're actually somewhat common, unfortunately. And everybody just has a threshold, different thresholds for different businesses. Some businesses when you have downtime, it's very, very expensive, both in terms of money loss, but also reputational damage. But that threshold of pain changes and the willingness to go in and put in effort to go and prevent that also changes as well. So here we go. So in order to begin supporting some of this, in order to go and work around downtime or most ordinary downtime, in order to scale businesses, you kind of have this, this traditional stack or layering, like you're back in the in the middle days, you'd have like your three tier architecture, right? Well, the cloud equivalent of this is you've got some user CDN, you balance and distribute your workload to a bunch of application servers, and then you fan out and talk to a bunch of databases, assuming that you've got a sufficiently large business that requires this, right? Basically breaks down to this, you've got clients, you've got routing, you've got compute, you've got storage, right? And those are like kind of the four pillars at this point in time. And there's maybe a fifth one that you could argue in the back there about analytics or something to that effect. But, you know, if we spent 20 years, 30 years working on distributing applications, we haven't done that as an industry in terms of distributing our data and our databases. And so that's really kind of where what we do cockroach comes to play. And you want to think that it's been an interesting organizational, you know, thing response to this is organizations have developed microservices as strategies and organizational strategies, technology strategy that allows organizations to scale their workloads, to scale their databases, because in theory, you've got a database for your macro service or whatever you want, you know, name you want to call it, that allows your application to scale within kind of like some, some defined data set or boundaries there, you know, inside of and behind an internal service, you're going to have an API layer that allows you to have kind of like awareness in terms of the touchpoints and these microservices, they have a tendency to be bounded at teams, you know, what that's actually Conway's law in effect, where you actually are carving up your business along your internal team boundaries and your API inside of an organization will tend to resemble your organization. And you know what, that's great. But now that means that as you scale your business, you're going to run into the coordination across different teams. And so all of a sudden, like, you know, the old law on dollars law comes back and is is not just something that is relegated to computers and hardware, it's actually in distributed systems, it's a problem for a different type of distributed system called your organization and engineering efforts. So if you can simplify your tech stack, you can actually potentially, you know, reduce the number of microservices from a scaling perspective, and instead, you can focus on your business value. And, you know, famously, like you've got large examples of these kinds of microservices and just this huge proliferation of this, and really, like, you know, is this a the most productive way to go and build software? Can we potentially do some pruning here of this complexity? What ends up happening inside of an organization is companies will figure out a design pattern, typically, the team that is having the most acute scaling challenges, or is generating the most business value will develop a pattern and in a very, very common pattern, I would say probably 70%, 80%. So I mean, it's really common. Organizations will have some type of gateway, they'll have a microservice that microservice is going to talk to a caching layer that caching layer is going to are the is going to be fronting a database, their service is potentially going to do writes off to an event service like a Kafka. And then, you know, there's going to be some consumer that's going to read events out of this coffee and then going to pipe it off to whatever, like, and then every service is going to take this. And because it worked for, you know, the, what the marquee service or like your tier zero services, and they're going to copy and paste this, this thing, and it's going to become the way and everybody's going to export this, right? You've got this mass cargo culting of architecture that's going to happen across the entire organization. And if you look at this, that may have been appropriate for your big service, right? And it depends on your business and depends on the application. But you've got a fair amount of redundancy. And if you step back then and you look at what your microservices application owners are actually doing, this is some pseudo code, but you've got this, this, you know, kind of predictable pattern, you're going to have some kind of an object, you're going to save a thing. So you're going to save it to the database, then you're going to go in and, you know, invalid or update your cache because, you know, you have to do that because your database has some scaling, you know, limitations in it, because it's a single server, typically. And then you're going to go and emit an event. And so you've got these three separate touch points. Well, that's great. But, you know, what happens when there's errors along the way, right? You know, this is a form of tech debt that you're pushing on to every single developer. Nobody can escape those three calls. So how do we go and simplify this? Or how do we go and reduce the amount of friction, the burden, the cognitive overhead for getting anything done, right? So if we go and torch a few of these components, we're able to simplify things pretty significantly. You have a microservice, it talks to a database. The database emits events to some kind of bus, and then you have clients that are able to consume things out of a bus. This becomes a very realistic and tractable, you know, architecture, right? You can do whatever it is that you need. And then up to kind of some interesting speed limits, right? If you need caching, and I'm not saying that caching is bad or not necessary, right? Caching is really important. You know, especially like if you're serving cat photos and you have a high fan-in rate to a particular record, and that rate is, you know, in the tens or hundreds of thousands of operations per second, use caching. There's absolutely a place for that. However, a lot of businesses and a lot of organizations do not need a caching layer because the rate at which they're updating an individual record is below, you know, this is our internal speed limit of 2,500 operations per second per row. And if you're operating an application that is existing and running at less than 2,500 operations per second per row, you don't need caching and we'll get into why. But as a result, you don't have to worry about cache coherency issues. And you don't also have to worry about dual rights with this kind of a simplification. So if your code earlier had three branch points or, you know, three error handling conditions, because you were talking to a cache, you were talking to the database, and you're talking to your Kafka or a PubSub or whatever it is that you're using for events, then like you could actually simplify this down to just one thing, you say to the database. Sure. While you're on the topic, would it be fair to say that in the previous example, the last slide, although we were writing to three different places, at a high level, is this the dual right problem, correct? Yeah. That's exactly it. You've got lines of code. You've got error conditions. And actually, I was just like, as you were saying this, I actually, I said this out of order without even realizing it, that I said cache database and then event or whatever it was. Like if you've got hundreds of call sites or thousands of call sites where you're saving information, the probability that somebody's going to like switch those two, and then you're going to have the cache that's going to be inside out of sync, because there's like some ordering. How do you handle and unwind a, you know, an error? Let's say there's a transient failure of, you know, let's say the cache invalidation happened, but then the database save failed for some reason. Like you've got some phantom record in your cache that's going to eventually expire or like, how do you catch that? Potentially, eventually. He might not. Unless you've got TTLs. I was in this exact problem. When I was first adopting Cockroach, one of my questions to Tim Vale, who was a sales engineer at Cockroach a while back was, we had a cargo cult of exactly this. We would write to a database and we'd write to a cache. That's just the way we did it. It was completely cargo culted. His advice was, don't cache. Don't start off by not caching and then only add a cache when you need to. That was hugely interesting for me because I was so ingrained in that cargo cult that I just assumed that that was the way you did it. You had to protect your database with a cache. When really you don't, it should be your database that protects you because it can scale. Now, another question from me that might be beneficial to others is you've mentioned the 2,500 requests per second rule before. How did you arrive at that as a sweet spot? We're going to get into the architecture. I'm going to explain where some of that comes. You're leading to where we're going in just a second. It has to do, in the case of Cockroach, with the way that Cockroach is implemented and its unit of scaling, this thing called a range. An individual RAIM, there's mutexes and locking, and updating a single row tends to light up a VCPU or a core. At that rate, you're not going to be able to get more updates per row because everything is going to be bottlenecked on a single VCPU core. That's where that number kind of comes from. It's actually higher than that, but I just use that as a rule of thumb for most people. I had a point there that I totally spaced out on for half a second. Maybe it'll come back to me later. What you're describing here looks more like the transactional outbox pattern. Yeah, that's basically it. The fact that you just have the simplification, oh, that's what it was. It's like caching. You have to. Why wouldn't you? Your database is going to fall over if you don't have caching. That's premature optimization, actually. You don't know that your database is going to fail without caching. I can tell you right now that it lays bare the fact that needing to cache or the instinct to needing to cache everything is a form of premature optimization. It's real obvious with Cockroach. There's both savings that you get from pulling out of this product. There's bugs, but it becomes real obvious that there's just very little value. You have to be doing more than 2,500 ops per second per row to all of a sudden need caching. It might not be premature optimization for a lot of databases, but it is certainly premature optimization for Cockroach. How is this possible? These seem like big statements, potentially reckless statements. How or why is this possible? This is probably the thing that I want everybody to internalize and absorb at the end of this talk because we have some neat values that we're able to push on the world because we're as a distributed database. The thing under the hood that we're going to get to is the how and the why here. It's super important because all of a sudden it unlocks how and what. As a distributed database, we've obviously got a bunch of layers. I'm going to talk about a couple of these layers at really high level terms so that everybody that's here can begin to internalize and think in terms of what it is that Cockroach does and why this is so powerful. In this example application, we've got a dog's table. We're going to use a primary key, though normally an integer or a UUID, like whatever it is, like composite primary key, totally fine. It just makes it easy for slideware purposes to explain this. You've got a dog's table and we're going to have a collection of dogs and then we're going key value have some weight or some other attribute that we're attaching to these dogs. Under the hood, what this actually means is we take this table, this Postgres table because Cockroach is a Postgres wire compatible database. It supports the Postgres dialect, very small exceptions, but it is like 99% compatible with Postgres dialect. So create table, acts like that, insert, update, delete. It's exact same as Postgres. So walks like a dog talks like a dog, not a duck under the hood or dog in this case. So you've got a dog's table and we take this dog's table and that's the, the, the logical representation, right? So you have like kind of the equivalent of like an extensor or if you're used to that. But what happens in Cockroach is we take this logical representation and we break it out into these things called ranges. And you know, right now today it's 512 megabytes and so we take this table and we kind of chunk it out, ordered based off of the primary key that's like the way that we cluster or store things. And we break that out. So you have this, this distribution of data that starts to happen or is beginning to happen. So you have Carl the Jack that goes to range one and you need a lady to PT, which goes to the next one. And, you know, we're starting to break apart and chunk this data out. And what happens then is we use raft under the hood to replicate this information and to distribute these different chunks of data across different servers. So each server then, now that we've, we've decomposed from a logical or, or virtual to a physical, you know, representation of the data, we take these ranges and then we scatter them across and then we replicate them in the information so that there's no one single point of failure, right? This is important for zero downtime, right? If you have a, if you have some kind of a failure, not going to be impacted. And the reason we do that is because we have replication built in under the hood, totally transparent to the application, replicate the information so that all rights end up on end one in this particular case for range 100. And all the followers have a copy of this. Once there's a quorum right to this information, then the right will return back to a customer or back to whoever your client is. And then similarly, if you've got thousands, tens of thousands, millions of these ranges kicking around, the database will scale, but you don't want to make your application aware of that. And the reason your application doesn't have to be aware of that is because we have a gate, every single node also acts as a gateway. So if your query comes in and hits node one, and the data is not on node one, the node is on node two, it'll proxy the request internally using an internal RPC protocol, perform the query and then return the result. It's not just limited to key value like full WCT, CTEs, like, you know, big analytics queries, it's kind of a thing that we can do. It's pretty neat. And now, because we've got all of our information replicated out and we're doing it, we can also take these copies of the information, and we're smart enough to be able to distribute it across different AZs. And so what this means then is you are now immune from an AZ going down and taking your application offline. All of your information is replicated, you know, replication factor three is the default. So that if you've got one of three AZs that are on unavailable for whatever reason, or you know, something tragic happens, or there's whatever, your application will stay online. It's handled and resolved. The self healing happens at machine timescale, like, you know, seconds, there's like, sure, you may page out to meet space to go and get a human operator to get involved. Not required. It happens and heals, self heals automatically. And so, you know, your developer, they see a database, they see a table, and it's, you know, straightforward. But under the hood, you know, they're able to go in and treat it like a Postgres, even though it's not really Postgres. And then like the other thing that's kind of neat is, you know, you've got these extra copies, you've got some built in unit of scaling, I've just duplicated my information, which is great, because now I can go and read from it, and I can scale out really quickly to, you know, thousands of reads per second or up for this individual row, right, or, you know, collection of rows. And you can do that with some some extra syntactic sugar that we have so that you don't get what we call follower reads by default, but you can make use of this whenever you want. And it's really useful for analytics queries. So, and then last thing I'll say about, do you go into detail, I don't think we go into detail about follower reads, would it be, would it make sense to talk about the topologies available at this point, just to give people an idea as to how, because I think a question that a lot of people might have is, I have my Postgres node, I'm talking to this or my Oracle rack, whatever it happens to be. And my apps are quite close to it, proximity wise, and I'm getting good read write times. But the concept of a distributed SQL database, it was quite a paradigm shift for me when I first adopted distributed SQL. And I think there's definitely a great with great power comes great responsibility aspect to it. And I think it'd be interesting to show, because one of our biggest competitors is the speed of light, there's no getting around that if I want to make a query to a database in Singapore, and I'm based in London, there's going to be a latency penalty there. And I think we have we employ different topology patterns under the hood that you can harness. So for example, you can surface a table as a global table, which means the data is available globally with very low read latency at the cost of course, of that there must be a trade off somewhere, the cost of higher write latencies. And you can also partition data, there's lots of different ways you can manipulate your data with just a single line of SQL that you attack on to a creates create table statement to it to unlock this and follower reads is one of them. Sorry, no, no, totally spot on it wasn't something I was Yeah, no, absolutely. Like really, yeah, really happy you're touching on that because if writes happen on one node and you've got followers elsewhere, like you can also do other things with multi region in order to like, you know, propagate these rights synchronously or asynchronously to different regions, based off of what it is that your application actually needs. Are you doing like a global identity? Are you doing something that's region local like, you know, orders or deliveries or something that has some natural geographic partitioning that says like I'm West Coast data because all of my users are actually West Coast or East Coast data or whatever it is. Yeah. And just to dive a little bit deeper on follower reads, if you've got data in one in spread across multiple locations and you by default, we would orchestrate the reads and writes through what are called leaseholders of the ranges they own the data that for individual ranges. And if you do a strong consistent read that's not a global table, of course, that would that would necessitate a an internal hop if the if the request that it lands on isn't the machine where the leaseholder resides by toggling on a follower read you're allowing cockroach TV to perform a slightly stale read so you're opting into slightly stale data at the cost of much higher read throughput because you're not having to make that additional hop by default reads happen on the leaseholder. And when you turn on follower reads, it allows the query to be handled by any of the followers that are participating in the RAF forum for a given range. So yeah. So then the last one that typically is not typically but gets asked is like, well, you know, how do how does the database evolve to have to reallocate us? And it's all done automatically. Right. So in this particular case, we've got an insert that's going to come in for a dog named Rudy. And what's going to happen is is we realize and we know that that information is going to end up in range, the pink range in this particular case. But that's going to cause a what we call a split a range split. So the data is going to come in. And we're going to, you know, exceed the amount of in the nut in this case, but like, let's say that we've got 512 megabytes or near 512 megabytes for these ranges here. If that's going to cause it to exceed, it's going to cause a split. And what happens in is all of a sudden that 512 megabyte range turns into two and that second range gets kicked off to some other server. And that allows for all of the application to continue to subdivide and, you know, subdivide as you add more nodes, again, subdivide and distribute. But there's there's two kind of criteria that the cockroach will there's two criteria that cockroach will use for splits. One of them is data size. So if a range is going to exceed 512 megabytes, or the number of queries per second, so there's just a lot of load on a particular range, it'll also cause it to split. And it'll continue to split down to a single row. Right. That's also part of where that 2,500 ops per second came from is if you wanted to, you know, bang on a single row and perform just a bunch of updates, you can do that up to a point. But at some point in time, that cockroach is going to split that down to a single row, and then it's going to end up on a single VCPU, the channeling just that one row. And, you know, you're going to kind of run it bump into this kind of speed limit. And that's where the 2,500 ops per second comes from. But, you know, that is 2,500 ops per second, 24, 7, 365. And that's a really busy, you know, thing. I think it's I know it's higher than 2,500. I'm just telling everybody, like, leave a little bit of headroom, thumb suck, you know, 2,500 ops per second per row. Worth noting that the range split won't impact the right performance. We would allow the range to internally, we allow the range size to grow above the 512, but asynchronously split it after. Right. Yep. It just triggers a split action. And the other one is as the workload splits and decomposes, let's, you know, at some point in time, Cockroach is going to say, hey, I actually need to go and merge to adjacent ranges and it'll re merge and recombine. And the thing that's important about this is the workload and the database and the way that the database structures and handles and moves your data around, it's adaptive, right? There's no tuning. This is all done automatically. The system's doing this 24, 7, 365 for you. No administrator involvement, like there's not hotspots as a result of this. And so what you'll typically see is you'll see the average temperature of the entire cluster, like kind of the heat of operating on distributed database, it kind of rises and falls together at a uniform rate, which is kind of ideal. It's really nice as an operator. So we've got our distributed database now, Sean. Just thinking about zero downtime, like a distributed SQL database in the context of zero downtime. Historically, when we talk about resilience, let's go back to single data center legacy RDBMS systems. Resilience meant having a DR plan, a business continuity plan, whereby you would have a presumably idle infrastructure running a secondary version instance of your database. Can you, can you, do you have experience of running across in those kind of workloads where you've got a DR site, let's say, and what the experience of managing something like that is? Because now we're, now we're operating on multiple nodes and in a distributed SQL database. We're no longer operating in a DR mindset. Our mindset's changed now to a essentially disaster mitigation, disaster prevention mindset, which is quite different. Obviously, you'd want to have some kind of backup process in place. You might even want to, but you're maintaining nodes across localities, which is essentially doing the job for you. What's, in your experience, what's, what are the pains of managing a DR situation? What does it give you in terms of zero downtime by migrating to a distributed SQL database? Your thinking changes, right, unplanned maintenance, unplanned maintenance typically is results in a small blip to SLO because you did have an unplanned failure and it takes a second to detect it. But like no humans involved, right? And so what you're doing is you're spreading out your risk over a larger number of nodes. So there's maybe like some background, like, you know, variance over time. But for all kinds of purposes, you're, you're managing risk in aggregate, as opposed to the risk of an individual asset, right? You don't have to worry about, you know, when a planned maintenance is going to happen. Instead, what you're worrying about is how do you go in and triage the overall health of the overall cluster, as opposed to like, you know, the individual server, right? You know, the famous, you know, at least it was really popular for a while as pets versus cattle, kind of like, you know, analogy. And it's like, you don't have to worry about that named node over there or that named instance over there. Instead, what you're worrying about is, is this collection of servers that are all operating together and kind of stitched together so that in theory, you don't have any kind of like meaningful downtime, you are going to see like little blips to your SLO and individual queries, but like, it's spread out across many nodes and many, many nodes, many leaseholders. And so your risk profile goes down. So it is very much a change in thinking. From the basis of our last chat or last topic, we've had two or near identical questions just come in around the subject of schema migration, schema changes and changes to DDL, which I'd just like to address at this point before we move on. The questions are essentially the same and they are, how does availability, right latency get impacted by schema changes? And I just wanted to answer that while I thought about it. And essentially, Cockroach Under the Hood implements the MVCC pattern, the multi-version concurrency control pattern. And that's not new to Cockroach DB, but what it allows us to do is let's say you've got a table and it's got a billion rows in. If you make a schema change by adding or removing a column, we take a cut of that table, we copy it over to a second version, we make the schema changes on that version while traffic is still happening on this, like a kind of blue-green setup. And when that schema changes has completed, we then redirect traffic to the new version of the table. And the follower reads allow us to at any time up to the point of garbage collection, which I believe is by default four hours, we can query from both tables. So if we say select everything from the table, we would get the latest version once the cutover has taken place, the schema migration has completed. But if we ask for something before that cutover took place, we would get data from the previous table. So with the new column, or the column that no one consists. So just wanted to address those. Please feel free to ask, if I haven't answered the question correctly, feel free to rephrase it and I'll keep on going. Back to you, Sean. Yeah, no worries. Great questions. Yeah. So one of the things that we haven't gotten to or didn't talk about is secondary indexes. Those are a thing. You can create secondary indexes after the fact. You don't have to do dual writes and do some kind of migration. The other one is is you can do the alter table, add column or whatever else alter table, drop column. And as Rob said, like that all kind of happens behind the scenes and we present that post grass compatible interface to application developers. So in that regard, like no change, a lot more complicated in terms of what it is that the software has to do. But as far as a user's concerned is nothing like it walks, I got quacks like it does the thing that you were kind of expecting. So yeah, so moving on here. So and but do feel follow up if you got questions or more questions. You know, what we tend to see is organizations go through this kind of life cycle. There's an organizational evolution where organizations come from and work on the single point of failure, single databases, we were just kind of talking about. And what happens then is, is you're like, Hey, I want to move away from this. And so the kind of like the phase one evolution of all this is I want to move from a single service or single, single server, you know, which is kind of a traditional relational database to something that allows me to scale out, get higher QPS, more data, whatever else is, is, you know, rolling upgrades, zero downtime, maintenance operations, things like that, move to a distributed SQL database. And then now that you've taken that leap and moving from a single server to a distributed server or distributed SQL database, then you can begin to take that data and you can stretch it across different regions and get to a multi region cockroach so that you're able to have your data move closer potentially to where your customers are if you're trying to optimize for speed of light and, you know, the distance between an eyeball and where that user's data or customer's data is. And then after that, obviously, then there's a future state where the ability to span clouds is also potentially possible. And that's, you know, typically the most sophisticated of customers that are dealing with that right now. But like it is this journey, and it's a crawl, walk, run, like it's an incremental thing, like first, the first and the biggest value for most core organizations to start with is going from that relational database that's single server or maybe like a collection of like, you know, write or follow or like in a single, where the, in a single AZ where there is a single point of failure, that's really the important thing to something that is a distributed SQL database where there are no single points of failures in that worker. We talk about manually sharding databases, but like the traditional way of scaling a database was I start with one database, I'd never know that I don't know that I have to scale it. And then I manually shard it in order to gain some form or semblance of horizontal scalability rather than because you can only vertically scale your nodes up to a point. We talk about the pain of manually sharding databases, namely Postgre as one example. I gave it a go the other day just to see, just to be able to say, yes, it is painful. And it made me appreciate with distributed SQL how easy it becomes. You don't have to start off with a huge cluster, it doesn't have to be prohibitively expensive, you can start off very small. And as you add nodes and the data rebalances across the nodes, it allows the database to grow with your business. Yeah, there's there's two dimensions of scaling with cockroaches posted just having vertical where you throw more cores and more ram at a problem. You can actually like slightly more nuance, you can have like a bunch of, you know, m six, two excels, and you can move from that to like, you know, some four excels, or you can add additional nodes. And there's, I don't call it, I don't want to say it's not science. There's certainly reasons for why, but like, yeah, there's, there's a lot more nuance there, because you have these two additional dimensions, you can scale up the individual nodes that are in a cluster, or you can scale out as necessary. So yeah, then you're going back to the availability bit there, you know, loosely coupled systems versus tightly coupled systems, which one would you rather be operating, right? A loosely coupled system is, is what is required in order to move to that kind of like, you know, three, four, five, six, nine's kind of availability. And that's really, you know, what you have, like you have to start thinking in terms of the risk profile and the amount of entropy and the environment. And when you look at what it is that cockroach needs to do, is we're actually abstracting all kinds of, of, of entropy. And, you know, for better or worse, one of the things that happens when, you know, we work with, with organizations is like, there's availability issues, there's network kickups, like, you know, for better or worse, there's, there's going to be network congestion, or you're going to have like a saturated disk periodically. All of that entropy are things that the cockroach abstracts. And, you know, we hide all of that for customers. And that also means that like, you know, when there is a disk stall, it shows up as a cockroach issue. But because like there, there is just kind of back pressure, like when there's a network stall, like that shows up as as car. So fine, but like, somebody was having to deal with that beforehand, and it was potentially not being observed or paid attention to. Now we're able to, to abstract that and hide that because our system wraps multiple AZs, multiple nodes, et cetera, in order to provide this kind of postgres experience, postgres service experience that is being distributed in order to provide the higher availability that businesses and organizations just want, like being able to do rolling upgrades or maintenance in the middle of the day. It's pretty nice. Not only is it just nice, but like, it's important to be able to handle the, the unplanned failures that happen at peak, as opposed to like, you know, the hypothetical unplanned failures that happen in the middle of the night, like bump in the middle of the night. Nobody cares. No, what customers are awake bump in the middle of the day. That's, that's, that's the money shot. That's where what, you know, matters to everyone. So um, and let's see if you have a quick question there, if you want to like, or something you want to inject there. It was about the slides real fast. It was just about the slides being available. I don't see any reason they couldn't be. So I think that's something we can chat about. So the video will be on YouTube off afterwards. Once it's been uploaded. So the slides will be available there, but let me have a, let me have a chat with the team. And once I've got an answer, I'll leave it as a comment in the YouTube video. But yeah, thank you. And I will just, and there's another person who's who's just said that the links I'm sharing in the chat don't seem to be able to be copied. So I will just send those into the Q&A. And hopefully we'll have more luck there. Thanks for letting me know. Okay. If you look at the history here, you can see that like I start, start up this cluster immediately right before we started our demo here. So let's see. Okay. So what I'm going to do here is, is we've got a workload that's doing just some constant workload. And this is a really tiny cluster right now. It's only three nodes. And so we're going to do something that that's reasonably abusive of this system. And we're just going to do an unplanned shutdown, like P kill nine kind of style. So what I'm going to do here is actually, let me just P kill nine in the background. We do this just so you can see that's about what I'm going to run here in a second. And what I'm going to do here is do this and see the impact of that. And wait for the impact. So the cluster's carrying on, you've P killed something in the background. I actually P killed something on one of the workers. So you weren't gonna see that. Now you're going to see it. So what would you expect? So what would you expect to see, explain to the? Yeah. So, so because this was like an unplanned, like just forceful hard termination, you're going to see a drop. And then you're going to see everything recover here. So these are 10 second, you know, this was rolling window of 10 seconds. And the system will detect it that there was a failed heartbeat and it will renegotiate all of its leases and then everything will come back up and we'll be back to where we were slightly degraded just because we have lowered capacity. But like, there you go. Like that is a potentially an acceptable blip to the outside world. You know, obviously this is an unplanned failure, but catastrophic in any other situation that could be catastrophic. And I think it's interesting that the kind of ambivalence with which we can kill our software in demo scenarios, in live demo scenarios, no less, and have the confidence that it's just going to carry on working. So you can see here, you see the ranges here where it says like, Hey, I have under replicated ranges, you know, and noticed and detected this and said, Hey, I've got nodes or I've got ranges without leases. And it was able to recover that kind of machine time effectively with no human involvement. So then we'll, you know, kick this and let everything come back up. And you'll see that there's going to be some more least transfer activity and the node will go back to being available here in a second. And there's two types of there's, there's different behaviors in terms of failures, isn't there? So if the, I think it's called the time until dead or a node is by default, five minutes. So if, if that node comes back up within that, those five minutes underneath, we will use raft to send any data that was missing from that period to the node that comes back up. But if it's dead for more than five minutes, we remove the node from the cluster entirely. And essentially, when you're, let's say your Kubernetes pod comes back online after five minutes, the data would be resynchronized to that node. That's correct. So this is a little contrived in the sense that it's only a three node cluster, but like imagine 100 nodes or like a couple hundred terabytes, like with like instead of 240 ranges, doing this with like 300,000 ranges is an enormous amount of data to have up and running it online. And so having that amount of data online and being managed by a system that handles all of these self recovery, self healing type scenarios is really important for organizations that just don't want to have to deal with downtime. And that's that goes back to the kind of the premise of what it was that we're talking about, we're talking about, like you can kill a node and the node will recover. End of story, right? And there are obviously like interesting complicated situations where this becomes difficult, but like these are all complicated, difficult situations on a good day. At least you've got software that's going to cover, you know, and get you 95, 98% of the way there so that you can have petabytes worth of data online and a minimal number of operators or administrators for that amount of data and database and the number of database clusters or the amount of data online. I think that that that efficiency metric is part of one of the things that I've set it up front at the beginning of the talk, but low number of administrators required to operate this, especially as the number of clusters grow and the amount of data grows like it's the amount of administrators doesn't change as the amount of data grows for a given cluster, typically, you know, and I've been on the opposite side of that as well, and I think that's an important point. So I've witnessed 17 different instances of the same application deployed across disparate instances. I think it's called multi instance architecture and that was a nightmare to maintain. We had 17 instances. If we wanted to perform updates on any of them, that was downtime for every single one of those instances. Being able to take out a node and bring it back in with no impact allows you to perform rolling updates. So if you're running in a Kubernetes pod, if you're running a Kubernetes cluster or multiple Kubernetes clusters and you want to perform a rolling update, you send it the new image, it will take out a pod with a drain and then it'll bring it up with a new version without any downtime without any downtime tool. And so like, you know, having these kind of tools at your disposal makes this kind of, you know, it's pleasant or more pleasant for operators because you don't have to worry about the Saturday morning, you know, or the midnight update, whatever it is that you need to do in order to, you're basically taking that off the table. It's not a problem anymore, the same way. But it becomes business as usual, doesn't it? It's something you can say. It becomes a Monday to Friday night at five type activity, because especially in larger clusters, like three nodes is like kind of the minimum you'd ever like consider, or is not anything kind of it is, it is the bare minimum single node cockroach doesn't make a whole lot of sense. But if you've got like 30 nodes and you lose a node, after 10 minutes, we hit the dead node timer, like the data will up replicate across all the nodes. And you can just leave that node offline, probably want to decommission it and like, you know, perform a little housekeeping. But it's not required, like the workload moves on the data up replicates so that you are have gone from three copies of the data down to two, when you had that unplanned outage, and then it'll up replicate to three. Normally, you'll do things more gracefully, you'll do like a drain or a decommission, and then you perform like, you know, kind of a nuke and pave kind of like approach to rolling upgrades, or you can just, you know, drain, replace the binary in place, start the process again, drain, you know, perform your OS patching, you know, restart, start the process back up, everything kind of, you know, joins and reassemble. So and the effect of being able to do this intraday Monday to Friday is you'll have your whole team on hand, as normal, as opposed to having to do things out of out of hours where you might be the only engineer on call, performing an out of hours update, when none of the rest of the team are available to support you, it's better that it becomes business as usual, you can do everything intraday when everyone's available. Yeah. And then it's convenient for, because developers are able to think of it just like Postgres. So, oops, no demo where we did it live. So let's see, I think we're going to move through the last little bit of this real quickly. You know, databases, you know, we built an entire, you know, kind of set of practices around sharding information. And that was accidental complexity that was forced upon us because databases couldn't scale in the past. And now you don't need to, to, you know, engage in those types of practices because you have a database that actually does this. So instead of having to have plumbing, instead of having to worry about how do you do cross shard joins, we're having to take your database and dumb it down from being a database. And now you're treating it just as a simple data store where your, your application, your API is actually starting to act like a database because it's doing effectively joins across shards for you, right? Would you rather be doing that? Or would you rather push this information down to a database that has actually tested and built as a database? And now you just interact with it as SQL that allows me or you or whoever to change their focus from how do you perform these kind of like mechanical plumbing operations to I'm going to focus on my business value, right? And it really helps elevate thinking so that you're not thinking in terms of how do I go and talk to shard one to go get information off of shard two to assemble this to go and do some like, you know, join or to add some like complicated business value. I don't have to do that. Instead, it's like, I've got a query, I'm going to go run my query, all the scaling, all the sharding, all that that's all handled by the database. So now I can focus on releasing my new feature faster with, you know, better availability, right? So to wrap up and kind of come back to some of the initial points, just like, you've got something that looks like Postgres, you don't have to train people from, you know, a new new technology, you can focus on productivity, you can focus on increased efficiency because you have a good general purpose database that allows you a cost, you know, a workload to scale without having to go in and deploy an army of people to go in and manage some potential homegrown bespoke solution. Instead, it's, it's, you add nodes, the cluster grows, your workload goes up, you can scale down if you need to too. Like this, I didn't really get into this, but like if you, like your compute utilization, you know, you need to add capacity, reduce capacity operationally, these are all things that are on the table and very flexible. And then from availability, everything kind of is handled at self healing, you know, it's, it self heals at a machine time scale, as opposed to at a, a meat time scale, or when you have to, you know, get a human on the line, because frankly, we're just slower responding to things than computers. So we've just had some more great questions before we wrap up. One of the questions is on, is there any limitation on node data density? And I think that's a really interesting question. We've done some internal testing of this, and we run tests, large scale tests every day, as we roll out our versions and our patches. And I think what we've arrived at on our, and I'll just send a link to it on our production checklist is around four and a half terabytes per node, correct? That took 10. But 10 to thing. And so like, because that four was where we were for a long time, 10, you can do it. The question is, is if a node goes offline, how long do you want to have 10 terabytes of data potentially be in this unreplicated state? And so that's where the trade off kind of comes in. That becomes a business judgment question, as in like risk tolerance, as opposed to like, where does a software support? Because it's not just about having 10 terabytes. And if you think about it, let's say you've got 30 nodes, I'll quickly do the math here. Let's say you got 30 nodes, and you got 10 terabytes. That's 300 terabytes of raw, right? So 300 terabytes raw is a lot of data. That's unreplicated data. So that would be replicated and tripped. Right, exactly. So if you lose and you go to 29, it takes a while for that data to just be copied potentially. And it's really up to your risk profile. I definitely know of customers and clusters, many clusters that are running at that scale. It's an entirely doable thing. They're not using three nodes. But like, it's very much a doable thing. But there is that, that you just have to be aware of some of that stuff. Let's see. And then multi-tenancy, we've got some stuff there with regards to multi-tenancy. We do have, that's a really complicated question and depending on how far you want to go into that, right? Virtual clusters is really kind of like the way that we like to think about this, where you can have a single physical cluster stood up, and then you can carve out multiple virtual clusters. And then inside of a virtual cluster, you can have different database instances. And so we handle multi-tenancy in that way. I don't know. I'm pretty sure that that's in 23.2 in a preview state, and we're working on pushing that out of GA. So that's, but then there's other like forms of multi-tenancy, where it's like application multi-tenancy. And we don't do anything really to touch that with what we're doing with regards to multi-clusters. Just to add on to that answer, we do have a multi-tenancy product offering. So our CockroachDB serverless, or soon to be named basic tier, does support multi-tenancy. And ultimately it's orchestrated via a Kubernetes cluster, distributed Kubernetes cluster. And each tenant will have zero or more, hence it can scale down to zero tenant pods or SQL pods. And they're private to you with an underlying shared storage, which is very specific to your tenant and can only be accessed by your tenant. So depending on how you're answering the question, you're asking the question, there could be different answers, but happy to take that offline if we want to go deeper. There's a lot to pull on that one. So then plan makes, yeah, absolutely. You hit drain. We've got a bunch of hooks and timeouts. And so you can say drain and terminate a node. And actually in this little example, it wasn't set up that way. But if you do a term, it'll actually wait for the number of SQL connections to drop to zero, and then it'll shut down the process, transfer all the leases, and then shut down the process, and then it'll exit the process. And that actually will happen if you just send something, you know, control, so you like send it a term or an in signal, and it'll begin that graceful draining and shutdown. So you don't, you as the operator don't have to worry about all of the new ones there. I did the kill nine just to be abusive and to show an unplanned failure. But yeah, absolutely. Plan maintenance like that, is definitely a thing. Integrating that into your load balancer, like there's bits there. Yeah. If we don't have any questions, if you scan this QR code, I think Rob, you might know a little bit more about this, but... Yeah. So the definitive guide is an amazing resource, completely free, for you to download. It's in a Riley book. There are a bunch of texts, both available now and in the pipeline. So keep an eye on our website. We can share a link to the website, and there's a bunch of books and reports, either already out there or in progress. But yeah, the definitive guide is a great source. We've also released the survival guide, which is quite an amusing take on surviving and thriving as a software engineer in the modern world. Well, I appreciate it. Thank you. Yes. Thank you. Thank you very much, everyone. I'll pass over to Candice at the Linux Foundation. Thank you very much, everyone. Thank you so much, Sean and Rob, for your time today. And thank you, everyone, for joining us. As a reminder, this recording will be on the Linux Foundation's YouTube page later today. We hope you join us for future webinars. Have a wonderful day. Thank you. Cheers.