 So much thank you so much for handing it over to the gyms. We like to call ourselves Jim and I were we're actually a band. No, we're not. We're just two distributed people working on distributed database and Kubernetes. Welcome everybody to the event. My name is Jim Walker. I'm actually Jim do you want to come on video so that we make this a little bit more personal. Thanks buddy. Good to see you as always. Hello everybody. I'm Jim Walker. I'm principal product evangelist at cockroach labs. My job at cockroach lab is to really kind of help people understand some of the technical underpinnings of what we're doing here and apply them to what they're working on. And I love doing this I've been in the Kubernetes space for really quite some time. I've been in fact the first time I saw cockroach TV was on stage with the CEO core OS demonstrating something that's kind of very similar what we're going to demonstrate here so Jim do you want to just give a quick introduction of yourself as well. Sure. My name is Jim Hatcher and I'm a solutions engineer, which means I get to work with customers and prospective customers and help them work through, you know, the technical aspects of cockroach TV and now recognize if they're a fit for the use case they're that they're choosing. And so it's a really, really fun job and I get to come on webinars occasionally with Jim Walker and do demos of cockroach that's fun. So and, like I said it's just Jim and I, today, that's our band name, you know, we're here to answer any and all questions. As noted, please do don't questions in the QA. I'm going to go through a fair amount up front just to explain kind of what we're doing in cockroach database, but I'm going to try to tie it to kind of distributed systems and the, but I like to call as the distributed mindset. I think in my conversations over the past couple years it's, how do we help people become distributed like because I really do believe that this is the future of kind of all software engineering and all of our systems. And I, you know, I saw the light a couple years ago and so I just like to expose some of the principles that we're using here. I do like to think that cockroach database is a PhD and distributed systems as some of the stuff that we're doing is, it's pretty phenomenal. We have lots of information or documentation if you really want to get into deeper deeper levels. I know we have a sigmod paper that we published. Oh gosh I think mid last year if you look up sigmod and cockroach labs, some really great engineering details there. Before I even started, I would be remiss without thinking, you know, years and years of work on distributed systems and, you know, a lot of the stuff that we do here at cockroach labs is the senate of a lot of the stuff that we've seen come out of Google over the past 15 years if I think anybody who talks about these, these concepts and doesn't think you know Jeff Dean and Sanjay Gemma and many, you know, Eric Brewer, many of the distinguished engineers at Google I think is, you know, it's almost short sided and this is this is the tech that's driving the future of what I feel is happening in the data center and the data center being cloud. And that's really kind of the underpinnings of this conversation so I hope this is valuable. Again, please do ask questions along the way Jim and I love questions and we would love to basically basically try to answer them in line as as we're going along so it'd be great. There's this concept of distributed SQL and this talk is about Kubernetes data and a distributed mindset and that to me the mix between this, you know, what's happening in Kubernetes and what's a reality is that you know when when I was at the US few years back, I always saw companies struggling trying to figure out, you know, stateful week will workloads I think at the beginning of the Kubernetes world was all I had stateless workloads I'm like, What is that even mean. You know the database is such a key component of all of our applications. Well, let's say most of them that you know there needs to be kind of a re architecture rethink of it as a layer of the software development stack. And that's really why you know there is kind of this emerging space around distributed SQL. You know one of our, our partners in the CNCF you know like there's the Vitesse project you'll hear you know how do they, how do you shard my SQL, you know within within a single environment how do you automate that that sort of stuff like, and to me, asking the questions about distributed really comes down to really five questions number one, if you're going to be distributed SQL, is it sequel. Number two, I think a core concept and distributed senses is easing the complexity of scale, because it's kind of a big piece of what we're doing here in the cloud. These things need to be geo replicated, always on and resilient. Again, a core principle of distributed systems, you know architecting for resilience into the system itself. And that's the outlet, you know, technology around the thing, keep it alive, let's let the thing keep it alive. And then if you're talking about SQL and you're talking relational, some sort of level of asset compliance and distributed it's not single node transactions we're talking like truly distributed kind of MapReduce style type, you know distributed transactions. And then, I think, you know, I often have this conversation like I think the biggest concept to understand in distributed systems is, we used to think logically we need to start thinking physically. And the nature of distributed systems, the whole very concept of the word distributed infers that these things happen in various different locations. And so from a data point of view, how do you tie data to a location so that, you know, you can survive failures or you can survive or you can or you can, you know, provide low latency access to data anywhere on the planet and that's something that actually we're doing in we think the database should deal with that. But in your own applications and services, these are all things definitely that that where data lives and and what's happening in your compute is a key concept to actually understand the distributed systems as well. So cockroach was architected with a lot of the same principles as as something distributed like well Kubernetes. And, you know, it's inherent that it has scale and it has resilience. It is multi truly multi master. It's going to guarantee transactions. If you're familiar with something like at CD, I remember when at CD multi master came out. Actually at CD implements raft we also implement raft I'm going to get into a raft a little bit here. So there's what kind of the core concepts of Kubernetes apply here with cockroach like if you, if you kill a pod well the control plan make sure that the pot is up and running again right you you basically set state for Kubernetes and it maintains that we actually show how you can kill a pod and the database going to survive that naturally without loss of data without loss of transactions, and all these different things. We build on top of a lot of the core principles or core concepts and classes in Kubernetes. We can mount volumes locally with source class we can mount whatever PVs you want to actually go out and deal with. And we build on top of stateful sets, and we naturally inherit a lot of the power of Kubernetes. Many people say you know where's your operator you have an operator will funny things about cockroaches. You know when I first got here, I argued with our engineering team to build an operator because well, we kind of are naturally fit for Kubernetes. So how do you do day to operations how do you do rolling upgrades how do you do, you know, certificate controls and whatnot and that's the stuff that we're actually building into our operator. It's kind of the more advanced topics that are going on so my friend Kelsey, you know, he's friends of all of us and just, he's been fantastic for this community. And I can say, you know, cockroach DB is the spanner like Kubernetes is the board and, and, and there's a lot to be said in that statement, you know, I think I'm tremendously for saying that. But I think once you see cockroach and what it can do. Yeah, there's a there's a lot of similarities here. We are a descendant of the spanner white paper so. So we're quickly cockroaches a cloud native database. This is relational database so it implements familiar SQL so if you. You know why are compatible with with Postgres so number one we're going to implement SQL, but scaling a database like this is simply. It's accomplished simply by by spinning up a new node and pointing it at a cluster. The database will redistribute data and and deal with the incorporation of a new node. But what's interesting here is this is not just reads. This is for rights as well every node is a single consistent gateway to the entirety of the database. Often we think about scale in terms of volume. We think about it as in terms of transactions, but we also think about scale. From a geographic point of view, and how do we scale across regions and even clouds if that if that's the case. And so cockroaches very naturally doing this. In fact, I'd say nearly every one of our implementations are large majority of them are multi region. I know on our cockroach cloud product I think, I think 80% of our customers are actually deployed in multiple regions there's lots of reasons you want to be in multiple regions right what if happens if. An entire cluster goes out do I survive that burn entire region goes out do I survive that for us a region can be a cloud provider it can be a region within a cloud, it can be. And then a Kubernetes cluster. We, you know, I think my friend Keith McClellan and I did did a did a demo on a previous Linux foundation webinar, where we talked about deploying a single logical database across multiple different clusters. You know in the Kubernetes world we often talk about how do we federate clusters so that we can have you know one kind of cluster that that is in multiple different regions. That's really difficult to do. Why not to subtract that up to the to the data layers is often the way that we think about that right because we can actually be multi cloud. And we've done this before to we were doing this now. We have a single logical database with data that persists in three different cloud providers, which also is kind of as this unique capability of cockroach you know hybrid multi cloud multi cluster if you will right. And like I said many people do this because they want to be able to survive the failure of whatever failure domain they want to think of is it a, is it a note is it a rack is it a is it an easy is it a region is it an entire data center is it isn't the eastern for that matter or, you know an entire part of the world and you know cockroach is architected to survive those sort of things. It really comes down to implementation details of which you know my friend Jim deals with all the time, helping people optimize their data at the row layer at the row level on how they want to survive a failure how quickly they want data to get to a user. And there's trade offs like it, we're never going to beat the speed of light y'all like that's a that is a physics problem, I would love to be here one day when we do. I don't think it's going to happen in my lifetime or not but you know I think we're dealing with with with software now that is pushing up against these physical limits and there's lots that you can do to get around and and and to optimize for those things but that's that's actually pretty important way that comes out if you think about a single database across, you know three different clusters. If I have some random using at user asking for, I guess this is my CEO's customer table here's Spencer Kimball's data. And he's asking through, you know, through load balancer to any one of the nodes in the US West cluster. And this is smart enough to go find that data, and the raft leader and we'll talk about raft a bit from wherever that lives wherever that resides, and then return that data. Now we do this because, well, you know Spencer lives on the East Coast and he wants fast access to his data. And so we're going to actually go and the raft leader, the authoritative source for his records is going to live on the East Coast because when we write data to cockroach we're actually writing it in tribal kit. That is configurable you can do five times you can do it seven nine some odd number because ultimately transactions are completed via quorum rights, two of three nodes have to actually commit. In this case, I can survive the failure of an entire region because I still have access to that data, right typically you'll put maybe one copy in each, each region. So you can actually still commit and deal with rights right so, but it's an implementation detail and something that's configured by you at before at the real level on each table. So you want fast rights in every region, well, you're going to have copies of that all over the place and you're going to allow for rights to be you know follower reads and we have lots of features that get into these things and and it really comes down to what you want to accomplish with each table, not a database in its in its entirety, which I think is kind of one of those key things but you know being able to access data from any note is just it just keep hard of cockroach. So we also understand you know you need security you need optimization we have a distributed cost base optimizer. When you do backup and restore in a distributed system isn't distributed. You know, and I, you know I'd be remiss without mentioning our documentation again. It's probably some of the best documentation I've seen a piece of software in a long time open source or not. And so I would just start there I think there's some really wonderful details there so. Before Jim gets into the demo I just want to explain a couple things on how we actually accomplished what we're doing. And I think it's in these concepts that will help people understand kind of what the challenges of our distributed systems are. Ultimately cockroach is a database on top of a database at the storage layer and if you think about database database has storage, because well we have to persist data that's the whole point here right. So if you have a language which well we believe that sequel definitely for relational data. I think that the elegance of the sequel model is. And the, you know, sequel data modeling is just, it's apparent of how beautiful and easy that is to mirror business logic. Document models are interesting. What happens when you get to you know 15 documents or 20 or what is that what is the complexity how do you change things. And you know the the elegance of that data models kind of interesting but but in the middle is this distributed execution which sexually really really important. Being truly truly distributed at that layer is also kind of one of those key things but at the lowest layer the storage layer, what we're doing is actually storing data using a KV store. We actually had used some called rocks TV, which is an open source product. We translated that into go and re architected it a bit so that it is a kind of, you know, more optimized for some of the things that we were doing from a global point of view. So it could be a multi tenant KV store, which is stuff that that we're working on right now, something called pebble is what we call that rocks pebble get it. And we do that so in a traditional database what you have a say you have an inventory table you just keep a penny records to the end of that you have an index that points to these records just bunch of pointers and memory. And that's how you kind of store data, right and that's how you store data, right you just keep a penny to the bottom of that table. Well, we had to actually rethink that we redid that so that we can actually gain the value of KV yet still be distributed and this gives us a lot of power. And as I said, every table is a monolithic sorted KV map. All tables have a primary key. And the K in the KV is the name of the table, the index, a key and say a column name, and then the value is going to be the column value the best way to show this to just show you a table. Right. So here's a simple table, it's our dogs table. These are, you know, some office dogs that we have. And there's some entries over on the right hand side we have ID, we have name, and we have weight. When we write these records to cockroach at the storage layer what we're doing is we're taking you know this first record and we're saying let's break it down. It's the dog table, there's the ID, there's the key and then the column name is name, and we're going to store the value car dog 34 weight 10 dot one. Okay, so for the second record for the third record for the fourth record. And what we're doing is if you just look at those keys if you were just to sort those keys like put them into say Excel or sheets, whatever, right. And you sorted them, you this would be in some monolithic kind of lexographical key like storage right like it would be completely kind of sorted correctly. So we know exactly where to insert records now this allows us to do some really really cool things. And let me show you kind of how we use that then. In in cockroach we have basically a concept of a range right a range is basically it's arbitrary that it's not actually arbitrary. What we do is we break down each table into these contiguous chunks. Think about this as a shark. We're using 512 megabits it allows us to kind of amortize indexing it allow it's small enough for us to move some things around. So look at this table we've we've ordered everything and now we've created ranges and these ranges are now ordered. Right. And so what we do is we well if you're going to do this you actually have to find these ranges and this distributed system so we actually implemented index. If you're familiar with a be tree it works very very similar to that. But what we do is we then if we want to insert a record and say we're going to insert sunny into this table we go through that index we go look at that range and we say hey range do you have enough space. Yes the red range has space. Great let's insert that record let's move on. Okay. Cool. Great. Well what happens in the case that I want to insert another record say I want to insert Rudy. Come on slides what do you do to me today. Jim, you know your demo is the thing that's supposed to have problems not my likes, you know Google slides right. Oh man I went too fast. There we go. So let's say I want to insert Rudy into this range. Well, the range is smart enough to say hey I don't have enough space I'm hitting my 512 limit. What it does is it splits the range and inserts the record. What we've done is just completely automated charting there's no like user interaction to do this there's no like manual charting in place there's no like rethinking through the data. We're basically just doing this all under the covers in real time. Now we aren't going to split a range in real time. We're going to let people, you know, go a little bit past the limit and then you know clean up things in the background when a range is going to be used right so that we don't affect performance of access or whatever that is right and so, but we're doing this all automatically. And if you think about a production database, there's hundreds and thousands of ranges of which you know database like cockroaches will basically manage and maintain and understand kind of what's going on this is just happening all the time throughout the entire system. If ranges get too small it'll contract things as well in cockroach to optimize you know efficiency for searching and that sort of thing so there's lots of efficiencies that are going on. But ultimately I think the key part here is this is kind of automation of really the sharding of the database itself so this is that natural scale that we talked about in terms of kind of one of those core requirements. Now, we use something in cockroach called raft and if you aren't familiar with raft and you want to learn about distributed systems I would highly suggest going out and reading about it in fact, there's this really great website called the secret lives of data. If you want to go check it out, there's a great, there's a really, really great description of raft and how this works. You know, for our purpose, you know, this is how we kind of atomic rights and consistent reads across all of our data. Raft is implemented really as an odd number of replicas. So on the right hand side you see it's blue. This is that blue range. I forget which dogs, I'm pretty sure and buddies in that range that's my dog right. But we have three copies of this data, and that represents a raft group. Now this, the protocols is is chatty was gossip going on. And there's a kind of these coalesce heartbeats to make sure everybody kind of is in sync at all times now. There's a concept in raft called a raft leader, a leader is elected across the three of them, and it coordinates all rights, it promotes commands to the others and ultimately it's going to be able to serve an authority up to date. So it's a great record for for what you actually want to access. It is actually key key concept, and it is really what allows us to get this kind of atomic replication of commands if I just asked the raft leader to do this, and it can go on off work with all the other replicas and say great two of three have actually committed the raft leader knows this I can say hey I've done this and so this distributed consensus is such a key piece of distributed systems. So it's a little bit of a raft. That's what we chose to implement there's another one called Paxos, which is, which is popular with with with some others as well. But to me I think these are you know key key concepts to actually understand the distributed systems. We use it a lot within cockroach, and it's really kind of core what we're doing here. What are we putting this data in a cluster and how do we survive these sort of things before we get into this is kind of how it works basically what we do when we write data to a cluster here's a four node cluster. When I write the first range and when I write the first range blue range and when I write the red range. And we've basically distributed this data evenly from just a volume point of view across this cluster. And remember this is millions and millions or hundreds of thousands of ranges that we're doing this for we can also do things we take heuristics on workload so say there's a range that has you know really heavy access to it. How do we actually segment that off onto a particular node. Since this underneath the covers that your primary key is such a key piece of doing these sort of things. You can use things like sequential IDs in cockroach. You know, because you're gonna have your end up having something called a hot range right you don't want all your access going to the very last range in order. You want to use some sort of random number like a UU ID something like that like there's a lot of best practices that come into play here. When you set the database up right. But think about from a distributed mindset as well when you're dealing with distributed database. Do you want to insert 10,000 records in one commander do you want to do 10 inserts of 1000 records. Right. And so this kind of distributed thinking comes about when you start to design the data and then how you interact with it as well which is kind of a key thing. Right. So you don't have these kind of overloaded nodes for a particular transaction whatnot. And also to something that's really interesting night and I talked about this at the very beginning, you know this this concept geo partitioning. You know, can we overload the key to sort the data in this in this KV store, so that, you know, we can insert say a location to this, because when we deploy cockroach we basically just say hey each node is assigned a region or a location if you will. We can set that location and now tie it at the row level. So there's a country code for EU well everything that has a country code EU, EU is now part of that key. Right. You know that the key for the first dog was 34 will it is EU 34. Now all that is sorted now when my sort happens on KV. Everything is in order where things should live. And now I can say everything that has keys from you know EU blah blah blah blah that's stored in Europe. This is the database to tie data to a location so that really three reasons this happens. Number one resilience, I want to make sure I have copies in three different regions based on what it is. Number two speed of access I want records that are in the EU to be located in the EU. Often we see people use this from a data compliance and privacy point of view I want to actually just tie data to say German users on German servers only. All right, and so this is all configurable via some simple, very, very simple DDL within the database itself to actually take care of these things so developer can actually configure this on the fly right. We also rebalance things so on a scale okay great I added a new node the database is smart enough to actually redistribute the data so grab just scaled for for volume but I've also scaled for transactions as well right because any one of these can do this. I can survive the loss of a note. If I lose these ranges that are on node three. Well the raft leaders which kind of have that dark thing around it knows that one of the replicas is gone right because I have to have a complete replica set of three replicas. And it'll just make a new copy on another node the database is just surviving this naturally. We can also survive kind of temporary failure say the database, you know, you lose a node for say a minute or something, we can actually replay logs and get that thing back right All of this is happening completely automated unbeknownst to you spin the cluster up all this isn't there just there happening in the background. And without any real configuration I mean the geolocation stuff for sure but this stuff is all very, very natural so. All right, so I just wanted to give a quick kind of like high level understanding of some of the core concepts that are happening underneath covers now. So Jim is going to come and show is you know, how is this aligned then with Kubernetes because if we can scale the loss of a node and Kubernetes can kind of ingest the loss of a pod. What if we're deployed on Kubernetes how do we survive that sort of thing how do we scale how do we just put up a new pod right so Jim I'm going to leave it off to you. I hope that was good valuable baseline information for everybody but let's go into the demo itself Jim. Yeah thanks Jim. Other Jim. Other Jim, hold on a second, how do I stop sharing now I think zoom actually, oh there we go. Zoom hasn't changed I've changed it. Yeah. Now you're still the same wonderful person you always were. I'm Jim one your Jim to that's right Jim. Yeah. Okay. Yeah, so I stood up a cockroach cluster. In GCP. And I did it using, I just followed the some some documents that we have on the cockroach labs.com website. If you kind of navigate to our docs page and then look under deploy orchestrated Kubernetes single cluster deployment, I went through these steps. And once I had stood that up. This is the DB console we call this kind of administrative tool that you can use to monitor what's going on and cockroach. And so you can see that we have, we have three nodes here, they're called cockroach DB zero and cockroach DB one cockroach DB two. And so this is actually running in Kubernetes Kubernetes. I mentioned that the cockroach runs very well and Kubernetes we can kind of. I would say we both embrace that distributed mindset that Jim's talking about, and kind of a cloud native approach to to doing things so I'll just show you real quick we have a run and get pods command. I have, I have the three cockroach nodes running. Zero one and two. Also have another pod running, which is my client that I just kind of used to run some commands so. So, let me, let me demonstrate a couple, couple quick things. One thing I'm going to do is just run. I'm going to run a command here against my, my client pod. And I'm going to run the cockroach command to run a workload called mover. Mover is like a ride sharing app that we use to kind of demo capabilities. So if I kick that off and Jim just really quick in the cockroach DB binary are several different workloads that you can just run. Part and parcel of the binary you can run TPCC there's wide BSEV there's the mover workload which is a fake app that we created but a lot of people do this just to demonstrate or just test things out but but it isn't like you had to create that Jim it's just part of the binary itself that's an important point there. Right. And I think that's with the cockroach binary. That's one of the reasons we're, we're really, we work well in Kubernetes is because when you launch cockroaches, you literally download one single binary. And when you do that contains the code to run the engine contains the code to run this monitoring interface and all the workload code and the client code. And it's it's a nice way to bundle things that works well in Kubernetes so. So back over here I clicked the metrics link over here on the left and on this main metric screen. There are two, you know, kind of the two things we tend to look at when we're running loads is this how many, how many statements are running right now. And you can see that that that ramped up a little bit when I kicked off that workload. So this kind of tells us our throughput. And then the other thing that we see here is our latency so this is showing the 99th percentile of our SQL latency. So you can see it's about, you know, seven, eight, eight milliseconds so. So we have a, we have a workload running. Now, what we can do over here in Kubernetes. I'll just remind us what pods we have. I can kind of simulate if a pod went down or a node went down. I can just delete a pod. I'll just delete that cockroach DB two. And so we'll see that gets deleted. If I come back over here to this DB console. A couple things happen. One is that we see that we have three nodes, but when two of those are live and one of those is gone to a not good state. And then we have this other metric that we see called under replicated ranges. Walker mentioned earlier that by default data and cockroaches replicated three times. So when we get into an under replicated state what we're saying is we're supposed to have three replicas and right now we have two replicas. So it's just kind of a warning it's not like a, like a down condition that because we have two of the three we're still able to to read and write. And in fact, we can see that this workload is still running still cranking away. And then our, our metrics, you know, we're still seeing the throughput and we're still seeing latency is about the same. Now, in the, in the couple seconds it took me to explain what was happening with the under replicated ranges. We can see that this node has come back. And that's because Kubernetes brought that node back. I have this setup in Kubernetes in a stateful set. So, you know, we, we, we declare, you know, the kind of this declarative syntax that we want to we want to stateful set that's running three pods. And so, even though I deleted this cockroach to pod Kubernetes saw that that that wasn't there anymore. And Kubernetes brought it back. And because it's a stateful set. When it brings back a new pod, it connects it to the same persistent volume that was there before. And that, and that hardware that it brings up in place for compute, you know, it's connected to the, to the same data. And so we have, you know, Kubernetes kind of brings brings brings things back to a consistent state at the infrastructure level. And so, you know, I think that's, this is a cool illustration of how cockroach kind of takes care of replication and availability at the data level database level. And Kubernetes is watching things at the, you know, at the infrastructure level and making sure that we have the high availability there. And if, if that had stayed in a, in a dead state for, for five minutes cockroach will kind of go into an auto healing mechanism where it says hey I've got these under replicated ranges. You know, I should have three, three replicas but I've only got two, and it'll wait, wait five minutes for that node to come back up, but if it doesn't come back up cockroach will say okay I'm going to, I'm going to recreate that third range. I'm going to put those on other nodes in the cluster. So that's a cool, cool feature so anyway so we can see that our workload kept kept chugging during that time and our, our latencies are, you know about the same. So, another thing I'll do here is just kind of demonstrate how we can scale out the cluster. Jim, Jim mentioned on some of the slides that, you know, we can span, we're distributed database and we can span multiple regions multiple clouds. So, you know, we did the fact that we distribute horizontally is a big deal. It means we don't have that scale ceiling where we can always scale up up up and then we kind of hit a certain size and we can't get any bigger. We can scale horizontally so to do that in Kubernetes what we can do is. This command to say, let's scale our stateful set, let's spell that out. This is clear. Cockroach DB, and then we can say the number of replicas we want is four. So, you know, currently we have, we have three replicas, which are represented here. I run this command. We can see that the stateful set says that it's scaled. And if I run this get pods command. We now have a cockroach DB three, that's in a container creating state. And so, if we keep watching that we'll see that this moves into a running state and goes into a ready state here. But let me watch the dashboard. What we should see here is that cockroach will automatically see the presence of that new node and we should see a fourth live node show up here. And a fourth node show up down here. See how we're doing. So we're, we're, we're in a running state, we're waiting to go into a ready state second. So one of the things we can, we can look at to kind of monitor this is that there, there, there's a, there's a graph here we can look at that shows us the number of replicas per node. And we can see that right now we have three, three nodes, and then they're each got have 66 replicas per node. Which makes sense. And when we see this other node ad we should see node for show up. Let's see what's going on with this pod. Okay, looks like when I started this, it hit that failed our readiness probe so me. I'll just delete that pod to start again. So, you're just basically restarting the pod you killed it and Kubernetes is restarting it right. Yeah. I started this pod and there are a couple of probes that happen in Kubernetes one is this. The ready readiness probe another one is a live in this probe. And so when you get in. When, when you, you have to pass the readiness probe they have to pass the live probe so let's, let's see if it will be passed at this time. Well, Jim, you know that's the way it's got to work if my slides went awry, you know, yeah. I mean the point is basically you just you now just scaled and the database is going to automatically kind of redistribute data it's going to take care of these things right so Jim and you know you knew this had to happen today for us. Yeah, I know. Yeah, but what I expect to see is you'd see another node show up here, and then these replicas would balance, you know, that we would see this fourth node, and cockroach cockroach say hey I can move some of these 60 ranges that are on existing nodes and push them over to this other three balance everything and what you've done you've not just scaled the volume of the database you scale the transactional volume as well, which is such a key thing to I think one of the things that Jim noted as well. This is a single binary, any node and cockroach can serve as a reader right, any node can take a transaction the database is smart enough to actually understand what it is I think another one, another kind of core concept of distributed convergence is that basic ability. If you can avoid having two different types of nodes or two different types of binary. Now I've seen other databases where they have transaction pods or they have storage pods, it gets real complex on how you actually have to configure those things cockroaches takes care of all that through a single binary. One thing that's really interesting is this, this concept of having the, the UI that Jim that Jim's talking about here which we're connecting to what 34 140 whatever that IP address. That's just any one of the nodes, any one of the nodes is going to service up this UI. And it and that's kind of one of these other innate concepts within distributed systems I think is really really important, you know have a self contained, you know, binary which is that's the atomic unit of scale. So it's kind of a complex topic but I think one of these things it's really important to understand distributed systems if you, if you can treat your, your unit of scale down to single binary do it. Avoid the complexity of having to configure on the back end and let, let the system itself actually deal with that sort of thing I think it was, those are one of the that's definitely one of those big concepts and distributed systems I definitely adhere to and I think it's extremely important, because it's going to reduce complexity, right. So the question about, you know, different multiple environments, like if I have, you know, prod dev stage test, blah, blah, blah, right and so when we start to, you know, deal with these things I think a lot of people are looking at you know, moving data around between these different environments, how do how do people typically accomplish that using cockroach and, you know, avoiding storage costs in multiple different ways to right like, how are you seeing people deal with that. I see some folks that want to take like a, an image of prod and like move it back into like a UAT environment, like once a month or something just to kind of have like, you know, production like data and other environments so you can use like backup backup and restore to do that. A good way to do that. And I think it's one of those things to Jim it's that you know, like if you're doing backup restore in a single system. That's interesting. You know for us backup and restore had to be distributed. Because if I'm just doing backup restore of a single database well yeah I can create a new file and it's on a local like box or with some attached storage or whatever. But what if we are doing geo replication of data maybe we don't want to actually incur egress costs because you know we don't all the data to flow from one region into another right. So distributed backup or restore something that we actually had to implement. It's all the stuff that's around your system and has to be distributed as well. And these things are not easy to do, but but it allows you to do this sort of thing. When I think about different environments. It's really, you know, configurable on what you want to do. Typically Michael. I'm having a hard time reading I took my glasses off. It is it is backup and restore that's how people are actually doing with this you know shave it down. You know do you have smaller production instances and whatnot and then you know manually dealing with that so. So I also love this community Jim because of course somebody's trying to help us you know troubleshoot which is like so perfect of you know CNCF and Linux foundation right like, hey let's all help each other. So I somebody was saying you can, you can restart Docker and the cubelet should fix the readiness probe but I don't know if you I don't know if you fixed it or not but I just love that somebody had the, the, the confidence in our demo that yep there was something wrong underneath it wasn't us it wasn't our software was something else right like. Yeah, so I haven't fixed that I'm not quite brave enough to try to debug it all and get working on the fly but I will say that. I mean we. We, you know, we have customers around us and gcp and any ks and other other Kubernetes based environments and. Yeah, it, it, it works well unless you try to demo it. Oh, come on. Tim, have you set up cockroach on top of you know multiple Kubernetes clusters or I've done that with Keith before what are the complications there I think that's a interesting concept for me for sure. Yeah, if you, if you, if you do run cockroach across multi regions like in your slides Jim you're talking about if you ran one and Google one and Amazon one and Azure or something. You can stand up, you know, the way the way we do that is there's actually a separate Kubernetes cluster in each one of those environments. And then you make sure that there's, you know, network connectivity between those you'd set up like a, you know, VPC peering between those environments. And then, you know, you have to use like a CNI container network interface that knows how to talk between them so you know that the cockroach cluster actually, you know, doesn't isn't really aware of the underlying infrastructure just knows, hey there's other, there's these other running in other places. So, you know, I think the Kubernetes community as a whole hasn't hasn't really solved this issue or there's a like a read upon way to solve this with the kind of federation of Kubernetes cluster so the way we do that is we just have three separate clusters. And so you end up having like three, three separate control planes so like I guess one complication. If you want to call it that is that, you know, if you wanted to scale from three nodes to four nodes in each one of those clusters, you know you need to do it in each one of those control planes. But yeah, it works well. I mean, I think that's a that's an issue that over time we'll see an answer to where we can do it in one one control plane across, you know, one like federated Kubernetes cluster but Yeah. Yeah, I think there's a lot of movement there I mean if you look what's going on in the cross playing community if anybody's not familiar with that like just having some control plan across multiple different clusters that sort of stuff I think the, the work that team done has done is is phenomenal. If you look at what Sillium has done to help kind of with the networking problem. Also very interesting, a customer of ours called form three is actually deploying us across multiple providers in production. So I think form three has, I think it's millions of transactions per day at this point they're like a clearing house for for financial transactions in England. And they need to mitigate the risk of a cloud provider going away at any moment and so they're actually using us, and they're using Sillium to solve that networking problem because I think that's the big problem right Jim I think the networking is the piece that when you start to go across clusters you go across clouds. That's that that's the big complexity I think so. Yeah. The nice thing about Kubernetes is it's a consistent interface, so that you know the, you know, the fact it's not like everybody's kind of run their own, you know wildly different version of Kubernetes it's, it's just making sure that everything, everything's talking to each other in a consistent way. That's right. Well and the other part about the Kubernetes community is projects pop up to solve problems. You know we were using open tracing to do some sort of you know to root cause issues and performance of cross queries. We're now implementing open telemetry if you're not familiar with the open telemetry project another one super super interesting. And I think the second most popular project in the CNCF now so, but we're using all those things and I think it just part and parcel of the overall community so Jim I'm going to round this off with with one last kind of part of this If there are any other questions, please y'all do dump them into the QA Jim just interrupt me if there's something that's interesting comes through I'm going to kind of have one screen today so. So, so just real quickly, we talked a little bit about this, just just visually, this is the way things used to work when I used to code an app server at a database, often this was just Fox bro running on my laptop. I'm joking. It was just a database somewhere there was a server there was my app. The problem with these these things is these round trip times. If you think about somebody in New York accessing the database that's a cloud in South Carolina or wherever that is and US East. Well, the speed of light is no joke. You know we're talking about a 70 millisecond round trip time between the West Coast and back. From a human point of view, what's what's real discernible is real time is somewhere around 100 milliseconds concept of the 100 millisecond rule which actually the Google team thought of as they were doing Gmail. But what happens if I have a transaction that takes multiple trips back and forth right consumers don't have the patience and further you know you have timeouts you have a bunch of different issues that can go wrong here. Also the way that we backed up our systems well we had to actually use this kind of asynchronous replication so you have these kind of like active passive systems. What we're talking about with the distributed database is just active active, everything's active all the time everything can serve as a query. I could lose a note and it's and it's completely fine. If you lose your primary and you have to back up to a secondary what's your RPO and RTO in those situations. How do you remediate when you know the secondary when the primary is ready to come back online and you have a bunch of transactions that happened in the secondary. What if it's hundreds of thousands transactions in that five minutes or, or in that hour or whatever that is how do you remediate these two, like what what is the right state of your data. There's lots of issues this has worked for a long time for all of us. Now we've proposed just a concept of active active have multiple different nodes I mean this is just a better way of thinking about not just latency, but also survivability. I have users say on the East Coast and the West Coast say of Phoenix is accessing US West. It's a seven milliseconds back and forth to LA, and it's 24 milliseconds for rights right seven millisecond round trip time. Well what happens if US West goes down, and I'm just simply going up to the US West. I have copies of that data there I've basically I have the quality in my latency it's actually the same amount of time I didn't even have to hop through that thing. Before I was hopping seven milliseconds to LA and then another 24 to get up the coast right those. What is that do my math is 31 right. What if that just went away and it was just a simple 31 milliseconds from you know say Phoenix to there I have now the same thing. I completely survived the failure. There's lots of different ways to configure a database to actually deal with these things. And finally, as Jim was talking about, and I talked about, don't federate clusters distribute the data. I find this to be a very interesting approach to kind of the multi cluster thing. You know maybe you still have to manage two clusters but you don't have to manage it from a data point of view and the application layer can actually just deal with it as it's a single cluster. Because this is a single logical database that's spanning multiple different Kubernetes clusters right I think that's a kind of one of these unique capabilities of Congress DB. We have a Kubernetes operator, as I, as I said, it's kind of handling a lot of the day two operations were continuously working on this. You know, and if we have continuously releasing things, you know to help you kind of deploy manage do rolling upgrades, the same way you do rolling upgrades across pods well you can imagine and cockroach spin down and oh it's been a note back up with a new version. I think we're backwards compatible to Jim is it two major versions or three I think it's two major versions right. Yeah. Yeah, so that'll be a whole year so and then survive pop failure so you know where we've been at this for a while. You know we feel we do have kind of this kind of cloud native database that is kind of purpose built for Kubernetes. If anybody's interested in learning more our documentation is amazing. There's also cockroach University we are building lots of courses, adding new courses regularly. If you want to learn about cockroach I think we have some basically general purpose SQL stuff distributed systems, but a great resource, but our docs is really great and if you want to get started today you want to try this out. You know cockroach DB in the cloud is just a service you can get up and running minutes. So we have a serverless version of this which will completely automate scale. We believe will change kind of the face of database and what this what this completely is because you don't have to deal with any of this kind of back in configuration but you know this is us running cockroach DB on Kubernetes. Our car our cockroach cloud is actually all Kubernetes underneath so we've learned a lot about this stuff over the past couple years and you know we're ready to share this with with anybody. Okay, well that's I think all I had Jim anything else before we take off I don't think there was anything else in the QA was there. I don't think there's one other question but I answered that. Okay. Kubernetes keeps up with IP addresses and. Okay. Oh cool. Everybody thank you for joining. I hope this was valuable. Yeah, Jim and I you know we love talking about this stuff. I hope it comes out and the way that we speak about it. You know Jim and I are happy to answer any questions that anybody has I'm, I'm Jim at cockroach lesson Jim what are you, I'm Jim one so I'm just Jim cockroach what did you end up being within it with an email is Jim H. I'm Hatcher my last name Hatcher. We're the gyms though so engage us in our slack channel reach out download it use it go check out our docs I think is the best way but you know lots of information of how to do this on Kubernetes you know we just do feel this is the right database for that environment so thank you everybody for joining and back to the Linux Foundation to wrap things up. Thank you so much, Jim and Jim for your time today, and thank you to everyone for joining us. As just a quick reminder this reporting will be available on the Linux Foundation's YouTube page later today. Okay we hope you will join us for future webinars thank you so much again and have a wonderful day.