 Yeah, I checked it when I turn it off, right? Yes, yes, yes. Yes, you can see. Okay, yes, that's like, oh, yes, yes, like, oh, Peter, how can you be so stupid? Yes, yeah, I also think that. Check the microphone and then forget it to enable that. Well, anyway, anybody trying to listen online, I am sorry, my fault entirely. But we didn't get the good stuff yet. So I'm gone. Well, Kubernetes, right? Going back to the Kubernetes which became ubiquitous and another thing which is interesting in Kubernetes in relation to the databases is what Kubernetes was designed first for state less applications. Right, I mean, if you think about that in the early versions of Kubernetes, running database of all Kubernetes would be a similar one because, well, the database is where the state is, right? And the idea was, well, you know what? We'll have databases somewhere and that's where state will be kept so we can have our state less applications in Kubernetes. But you know, as if many things, they evolve and people started to want to have a database on Kubernetes too, right? So you don't have to manage two different kinds of environments. And Kubernetes became much better in the recent years for handling those stateful applications such as databases. And you can see the Kubernetes operators and that is kind of a way how you run their databases available for many most popular open source databases and not just them. What you also see is a recognition of that effort in their community, right? Many of you probably heard about, well, data on Kubernetes community, right? If you guys are here, which was going pretty strong. Like few years ago, then Percona joined this effort where there are less than 100 official members, now it's more than 5,000. So there is like a lot of people interest, not just into running that, but in actually participating in finding ways how to make it better. Okay. So what is interesting in this case, and this is kind of a stats from 2022, unfortunately I don't have a more recent report, right? What is interesting in this case, what we can see is what for a lot of companies participating in this community, we can see a lot of them are already running a number of workloads on the Kubernetes, right? And we can see a lot of folks are expecting to increase that. What I think is also interesting in this case is how people think about that. Does that end up working for them? And what we see in this regard is what people moving workloads to Kubernetes or starting to run the data intensive workloads on Kubernetes are pretty satisfied, right? As you can see, we're kind of very satisfied or some that's satisfied, right? That's, you know, you typically get asked and sorry. Here is now another legal stats for you, right? If you need to, you know, convince your boss to let you try running a database on Kubernetes. This is from CNCF, right? So this is not a data on Kubernetes people also known as, you know, data on Kubernetes zelots, right? This is overall Kubernetes space. And what we can see in this case is, you know, open source monitoring, databases, messaging, right? These are some of the fastest growing workloads. What I think is interesting in this case, all of them are pretty much data intensive because even if I speak about open source monitoring, guess what, right? It's going to store and process a lot of observability data. So it kind of includes some sort of data store. You know, here is some more interesting stats, right? About their data related technologies on Kubernetes. And we can see what, you know, databases, right? I think which people mean in this case like operational database and analytics are very common, right? And this is kind of how what is percent of respondents run different workloads on Kubernetes. Okay, now let's go about some specifics here, right? Hope I provide you some stats in terms of what is, what is a good idea to, why is it a good idea to run database on Kubernetes? Now, I mentioned the operators, right? What is exactly like an operator concept I mentioned and why is it important, right? Now, if somebody is responsible for database operations, making sure it runs healthy and so on, right? How do we call that person? Well, that is an operator, right, in many cases. Yeah, you can also call it like DBA and whatever, right? And that is essentially that concept which is now translate to the software, right? The operator is the piece of code which is responsible for doing everything which needs to be done with databases. Now, why do we need that? Well, because if you think about your simple application deployment, let's say, hey, I want to deploy 10 copies of our stateless app server, right? Well, you just, you know, deploy them in kind of any order, you can kill them at any time. It's kind of all like a very loose and easy. That is not how things work with databases, right? You have to practice kind of certain care, right? If you are deploying the cluster, it has to be deployed in a particular order, right? If you want to, for example, perform the rolling upgrade, you need to, you know, do that kind of carefully, right? So that second part also didn't work, right? What's that? You're good. You're good? Oh, I see. So I have to say, oh, the speakers didn't work. Oh, that's the figure didn't work, but it worked for our esteemed record audience. Okay, that's good. I thought I have to like say, sorry for second time, right? So it finally gets the people. Okay. Oh, okay. You're just like playing the fly, it's okay. Yes, yes. Yes, my kids also did that like 10 years ago. Okay, back to Kubernetes operators, right? So with databases, you have to have a certain care, right? There is a lot of logic in terms of managing databases appropriately, right? Like upgrades, backups, scaling up, down, wherever, right? That is what would be done, you know, manually by a human operator now can be coded in that operator concept, right? Now, when you talk about databases, right? We often talk, and many applications in general, right? We often talk about day one and day two, right? Your day one is saying, hey, I have an installation and initial configuration, right? And then day two is the rest, right? The rest of a database life, right? That's where we have to make sure backup, scaling, self-healing, upgrades and so on and so forth, right? And what is very interesting with databases is they spend vast majority of their life in that kind of day two, right? You have databases, we can all to generally say, well, you know what? If it's kind of sufficiently fucked up, just we wipe it up, right? And start from scratch as we often can do with applications, right? Those days, you know, they deploy the clean code from scratch instead of trying to kind of, you know, modify this very common approach. Those days, right? And that is what the operators are very good at. And that is where I would say like an other approach is to deploy the database and Kubernetes, you know, just you can take the, you know, MySQL, Docker container, right? And throw it into Kubernetes. Well, you can do that, right? And that simplifies your kind of day one thing, but that is not really useful for day two. That is where most of the production databases spend vast majority of their life. Now, at the corner, we have built the Kubernetes operators, right? And we've been doing that for a while. Our MySQL operator, for example, came, was released as GA years before Oracle came to release them. And I have made a little kind of tutorial for those who don't have a lot of experience of Kubernetes to show you, right? What exactly you can do with it, right? And you may be pleasantly surprised is how easy it is to do many things with Kubernetes operators compared to what it would take to deploy, manage, scale, you know, execute backups, right? On the conventional cluster builds on the end. What also is interesting and why you can see what this database on Kubernetes approach being a lot of hearts and minds is what there is a lot of databases and service solutions which have been built recently are actually built using Kubernetes as a backend, right? Like this is just, you know, example of number of folks who use Kubernetes backend, right? And they typically use either open source, right? Or maybe some sort of like a proprietary, you know, or operators to do that, right? So that tells me two things, right? But if they're doing that, A, well, that means it's kind of works, right? If it would be totally kind of blown up by customer databases, right? If probably would hear about that, right? And B, that is reasonably efficient, right? Because obviously if it would be like a 10 times slower than running databases on VMs, you know, well, that would be a problem, right? And I think that is an important point is what we see with a recent state of Kubernetes, right? The performance overhead from, you know, Kubernetes compared to the same, let's say VMs is actually quite minimal. Okay, let's now go to talk about the specific practices. Now, what I would encourage you also is to be active. If you violently disagree with something, so you know, just sell a peer, this is wrong, right? And that will be helpful, probably can do a better presentation next time. The first one is something I already mentioned to you. Use operators, right? When you're looking at a different options to deploy the database you plan to run on database, look for the good, full features, stable operators out there and use them rather than have a deployment time. The second one, which is very important, is set up for higher availability, right? Of course, if you care about that in production. An important reason for that is what they're relying on a single instance in Kubernetes is totally not a good idea, right? Like, I mean, if you are, I've seen a number of people just saying, hey, you know what, we deploy the single instance and you know, on VM or in a bare machine, it actually works and we didn't have a problem for years. Well, remember, Kubernetes was designed with the idea of a treat your nodes as cattle not pets, right? So if you, you know, if one of them dies, right, it's not like a peripheral thing to shed the tear about, right? But if you only have a single node where the database runs, that can be the problem. The second thing, which is also very important here is what you want your data to be persistent, right? And that means, well, in Kubernetes, the concept is different compared to your kind of like a VM, right, when it's sort of like, or like especially like a physical box. You have a hard drive, right? Until you wipe them off, right? The data stays there, right? In case of Kubernetes is designed for stateless stuff. So unless you specifically stated what I want to keep that state, right? On the persistent volumes, right? It will disappear, right? And that container is killed, right? Now for persistent volumes, you can use both the local disks, right? Or fast, fast remote storage, right? And obviously they have different properties, right? If you use something like EBS, it has a certain level of redundancy built into that, but it's also rather expensive, right? And also rather slow. Like here is an interesting image I grabbed from one of our guys presenting comparison between NVMe local storage and their provisioned IO of storage, right? Which is another way to get a good performance if IO bound workloads. And what we can see here is what, even if you're getting like a lot slower, like this kind of like a 10 times slower in number of IOPS storage still costs us like almost 10 times as much, right? In this case compared to NVMe. So if you use the local NVMe, yes, it is non-redundant and that means in this case you want to make sure you have like a replication on a database level, well, which you often do anyway, right? But it's much more performant and cost effective. Next thing is keep data per pod relatively small, right? I think this is interesting because if you look at sort of like a conventional database setup, databases often like this kind of like a big iron, right? And people may say, well, I have this kind of like a monster server with a lot of storage, right? And I run my single Oracle Postgres MySQL doesn't matter instance on it, right? Which has like 50 terabyte of storage, right? Or wherever that number is. Well that is probably not going to be very good in Kubernetes environment, right? And what that means often, look, not every database is very good for to move to Kubernetes, right? In certain cases like a very large databases, people may not have a very good success just yet, right? Unless they've been moved to, you know, distributed database, right? Which you kind of designed more to that smaller nodes. What is also, so yeah, so that is not a good idea. The other thing I would consider is that thinking about like appropriate node sizes. And what is interesting in this regard is when you look at application deployments, right? Your normal app server, app server, proxie, whatever, right? Often you don't particularly need a big boxes, right? And what I have seen a lot of the Kubernetes clusters out there may be engineered with relatively small node sizes, right? And then you come and say, well, I want to run the database on Kubernetes as well and guess what? In this regard, you need probably much larger nodes, right? So the shape of your Kubernetes cluster configuration may be different, right? If you are looking to move like a 64 core node, right? In Kubernetes, well, guess what, right? You need to make sure the nodes underneath your Kubernetes cluster have at least that amount, okay? Second, oh, like what? Thing number six is to configure resources in requests and limits. So what are those things? Well, if you think about there, if you just have the Kubernetes pod deployed, there's no kind of hints about how much resources it needs. It will be scheduled somewhere and it will be able to utilize all the CPU resources but will not have any guarantees in terms of how many resources it will get. In certain cases, it is not bad, right? For example, in development environment, you may say, hey, look, I just want to use as many resources as possible but you know what? If a lot of guys are using the system right now, right? Then I kind of get less performance, that is okay, right? Now, in production environment, we typically want a different, right? We want to make sure is what the database performance is predictable, right? Otherwise, it's kind of becomes rather hard to work, right? And in this case, you want to make sure A, you're configuring that and then often for CPU, we set requests and limits to the same number, right? So it works kind of in this regard similar to VM, right? Because if you provision VM, you allocate kind of four CPU cores to that, right? And you're not using some sort of like overcommitting. Well, you are guaranteed those four CPU cores will be dedicated for that VM and you're not going to get more, right? But you're not going to get less, right? So here is, let's say like a pre-concept of what you can use. Like one is a shared resources, right? It is, which is quite usable for development environment. Then you can have request limits, right? In a certain shape, right, which you can do. The danger we want to highlight here, right, is if you have, I don't know, let's say eight core boxes, right? And then you allocate like a six, let's say CPU cores, right, to the database resources. Well, I guess what, right? Only one of them can be scheduled to the physical node, right, the virtual node, right, and you will waste. So in this regard, with the databases, we often kind of want to emulate that, I would say their VM-like environment, right? Where you have like a single sort of like a database pod runs on a single VM. And in this case, we want to essentially keep the limits pretty close to the allocatable mode, right? Or if you say, hey, I want to have a two of them, then we also want to make sure they kind of fit nicely, not allocating a lot of space, a lot of waste. Okay, next thing, use proper anti-affinity, right? What is anti-affinity? Well, let's say I have a three node database cluster, which I want to be reliable. Well, I want those three nodes to be on a different physical servers, right? Even if I kind of chopped in a number of kind of different VMs, right, underneath, right? Well, maybe I even want them to be like in a different physical racks, right? So that is how you can specify the anti-affinity rules to make sure you don't have all your eggs in that basket. And that again depends on your requirement in the cloud. You would often say, hey, you know, if you are looking to have a very reliable cluster, I would use a concept of the availability zone, right? And make a different nodes in a different availability zones. Next one is wherever database is, Kubernetes doesn't make it magical. Some operators may apply their kind of own tuning compared to what like a stock or like a Linux package may provide, but in generally still, you need to tune your database for optimal results, right? So indexes, especially queries, right? You still need to take care of them as usual, right? I think for most of you guys that goes without saying that I've seen a lot of kind of junior developers, when you say when we speak like some next generation of technology, we kind of expect magic. Finally, I don't have to learn about all those kind of a pesky database internals, right? And then it will just do everything automatically for me. Well, perhaps some things, but not everything, right? Like one of the tools you can consider for that would be PMM, that is an open source software tool for per corner, which has a lot of deep insights for MySQL, MongoDB Postgres to help you with tuning settings, queries, and so on and so forth. Next one is it's a very good idea also to understand the database you're using, what kind of options does it have to scale. Some databases, distributed databases, right? They can help you to scale both reads and writes, right? And a lot of I would say databases which I designed for the cloud in the last few years, right? They are, or like probably like the decade by now, these are tend to be distributed databases, right? Things like MySQL, Postgres, right, they have been designed many decades ago by now, right? And they sort of like, they're designed for a single node and then added some kind of like some clustering, sharding, right, as an afterthought, right? So in this case, if you are thinking about what database can scale out, right? Well, we have like a TyDB, for example, right? Do we have anybody from TyDB here? Oh, look at that. We have a bunch of folks from TyDB, right? So if you want to talk about TyDB, come to them and also PlanetScale, right? Yes, the source survival of a PlanetScale is here. Great, right? Any other distributed databases? What, Yugovite? Yes, yeah, oh yes, oh yes, yeah, Yugovite right now. Yes, yeah, that's right. That's another cool operational distributed database. So yes, if you are running one of those other distributed databases, guess what? You can scale number of loads and scale read and writes, right? In other cases, if you are running MySQL, like Postgres, right? Well, you often would be limited if you need to scale writes, right? You need kind of to scale up, go to larger loads, provision more IOPS, right? If you want to scale reads, well, you probably can have a, you know, read replicas, right? And well, for non-relational databases, you know, MongoDB, Skasandras, right? They typically also have some sort of sharding setup. But again, the point here, the point here, right, is to understand how to scale your database so you don't really have run assumptions, saying, oh, well, you know what? I can now start with one node, but if I grow my database to 50 terabytes, I just go into, you know, run it as a magically distributed database, maybe not. Also control eviction priority, right? Well, Kubernetes is something which evicts some of, or can evict some of the nodes, right? Like in this case too, you know, for various reasons, right? And the database nodes are not expensive to do that, right? When you're starting database node, often it would have to do like some, you know, crash recovery if you just killed it, right? It may, even if it doesn't have to do that and it was shut down properly, often it will take warm-up to feel it's all the buffers, right, caches, right? So with database, you don't want Kubernetes to do what it likes to do, right, and mess around all the time, right? Yes, sometimes you want to evict the code and relocate that, but you want to tune priorities. Well, exposing the database, that is not an important thing, right, you want to make sure from security standpoint you only expose the database outside of that Kubernetes cluster if you need to. A lot of security, bad security incidents happens when people expose, let's say, database outside of their, you know, secure environment, you know, because they just want to much more conveniently access it from somewhere, right, and then bad things happen, Kubernetes is the same. Encryption, data trust, data in transit, you want that encrypted. The good thing about encryption or misconception about encryption, right? Encryption those days is pretty cheap. Like a modern CPUs, right, they have like a special encryption, instructions to encrypt the data very quickly, very efficiently, right? That also applies to, you know, the TLS data in encryption, right? So you typically just want to want to make sure you use that for extra security, right? And that's typically much better and safer option than kind of just trying, oh, I need to encrypt this data, not that data, this connection can go on encrypted, right? Because mistakes can be very costly and savings from not doing encryption are relatively small. Also, and other security things, the Kubernetes has a pretty good building concept as a Kubernetes secrets. It's a very good way to pass the database access credentials to your application, right? Instead of, you know, committing it or keeping it in source code, right? And so on and so forth, right? Again, like a very good practice, right? And you can look at the operators which also support, you know, integration with Kubernetes secrets for the things they use. Backups, we spoke about clustering and what you need to have ability in Kubernetes clusters, but hey, it does not eliminate the need for backups which you want to make sure you set up. As we said, keep them off a node, off a cluster. We have majority of folks keeping backups in, you know, S3 compatible storage, right? Or some other object store works pretty well. Now, I already flagged some of the folks here, like VTest, Lashplanetsky, Neon, Yugobite, TIDB, right? I think these are all pretty cool databases to consider because they have been all designed with this kind of new, you know, cloud native world in mind compared to their MySQL and Postgres which are fantastic databases, but let's face it, they have been designed for a different world, right? And they have been just, you know, adopting for what is the new solutions can provide. Number 16 is considered CPU choices that you have efficiently. In particular, I think if you are on Amazon, look at Graviton ARM CPUs, they are, you know, pretty efficient and relatively power efficient. And I think there's a lot of development, you know, going on out here. Number 17, what I would mention is peak, what exactly Kubernetes deployment is right for you, right? Because I have seen different teams sometimes saying, oh my gosh, managing Kubernetes, no, we don't want to do that. That's kind of another load on that. In this case, you can provide, use like a managed Kubernetes solution, which are available now from a variety of clouds. Right, again, they would not eliminate you to know the Kubernetes, of course, to use it, but can reduce some of the toil which comes with the Kubernetes management, right? And maybe, you know, provide some, you know, cool automatic, you know, additions. Like, hey, if you deploy the managed Kubernetes on AWS, you can say, hey, you know, manage the scale of my Kubernetes environment automatically, right? If I need more resources, you can scale Kubernetes cluster automatically, right? Which can be very cool. Other folks, they like the not managed Kubernetes but deploy Kubernetes on the M that has our benefits, right? If you say, hey, you know, I have this particular kind of blueprint, how we deploy Kubernetes, what Kubernetes version, and you want that to be the same on Amazon, Google, and on-prem, then using some of the Kubernetes distributions, right? And just deploying it in different places, maybe better choice for you. Number 18, you want also to make sure you monitor the utilization, right? Monitor utilization of your nodes, right? Because it is very easy in a Kubernetes space, right? To leave a lot of resources, right? Which may not be even only used, but which may be not even kind of allocated for possible use. Well, like in that picture I showed you, right? Hey, you know what? We have nodes, right? And they just have some sort of slugs which cannot be possible allocated based on how things can be configured, right? So that is a very important thing to save costs. Now, also I think as you are looking at Kubernetes environment, right? And databases deployed out there, a lot of them are going to be kind of much more complicated, right? And also as you, if you are kind of deploying, let's say, Linux package, you typically have some idea what it installs, like, oh, I install MySQL, well, I'll have, let's say, MySQL process running. If you install the operator, right? Even that's kind of operator from Perconi, you probably would not have an idea what it installs completely, right? Because it will install, you know, some proxy for load management and then it's going to be, you know, some other bits and pieces, right? And I think that is very important to have some tools, right? So you can get that overall picture what runs, right? And what, you know, talks to each other, right? There are a number of tools for that. One of the tools, which, well, I'm kind of helping with this open source project, right? It's called Karut, right? Which is quite helpful in automatically identifying what runs on Kubernetes cluster using kind of EVPF, right? So it doesn't rely on you installing any kind of additional agent, right? And that is how you can, right, discover what you actually get when you deploy some sort of operator, you know, and how those things talk to each other and also if there are some problems, you can also see there they come from you. Okay, with that, let me ask you a question. So I went through my tips and probably some of you, our Kubernetes experts are sitting right now and thinking, oh, Peter, you forgot this very important thing, you know? Any ideas, anything I forgot, anything you find a very important best practice? Yes, oh, absolutely, yes. Oh, yeah, absolutely, right? I think that is an interesting thing, right? Because the data is kind of like so easy to destroy in Kubernetes environment, you want to make sure it's very careful. Now, what I mean, what I said, and maybe I wasn't clear about the NVMe, right? Is well, of course, you cannot rely on a single NVMe storage, right? But what often happens is when you would deploy, I don't know, it's a postgres, you wouldn't deploy the single node, right? You would deploy the replicated cluster right with three nodes or something, right? Like in this case as well. Of course, each of them would have its local NVMe storage, right? And yes, that storage may go down and kill the node, right, but there would be redundancy because of two other nodes, right? What will you get if somebody like EBS, right? You essentially have three nodes, each of them has data, you know, like wherever copied so many, wherever many times EBS copies that internally, right? So that is what we are talking about. Yes, yes, well, and you know, I would say it's also like never say never, right? I would say like in certain cases, yes, like you may, your workload may not be even IO bound, right? Like to begin with, right? Then that's a less, less of a problem, okay? Well, I mean, if you look at the upgrades, right? I think that is where cloud is actually very nice because you can kind of expand, right? And then shrink environment. So what we like in this case is saying, hey, if I'm doing kind of like a minor upgrade, then typically operators can just, you know, run it together, right? In the cluster, it's kind of low risk, all good, right? You have some major upgrade, it's often good to just, you know, your provision. The new cluster, you kind of set up replication and sure kind of everything works, right? And then you can do a cutoff, right? So this kind of like an approach, okay? Yes? Yes, yeah, yeah, I think that is a good idea, right? We see a mix, right? In a number of cases, you see people having like a special purpose database cluster because the shape, right? What we see for the database nodes, right? They need, can be different, right? What is optimal for applications? Yeah, well, look, I wouldn't, like I know there's a lot of innovation, right? Which goes in terms of storage, right? With that, we typically work with what, you know, customers have. Oh yeah, yeah, now, absolutely. Well, I didn't mention caching, right? And I think there's a lot of tools coming out for that because it's not like really very Kubernetes related. But yes, I mean, I think if you look at this case, you have a number of caching solutions, right? Which can be quite database specific. Okay, folks. Now, contrary to popular belief, that is not the end of the presentation. I just wanted to take a brief pause and now I'll get to like a next stage in the talk of in a few minutes of what is coming, right? One is what you can do beyond Kubernetes operators, right? One thing what we see is the people who are using something like an RDS and saying, yes, you know what? It's all kind of good and fantastic, but you know what? Having this kind of YAML files, which you do kind of, you know, CUBE apply is not like that UX you got used to, right? I want something I can use my click ops skills, right? And deploy the database in a few clicks, right? And that is, I think is quite important for certain users. We at Percona are also building the open source product for that called Percona Everest, which is currently in beta. And I would encourage you to try it out, maybe give us some feedback or maybe in kind of some code so we can make it better. Finally, wanted to highlight about couple of things which I think come in the future, right? And which important. We are working on that stuff. I know like some other people are working in this environment and some things are kind of you know, maybe working one, which is an interesting challenge is deploying database across the Kubernetes clusters, right? We see a lot of people saying, hey, you know what? I don't believe in single Kubernetes cluster, right? As something which is kind of universally available. I want to make sure I survive and see the full Kubernetes cluster failure. And then there are some solutions coming up. Like for example, this one is you know, is getting a lot of traction recently. There is some others as well. We at Percona operators, for example, we do support replicating between, right? Many of our customers would have like a node or a cluster in a different Kubernetes cluster for disaster recovery, but there is no kind of single easy of orchestration or multi cluster just yet. Another thing which is very easy, but I think very nice is their work going with automating automatic volume extension, right? Because what database is they kind of intend to grow, right? And often what you want is saying, hey, I would want to Kubernetes operator to want to start with that and also 10 gigabyte or whatever volume, if that's you know, EBS, right? And then grow it as needed to be larger, similar to what like a EBS and similar technologies need to do, right? And I think what is fantastic in this case is what we are getting the appropriate universal support in Kubernetes for that interfaces, right? If the cloud vendors, right? Or other storage vendors, which can provide that storage thing, right? And I think that is very important with the storage innovation. We are getting more and more with that like a container storage interface, right? As interface to those advanced features, but in a universal way. So we don't have to quote specifically for some storage. After scaling, obviously that is another very much wanted by some people. How can we make sure we can transparently scale our databases up and down? That's what we see in the cloud, right? And a number of options, right? And I think a related kind of friend of after scaling would be also after suspend for development, right? When I can say, hey, I deploy the database, I use it only so or every so often for development. Why can it just automatically shut down, right? If it's inactive a certain amount of time and then I'm kind of trying to connect to that, you know, just bring it up in a background, right? So I don't have to think about managing and kind of spinning up and down that node myself. So that is another thing which I know like many people have been thinking about. Finally, ease of migration, right? Everybody, like wherever you introduce the new technologists, right? You want to make sure it's easier to migrate, right? And that is something what we see there. Operator building various tools, right? Saying, hey, you know what? You can see the database maybe from a backup you're taking from the database which was not running on Kubernetes, you take it on S3, right? And initialize your cluster from that backup or having something like as a rolling transition and you can say, well, you know, you can have a cluster, you can start some nodes in that which are in Kubernetes, right? Until you finally migrate there completely, right? So virus kind of approach is how to make the migration both easy and also not impacting production if a downtime, right? Or something is a lot of work, there are a lot of work going on. Well, with that, that's all I had and I think you have a couple of more minutes or questions if you have any. Well, so if you think about like a suspend, in many cases the idea in this case is this, right? I mean, if you look at a lot of classical databases, we don't have like a quiet separation of the storage and compute, right? So what you would have is some sort of a proxy and the proxy knows, right? Or if connection comes and I don't have any backend because the backend is out, I can bring up that backend and then transfer connection to it, right? That's one approach, right? If you look at like some of the, I would say like a new generation database like Neon, they can have a separation of the storage or compute, right? So that out there kind of happens more after magically, right? In this case, because you can essentially spin up the like already use the compute, right? And in a very quick way, okay? Oh yeah, I think you are next speaker? No, who's next speaker? You are, oh okay, yeah. And you started in 11, right? No, 11, 15. 11, 15, oh, oh, okay, that's, yes. Good, good. Okay, well then, thank you, folks. Oh, you know, don't worry about that, that's... Oh, it's off, ah. 11, sound check, sound check, how do you work? Probably I need to... 11, 22, yeah, should be all right. How's it going so far? All right, yeah, everyone. We are ready for our next talk, which is using Kubernetes with distributed SQL databases. Please get settled up for the next talk. David, Magda, everybody. Sound check, ah, you wanna try this one? This is a good idea. Something that Peter did not have. The experimenting with the audio settings. Oh, man, no, I need your help. You see, no, it's fine. 11, 22, 12, three. Ah, hello everyone. Take a seat. What? Disappeared? No, we don't need screen, yeah. Now they're just standing and talking and you'll be trying to figure out what I'm talking about. Let's do this reset, hard reset. Hopefully my laptop will not fail me. I don't know what's happening. Should we? Probably we should check this one. Because when did it disappear? It was on, right? A minute ago. It was on. Displace. My laptop can sit. What if I do stop mirroring? And then I do, two, two, two, two, two, two, two. No, the laptop sees it. Source, HDMI, no signal? Yeah, I'm plugging it back. Get the signal and my Mac sees it. Huh? You want to reset it? It is. But right now the TCD community is not growing. It's not striving. There are a few maintainers that have their primary jobs and they need to also to look after TCD. And sometimes it takes a lot of time to, not just to introduce new features and capabilities to TCD, but just to do that basic maintenance work that is needed for Kubernetes for other projects. So those are at least two reasons that I am aware of there might be some other reasons. Because eventually the Kubernetes community created a project that is called Kine. And Kine, what Kine does? It's a translation layer, translation layer between Kubernetes and relational database. What it does? Why is it necessary? Let's say that you belong to one of those companies that found the reason to replace TCD with some other database, like relational database. And Kubernetes itself, Kubernetes, it does not support other databases by definition. So if you want to replace a TCD, you need to use some translation layer. For instance, if you use Kine, and Kine is one of those well-known community projects, you still will have Kubernetes. And Kubernetes will continue using ETCD APIs. ETCD APIs, so whenever you deploy any part, your service or deployment in Kubernetes, Kubernetes will be calling using the ETCD API, trying to talk to its ETCD database. But if ETCD is not there, then Kine will be intercepting those ETCD calls. And Kine then can route those calls and use let's say Postgres or MySQL instead of SQLite. So this is what ETCD does. It basically intercepts all those API calls from Kubernetes, and then it can actually use Postgres, MySQL, or another relational database as a meta store. And then throughout the presentation, you will see how you'll get to the distributed database. Because next, that's the agenda. That's the quick introduction to you, why we need this conversation and how we are going to achieve the Kubernetes deployment on a distributed SQL database. What's coming next? Right now, we are switching to another tab in my terminal window, and we'll deploy first Kubernetes on Postgres SQL. So Kubernetes will be using Postgres as its own meta store. And after that, we will do one extra step. We will deploy Kubernetes on a multi-region distributed Postgres SQL version. You go by the way. So Kubernetes, like Postgres, as Peter was saying, it's a database that was designed for single server deployment. However, these days, usually, you want to have something that is scalable and highly available, and in Postgres, the system has many solutions. One of those solutions is UGA-byDB. I'm talking about UGA-byDB because the company pays me because I'm bullish about Postgres, and I believe in UGA-byDB, right? Take whatever explanation you like most. But generally speaking, for me as a developer, it's much easier just to use UGA-byDB because even though it's distributed, for me as an application developer there's just one connection string to my database, and I don't care like how the data is sharded, distributed, how the fault tolerance works. I don't care. For me, it's just one database connection, same as this Postgres from my SQL. All right, let's do this. So what I'm using today. I'm using my laptop, but also I deployed three VMs in the United States West Coast. So I have virtual machine running here, West two, that's Los Angeles. Then we have another virtual machine in Oregon, West one, and West three is Salt Lake City. I'm already connected to West two. Yeah, I think let me come here. Yeah, that's my machine on West two. It's here somewhere in Los Angeles. And I will be using this machine to deploy my Kubernetes instance. And then also I already have on this machine Postgres running. I remember the first step is let's deploy Kubernetes on Postgres and see how it works. Yeah, let me connect to actually, I will connect to this instance from this window. And to connect to Postgres, Postgres actually comes pre-installed with Ubuntu. And I will connect to Postgres this way. I am using PSQL, I am connecting as a Postgres user. I'm connected, and here as you can see that my database is empty. I don't have any tables, I don't have any data. So now I want to start Kine. So Kine, if you Google for Kine, you will find this GitHub project. It's part of the K3S. This is the lightweight version of the Kubernetes. And here is, take a look at this extensive documentation. But at least it's straight to the point. Kine is the translation layer that allows you to use Kubernetes with SQLite Postgres, MySQL, and Nets. I don't know what Nets is, but hopefully some of you know. How do we deploy it? Let's actually start with Kine first. Before we deploy Kubernetes with Kine over Postgres, we can do this. I can, yeah, I need to go to my local folder because I have already cloned Kine, I'm here. And the next step is it's written in Go. I'm starting Kine and I'm asking Kine to connect to the following endpoint. And the first, the beginning is the backend because Kine supports several relational databases. I'm using Postgres and this is gonna be a hint to Kine. Hey, please start and use the Postgres backend implementation and then where is my Postgres? That's my Postgres username and password. And that's my Postgres instance. It runs local on the same machine. Okay, and I'm connecting to the Postgres database. And it started, it still works. And it said that the Kine is available. Remember, here is I'm connected to my Postgres instance and if I check the relations right now, here is you can see that Kine successfully connected to my Postgres database running locally and created these two tables. You actually have just one table that stores all the metadata that is being generated by Kubernetes and also you have this sequence. The database sequence basically generates IDs for the events because every event that is created and generated by Kubernetes, whether you deploy service, deployment or whatever, it has a unique identifier and the Kine generates those identifiers with the database. If you take a look at the structure, all this Kine table, there is no new rocket science. You have this event ID, name of the event, when it was created, deleted in different revisions and also there are many indexes that were created to expedite some of the queries. Okay, so for instance, let's take a look at this. I can select ID, name from Kine table and let's get the last three. Yeah, we have only just two events. So when Kine started, it decided to write down a few events into this table. So that's it, that's it. But now, that's how it works, but it's not Kubernetes yet. So how do we start Kine with Kubernetes? Let me do this. I want to drop this table for now. Let's do drop cascade because I will be restarting Kine with drop table Kine cascade. It's empty again. For Kubernetes, when you use, if you check this documentation, it said that you can certainly run Kine as a standalone process nearby your standard Kubernetes deployment. But today for the sake of simplicity, I want to use lightweight version of Kubernetes. It's called by K3S. Anybody uses K3S actually? Yeah? Then it's gonna be like speaking your own language today. So K3S, a quick intro for those of you who don't know, it's basically this minimalistic version of Kubernetes and what's valuable for the today's conversation that take a look at this component. It comes, Kine comes prepackaged with K3S, which means that I can just start K3S, pass information about my data endpoint, and then everything is gonna work fine at this, I hope. So it used to work for me in the past. Let's deploy it. How do we deploy it? So what I will do, I will use Curve. I will connect to this web address and I will download the special installation script. This is actually one of the installation instructions for K3S. And then I will start K3S in the server configuration. I'm passing some essential settings. That's not the secret token. You need to use something more advanced, but this is what is interesting for us today, data store endpoint. And here is I'm passing this configuration. This endpoint looks exactly the same as the one we used for our just standalone kind deployment. And basically what will happen when I will deploy this Kubernetes instance on my local machine, then this endpoint will be passed and Kubernetes will see that, yeah, the user wants to use Postgres, which means that I need to go to Kine and Kine eventually will connect to my Postgres instance. Let's start it. It downloads anything, something, yeah, creates. Now it starts, it's starting K3S. We already have these two tables. And what I want to do now, yeah, it started meaning it will be running as a demon on my operating system. And now if you check this ID and name, let's from Kine, right, we want to get from Kine. Let's order by ID descending. We want to see the latest events generated by Kubernetes first. And then how about printing the last three? You can see that, take a look at this number. Kubernetes was started barely a minute ago and already generated almost like since 100 events. Like it's a stateful system. It like runs silently, it just runs silently but it always does something. And I don't have anything deployed. If I do this right now, and this number will keep growing. For instance, I will use this watch operator of PSQL and I will keep executing this select every two seconds and they will see that, yeah, something is happening. Something is being registered, I don't know, probably some hot bits or whatever. I'm not a Kubernetes expert here. If you take a look at this, let's use Cube CTL, get nodes. We have it running, it's running, it's ready, it runs on my machine. So this is how easy it is to deploy Kubernetes on Postgres. And so right now Postgres is used as your store. But there is one problem with Postgres when you compare it to ITCD. ITCD is scalable and highly available. Postgres is by default now. Postgres is like single server instance and if Postgres, you can deploy Kubernetes across multiple servers, across multiple availability zones. That's probably what you want to do because if Kubernetes goes down you don't want the control plane to go down. But Postgres, you need to find some high availability solution. And as I said, there are many options, not many, but several options in the Postgres ecosystem that will help you to make Postgres scalable and highly available. They're just not built in in the core Postgres. And let's use Ugo by DB. So today we will use Ugo by DB in this way. Let me first, what I will do, I will just destroy and start the just created Kubernetes version. I no longer have this K3S installed. If I execute this command right now you'll see that no, Kubernetes is not found. And here as we will see that Postgres no longer gets any events from Kubernetes because I stopped it. Let me just drop the tables. I no longer need Postgres. Speaking about Ugo by DB. We will start in transaction to Ugo by DB with the following command. And then I will explain you a little bit more. So again, I will be using CURL to download the same installation script of K3S of our lightweight version of Kubernetes. And then for the data straw endpoint, I'm telling, please, when you start, use Postgres backend, the Postgres implementation of kind. But this time you will be connecting to the following Postgres instance. And here is, it's not Postgres instance, it's the Ugo by DB instance. I am using not the strongest password in the world, but that's my database in point. And this IP address is here. It's the same virtual machine. And I am deploying a multi-node Ugo by DB cluster. Let me actually show you that multi-node Ugo by DB cluster because one of the instances is running locally. Ugo by DB exists in several versions. I mean, like you can, it's an open source project like Postgres. You can download and install and use it and don't pay any pay any or you can consume it as a cloud native fully managed database. It's up to you. I have this Ugo by D tool and I want to check the status of my database. And I know that I store the data, the configuration of the cluster in the following local directory. So I can go to this directory and get the information Ugo by D. No, it needs to be this way. So what I have? I have a three-node cluster running. It's available and I can connect to it. What we will be doing right now. Let me use PSQL. I want to connect to my Ugo by DB instance. Yeah, to this one. The port number is this one. And the user is Ugo by it. I'm connected to Ugo by DB and as long as Ugo by DB is distributed version of postgres it certainly supports the majority of tools and frameworks and libraries that you created for postgres. PSQL is not an exception. I continue using PSQL. So I don't have any data in Ugo by DB in my distributed cluster. So now let me start Kubernetes on Ugo by DB. You will see that kind also connected to Ugo by DB and it's already created the tables. What I want to do now, let me show you actually that Ugo by DB cluster and do a little bit more advanced intro for you. Ugo by DB comes with this UI. So what I have, I'm running a three-node cluster and I'm running this cluster in a multi-region configuration, which means that I have one node of the database deployed here in Los Angeles. Another one is in Oregon and another one is in Salt Lake City. Why I do this? Probably I decided that I want to tolerate region level outages. Region level outages happen and if let's say United States West too in Los Angeles goes down for any reason, my Kubernetes, if I also deploy Kubernetes in other regions like Kubernetes itself, right? During the demo, I don't do this. Then my database will be available, no any data loss and I will be able to execute my application workloads. Also when it comes to multi-region configurations with databases, you know that it's a distributed database and by definition distributed database nodes are interconnected through network and you already take that network heat, right? The latency heat because the network. That's why especially when you deploy across multiple regions that network impact on the latency can be even higher which means that when you deploy across multiple regions you need to find one of the ways how you can minimize the latency impact. For my today's presentation I decided to write I want to define one of the preferred regions because by default this is a distributed database and when I have this kind table or any other application table that you create in the database, it will be sharded. Basically your data will be split into shards and all those data will be distributed across all of your nodes and then you can load, balance, your reads and write requests. So this is how you utilize the entire class or capacity. But for my, let's say use case to the MTL and yeah for the sake of high availability, I certainly want the copy of the data to be stored in all of the regions but I want to define a preferred region. The region where nodes will be handling all the reads and writes by default. The primary copy of the data will be in one of those regions. So I selected United States-West through the preferred region which means that kind and Kubernetes they will be executing quite advanced selects with multiple joins, et cetera, et cetera. And by default, all of those requests will be handled by the database nodes that are deployed in my preferred region. I don't want those joins to span multiple regions. So that's what I did. And we have, this is the new UI that exists in UGBIDB but we have a little bit all the one, I call it the veteran of UGBID. And it shows a little bit more advanced information. So that's my node in Los Angeles. And you see this leader preference priority. It means that this node belongs to the preferred region meaning that by default, reads and writes will go to this node. If let's say the data center or region in the United States-West to Los Angeles goes down, then priority number two, this node will become the next preferred one, all the nodes. And here is in this configuration I'm running just three node cluster. You can deploy as many nodes as you like in every region. It's up to you. Okay. Enough. Now, if when you go back here, you can see that Kubernetes has been studied successfully on UGBIDB. Let's check it. How about we run the same request? I want to get the latest events that are being generated by Kubernetes order by ID descending and let's print limit three. And let's do watch. Please repeat this request for me. Yeah, the Kubernetes is up and running and I can see that something is happening with the Kubernetes cluster. If I do this, let's do K3S cube CTL get nodes. We have it, but we don't have any pods. We don't have anything else. And generally speaking, it's easy. You saw like, it took us how many? You have 30 minutes and probably five minutes for troubleshooting. 25 minutes to deploy Kubernetes, first on Postgres and then to deploy Kubernetes on the distributed multi-region database. It's always straightforward. And we used kind. It's something that is maintained and created by the Kubernetes community. But the question is what? How many of you are DevOps people who are responsible for running Kubernetes? How many of you are application developers who are deploying Kubernetes? Yeah. So generally for you, it's interesting to be on those insights, but let's say that eventually we want to deploy our stuff on Kubernetes. Let's try to deploy something on Kubernetes to make sure that this configuration eventually works. What I did, probably you can come up with a better sample, but I just Googled and I found some repository with Kubernetes sample applications. And I want to use them today. So I will use this command. I will deploy some emoji water example. Have anybody heard about this example? Me neither. But at least let's try to deploy it right now. And now it's gonna be deployed. And here is, let's pay attention to, yeah, here as you can see that something is happening and I stopped generating events. You can see that you are voting emoji water. So generally you can see what's happening inside, right? We are deploying, we application developers deploying something on Kubernetes. And then this information is being stored in the Metastore and kind intercepts those 80 CD API calls. And eventually all this data is stored to Yuga by DB and it's replicated across multiple regions. So you're not losing this data even if one of your database node goes down and you have a meltdown in your region. Let's restart this. So now if I do this, let's get information. So what was deployed? Yeah, we have several pods. Yes, several microservices. Then we have services, deployments and replica sets, pretty standard configuration. And this one is web, web. We can actually do an extra check. Let's say that my application was successfully deployed on the following Kubernetes configuration. But now let's make sure that the application itself responds. I know that I can connect to this endpoint and check that, yeah, we are good to go. Let's use HTTP get and the port number is 80. Yeah, it's all good. So we've got some response from the application HTML text. So that's how it works. That's it. That's it. It just works and it's quite simple and quite straightforward. However, there was one final item on the agenda. Do we need to have any kind specific optimizations for distributed SQL? Because you saw that today I didn't do anything. I just started Kubernetes using kind and I used the kind version that you download from the upstream of the project. And basically kind continues using the Portuguese backend. But that Portuguese backend walks out of the box with the distributed version of Portuguese. However, if to be brutally honest with you, this like for my friend of mine, colleague friend for sure, he sits here half a year ago. Sometimes we had that gig when we want to find, let's say some interesting well-known application that is written for Portuguese and we want to check, all right. Probably it's no longer gonna, if you can try troubleshoot, let's do it. Don't do that. Yeah. And we actually came across Kubernetes, we came across kind. And when they were deploying a half a year ago, when they were deploying Kubernetes with kind on Yugo by DB, sometimes the deployment would fail, sometimes it would succeed. Why would it fail on some occasions? We decided to check that was related to the SQL requests that were generated by kind. Kind internally has just one single table. But if you take a look at the logs or if you take a look at the implementation of the project, you will find that it just generates like those massive SQL requests that need to join the data. And sometimes during the startup of Kubernetes on kind, something would blow out and the Kubernetes would not start. And eventually, Frank and I, we forked kind and we just created indexes a little bit differently. We also enabled some of their optimizations for the distributed SQL databases. And our version, and that version of kind and Kubernetes started working on Yugo by DB. Today, when I was reherding for this presentation, that was no longer necessary. But does it mean that we can just go ahead and use the current version of kind? Probably yes, it's all to go. But there is one important lesson for us developers. When you're transitioning, let's say, from a relational database and your application, you already have some application that was written for the database. You created it. And even if the database like Yugo by DB that is postgres compatible, like feature in runtime compatible, it's pretty much highly likely that you will take your application, you will connect and it will work. Still, then run some load testing, run some performance testing because you might discover that you need to optimize some of the requests, some of this workloads that you created for portgis or my SQL before. So that's why distributed SQL is the same SQL that works like a charm. It allows you to scale like beyond the capacity of a single server. It allows you to tolerate various outages because it's distributed trans across multiple nodes. But also as an application developer, your learning curve will be much, much shorter because if you already know SQL, if you know how to use, if your Java developer use high-end spring data or you use some other ORM for your programming language, it's gonna be an easy transition for you. But still, make sure you run some load testing and performance testing because still you might want to optimize some of the requests as long as your database runs over the distributed environment. But speaking about Kubernetes, if you go to the original configuration, Kubernetes is great. It uses a TCD by default. There are some of the use cases when a TCD is not enough. It can be related to the scalability issues. Some of the companies have concerns about the state of the TCD community. But nevertheless, you might have some other reason. Keep in mind that the Kubernetes community has special project kind that allows you to replace a TCD with another meta store, another database. It can be Postgres, it can be MySQL, it can be Yuga by the whatever you prefer. Or, and if you would say eventually, listen to the presentation of Peter who kind of hinted and suggested that, guys, probably you can run. Going once, twice, twice. All right, thank you David. Thank you for the amazing presentation. Thank you so much. And two tests. Does it work? I don't know. Everyone, we'll be starting with our next talk titled MySQL Connections, Handling and Pooling by Mathias Cronville. All right, welcome everyone. So, my name is Mathias Crowles. I am an enterprise customer engineer at BlindScale, still at BlindScale, which is good. I'll be talking to David about MySQL, high availability, how you can do connection handling, connection pooling, things like that. There will be time for questions at the end, so if you have questions, please save them for that. And with that, let's get started. So, people always ask me like, is it MySQL or MySQL or whatever? So, the official way to pronounce it is MySQL, but we're open for all other suggestions. This is from the manual, the link is there, so you can see. Today, we're gonna give a brief introduction about MySQL, how it came to be that I'm talking here about connections and connection pooling, because like life is short and why would you talk about connection pooling? Then, some challenges about connection management. Then, I'm gonna go a little bit on how plan scale supports connections, and then at the end, we'll have time for questions. So, MySQL is still the top open source database. We're losing traction against Postgres, but looking at DB rankings, we're still the biggest one. So, founded in 1995. A long time ago, 30 years next year. So, yeah. Now, currently Oracle Corporation owns MySQL, and that's been the reason for many of the flame wars around there on the internet. And so, we'll see how that develops. And currently, it is still 8.0. That's the one you should be using in production. If you're using anything else, anything older, please upgrade. Anything newer, please be careful. In 2019, it won the DBMS of the year on DB rankings. And it's widely used by top websites in the world. Think Facebook, Twitter, Booking.com, all of those very large web applications are using MySQL. And this is also why connection handling and connection pooling is kind of important, because traffic, web traffic is very unpredictable. Like if you have an application in-house application, you can pretty much predict how many users you'll have, because the amount of employees you have is probably going to be the maximum number of connections. But if you have a web application like Twitter and someone does something stupid or says something nice, you might experience a little spike in connections and traffic. Why did we get so popular? Mostly initially because it's free and open-source software. It's very easy to install. Like my first job, we were using MySQL and all the system engineers were doing was app get install MySQL server. Done, works. If you grow a little bit, it becomes a little bit harder to install and to manage, but still can be done. And it's able to handle a lot of connections, like if you would run Oracle database for the same amount of connections, because you're an arm and a leg on licensing alone. And so MySQL free. And it was very easy in the early days of PHP MySQL to set it up. Like PHP came by default with MySQL connectors enabled. Like if you want to use Postgres back in the day, you'd have to recompile your PHP with all the flags to support it and to be able to connect, which was a big pain in the ass. So that's why MySQL really gained a lot of popularity. And then finally also because replication. Replication and we're gonna go a little bit deeper there. So replication is a way to have more copies of your data that you can actually use. Like you would always have a single primary where you can write to, but you can have a lot of replicas. Like there's three dots here, but you can't see those with the lighting. You can have a lot of replicas. There's three here, but you can have multiple. And so you can use all those replicas for reach scale. So you can write to this one server because most of the web traffic, like if you think about a website like booking.com, how many hotels do you browse before you book one? Like it's probably gonna be a magnitude of that. So only the rides need to go there, but all the reads can go to the replica. The replicas can also be used for a high availability and failover. If something happens with your primary, you can say, okay, let's promote a replica to become the new primary and continue operations for like nothing happened. You can use replicas for backups, like you don't have to take the primary dance backups, which is an easy thing to do. A few replication ways. Traditionally, replication was asynchronous. So that means that the primary executes your transaction and doesn't care about any of the replicas to be catching it up. So at commit time, it would write the transaction to a binary log and then the replica would at its own pace read that binary log and apply it in its local data copy. In an ideal world, you keep that replication delay, like the replica behind your primary under one second, but it can't grow if the replica can't keep up. That's why a semi-sync came to be like we wanted to make sure that at least one replica had received the transaction before the primary would continue working, okay, continue committing the transaction. So you would have more safe, like crash safety if your primary would crash and no replicas had the transaction received. It would have not been committed and so it would not be acknowledged to the client. And then at the end, you have synchronous replication. It's virtually synchronous because it can still be behind, but in this case, the cluster solution that you would use would make sure that transactions are synchronized across the cluster but not necessarily all applied there already. So now we come to the point where we talk about connection handling. So my SQL has a thread-based architecture as opposed to Postgres where you have a process-based connection system, like if you would look at a Linux server's process list on running a Postgres server, you would see all the connections in the process list while on MySQL you would just see a MySQL D process and no other processes. But if you now would go look into the threads, you would see one thread per connection that MySQL uses. So that's a good and a bad thing at the same time. Good thing is you have it fully isolated and so the threads can work on their own pace. The bad thing is that you cannot do multi-treaded transactions like MySQL doesn't support it because of this architecture. So whenever you get a client that tries to connect to MySQL, it queues up its connection request in a queue here on the front and then there's a receiver thread working inside of the server that handles the connection requests that are incoming and it will see if there's a thread available in the thread cache and if there is one, it will take one, it will assign that to the connection and this is the thread that will handle that connection. If there's no thread available in the cache, it will create a new one which is gonna be a bit more expensive but in modern systems, that's not so expensive anymore. In older systems, this was very important to have that thread cache but nowadays it's not that big of a thing anymore. And then in the end here, this is gonna be important. This is what's called a thread object and that's a sign for each of the user threads and that thread object is a one-to-one mapping with the thread of the user and that basically holds all the memory that this thread will be using. And so we'll come back to that later why that is an important thing. By default, and most common example there is PHP, connection or short-lived. So you connect to your database, you do your workload and at the end you disconnect. That's how a short-lived connection works. Your client sends a query, sends multiple queries, gets results but then at the end of the page, it just disconnects and at the disconnect phase, this thread object gets deallocated. So any memory that this thing got assigned will be freed to the OS by the server. As opposed to long-lived connections where these connections are actually kept indefinitely and so this thread object also is kept indefinitely until the connection is actually closed or recycled. This thread object will be there and will be keeping memory. So if you have a large thread object, you can have a large number of memory and you can get OEM issues with that. So in an ideal world, the threads just keep going forever. Unfortunately, CPU threads are still limited like there's no unlimited number of threads that you can have and so it's basically the OS that will decide when a thread gets time to spend on the CPU, like it's the CPU's time shared. So they need to wait until it's their turn. Then there's a few things it can wait for, like mutexes, this is one of the important things in the database, like if you have, if something has to make a change to the internal structures in the database, a mutex will be requested and hopefully at some point granted. So that means that during the wait for the mutex to be granted, the thread is actually blocked. And once the mutex is granted, other threads that will be waiting for that similar, same mutex will be waiting on this thread to finish its work. So that's also one thing. And then another typical thing in database world is locking. It's important to be able to lock the records in the data when changes are being made so you don't overwrite each other's changes. There's data locks when you do in-search update leads. There's meta data locks when you need to do DDL changes like create, alter, drop. And then finally, there's also IO things that you might have to wait for. Like a disk is a typical example, but a network connection is also something you might wanna wait for, have to wait for. Nowadays with SSDs, disks are better, performance is better, but still there's a limited throughput to those. So it's always gonna be a thing to consider. And then memory allocation is another challenge that we have already touched upon briefly. So the threat object holds per connection buffers. Short-lift connections can have a lot of overhead of allocating the allocating memory, all those things, but then long-lift connections can also have a big memory pressure. So if you have a lot of long-lift connections, then the memory hugging can grow and you might run into memory issues there. And then finally, another challenge with connections is service discovery. So we've talked about replication, but you have to know where the primary is. So if your application needs to connect there, you have to know where it is. If you wanna use the replicas for read scale out, you have to know which replicas are available, where they are, how to connect to them, all of those things. There's a few common techniques, like I've seen people using DNS records for making sure where the primary is or floating IP address, like that you can move across different servers. For replicas, you can just load balance between a pool of replicas, but you have to also then need to monitor which replicas are lagging behind, which replicas are currently used for taking a backup, things like that. But that's also a big thing. So this is where connection pooling can definitely help. Connection pool is typically an application that's sitting in the middle between your application and your database, and it basically holds a number of connections. So if you're using Java, like a Tomcat server or Rails, that typically creates a number of connections when it starts up, and when your application then needs a database connection, it basically requests to the connection pool, like, hey, I need a connection, and then the pool will say, hey, this is a connection I have available, or I will make a new connection for you. And basically, by doing connection pooling, you make a short list connection, become a long list connection because once your work is done, your application will return the connection to the pool and not disconnect, and the pool will then handle the things like, hey, do I need to recycle memory? Do I want to close this connection or not? That's all up to the pool. And sometimes, like in, if you come from a PHP background like myself, connection pooling wasn't the real thing there back in the day, and so you had to use an external application. And this is where proxy SQL as an application is something that's very useful. So what is proxy SQL? It's a high-performance proxy that sits between your MySQL database and your application, and it's designed from the ground up to speak the MySQL protocol. So it can provide load balancing between connection between the different servers, and it understands the topology. Like, if you send a query and you say, I want this to be, it goes to a reader, then proxy SQL can know that from looking at the query or from whatever you tell it to. It knows whether an instance on the back end is up or down. It can be configured as a connection pooling application. So by default, proxy SQL will keep 10% of the connections to the back end alive. So if you don't make any other configurations and you just install proxy SQL, it will act as a connection pool on its own. And it has full end-to-end SSL support, which is very useful. Like if you have an old legacy application that you still need to connect to a database, but the SSL handshake and overhead for doing that in a PHP application, for example, is too big, you could set up proxy SQL in the middle to accept the unencrypted connections locally and then connect safely and securely towards the database. And this is the use case we currently use it most for at plan scale. If we have customers that don't have the SSL support currently available yet, we put proxy SQL in the middle to handle that for them. So we keep the proxy SQL on their side of the connection so they can over their local network or local socket even connect to proxy SQL and then proxy SQL handles the encryption that we require on the plan scale side. So architecture diagrams for deploying proxy SQL, this is the simplest one. So you have your clients on one side. Yeah, there. And then you have the proxy in the middle and your clients just connected to proxy and the proxy connects to MySQL. It's always recommended to have not one proxy because that becomes a single point of failure if your proxy dies, then the game is done. So in this case, there's three, but you can scale that as many as you want. The proxies can talk to each other to share configuration. So if you configure one, the other ones will automatically follow. And then on the back end, you have the MySQL servers. And you can see these are called in proxy SQL world, these are called host groups. Like this one has one server, this has two, this has three. Important to know is that if you connect to proxy SQL, it will send your query to only one of the servers in the host group. So if you decide this needs to go to this host group, it only has one choice to send it to. But if you decide this is the host group it needs to go to, then there's two hosts in this one, but it will send it only to one. So it's not a good solution for write load balancing because you will be writing to different servers and that will lead to split brain. So that's not where you wanna go. So if you have a host group primary, to talk to a primary, it should always be one server in that group. Another very common architecture, and this is mostly done in Kubernetes deployments, is you put your proxy SQL where the application is. Like you send it as a side card to your application and then the client, the application can talk locally over the local inside Kubernetes networking or even to the local socket to connect. And then proxy SQL will handle the connection towards MySQL. And this is what we typically use when customers have issues with the SSL offloading and stuff like that. So we can put the proxy very close to the application and then connect to MySQL. This is a good architecture, but if you have a lot of pods, it's becoming very difficult to keep the proxies in sync. And so then you have the hybrid architecture that you can use. So proxy SQL can, the backend in the proxy SQL can be another proxy SQL server if you want because they suggest speak the MySQL protocol. And so in this case, all the proxies here are configured with the same configuration, just the addresses of these three proxies in the middle. And if this changes on the backend, the only proxy SQL instances that need to be updated are the ones in the middle. And then these ones don't need to change. You don't have to sync them with anything. Only if one of those here would change, only then they need to know about it. But this is something that doesn't change that often. And also because a proxy SQL on the client side knows whether a proxy would go down, like if you would need to do updates or whatever, you can take one of the proxies here down and these ones will know, oh, this one is unavailable right now. So I won't send traffic to that one. And when it comes back, they say, oh, it's coming back so I can start using it again for sending connections. That's a good thing to have. Then we come to plan scale, like where I work. Plan scale is built on Vitas and Vitas is originally developed at YouTube. So this is an application that was cloud native before cloud native existed. The first release of Vitas was done just weeks before the first Kubernetes release. And the guys at YouTube, which was also a Google company, they talked to the Kubernetes team which were their colleagues and they were like, so whatever we built here, this is actually ready for whatever you're building. And they were like, yeah, actually it is. It was a thing before Kubernetes came to be. We currently run a managed service in the cloud. We are currently offering publicly in GCP and AWS. There's no reason why we couldn't do Azure but there's just not been a demand for it. So if anyone wants. We are fully scalable. We run everything on Kubernetes like all our databases are running as pops in Kubernetes. Vitas and Fiti orc, one of the components of Vitas provide us with high availability. So if something happens to the primary, we can either let Kubernetes reschedule it to another node and continue working. But we also have like a tool, an orchestration tool called Fiti orc that will detect this and will promote a new primary from one of the replicas in the cluster. And on top of that, this is why it was developed initially at YouTube. It's for charting. As you might know, YouTube has grown a little bit over the last decades, just a little. And so to keep scaling, they quickly realized that they would reach the end of the capacity of the servers they could buy. So they needed to go to a distributed system where they could spread the data over multiple servers. And so if it has natively support charting and thus plan scale also support charting. This is the Vitas architecture diagram. So all the way on the side here, you have your application servers and they connect to something called a low-glancer like this is something you can roll your own or you can use in plan scale side. You can, we have our custom load balancer that we offer. And so this is where you choose what you want. If you want to run the test on-premise, you can just put an ELB or a TCP load balancer in Google in front of these things and these things are called Viti gates. And this is basically our proxy. And so your application connects to Viti gate and Viti gate will accept MySQL connections and will tell you like, hey, you're talking to me as a MySQL server and everything that's behind me just to your application just looks like one giant MySQL. But in fact, there might be hundreds of charts with hundreds of MySQL's but to your application, the test will say, hey, here's a database. Talk to it. We run in our Kubernetes clusters next to our MySQL processes. We run a Viti tablet. It's basically a sidecar process. Actually, if you think about it, MySQL is more like the sidecar process to Viti tablet because if you're talking about MySQL, it's easier to think that Viti tablet is a sidecar. And there are multiple charts. Like the tables can be charted into different MySQL servers. And Viti gate can know about that because this is all registered in a topology server. We use at CD for that. There are people that use ZooKeeper with the test. But that's a minority. There's still a few people that use console as topology server, but we're looking to deprecate that in the future. And then you have things like Viti CTLD, which is the control plane to control all of the things here. So if you would like to add a new tablet, you tell Viti CTLD like add tablet and then it will update the topology server and make sure that Viti gate knows about it and all those things. Viti Orc is what I told you about. It's another block that should be here, but I couldn't find the image of it. So I left it there. And then on the back end here, you have MySQL which is basically still the main storage. Our connection pool is not situated on Viti gate level but it's connected situated on Viti tablet level. Viti tablet keeps connections to MySQL open and the connections between Viti gate and Viti tablet, they're basically GRPC HTTP requests. So that is very much easier to handle than keeping MySQL connections open. HTTP scales a little bit better. Then on plan scale, we added our own edge infrastructure and on the load balancer side. And our edge infrastructure is a little bit similar to what you would get when you use AWS edge locations. If you have an S3 bucket and you want to serve that from a local area, you can have an edge location that caches basically your connection. What we do is we terminate your MySQL request on the closest edge that we have. Like if you connect from LA, you will be terminated in the US West 2 region to make sure that you have the lowest possible latency and then edge will figure out where your cluster is and will connect you over the AWS or GCP backbone to your actual database. So to have the lowest possible latency, that's a good thing. And also edge supports HTTPS endpoints, GRPC connections. So if you don't want to use the MySQL protocol to talk to MySQL, but you want to use like HTTP connections, you can. So that's supported by edge. And that's actually also the initial use case for VTGate. Like initially at YouTube, VTGates, the application stocked GRPC to VTGate and the MySQL protocol in VTGate is actually something that came later in the second phase because we offer a MySQL service. So we should potentially also be offering the MySQL protocol. And so that's then what it looks like in the global routing infrastructure. So you have the edge at the closest place you want and then you have the databases. So the VTGates are the proxies and then you have your databases on the back end. So wherever you come into, you can be routed to the right database in VTGates as you need to. And then we have a blog post where we took this glove to get it to one million connections. It's, and we got there. So if you want to read about how we did that, there's a blog post there, but any questions? Yeah, that depends on how you configure them. Like if you use the proxy SQL clustering mechanism, I wouldn't go over hundreds. But if you have a way that you can distribute the configuration for those proxies in a better way, then letting them figure it out amongst themselves, you could go to thousands. Like if they don't have to rely on it, I'm talking to each other. I know proxy SQL was originally developed at Dropbox and they also have quite a lot of connections. So it was, it's able to scale quite significantly there. They go through there, yeah. So what you do, let me go to the slide. There we go. So basically the reason you put these here is because this is more volatile on the MySQL side. Like if there's a failure on your MySQL end, you need to promote a new primary or you need to recreate a new replica and then proxy SQL needs to know about that. And if you have thousands of proxy SQLs to update, it's gonna take a while. Yeah, yeah. The easy part here is that it's just monitoring the health of the proxy here. So if one of these proxies goes down for the upgrade or for whatever reason, it just stops sending traffic there. So it's only doing your ping. Like are you alive or not? Whereas if you have to update the configuration, it has to be singed and it has to be stored. And other questions? Oh yeah. Yeah, yeah. Yeah, proxy SQL performance is really good for that. Like it really helps. Yeah, it's basically speaking the MySQL protocol. Yeah, it basically does whatever you're doing now. It's also implementing the C library for MySQL clients. Well, it's using, I think the MariaDB library still, internally, but yeah, same potato potato. Anyone else? Then we could all go to lunch. Okay. Say hello. Hello, check. Sound check. There you go. All right, we can turn it off for now. Great. Hello? Okay. I guess we can slowly start and probably some people will show up later on because it's a lunchtime and there's expo hall is open and maybe they'll join a little bit later. Today we're gonna talk about design and modeling with MySQL, but design and modeling for databases, MySQL and Postgres. So, hey, you know, it's put it this way because this is supposed to be MySQL track, but it applies for Postgres and other databases. We'll go through that. But before that, if you haven't connected, we can connect. My name is on the schedule also or you could send me a request on LinkedIn. And my name is Alkin. I work in Sista data. Some people call it Chista and the EDP of global services where I actually handle customer interactions and run a team of DVAs focusing on click house on analytics. Previously I was working in some of the services companies like Parconapitian and PlanetScale, which we talked about in the previous call. And in my previous life also, I was an enterprise DBA worked on Informix, Oracle DB2 and SQL Server installations worldwide. So a couple of things that I've done in the past, so we'll skip that. And this is where I work. Why I put this over here, we're still hiring, so if you're interested on getting your workloads on MySQL or Postgres slim down and want to run some analytics, you can actually give it a try on click house. And click house, we'll talk about that a little bit later, but click house is a columnar base database which is outside of the context of this talk. So I won't get into that, but check out the Sista data, IO or sista.com for the blog post and the information. There's also a DBAS that you can go ahead and sign up and try. All right, so as a captain and a sailor, I have to put in one maritime slide for every talk I make. So today's trivia question to the audience is what is the name of this device or instrument? Does anyone know? Okay, good. All right, you got a beer from me. So it's called sextant. It has nothing to do with, you know, don't think that way. It's an instrument to calculate an angle between the two objects. And then it's literally used in 1800s for navigating in the seas for getting the horizon and the object could be a star or anything else. And then there's a little bit of a calculation behind it, latitude and longitude, and you can find where you are basically. It's difficult. I have one, never used it. I don't plan to use it maybe when I retire, but it's something is out there. So I bought it just to, you know, it's nice. But with the modern technologies, we use GPS and everything else possible to navigate. And for forecasting, whether everything, we have very complex and advanced systems to do that. But it's nice. So thank you for the answer. Also, I actually, we had a lot of discussions around the community. So my SQL community, as you can see, the room is not that full. Maybe there's a little bit of a split. I have taken some time to write a blog post about the community, how can one contribute to a community? And basically I wrote it for my SQL because I come from the my SQL community, but you can contribute to any other open source database or other open source communities. And please take a look at it. The slides, we will share those so you can basically get the link. All right, now that we have hopefully broken the ice and we can look for the agenda. So today's subject is database design and we'll cover some of the scaling and scalability options that we have. Some of the previous talks already covered for Yugabyte and for PlanetScale, for Vitesse and I will also touch base those and there'll be another talk after this covering the TideDB. And then we'll talk about integrating databases and then emerging trends which is going to be the distributed databases. But we can actually, if we have time, we can actually talk about or open discussion over here. So I'll try to keep it light and not boring because database design and modeling is sometimes you don't get to do a lot because when you end up in a company as a DBA or a developer there's already some sort of a database and it's built by some tribal knowledge. Sometimes it's the business requirement comes in fast and quickly that it's actually taken as a scratch book and then it becomes a huge large database which is problematic at the end because you're there. As a DBA for the last three decades, I've ended up in many places with the same problem. Sometimes I did contribute to a design of a new schema or new database but sometimes we had to deal with the existing design which was a problem. So take a look at the history of the data model. So I come in somewhere in between over here but basically the models that emerge on the, in the 70s they were like the network model and then the network model was basically a flat files and reading large files into like a database, a data store, one by one sequential. Some might know that if you've worked in like cobalt era, RM cobalt era, there was like files in the, and just read and maybe with some simple index or not even a libraries. The libraries came maybe a little later. CISM libraries and then which became part of the MySQL originally. That's again about 30 years ago. So the relational model born in the heart of, when the databases actually became more like a requirement for the businesses and yeah, okay. And what we're saying is relational databases became dominant in 80s and the 90s we have faced the object oriented programming which actually came with the object databases. And in 2000s we started having graph databases and now we are on the wide column stores, no SQL era of the emerging technologies in this era. So Clickhouse is one of those, the column store which actually is opposite of the row based system which we will talk. If you look at this history, the emerging technology from the inception of the database or the data model to becoming a commonly used technology, it takes about 10, sometimes 15 years. So yes, the relational databases became in the 80s but most businesses actually started using Oracle Informix DB2 or UDB DB2. Even SQL server in the late 90s or mid 90s. And because of the popularity of the relational model, currently it is still the most common used in a bit of 60% and there's also the other models that were used in the model. If you see hierarchical network and object oriented model, they actually don't actually take as much space as the relational database models but I think this list can be upside down in coming years, maybe the introduction of the microservices and splitting up the data and the data models are slightly changed. Within those, their still relational model still exists. So if you look at the relational model, as we all know, for MySQL and Postgres, they're all organized in the tables and it's row based and each row actually has their attributes and then there are keys around it. We won't get into the keys on this talk too much. It's a separate talk, keying, indexing, optimization of queries, all that is a separate subject but this is the foundation of the database design so if you don't have a design that supports your queries and the business rules then you'll have a problem with whatever database is not going to be MySQL only but also on others. Okay. So one of the concepts that I would like to highlight is the normalization. So it's actually directly linked to the relational model because if you have a relational model database that needs to be normalized. This used to be a thing in the early days as we've seen in the history of where it was born. The data on the relational databases, on the earlier versions of our DBMSs, they were not normalized. So this was a very big requirement to have data normalized and then become more efficient and basically consistent in itself. So what is a normalization? A normalization is actually basically reduction in redundancy, avoiding the redundancy in your table so you don't have the same table repeated, same attributes repeated among other tables. And so this improves the integrity, you don't have to go and update multiple tables for it and then also it helps the efficiency and the productivity of the database that you're actually querying. So the other thing is this is linked to the relational model, is the asset properties. So we have to have atomicity and consistency and isolation and also the data should be durable. So in order to basically do all of this in the relational model, you need to have all these properties applied along with the database operation. So going back to the database design, when you are thinking of designing a database with this relational model, you have to know these properties that the database will offer to you. And MySQL is asset-compliant database. So what we mentioned in the new era of databases, some of the emerging technologies like NoSQL or columnar databases don't necessarily comply with asset properties. Hence they cannot be used for transactional purposes. And also the infamous CAP term. And so along the lines of asset properties and the database design, you will actually have a problem to deal with is the consistency, availability and the partition tolerance subject. So we have to make sure that these are also aligned with the database design, which is the relational model in this case. Also you're complying with these three. So I will get to where MySQL stands in here, but again the CAP term actually isolates the availability, partition tolerance and consistency. Having these three, it's actually, I give this example to, so think about you're in a restaurant, you went to a restaurant, restaurant has amazing food and the restaurant you go has a beautiful view. Maybe you're on a seaside or lakeside and the bill is very cheap. That is not possible. This is what it says. So if you compare to that, having all these three within the same database, you will have a problem with your database design and the model. So how does this happen in the NoSQL world, which is again outside of the context of getting into that, but knowing these are the properties of the theorem, you will actually have NoSQL having an optional work arounds for that, such as eventual consistency. So you will have a database, it will be consistent, but not immediately. So you can't do read over write, like you can't do certain things. So make sure that you are in that sense because what we have seen in the last decade is specifically on MongoDB world, it actually had, or some of the other databases like Cassandra were used for relational model mapping, but it didn't actually work the way that it was supposed to. This was a proposed by Eric and there are trade offs between the properties. So you will actually have to make sure that your trade off is satisfied with the design. Okay, for the data models we have couple of key terms, entity relations, the attributes and the relations between the entities. So these are the things that maybe the new generation of developers or DBAs or don't actually pay too much attention, but if you have a proper design, these are the things that you will be actually having. So the other models are hierarchical model, for example, it's a top-down model and you will have no relation between the bottom of the entities to the top, it will be only one way around. The network model was also popular in the earlier days and it's no longer a thing. Object oriented model is still a popular thing to be considered and because the object relational model is still applied within the object oriented programming and object oriented ORMs that will actually apply these. And then there's the relational model which is somewhat satisfied with the relational database management systems, database properties and then there's also entity relationship model. So basically you would have a couple of examples, for example, customers or their shipments, you can have a one to one, one to many, many to many relation and visualize that, make sure that your database is designed the way it's supposed to have and nothing is redundant. So you'll have to actually apply the normalization and then you will actually have a proper database design at the end. Unfortunately, we will not do a workshop over here. I would take it, that would take a long time to do it. I would maybe later on sometime that would be a good example of how to design a database from scratch would be an option but I have a quick hint for that at the end of this talk. Okay, so let's go back to the entity relationship model. So you could actually have a somewhat entity relationship diagram in order to understand the database. Okay, so who actually has ever done an ERD entity relationship? Okay, okay, we have a few people. Again, I mentioned it's not as popular as it used to be. I used to work in large enterprise shops where nothing would go in production without having an ER diagram and then normalization is applied and then there's keys created, verified checked and then deployed in. So today we're not maybe diligent as much as used to be in like large corporations but if you basically don't do all of these you are going to be facing a problem at some point in your future growth as the database will evolve over times which we will talk about a little bit on the scalability and how this gets into a problem of one becoming a one single database into multiples of thousands of maybe clusters of database globally. So how do we do this? We actually gather the requirements which the business is supposed to tell you, okay, we're gonna actually have a new service and say a new payment system or new integration to some mailing system or something like that. We identify the entities, we make sure that attributes are identified. When this is attribute identification what does actually ring you on? What is that actually meaning on that attribute? Basically this is the data points. So I think this is another very important point is the data type. So if you don't wanna have a problem in the future you would wanna make sure that all the data types applied from the beginning properly. So there is a lot of experiences that we paste, oh, this database is slow because they used a wrong data type for a wrong attribute and it actually had a difficulty both indexing, making relations and other problems. So after all of that we build ER diagram and then we construct that and then we review and refine. As you build this, we actually apply the normalization techniques and then basically the ultimate goal is only reducing the data redundancy. In MySQL, interesting enough there is a tool for that and it's free. And no one uses it. So has anyone ever used MySQL? Okay, but you build databases, okay. There's some couple of people use it but this tool has evolved and improved over the last decade a lot. It used to be very simple and didn't have all the bells and whistles. The last edition that I downloaded and tested is actually pretty good and it can do a couple of things. It can do the modeling which is what it's supposed to do but it does a reverse modeling also. Say you actually have a database and you connect to a database, it pulls out and it gives you the visual relationship model which is I think pretty cool because your database can be small or big but knowing visually the relations and how it's built it would be very good to use this tool to do that. I did search for the Postgres version. There is none out of the box in Postgres but there are a couple of third party tools that you can actually pull this data. I'm not even talking about the paid versions. And then it does the query, it's a query editor. We don't necessarily use for queries or the administration in MySQL world, it's always like as much as CLI possible but if you're a beginner and if you're kind of exploring it and there's a way of connecting to a database using Workbench and then it can also do the validation which actually can generate between the models and entities it can actually check the validation for the design of it. The other thing that it does, maybe it doesn't mention over here is actually it can also generate the SQL statement for it. Like you can design a database over here and then tell it to generate SQL syntax for it. You can take it, copy and paste it somewhere else and then execute it. So it's actually pretty good. So you don't have to type it by hand. It's a useful tool. So we mentioned about the ER diagram and then we built the database and then we think that it does satisfy the need of the business but then we also still need to check about the normalization. So the first normal form, we actually have a primary key for each table and then it actually represents that each row and only authentic values are in it and then there's a second normal form. So it goes on and on like this up to fifth normal form but normally you are around second normal form or the third normal form of a table design with the entity relationship and then this is something it's not mentioned in any of these slides but there is also denormalization. When it comes to be too broken down into system again you have a problem in I had been involved in couple of denormalization efforts because the table didn't have the properties of the attributes of the data that it needed so it had to gather from somewhere else which actually also an overhead for a database operation. So you also want to minimize the joins and maybe even like a bad queries Cartesian products that might be caused by reading two tables joining the rows then maybe it's too much to have beyond third normal form of the table. So this goes to this technique is not intended for a single table. It is intended for the main relation of the data points. So basically you get to the primary keys and then you make sure that you're separated the data to a different tables but not redundant and then build the relationship and then you actually apply these techniques and then you remove the partial dependencies that is not suitable. So no non key attribute depends on just part of a key. So if the column in our case is not a key should not be part of the primary key of the table. Okay so we mentioned originally for the asset properties and then there's part of the asset properties you also have isolation levels which again link to the data integrity of our design. So we design, we're designing a database and then we made sure that it is asset property that our properties are provided. We paid attention to cap theorem. We did some normalization but we still have to make sure that our access of the database is basically isolated. Why this is needed? Because we have concurrent access to the same data set and if it's changing by different users, different connections it has to be isolated from the other transactions and there are different levels of isolation and in MySQL again we actually have a repeatable read as a default which actually works pretty good. I don't want to get into that too much in internals of it because it's not part of this talk but we actually balance the consistency and the concurrency with the isolation levels and MySQL does a pretty good job doing that until things get distributed. So which was part of the earlier discussions and maybe the next talk also. So we have to choose isolation level carefully and at least be aware of there are isolation levels that can be set by a client which may be not in your control. So all right, scaling databases. So with that said database we created a model and then we designed a database considering all the factors that we've gone through over the last 20 something minutes. When it comes to scaling some of those properties will become a challenge. So why do we need to scale database? Maybe there's an exponential growth, the business actually booming and so you need to set strategies for the scaling. Whether you would isolate a database per requirement it could be per region, per country, per city or you would do some other techniques to do that. Basically this is what you will have to consider for both performance, reliability and the availability. So having a database scaled not available things falling apart, having a latency we've have seen that a lot and that's part of the deal. So there are some techniques that I will just highlight and overview those. For example sharding, there's replication. Replication for MySQL is native out of the box but yes in the previous call gave a highlight of those but if just for those new people if you weren't in this room before it does an asynchronous replication out of the box you don't need to buy or set up something outside of the MySQL itself. Caching and then there's other options in the cloud also that you can scale the databases within the given context of the model. So we identify two ways of scaling one is the horizontal scaling and the other one is the vertical scaling. So vertical scaling is a little bit more known method of doing it for just adding a new node or a larger node or a bigger node and then getting that. Horizontal scaling can be a little bit complicated. You may need some other tools, some other coordinator, some other router or proxy or something like that. So be aware of that how when you get there we need to scale the database you can't just keep putting in the largest instance of your cloud provider or your home machine. So basically what we are doing over here is expanding the capacity of the database that we designed. Okay, so in the vertical scaling we can actually have a couple of options over here. We actually have adding a new node and increasing the CPU, we mentioned that. And it's what we take the vertical scaling is a short term solution until we actually have a bigger term or a roadmap on our hand to have a solution for scaling the database. So don't take the vertical scaling as a solution itself. It's just a usually short lived initiative that doesn't actually scale itself. So in the previous talk also Matthias mentioned a proxy solution can be a way of scaling the database because out of the box we have replication but then splitting that replication into for example read write, you can scale the reads with using replicas and then you can use a MySQL application hitting that server. There's more complex drawings on his slides but I didn't go into that much of a detail because these are operational problems a part of the scaling which I link it to our design and model. So you have a design model that doesn't actually comply with this solution that's not going to work again. So in order to have that, I mean I've seen database it's just one big, you know, terabyte memory single node database that you can't put on proxy or anything like that. There's nothing you can do about it. So proxy solutions are not limited to proxy SQL. There's a MySQL router open source and then there's people that's been using even before these existed like HA proxy or other TCP proxies out there. MySQL has a different solution. There are many clustering solutions around MySQL but one of the MySQL's native solution is the InnoDB cluster which consists of a group replication and MySQL router and also a supporting tooling around MySQL shell. So then you actually have a cluster itself you can do primary and then secondary is underneath all and also you can have a cluster set around it. This is another way of scaling this. This could be like US East, US West and mid and then you could actually scale these, connect all of them and now you have a InnoDB cluster set around it. This is all of these like this looks a pretty complex drawing but all of these are natively supported in the MySQL ecosystem which is amazing because group replication is alone a cluster set. It uses some consensus algorithms. I don't know, again it's not part of this talk but there are some known consensus algorithms that to make sure that these actually work and talk to each other and make also comply with the asset properties of the database within the relational model. With this we talked about with this on the previous call also which is made for MySQL sharding framework and it has the similar models like you've got the VT gate as like proxy and then there's a mode balancer before you connect the databases and you can shard. This is basically intended for horizontal sharding. So one to N meaning each shard is itself a cluster. There are three nodes in it. It has one primary and two secondary and if something happens to this secondary it just fails over automatically and then it's still have that. If you're running on this one with an operator in Kubernetes and all that good stuff the new emerging technologies you would actually have immediately initialize new host and then add it to the cluster. So the sharding framework allows to go pretty much unlimited in this case as the limit of your hardware or crowd resources or credit. So for more we can check the VT SIO. I have links at the end. Before going over to the server list there is a tie DB which we will have a talk on the next talk is that there's much more comprehensive information about tie DB provided with the MySQL protocol. It's tie DB also solves many of the challenges of scalability on the relational model. So I have a link at the end but no slide for that. There are also serverless options. So apart from the manual implementations or the frameworks or the open source tooling some orchestration done around the database model you can actually have a serverless MySQL solutions. Within the serverless MySQL solutions most popular is AWS Aurora. Google Cloud has some solutions around it. Pretty much every cloud provider offers MySQL but not necessarily scalability options. Aurora stands out over there. There's also PlanetScale which actually uses this technology behind the scenes and you could actually have a database that can scale unlimited if that was the requirement of the business. And you could still use the relational model with all the other properties including asset properties and isolation. So these serverless, what do they actually provide at the end is actually serverless MySQL or Postgres or other database implementations actually takes off the burden of operational tasks and they are all managed, automated by the systems that they've been hosted. So they're managed services, managed operations, operational databases within the cloud. Okay, so the future of databases. I think earlier today maybe you were here there was a talk on the Kubernetes talk. So we have a sub Kubernetes ecosystem called Data on Kubernetes. So having stateful workloads in Kubernetes is something that's emerging technology. So you actually integrate with those technologies again and then you can actually run different workloads underneath them. And one of them is running an operator. Operator actually runs the database operations for you. In the Kubernetes. And then you can actually have a very scalable and I think emerging and more modern technology which can be integrated and adopted to other technologies. So for example, I am running a workload that is designed and running on Kubernetes. Stateless workloads like NGinux or any other implementation. Where do we put the database? I mean database needs to sit somewhere in there, right? Like you can't actually have like a standalone database outside of your operational Kubernetes cluster. So you would actually have to adapt to these new technologies. Maybe it wasn't a good idea five, 10 years ago but it may be a good idea now. So you need to actually leverage the new technologies and new ways of doing that which will allow again AI, machine learning and all that new bus. Okay. Why did we talk about this today? This was the idea that came to me from this book. This book, I've been working on this book for the last eight months with my co-author and this was actually good to go back and look at how the database fundamentals used to be and focus on that. This is on pre-order right now. It will be released this summer. And so hopefully you'll get a chance to read this book and with much more details than today's talk itself. And this was my previous book. This we spent a lot of time on about two years on this book together with my co-author, Svetta. And this also actually has pretty good information. If you are new to MySQL or you have some challenges with MySQL among the other set of very good books on O'Reilly you can actually have a take a look at this. So I put this one on here so that people can have an access on the slide deck. And this is it, thank you very much. Are there any questions? Thank you. I think we have a couple of minutes for any questions. Well, thank you very much again. Thanks for listening. Oh, we have a question. Yeah, partition tolerance. So partition tolerance is, Sunny, do you want to answer that question? He's very good at partition tolerance. So you are actually partitioned your connected nodes but systems still operational even that connection is broken. So the algorithms like Raft, Paxos and some other algorithms are similar to that. These are like basically computer science algorithms that creates a consensus among the nodes. If you have the majority of those, then you can continue. And if you lose the majority it holds some systems actually say, I'm not gonna do anything because now we have a double headed dragon over here. We may cause more problems than it is if the application writes into two nodes at the same time. So split brain. We had that in our lifetime also. All right, any other questions? Thanks Sunny, by the way. Thanks a lot. Maybe something needs to switch on. Oh, shit. Okay, we can sort of start, I suppose, yeah, eight minutes. Yeah, so we're gonna talk, oh, this is too loud. Or is it only me? Is it okay? It's okay? You're fine. So yeah, we'll talk about beyond MySQL and what's been happening in this space for the last eight or nine years because TIDB or pink app has been around for eight years now and it's maturing into a fine product. And it's a distributed SQL unlike MySQL which is like a single node but then you have add-ons like replication and group replication that I'll talk about and then I'll talk about what TIDB is and how it works. So if you have any questions, feel free to ask me. You can ask me later but even during the talk, you can interrupt. There's not that many people so we can figure it out. So a little bit about myself. I work for pink app now and I've been at pink app for two years. I've been working on databases for a long time. I used to be the lead at MySQL of the InnoDB team and that's my Twitter handle. We have this raffle thing going on so if you have some, you know, to join the raffle you can just scan the code and it should work. Okay, everyone's done with it? Okay, so a little bit of history. It's pretty long. So this is more about MySQL replication or history. So it started as a single load system. In 2000, they had replication which was moving SQL statements from one MySQL to another using the bin log. It's asynchronous and it's been asynchronous since then. Group replication was released in 5.7. It supports multiple writers but that has its own issues and they don't recommend it but they have it so I should mention it. Default is still asynchronous. It has consistency. It's eventually consistent. So it's not synchronous. Replication lag is an issue. DDL is a big issue. If you have large tables and it's a DDL itself with a single load operation and you can get stale reads which is because it's asynchronous and the maximum number of nodes in a single group is nine which I don't think anybody rather used to run four or five. At seven, the performance is pretty bad. The next evolution in the MySQL space was with MySQL Aurora. It was announced in 2014. The big innovation there was that the scale and storage, the compute and storage were disaggregated so you can run them independently. So they had a storage layer which was scaled and then they had one single writer and they had multiple readers and they replicated the redo log of InnoDB. It's still limited to 128 terabytes and this is because they use a single table space to store everything and InnoDB is actually limited to 64 but with a bit of trickery around bit fiddling they've moved it to 128 terabytes but it's still a single table space that is a part of their shared storage. So I won't go into details of Aurora. This is just a simple list. There is also BitS. It's more of a sophisticated proxy over a cluster of MySQL nodes. It requires manual sharding. The application needs to be aware of the manual sharding so your application changes. Sharding ID has to be passed around. Resharding is, as I've said many times, it's like a Massachusetts dream. It's painful at scale when you have to do it. Latency can be lower with this architecture though. If you are hitting one single node for your request, the latency will be much better with the BitS architecture than the others but you have to pay the price of how it all works. It doesn't have consistency across all the nodes. It doesn't even support repeatable reads beyond a single node. So when you do two nodes, let's say as an example if you have three, your commit is over three nodes from the application side and two of them commit and the third fails. There's nothing BitS can do. That burden is moved to the application developer to figure out what to do. So it can cause bigger problems if it's, you don't understand how it works. So that has its own problems. The next big breakthrough was in Spanner. So they used the same idea from bias-dotora, the disaggregated compute and storage but they have proper distributed transactions and you can have multiple readers and writers and the storage itself can be scaled up almost infinitely but obviously there are limits but it really was. TDB, TIDB, sorry, was inspired by Google Spanner. So the TIDB founders used to be, they were working for an internet company in China and one of the big problem, and these were the good times when everybody would think they were scaling really well in the mid, like 2010, 2012 and they were constantly sharding and resharding to handle that scale and they were using MySQL and they thought there must be a better way to do this. And so when they came across Google Spanner paper, this inspired them to try something in the MySQL space on how to scale like Spanner. And the current architecture is version two. Version one was writing the SQL code or node and then having each base and that didn't scale well and then they changed architecture to what it is today. So what's TIDB's unique value? It's open source, it's reliable, it works at hundreds of companies if not thousands now, especially in China, but also here, Databricks, Pinterest, few others we can't use, but it's working here in the US too. It's multi-tenant ready, so for example at Airbnb, I think, they consolidated something like 440 Eurora nodes, MySQL Eurora nodes into 50 or 60 TIDB nodes and they consolidated their applications to run on that cluster. It's versatile, you can run across games, logistics companies, web scale companies, any kind of load you can throw at it, it does pretty well. It's MySQL 8 compatible and it's really trivial to set up and start. So if today you want to try it, just go to tiup.io, just download, do a copy and paste and it'll be up and running in five minutes. So it's very easy to set up. It's that same tool that is used to scale it out to thousands if not hundreds of nodes and you can start with the sandbox immediately if you want. It's really easy to set up. So these are all the companies that are using in production and there are many of these that are running it at 800, 900 terabytes and at that scale, there are many things that MySQL is not so good at. DDL is, for example, like a big problem and I'll get into the details of how TIDV works and handles all that stuff. So first we do the design fundamentals, then a bit of resource control, how you can use resource control for multi-tenancy and other things, how online DDL works and some of the tools that are part of the whole TIDV ecosystem. So this is our reference architecture and there is one part of this that I just want to highlight and that's what's called TyFlash. TyFlash is what gives TIDV the ability to use the same data for analytic workloads. So in the same query, for example, the TIDV optimizer can send part of the SQL requests to OLTP nodes, which is TyKV and anything that can leverage the column store with its own MPP engine that goes to TyFlash. So this is called htab. So it has three main components if you take out TyFlash. TyFlash is optional, you don't have to use it. It's something called the placement driver, which is like a metadata server. TyKV, which is the scale-out storage system, which also does transactions, and then there is the SQL layer and these are the three main components. And you can do OLTP, it uses RAF for consensus, data consistency is guaranteed by TyKV. TyKV is a CNCF graduated project. So it's a region, so people who are familiar with, let's say, Google Spanner region is the equivalent of a tablet in Spanner. Or you can think of it as, if you don't know what a tablet is, think of it as a page. That's the closest you can, if you use two databases, like a page. It's fault tolerant and it works across availability zones. So one other thing you can notice is that you have four nodes and let's say you look at the little arrow there. So think that region or page or tablet is spread out across three nodes, one, three, and four, and then it's also replicated to the column store and converted into the column store format. So what that means is that each of these pages, there'll be three of each, is a raft group. So each page is part of a raft group. So what is replicated is the page, not the entire node like you would have in, let's say, MySQL group replication. So each page is replicated. This is how it achieves scale out. This is the core difference. So if you have a million pages, all pages are then distributed across. So if you have like a thousand nodes, by default it uses three nodes for redundancy. You will have all your nodes spread out and how they are spread out is something that I'll talk about and that's what PD does. I'll talk about that in a little more detail. So region is really its logical scale unit. Think of it as a page if you can't relate it to anything else. So everything that's got to do with moving data around is around the concept of a region. As I mentioned, a replicated region is a raft group and a single node can contain many regions. But each region will have only one leader in a raft group, the other two are followers. And all the data is stored on RocksDB and there is one RocksDB per TyKV node instance. And all the rows in a region are ordered. There's a whole story behind this because instead of selecting hashing, we selected, this is because we wanted to be an OLTP database where range reads are important and so order gives you that advantage. So it doesn't do any hash partitioning and does range partitioning. So the rows are ordered. So it's important for OLTP queries. So what's a placement driver? It's a metadata server that coordinates the entire cluster. It's a stateless piece of machinery and it uses ETCD for its own HA. So you can think of it as that you don't want a single point of failure. So each part of each component of TyDB is also the multiple of them. And because this is the metadata server, you want this data to be HA and you want it to be resilient. So it runs its own ETCD to maintain its own state in its own cluster. But the process itself is stateless. So I can just start one up, point it to its own cluster and the cluster, it stores its data inside TyKV. So remember that. Some of it, not all of it, but some of it. So it's the brains behind the whole TyDB cluster. It's what controls everything. It decides which page goes where. If there's a failure, what to do. Which nodes need to be copied. It monitors the CPU, the disk and everything else across the whole cluster and load balances dynamically. So these are very powerful things that you cannot do with existing MySQL solutions. You just can't. So this is really generally the next level of what a database should be, a distributed database should be. This idea is all from Spanner. It's not as TyDB invented any of this. So it handles, like I said, it handles dynamic distribution of, handles node failures. It also handles cluster config. So it can do add nodes, remove nodes. If a node joins, leaves and monitors, it knows what to do. Just in case node capacity increases because the node has gone, you've started something else up and the old node comes back on. So it manages all those kind of use cases. You can do multi-zone deployment and disaster recovery and stuff. So these are the things that I was mentioning earlier. So at the node level, it connects the total and the free disk capacity, the number of regions there are on that node, the data writing speed of that node because all this is taken into consideration when it uses, when it decides to move data around dynamically. It looks at the snapshots or because it uses LSM, it can just take a snapshot and transfer it in case it needs to build something. It looks at the, monitors the CPU, tagging I won't get into. It looks at, so the next one is actually very interesting. So it also, for failure detection, it monitors heartbeat messages from all the cluster. So it knows where the leaders are, it knows which who's disconnected, it knows which node, what's being read, what's not being read. So what role does it play beyond this? So now imagine I've run an SQL query from the, I write an SQL query to the SQL node. The SQL node is stateless. It doesn't know what to do. So the first thing it does is after it sort of optimizer has gone through the query, it will then ask the PD for the location of the data inside the cluster. So once it gets the address of where all the data is, then it goes to, so let's say the data is spread across, let's say you have three node cluster, the data is spread across, there's a primary on each of the three nodes. It'll do a parallel fetch from all three. So it does true distributed access of the data and it gets the address from where to read the data from the placement driver. As from the name you can tell, it tells you where the placement is of the data. So what are the scheduling strategies it uses? So there is, basically you want to specify how many nodes you want in your storage cluster. Then you want to know how do you want the placement to be per node, per rack, per node, per zone. You can specify all these rules and the next few slides, I'll show you the syntax of doing so. You don't need to do all these, but there are specialized cases where people want to do this sort of stuff. They want very fine gain controlled over where to place their data and this could be for all kinds of reasons. It also balances the space utilization across the cluster. Just in case you want to move data from one cluster to another because you want to take something down from maintenance, you can do all that stuff too. It does hotspot detection and mitigation so it'll start splitting up accordingly, dynamically. These are really, in my eyes, very powerful features. It's also, it has a governor which decides how much it should interfere and start doing scheduling and you can control it. By default, it sort of sit very conservative. It doesn't move things around too much, but you can control the rate at which that also works. So you create a placement policy using standard SQL. There's no flags or anything. You don't need to start up or do anything. And I won't go over the SQL, it's fairly straightforward. You create a placement policy and then you can attach it to a table and you can use DDL for attaching that policy to an existing table too. So it gives you the flexibility to do a lot of things and you can drop it and then it goes away obviously. So now we'll move on to the other component of the TyKV cluster, TyDB cluster. And that is the TyKV distributed storage engine. So this is what gives TyDB its scale out property. And the other thing that this TyKV does is it implements a distributed transaction protocol called Percolator that came from, also came from Google. It's not what Spanner uses, this is what TyDB uses. It uses a modified version of Percolator to implement this. All the TyKV nodes implement this distribution protocol because SQL nodes are now stateless. When they talk to TyDB, the transaction part is handled by TyKV. So it simplifies the general architecture. So you can sort of visualize it as a distributed and ordered hash map that is designed for high performance and reliability. So just assume that all your data is one large file that's divided into regions and it's being shuffled around constantly by PD. You can visualize it like that. It's a fairly simple idea, like conceptually. So let's take an example. So let's say you have, can you see it? Can you view the table? So let's say you have a table. It splits it up to three regions. So if you have four nodes, it sort of places them evenly across the four nodes. And you can see that TyKV one is slightly less loaded. So if the new page is created, it'll probably put the leader there and move the other followers onto the other, maybe TyKV three or TyKV four, depending on the characteristics of the node. So maybe TyKV one is a more beefy node, so it'll move more pages there. So it monitors all this, because it knows the statistics and the performance characteristics of each node. So it's quite clever. There can be anywhere you want, it doesn't matter. There can be across continents if you want. But then you have to pay the latency price of, let's say TyKV four is in India, for example, right? And the leader is TyKV one and that's in the US. So the consensus protocol now will have to send messages across geographical regions to guarantee consistency for, let's say, any region that's where the leader is inside TyKV one. So they can be placed anywhere, but then like anything, you can't defy the laws of physics. You have to pay the latency price. So the way that people deploy these things is slightly different in practice. They will never do the kind of deployment I just described. Nobody will do that. They will partition the data first based on their applications for local access and then have as little cross region, cross continent traffic as possible. Just like any other database. I mean, you have to pay the price for anything if you're doing any cross continent. But there's nothing stopping it conceptually from doing so is my point. So one problem you have with distributed storage is that now imagine your SQL node has to work like a traditional SQL optimizer. So if it needs data from, let's say, three, four nodes, let's say you're doing an aggregation. So the last thing you want to do is fetch all the rows into your SQL node over the network and then do your aggregation inside the SQL node. The massive amount of traffic will be extremely slow. So what you do is you do the standard query pushdown. So each type of node also has four types of query pushdown handlers, not four, one, two, three, four, five of them. So let's take the last one, let's take aggregator as an example. So my query has like some select count star across some table. So what it'll do is it'll look for the leaders across the cluster, get them to do the calculation of the data that they have and send the result back to the SQL node and it'll do a very simple aggregation of the results that it gets. It doesn't have to fetch all the rows and do it. So these are, so this is called the TyKB corp processor. So each TyKB node works on the data close to the data. It doesn't, so the whole idea is to minimize moving data around to do results. In reality, if you want to do a count star, you're better off using something like TyFlash, which is like even faster. So it's a column store, it'll just, let's say you're doing a sum over a column, it'll be faster than any row store. But anyway, that's what the corp processor does. It's, and it's inside every TyKB node. It's to reduce traffic and make things faster. Let's say you want the top, you have a limit one on your query. That will also be done inside every node and then the results will go there and then the optimizer will give you one row back. It can, yes. Even for a, one query, it can spread it, it can. So if you don't have TyFlash, then everything goes to TyKB. But if you have a TyFlash, because you want super fast analytic queries, the optimizer can figure out what it needs to do to fetch the data from both the column store and the row store in one query. So there's no MySQL code in any of this. This is just all written from scratch. So it's not, it doesn't have any of those legacy single threaded, single load monolith problems. This is like a true distributed SQL. So I won't go over too much of detail about how RAF works, but this is a rough idea. So the core idea of RAF is to elect a leader and the rights go through the leader. So whenever the SQL nodes need to do any rights, if you have, let's say, three node RAF group, it will only write to the leader because the leader is always known to be the latest data. The follower may not have the latest data. It may be in the process of sending data or whatever it is. The data is not considered durable until a majority of the nodes in the cluster acknowledge the right. So the leader is always the source of truth. And that's where the SQL nodes get the data from and write. Write the right to RAF and RAF then spreads it around. The reads come from the leader node. So this is like a diagram on how it sort of handles failure. So let's say you have four nodes and you can see there's hard beats going into the placement driver, the cluster of placement drivers, and certainly a hard beat for some reason gets interrupted. That means there's a timeout, so it says if it's timed out, it's all configurable and then it says, okay, then that node is down. Let's assume it's genuinely down. Then it'll start initiating an election between TI-KV2, TI-KV3, the blue region, and then move the data to TI-KV1 to resolve the problem automatically. So that's what will happen if it detects a failure of TI-KV4. So you don't have to do anything. This is all happening automatically behind the scenes. So it's self-healing is like the more marketing term of around this. So TI-KV supports read committed and snapshot isolation. So the snapshot isolation is mapped to MySQL, you know, DB is repeatable read. There are some minor caveats which I've mentioned on the website, but they're too minor to talk about here. It should, for 99.99999 of the applications, it'll just be repeatable read. So it uses, I mentioned this earlier, it uses an optimized version of Percolator for distributed transactions. I won't go into too much of detail. And that requires a global timestamp. The timestamp comes from PD and they're also used for MVCC, which is a fairly standard MVCC. One thing that I haven't mentioned is perhaps how data is represented. It's not important, but I'll mention it. So the data is mapped into a key value store. So let's say it has a rough idea. So let's say I have a table T1 and it has an index I1. It'll be written as the key will be roughly, roughly speaking, like T1 underscore I1 so that you can look at the data like that. So that's mapped like that. Also, it can optimize for short queries. So if you're doing single, let's say auto commit, single insert, just let's say, because you're going to just one region or one page, it can optimize that to a single phase commit. But anything that touches two regions but can only be solved by two phase commit. So you have to send it out and then they have to say, yeah, we both have understood. And then it has an optimization to reduce one part of the percolator protocol which I won't go into. So it's optimized percolator. But for single node updates, it can do it in one phase commit. That's easy. That's trivial to do. So it's very fast for short transactions too. But if they are across pages, then two phase commit kicks in. So this is like a classic optimizer. This is how TIDB does it. The only difference between MySQL and TIDB is that it also has a distributed co-processor. And this is the one that question you asked. So this can talk to Tikev and Tiflash at the same time. Because it's aware of the topology. It knows what to do. So that's what that does. But this is a fairly straightforward optimizer in architecture. I don't think there's anything special except for the distributed co-processor. So it's aware of the layout of the data. So this is an example of the question that you asked. This is how it would work in practice. So the executor would figure out which is the, will the fragment of the SQL query benefit from fetching data from the MPP engine in Tiflash? Or will it be easier to get? So whichever parts it needs as part of the query plan, it'll get those parts and then join, combine, or whatever it has to do. It'll do it inside the executor. So that's how it'll actually work. And these, again, so this is a real example of how in a single statement, the table scan will go to the column store. But the other parts will go to Tikev where it needs lookup. But for the scan, it'll go to Tiflash. So that's, if you look at that query, that's roughly how it will solve that query by going to Tiflash and Tikev at the same time. So it's optimized to give you the best, most optimal result if it's the data is on column store and on row store. Okay, resource control. Maybe running out of time, I think, anyway. So you can, I'll just skip to this part where the actual action. So what the thing you want to do with resource control is, there is one classical problem which every database has and that is you have short queries and then you have long queries. And you don't want the short queries to suffer because of long queries because users care about the latency of the short queries. They don't care so much about the latency on what they know are going to queries that are going to take some time. So how do you, this is like a classic universal database problem. So that's one problem. The second is you want to manage multi-tenancy. So you want, these are, this is some application that touches this part of the database and I don't really care so much about this and it can run a bit slowly, but these queries that run on this other schema or database, they should not be impacted and they should be much faster. So this is a concept that IDB has called resource control. So this allows you to have multi-tenancy and consolidate your instances and reduce your cost. That's how people are using it and that's where the request came from. So you want to check your CPU, you want to check your IO, it currently doesn't check network, but it does check CPU and IO and it works a little differently in Tyflash, which I won't go into, I'll just talk about the Tykv part. Also backups, you don't want to run them at top speed. You want backups and any behind, compaction or anything you wanted to run in the background and not chew up all your CPU and your IO. So all of these are scheduled through the resource group. So it's a logical container and there are three important parts to this. One is, let's say the bucket size, more easy to say this. And this is how much quota you give it per second. So that's the backfill. And you specify when you create the resource group. The second is a priority, low, medium, high. So these are, which means when the query is broken down inside Tykv and when the information is sent from the TyDB node, the scheduler will look at the priority and decide whether to give it more CPU time or disk or not, that's basically it. Burstable is interesting. So if your resource group has burstable, you said it to burstable, it means that if there is excess capacity, but this query has run out of its RUs, don't worry about it. Give it any unused RUs, that this is that important. And the other thing it means is that this query is so important that even if it's, the system is loaded, this must execute. That's the second property it has. Don't suspend it because you want to balance the system out. Make sure that if there's unused resources, give it to this, any query that's running in this resource group. And it must run all the time, because it's that important. So that's the burstable flag. So the whole theory behind this actually is from a VMware paper from 2017 or 18, I can't remember. And so it uses a weight and constraint based scheduler. I won't read out that whole text, but just if it is, one is proportional sharing and the other is like a minimum time that's required to run a query. And there are different costs. Like a read is, a write is the equivalent of eight storage reads or two read batches. Those writes are considered more CPU and disc intensive, let's say. And the three milliseconds that your query runs is considered as one resource unit. That's how the calculation is done to balance and schedule everything. So it's fairly straightforward. How do you set it? You do it through SQL again, no special syntax. So you go create resource, it creates a resource. You can even say create a calibrate using the load from this time to that time or from this time and duration 20 minutes. And it'll look at the historical data and calculate and calibrate and figure out what the resource unit quota should be for whatever you're looking at. So you can look at historical data and calculate these things too. So it makes it really easy to set up and use. Then you can find unit by hand if you want. I mean, this stuff is, this is required for any multi-tenant system. This is why in my talk, I wanted this to be like the next generation where my SQL should be going. So how do you do these things? So you can create a resource group. In the first example, it has burstable set and RU per second is thousand, that query can only consume thousand RUs. You can use regular alter resource, set the priority high, change the RU per second, you can drop it. You can say show, select from information schema. So you can do all this using just a regular SQL. It's very powerful stuff. Here's the other interesting part. You can do it at the user level, session level and the statement level. So the granularity goes right down to the statement level. At the statement level, you give it the regular SQL hints through comments. Otherwise, it just works through specifying whichever user you want to set it to. So you can tag queries and say, this one I want to run as per the configuration of that resource group or some other resource group because I don't know, that's just what you want to do. So this is just, I'll hand wave my way through this. So this is how it works. There's the tidy SQL server, the storage nodes, which do the scheduling and the data of the resource group is maintained by the metadata server, which is PD. And so there's the admission layer, then there is the calculation part of it, and then the scheduling is done inside TyGavy. Because that's where you are accessing data, that's where you're spending your time in doing the real work. So the TyDV's calculation or optimizer is not counted in CPU usage. It's only in the actual processing of the SQL statement. So if there's any question you can ask me, but it's fairly straightforwardish. Okay, this is another very interesting thing that TyDV has. So in a distributed database, you have lots of storage nodes and lots of compute nodes. And you want to leverage this for your work. You want to spread the work around. You don't want to be working on one node and all the other nodes are just idle. You want some jobs to be working across your entire infrastructure. So the distributed execution framework does exactly that. Currently we use it for background tasks, but we plan to make this available even to foreground tasks. So you can spread the work around across low loads. Let's say I have 10 compute nodes and I have 20 storage nodes and I want five of the storage nodes to do DDL and I want two of those other nodes to do something else. Then the distributed execution framework will take that task and spread it around. And currently it works for DDL, import, TTL, analyze, and backup and restore. So you can control what each component in your cluster actually does. The detail is like a talk on its own, so I'll just skip that part. Sorry? Yeah, the data will be spread out as by either your placement rules or the default placement rules. But the way that you import data is actually much faster to do it because I'm running out of time. I'll probably show you, because I want you to run this. No, I'll show you. This is just in case I'm running out of time. So you can generate the SST files independently and import them directly rather than insert them row by row. So if you're running large installations, you don't want to be inserting rows through SQL. You just build your SST file through a tool that we have which will build it offline and you just instantly attach it to IDB and it'll spread it across the cluster. So that would be the, if you have large data, if you have like, I don't know, a 500 gigabyte file you can probably shove it in. But if you have like a one terabyte file or two terabyte, this is what you want to do. This is built for very large type of data, but it works even with small data. I mean, it's not that you can't, you can have a three-note cluster running on your laptop and it'll work. But if you want to scale, then you want to do one terabyte, you build the SST file and just shove it in separately. So online DDLs, this is a very big thing for anybody who's dealt with DDL in production and large data. In MySQL, one of the problems is you have to acquire the metadata lock when you do, when you externalize the change of your schema. So what that does is you have to wait for all transactions to finish and it's a single load operation and all new transactions are blocked until that change is done. All transactions are blocked. And on a very busy system, this is a problem, but it's not busy, it's not a problem, because then it's like instant. But on a very busy system and where transactions are slightly longer than, you have to wait for that thing to finish. In the meantime, updates are coming in, the NODB DDL log is being filled up and that has to be applied and all sorts of things are happening. And this can cause a lot of problems in production. I have heard many stories around this. So this problem was solved again by Google, by a paper they published called asynchronous, like it's an F1 online schema change, asynchronous. So the core idea, which I may not have time to go through, the core idea there is rather than have a global lock across your entire thing, you want to narrow it down to, let's say, a table level operation, but even that can have substates, which means let's say state one, you can only do inserts and deletes and but you can't do reads, for example, and then it moves to state two. And at any given time, your entire cluster can only work on two versions of the schema. And so by managing these states and moving the schema along and across these four states, you can, in parallel, do DDL across your cluster with no interruption, no locking. Some of the transactions may fail because they're not allowed to do certain operations on that table during that sub-state, but that's about it. There is no global lock. And I'll go through that a little bit. And the reason it's required for a distributed database is you cannot lock the entire cluster. It's close to impossible. So you want everything to work because now you have multiple schema caches across your entire cluster, which you don't have in a single node MySQL. So what MySQL does is it does a single node operation, blocks, it waits for it to finish, and then it externalizes that and puts it on the bin log, and then it goes to the replicas, and then the replicas have their own problem. And this can cause a lot of replication lag. On a busy system, this is very problematic. So people use external tools, which read the data and send it across the cluster, and so you have to use all kinds of other, and they have, in the end, none of them can actually overcome the MDL lock. That is required no matter what you do. So yeah, so it's different because you have to make sure that the entire cluster works because you cannot do a synchronous update or metadata across the cluster. You just can't. Because everyone has cached that information, and so you have to update all those caches and keep them in sync. So you version all the schemas, and so tie DB as per the original paper. Essentially, when you connect and you're doing some work, every, I would say, transaction, it's a transaction, I think, has a lease on the schema. So what that means is that if you can finish your work within this amount of time, the schema is guaranteed to be in time. It could be an old schema, but it is guaranteed to work. The worst case will be that the sub-state has moved on, and because of the restrictions on that sub-state, you may have to be forced to roll back or retry. But that's about it. But the rest of your cluster that doesn't touch that table or is not affected by whatever you're doing on that table in that sub-state, you will proceed perfectly fine. So the granularity is very narrow in this. That is the key point. If you want the full detail, the paper is actually worth reading, just on its own. So this is an example. I hope I can explain this in the time. So this is working backwards. So let's say we finished the schema. It's public now or externalized, and let's say we'll be in. So you can do select, insert, update, and delete. So these are the four operations that you may or may not be able to do in the other sub-states, depending on which sub-state it is in. That's the key point. And you can only, a transaction can only work on two schemas, versions of the schema at any given time. That's it. And you have a lease, which means if you haven't done your job in the lease on the schema, well, then that's it. The schema has expired and the transaction will fail when you can't commit it. So these are the core concepts on which it works. So we are working backwards from where we started, let's say. So let's say you can do everything once it's externalized. In the previous version, you can't do selects, but insert, update, and delete will work. So if your DML is insert, update, delete, it'll work. So now that state is called write only. So what about the state before that? Again, select won't work. There's actually a very interesting reason, if you are interested in consistency, so isolation levels. Reads are what cause problems, because you've done a read and you've sort of gone and done something with the data. That's why reads are most conservative in this scheme. So you can't read, but you can. I won't go over the details. So one of the things that it's doing in this operation is it can take a snapshot and move it to all the nodes to backfill. That's what backfill is. So when you're writing data, you're writing to a normal LSM, which you can take a snapshot of, it'll take a snapshot of that and move it to the next stage and backfill from there. So insert is always handled. The problematic case is for deletes, and I'll get to it. So when you're in write only stage, you can do all these things. But delete, update will also work, but the problem in an update is a delete insert. So the insert will be handled by the backfill, but the delete will not be. So the delete operation has to be done when the DML is being executed. That's the point here. If it's a little confusing at the moment, feel free to ask me afterwards. It's some subtle points here. So now it's changed the value to from R46 to VR. So that's a delete and insert, basically, right? So the insert will be fine, because that will be moved along in the backfill, but the delete will not be. The delete has to be done. Otherwise, the backfill will not do the delete. It'll only do the insert. That's basically what it's saying. So deletes must happen. Updates must happen because they have a delete in them. That's why they're an insert delete. So that's why the same. But insert at that stage, you can't. When the sub mode or sub state is delete only, you can't insert. So what does the no, no, no, in this case mean? So let's assume it worked backwards. It means now the old version is obsolete. Any transaction that had this version when it started, and now the transition has been done to the new schema, they will all start failing. But only once, and then you retry and it should all work. That's the benefit of this scheme. So you have these states, so now you can think and go this way. So you have these sub states, and each sub state you have to decide in this transition which of these operations will work or not. That's the key point here. And deletes have to be handled immediately, whereas inserts will be pushed forward when you move some of the data into the next state. Anyway, this is a bit, unless you read the paper it gets a bit interesting. You can ask me, I'm happy to share this. So back to the original thing. So there are other optimizations we can generate and ingest files. No need to write to the new index in Type-AV. It has no almost zero impact on, and also with the distributed execution framework you can easily say do it on those three nodes and all the other traffic and go to the rest of the nodes and people run like 50, 60, 100 nodes. It's much easier on CPU and IO. The other thing, the last point there is that it can use the co-processor inside Type-AV to do all this data reading and moving around. It doesn't have to go through the SQL nodes. So it does it at the lower level too. And so this doesn't put any, it reduces the network traffic if it were done through the SQL nodes, because you're operating on a large amount of data. You're not really doing much except moving data around from one format to the other or whatever your DDL is. So all that is pushed down as much as possible into the storage nodes to reduce data movement. So these are some of the timing. So if you have a 10 terabyte index it can do it in 47 minutes. If it's a one column index, if it's a 10 column index it'll take you one hour and six minutes. That's on 10 terabytes on, you can see the config, it's, how many PDs, there are eight PDs and whatever that is. So it scales linearly. So I mean, these are serious numbers. These are not toy numbers. So if you want serious scale and you have these problems you should just go right away and install this in front of it. It really is that good. Okay, so now comes to the last slide. So these are all the tools that we have and they're all open source. And you can easily go and contribute and send us patches for any of these things. It's all open source, no problems. And I'm sure people who work with these sort of things, I haven't. They understand these things more than I do. I work on the database storage side of things. I don't work on the operation side of things. So Sam, my colleague is here. Do you have any questions about any of these? I'm sure he can answer. I don't know much about it. I just know that they exist and that this is how we, our customers are using it. And you can integrate with, with a great integration through TI CDC, the change data capture, then there's lightning that's very useful for importing data from existing systems. So you can import it. We have customers who've done 400 terabytes from Aurora. So I mean, these tools have all been used to do all those sort of things. So they are battle tested. They're not just toy tools for, you know, 10 megabyte or 100 gigabyte. These are like serious stuff. So you can play with it. There you go. So I mean, this is like serious stuff. These are, this is what MySQL, people should be working in for. And they're just obsessed with HeatWave for some reason, which I don't know what the hell they are on about. But this is how distributed database should work. And I used to work for MySQL, and I love MySQL, honestly, I do, I still do. Anyway, so yeah, so these are, when we publish the slides, you can hopefully click on these things and the links and get stuff. And that's again, in case somebody wants to take a scan of this and win this. And if not, there's also a QR code somewhere there. And so thank you. If you have any questions, please ask me. Thank you. Anyone? Yeah, sure, please. Sorry, I need, do we have a mic? I'll just walk up. It's no big deal. The paper, the F1 paper, the schema change, or the spanner paper? Yeah, so that's the schema online chain, F1. I can send it to you now. I mean, I can just give it to you now after this. Sure, that's Google paper. That's a really good paper. Anybody else? Any question? That's it? Oh, excellent. Thank you. Yeah? Sorry? Well, thank you very much. Yeah, I'll just, okay. Do I, oh, I do. See, I'm from Jersey, so when I talk, it's always like I have a microphone on. I was wondering why you came in here. I was like, oh, something's wrong. So anyway, if you don't have a schema, it is sometimes faster for development time, because then you don't have to worry about creating the SQL scripts to add a table or a column. And in some cases, you might need a DBA to verify that the table and column scripts are actually working as they should. It's less verbose than XML, which is saying something, because we all know XML is like this nice terse language or markup language. Pretty much every programming language can read JSON. And I say pretty much, because I haven't used all the programming languages, but every programming language I've used has been able to read it. And I wanna make sure I'm covered so somebody can't go, no, well, this language doesn't. And then probably the most important is the fact that some data is unstructured, which is hard for me to wrap my brain around because I come from a world where data needs to be structured and everything needs to be in a column and there needs to be relations and you need to have data types. And, you know, if it's not that way, it's just drives me crazy. But there are some cases where unstructured data exists. You can have configuration data. The example we're gonna show, it's configuration data for an application I wrote for a golf league. Where, yes, the data technically is structured, but I don't need to have it in columns because I'm never going to query, you know, give me all the golf leagues where the golfers had to pay $10 to join the league. That's never gonna happen so I don't really need to have that in a nice relational table. When I was giving this talk in Montreal last year, somebody said they store exceptions as JSON in the database and then that allows them to go back and use the JSON functionality to query the exceptions better than they could by looking through the logs. And probably the most important is if your application or what you're doing ingest data from a third-party source, a third-party API, you have no control over the scheme that comes back and if the scheme it changes, it's gonna be a pain in the butt for you to take those changes and throw them into any type of relational structure that you may have set up. So in those cases, I think it makes really good sense to use JSON to store stuff as JSON. So the way it's done, the way it used to be done years and years and years ago, is developers would store JSON in either a text char or var char data type. It worked, wasn't great, but it worked. When you search the values, you actually had to use a like search or you had to use regular expressions to search. And the problem with that is you could get inconsistent results because if you have nested objects and the parent object and the child object share a key, meaning the key is the same name, doing a like search could possibly get results that are different than what you're actually looking for and the same thing with regex. And then the other problem is if you updated any value in your JSON document, it updated the entire column. So it was just like doing a regular update so if you were just throwing text in there, it was the same promise or the same premise. So in 5.7 of my SQL, the JSON data type was introduced. It's designed to hold valid JSON documents. That's important because if you try to store something that isn't valid, you'll actually get an error. The data itself is stored in a binary format that has been optimized for replication in quick searches. And also, you can have a defined schema. Now, some people are like, well, if we're using JSON because we don't want a schema, why would we want to make sure that we have a schema? Well, there could be a time where you're logging information from, like the person suggested, you're logging the exception information and you want to make sure that every JSON object that you store in this database has a particular key. We can actually set that up when you create the table. You can add a constraint. Another thing is you might want to be, if a particular key exists, I want to make sure it's a number or that if it's a date, it's before or after a certain date or if it's a number, it's greater than zero. All right, we're gonna play a game. The game we're gonna play is called valid or not. Here's the rules of the game. I'm gonna show you an example of some text. I'm going to point to somebody. That person is going to say valid or not. If you say anything besides those two things, you're wrong. And I'm gonna call on somebody else. So it's either valid or not. And if you get it right, you get a lollipop. So you got one already, in the back. Valid, you're gonna test me here, almost. So that is valid and that is actually just an empty object. Yay, you got it right, good job. Next one. You got a lollipop already. Valid, what is that? It's an empty array, awesome. Heads up. I really gotta work on that. You know what it is? You guys are further away than I think. So we got that right, that is valid. What about this? Wrong, it is valid. That is actually valid JSON. Don't ask me why, it just is. What about this? Well, what are you gonna say? Not, you're wrong. It's valid, no is valid JSON. Okay, well I'm gonna have a lot of lollipops when I'm done. Awesome, what about this? Now what, what'd you say? Single quotes, texts, dates, and keys need to be in double quotes, so that's why this is invalid. Well that was fun, only gave way a couple lollipops. So for the examples we're talking about, so years ago I took over managing a golf league and I wrote a web application to help me manage the golf league and one of the concepts in that league is the fact that we have seasons and each season has a unique set of configuration options and those configuration options have grown over the years. As I've added more functionality and as we've changed rules in the league, I've had to create new columns in the season table to accommodate them, which became kind of a pain in the ass because then I had to update the database, I had to update the getters and setters and I had to do a whole bunch of stuff just to add one property. And then I realized, you know what, I don't need to do that. I can actually take all the properties that are gonna be unique to a particular, or that could be unique to a particular season, create them as a JSON object, store that in the database, then all I gotta do is update the getters and setters of what I already have and if I need to add a new property, I just add getters and setters and be done with it. So this is the scheme I have. You'll see that the one object is a, it has a course property, which is an object that has the information about the golf course, then there is a scoring property that has information about the type of scoring that we use and we have the sub pool property, which is an array of different types of user roles that can actually have, that can play as a substitute in the league. And then we also have use subs, which means whether or not you can use a sub, the league fee, the greens fee, whether or not we use contests, points per hole, golfers per team. There's a lot more, this is just all that could fit on this slide and I'm not gonna go over each one of them, don't worry about it. So creating a table with a JSON data type is very easy. In your column definitions, you actually specify the data type of JSON, right there. So we have a column here called season settings and it's a JSON data type. But what we wanna do here is we wanna make sure that the league fees property is no greater, I'm sorry, no less than zero. So what we do is we add a check constraint and in the check constraint, we call the JSON underscore schema underscore valid function and we pass it this configuration where basically we say the main object has, the main property or the main part of the root of the JSON needs to be an object and that it has to have a property named leagues fees, league fees that is a number and the lowest value can be a zero. So if we tried to save a JSON blob or the league fees property was set to negative one, it would throw an error. And you see here it says season, the season settings here is that's where in the JSON schema valid method, the last argument is what column that we're actually going to be using. So inserting JSON, you might think, wow, that's a lot. How do we handle that? Remember, what is JSON? What, at its basis, it's a what? String, who said that? So you get one now. Yes, we make a good team. It's a string. So that means we can just insert it as if it's a string, just like we would any other string. So whether you're using a JSON object or a JSON string or somebody's first name, we can just do insert surrounded by single text. And you'll see here the result of this. If we do select star from season order by ID descending limit one, we'll get the latest item that came in you'll see on the end here, it's an empty object like we passed in in that insert statement. So before I go any further, I wanna talk about some helper functions that we have available to us. These are native MySQL functions. The first one is JSON keys. This function is awesome if you're not familiar with the schema. So you can actually call JSON keys and pass it in the column name that you're looking for and it will return you an array of all the keys that exist. Now you'll see up here by default, you don't have to have a second argument in JSON keys. If you don't pass in an argument and second argument for JSON keys, it gives you all the keys that are at the root of that JSON object. But in this particular case, we wanna JSON keys for the scoring object. So we do dollar sign dot property name to get the list of keys off of the scoring property which is off the root of the JSON object. Whenever you're dealing with JSON in MySQL and you see a dollar sign, that means you're dealing with the root of the JSON object. There's one exception to that and we're gonna be getting to that shortly. But always remember dollar sign is the root. So again here, we're looking for the scoring property off the root and it's just got the three keys, title, description and handicap type. Now this is a function that you should only ever use when you're debugging or you're in development. Don't ever use JSON pretty in production because it could really screw up your results and the reason why is it puts line breaks in the return so that instead of getting one long array, we get them broken down like this so it's a little bit easier to read in the console. So another thing we use, or I use a lot are path operators and you'll see why I prefer path operators in a little bit because I think they're a little bit easier to read and they're a little bit easier to use. And we have two path operators and I honestly don't think they actually have names. So the first one is hyphen greater than. Now the hyphen greater than is actually a shortcut for calling JSON extract. So if you call JSON extract on a JSON property as we do here, you get the same result. And you'll see here again, we do the column name hyphen greater than dollar sign, course.name. This is actually saying we're going to get the name property off of the course property off the root of the JSON object. But the difference between the hyphen greater than and the hyphen greater than greater than is this one will actually return any quotes that may be around the string or around the property value. So if it's a number, it won't have quotes, if it's a string, it will. If we do the hyphen greater than greater than, that's the same thing as calling JSON extract inside of a JSON unquote method or function. And what that'll do is that will actually return whatever's inside the quotes. And I'll show you what that looks like right here. So you'll see here under course name, because we use the hyphen greater than, it returns the quotes as part of the result. Why they did that, I don't know because I can't imagine anybody would want to use the information in that way, which is why I always use hyphen greater than greater than, which gives me just the text with no quotes, okay? So now we're gonna start talking about querying data based on JSON values. So what we're gonna be doing is we're gonna returning data from the database based off of the values of certain properties in the JSON object. One method we can use is JSON contains, and you'll see the syntax, whoops, sorry, go back. The syntax is here, JSON contains takes three arguments. The first argument is the column that we're talking about. The second argument is the value that we're looking for and the third argument is the path to that property. So what we're saying here is we want to get the ID, the name, and the league fee, the value of the league fee off the root, from season where the season settings column has a property named league fee that equals 70, okay? And as you already saw, this is gonna be the result. Everything comes back where the number is 70. That actually works pretty good, except it works great with numbers and it works great with Booleans. It doesn't work great with strings. And the reason why is because in JSON contains, if we want a string, we have to wrap it in quotes. So here, because we're looking for where the core state is equal to WV, we have to say single tick, double tick, WV, double tick, single tick for JSON contains when dealing with strings. Yes, I had that same look on my face that you have right now. It's weird, which is exactly why. Well, let's show you, this is what the results are, so you see the results are what we would expect. And that's why I prefer to use the path operators to JSON contains or anything like that. And you'll see here is the same exact query I just showed you using the path operators rather than using JSON contains in the weird double quoting of stuff. Let me back up real quick. We'll show you something else here. JSON value is a way that you can actually get the value from a particular property and you can actually use the returning keyword to say what type of data you want it to return as. So by doing this inside of JSON value, you're basically casting that as a particular value, which I would rather use cast because that's what we're doing. And it's less confusing. So you see here, we're casting the value of green's fees as a decimal with four spaces and two after the decimal. And then we have where the core state property of season settings equals WV. To me, that's a lot easier to read and a lot easier to parse quickly than what we saw with JSON contains in the double quotes. Because I don't know about you but when I see double quotes like that, I start to go, whoa, something's wrong. And that's exactly when I first, the first example I saw JSON contains, I'm like, this has got to be a typo, but it's not. So my suggestion is gonna be if you ever want to get to a value of a particular property in your JSON object, use the path operators and use hyphen greater than greater than. Don't use the other one because the other one leaves stupid stuff in there when you have strings. And then here's what the values look like. You can see that it's actually cast as two decimal spaces. So as I said before, when you have a JSON object, one of the properties or the actual JSON object itself could be an array. And there is a function we have available to us that will allow us to query that array as if it is a table. And that is called JSON table. So what we're doing here, and this gets a little complex. So you see at the top, we're doing select ID and name from season, but we're also calling group concat on a property called sp.rolls. You'll see sp doesn't exist in the from statement initially. But then we go down here to JSON table. So as I said, what JSON table does is it will actually allow us to create a join between our table that we're dealing with and the values of an array inside of each JSON object in every row in the database. And this takes a couple of arguments. The first one, excuse me, is the column that we're using. So here you'll see we're doing s.season settings because I aliased that with s because it was a lot of typing and stuff went off the side. The second argument is the path to the array that we want to use. And in this particular case, we wanna use every element, that's what the star inside the brackets is, for the property named sub pool that is off the root of the object. And then the second part of this, or the last part of this, is we have a columns definition. So we use the keyword columns and then we pass in the name that we wanna use, so the first part is the name, the second part is what data type we're going to use, and then the path. And you'll see here, the path is dollar sign dot type. Remember I talked about earlier that dollar sign always represents the root of a document except for one case? This is that case. When you're using JSON table and you're specifying your columns, the dollar sign is the root of the particular item you're looking at in the array, not the JSON document itself. So we have a property called type that is inside of every object in our sub pool array and that's what this means. And then we have roll, again we set it as varchar 20, we set the path to dot name. And then here, we have to alias it, whenever you use JSON table, you have to use an alias if you don't it'll throw an error. Okay, so that's where the SP comes from, that's where this group can cat SP dot roll comes from. And then we go in the where clause, we say where the sub type, so this column equals roll. And then the core city, which is part of the season settings is equal to Charlestown. And then we group all those together so that we get the grouping cat will work. So we say we wanna group it by the ID of the course. And then when we run this, you'll see that for the first group, there's just one roll. But for these last two, there's multiple rolls. And each one of those came from a particular element inside of the array of the property named sub pool. JSON table is very, very, very helpful if you need to search through data that's in a JSON array because you can do like what I just did and say where a value in the array equals this. So updating JSON values. There's three ways that we can update the values of JSON properties in a JSON document. The first is JSON insert. JSON insert will insert a new key into a JSON document. It will not update the value if the key already exists. And you can actually add multiple keys in a single method call. So if you have a single statement, so if you have a bunch of different values that you need to update, you can just call JSON insert once and it'll handle multiple instances of that. So then we have JSON replace. JSON replace will update a value of existing keys in a JSON document, but it will not add them. So if you call JSON replace and you specify a property that doesn't exist in a particular JSON document, it won't add it. And again, you can add multiple keys in a single statement. And then lastly, and this is the one I tend to use most, JSON set, it inserts and updates. So if the key already exists, it adds the new value you pass in, if it doesn't exist, it creates and sets the value. And it's not on here, but you can probably imagine, yes, you could do multiple at one time in one call. So let's show some examples. So here's JSON insert. And you'll see what we do here is we do update season, set season settings equals JSON insert, season set JSON insert, the first argument is the column we're using, so we do season settings. And then we're going to insert the, whoops, I did it again. We're going to insert a league fee property and we're gonna set it to $25.5. So in this middle example, this is the season that we just created in the beginning where we put in an empty JSON object. So now we have a league fee of 25.5, but you'll see over here in an existing one, the league fee stayed the same. It didn't get updated. JSON replace. JSON replace pretty much takes the same arguments where the first argument is the column we're gonna use, the second argument is the property that we want to, we want to replace, and then the last argument is the value. Again, in our new property you'll see, or our new league season rather, you'll see that the golfers per team value doesn't exist, but the golfers per team value over here was set to four. And then on JSON set, again we pass in the column name, the property we want, which again we're gonna do golfers per team and we're gonna set it to two. And you'll see here in the new one that we created, the golfers per team now exists and over here it's actually, it set it back to two. JSON set to me is a little, it's a little bit better because you don't have to worry about, does it exist, does it not exist. You can just set it and take care of it. And then we can remove properties from a JSON object using JSON remove where the first argument is the column name and then the second argument would be the property or the path to the property we want to remove. You'll see here now that the golfers per team is removed from there and it's removed from there as well. When you actually do update statements like this, you don't have to do global stuff like I did here, you can actually add a where clause. I just didn't for the sake of this demonstration or these demonstrations. So this I think is actually really cool in the fact that if you have relational data inside of a table that also has JSON data, you can return all the stuff as JSON. And the first way we can do that is with a method called JSON object. JSON object takes pairs of arguments. So if you're using JSON object, you have to have a number of arguments that is evenly divisible by two. If you don't, you'll get an error. In each pair, the first argument is the name of the key you want returned. The second argument is the value. So if you see here, we call JSON object, we're gonna have a parameter called ID that is equal to the value of the ID column, a parameter called name or a key called name which is the name column, then we have the start date, start date. And then here we have a property called settings but we're returning the season settings which if you remember is a JSON object to begin with. And this is what that would look like for the league that we just created where we only have that one setting in there. So it's smart enough to realize that when we're returning JSON, to return it as JSON when you're using JSON object. JSON array ag, I tried coming up with good examples for JSON array and I really couldn't because everything I wanted to do was usually the result of trying to aggregate values from other stuff. So what JSON array ag does is it will actually create an array of values based off of results and it groups them. And you don't actually have to use a group by for this in this case. So here we say JSON array ag and then inside that we have a JSON object. So this tells me right away that we're gonna be returning an array of JSON objects. Okay? And inside that object we have the ID, the name, and the start date. And then we have from season where ID is greater than 22 and we order it by the start date. Descending. And this is what our result will be. It's actually just an array we can see from the top and bottom of the objects that we specified. Array ag also comes in handy if you have a table that you have a parent child relationship with and you wanna return multiple child data inside of one row or inside of one JSON object from the parent. So you can actually use a array ag on the child data inside of a join or with a join I should say. So this is all good. Being able to search properties on JSON like this I think is awesome but it's expensive. Even though it's been optimized for searching it's expensive to go through each of the objects in a row especially when you consider the fact that each document can be up to a gigabyte in size. So each JSON object column that you have can support each row can support up to a gigabyte of data. Now think about how big or how much data a gigabyte of JSON data is. That's pretty insane. Now imagine you have a system that has even half that size but there's tens of thousands of rows and you need to search by a particular property. There's gonna be a performance hit there. Fortunately, we can index JSON data. So we can actually create an index on a JSON column that is for a particular property. Now has anybody here played with my SQL document store? Awesome. So inside of the document store API there's actually a way that there's a method called create index. That actually follows a different path that I'm showing you here because it has to be backwards compatible with earlier versions. This example that I'm gonna show you uses a functional index which I remember correctly was first introduced and I wanna say 8.0.13. So the first thing we're gonna do is we're gonna run a very simple query. We're gonna run an explain plan on a very simple query. We're gonna select everything from season where the city name of the course in the season settings property is equal to Charles Town. And this is what we get. And the important thing we wanna look for here is the possible keys and the key values. If we had proper indexes on this particular table then those would actually be populated with the indexes that it would use. So the way we create an index, and this is when we're doing a functional index. If you do create index using the API for MySQL document store, it actually creates a generated column and then creates an index off of that. This, we're doing it differently, we're actually creating a functional index of the data. So the first thing we need to do is we need to call cast. And the reason why is remember I told you, the path operator is just a shortcut for JSON extract inside of JSON unquote. JSON extract returns a data type of text. What can't you do with data type of text? You can't index it. So our functional index, we actually have to cast whatever is returned from the course settings where we're getting the course that city as char. In this particular case I picked char 100. Was it overkill? Yeah, probably. I don't know too many city names outside of any place in Wales that would have 100 characters, but that's the way I did it. And then you see the last thing we need to do here is we need to set the collation to utf8mb4 underscore bin. And the reason why that is is because values returned from JSON unquote are collated as utf8mb4 underscore bin. If you don't do this, the index will be created. It just won't be used because the collations don't match. So after we create the index, we do show indexes from season. And you'll see now we have the second row because the first one is the ID. I didn't wanna put that up there. And you'll see down here, you can actually see what the expression is and you can see here where it actually chains that path operator to JSON unquote, JSON extract. And then if we go back and we recheck the explain plan, you'll see now it's actually showing it's using the index that we created. That's actually pretty cool. I actually like that a lot. And could I have created that index using the way it does in the document store API by first creating a generated column and then creating the index? Yeah, but I couldn't do it in one statement. I wanted to show you how you can do it in one statement. So for anybody who's interested, this URL here will take you to the documentation for the, I'm sorry, it'll take you to the JSON function reference for my SQL. Everybody good there? I still see a couple phones up. You want me to keep them out? There's another one coming next. Good? Okay. So one of the things that we as the community team have been able to do is we work with Oracle University. If you've never heard of Oracle University, that's the group that manages our certificate. The Oracle certifications, as well as any training responsible for those certifications. We actually have an agreement with Oracle University that whenever we speak, we can actually offer free training or free certifications to the people who attend our sessions. This fits right into my given away free stuff because I don't have to pay anything for it. But if you scan this QR code, you'll be brought to a page where if you already have an Oracle account, you can use that to sign in. If not, you can create one. And then this is actually a relatively new process. So please bear with the people who get this information because I don't get it because once you take the screenshot and go to the URL, I'm done, I've done my job. And it will actually, I forget how much it is. I want to say it's $200. It could be less than that. But basically you have two months to take any type of training that you can that would match that or register for a certification exam. And it's free, it won't cost you anything. Except for the fact that Oracle has your information. Sorry, can't help you there. So, wow, I did pretty good considering we stopped pretty late or started pretty late. Does anybody have any questions? That means either I did a really good job or I did a really bad job. I'm gonna go with the former because I have a very fragile ego. So, now it's time to give away the free stuff. And again, I didn't pay for this so I don't really care. I told you guys you need to pay attention to everything I say, everything you see. How much does my dog weigh? What was that? 106 pounds, you want the hat or you want the dolphin? This is not gonna go very far. See? Wow, that was like Donovan McNam. All right, so the next question is how many babies have I delivered? You had your hand up? Three. And that is why I don't play baseball. If nobody has anything else, our time here is done. Thank you for spending the last part of the day with us. Well, with me, I appreciate it. And you guys have a great rest of the conference.