 All right, welcome everybody. I'm so glad you're here. I'm going to be talking about adventures in data and leaning on Kubernetes storage to run hundreds of real-time analytic databases. As you'll see in a moment, I am a database person. Let's just jump into the intro parts. My name's Robert Hodges. I've been working on databases for over 40 years. And as a database person, we love storage. Kubernetes has made it very interesting because it's quite different from the storage the things we used to love about storage return things like battery-backed cache and firmware versions and RAID and stuff like that. There's a whole new set of resources that we have to think about. Been working on Kubernetes since 2018. My day job is I'm a CEO of a database startup called Altinity. And what I'm going to be talking about here is actually based on experience in that startup. We are a service provider for Klikas. Five years ago, we made a decision to move our processing, to make a bet, basically, that Kubernetes is going to be the wave of the future for management of data. That has turned out to be true in spades. And among other things, as part of that bet, we built an operator for, first of all, for Klikas, popular data warehouse service that I'll be talking about a little bit here. And then built a cloud on top of it. There's now, at this point, thousands of clusters running on this. We ourselves and our cloud are running about 240 prod clusters, data warehouse clusters, at this time. Some of them are pretty big, sort of 24 nodes, 40 terabytes of attached storage. These are not small. So that's the engineering. And in fact, as I go through here, I will be talking specifically about Klikas so that I know what I'm talking about. But just about everything we do here generalizes to other database systems. So let's just talk a little bit about analytic databases and introduce Klikas. So Klikas is what's called a real-time analytic database. And there's some kind of marketing words here that explain it, but what's simpler is to explain what it does. Imagine a security event, an incident management system. It could be a multi-tenant system, say run by some big company like Cisco. It could be reading tens of millions of events per second that are based on pulling data from various types of sources and looking for trouble. People breaking into systems, calls to servers, DNS requests to servers that are known malware sources, things like that. So very high rates of ingest. You also want to be able to react to things that happen quickly. If somebody does start making those DNS requests, you wanna know about it. Moreover, you want to be able to go through the whole history that led up to that event. So being able to, for example, be able to scan over billions or even trillions of rows, trillions of rows to find out what's going on and zero in on the cause. This is the kind of problem that databases like Klikas solve very well. So what's the architecture to do this? So this is a database that is optimized for reading large amounts of data very fast. So there's a number of features here. I'll just highlight some of the key things. It uses columnar storage. That's not really relevant when we think about the underlying disk or SSD that represents it, but that's a fundamental part of it. In modern analytics systems, we tend to have block storage. Using cloud block storage, for example, we'll talk about that quite a bit, or SSD. And also increasingly you see S3 or S3 compatible object storage used as backing stores for the tables, sometimes for the source data itself. We won't talk about that so much in this presentation. Databases like Klikas are complex distributed systems. So this just shows two nodes and with replication between them. So that'd be one shard of data, two replicas, but they could have 10 shards of data, three replicas each. These are quite large systems that communicate through networks. So that's the architecture. We're gonna be putting this onto Kubernetes and then figuring out how to make it access storage efficiently. So the first thing I'll talk about is mapping that database to Kubernetes. And since this is Kubernetes and it is a database, of course we're gonna start by writing an operator. So when we started this back in 2019, operators were not that common. I don't know when the model officially joined Kubernetes but it was not long before that. But basically an operator, if you haven't used one before, allows you to define new types of resources. We call this a custom resource definition and then supply the operator which is a special kind of controller which can basically carry out operations to make that resource definition become reality. And the thing that's really cool about them is that the resource definition allows you to take a complex system like Klikos, like Druid, like Cassandra and reduce it to maybe two feet of YAML which you can then kind of reason about change easily. And what the operator does is when it sees one of these YAML files come up, it goes out and looks at the corresponding resources that should be there in Kubernetes to make it reality. If they're not there, it creates them. If they are there but different, it changes them. That process is called reconciliation where you're looking at what the desired definition, desired state of the world is, the resources and you make it happen. So the first design question that we had which was relevant to storage was how to even represent a database like Klikos in Kubernetes resources. And I think how many people here are familiar with stateful sets? Everybody, yes. So stateful sets are the sort of standard abstract, low level abstraction in Kubernetes for dealing with data. They have a notion of a fixed name for each replica and then logic to make sure that you can connect that replica to storage and if the replica restarts, reconnect it, stuff like that. So the question is, just starting with a simple example like this where we have a database running in a container and some storage that we wanna talk to it, what's the right representation to actually result down on some worker node, having a process that's talking to some block storage? So what we ended up choosing was to use stateful sets. As I'll describe, there are some limitations to them but what you see here is a stateful set. It defines a pod that represents, in our case, a couple of containers that manage the data or manage the database and it has a persistent volume claim. In fact, it could have a number of them where, you know, because we often have things laid out with multiple volumes but a persistent volume talking to, excuse me, a persistent volume claim that then corresponds to a patch of storage represented by a PV. So that's the basic abstraction but then the question is, should you just lean on that? For example, I talked about replicas. Should we use stateful sets to do that? And it turns out the answer is no and this has to do not just with storage but with the way that databases work in general. And the thing is in clustered databases, even though, yes, these things should all be, you know, these things should be cattle, not pets. I mean, in the database industry, we've kind of resisted that. They in fact are more pet-like than you would like. Database replicas often have different resources. Like they could even have different, differing amounts of storage, different types of storage, different distributions of volumes. They could have different software versions. They could be in AZs. A very common thing in databases is to, particularly clusters is to have some nodes be dedicated to ingest the data and the other ones dedicated to reads. For this reason, you have the potential of a very asymmetric architectures. And as a result, we made a design decision, very early on, to go ahead and use stateful sets because they have a lot of good properties and they do map the pod to the storage in a fairly convenient way. But we just use one stateful set per database server. And this doesn't seem very exciting now, but you can believe we spent like literally weeks thinking about whether this is the right thing to do. And then once we did it, we stuck with it. So that's the basic, that's the basic map. And what that means is that the CRDs, and what we did further beyond that is that when we defined our CRDs, we actually imitated the good parts of stateful sets, which are the pod template that allows you to define the pods. And it's very flexible. You can do whatever you want with the pod definition. So when you look at our CRDs, you'll see we refer to pod templates. And then we also have, stateful sets have volume templates. We have a volume claim template, but it used very similar syntax, almost identical to a stateful set. What that means is you can configure the pods and the volumes virtually any way you want because we allow all of the relevant syntax. So this is the top level. I don't show the pod templates yet and the volume claim template, because I'm just showing in a couple slides from now, but we basically, in our CRD, the key point is we then imitated the stuff you know and love from stateful sets. So I'm gonna take a quick detour here because now we're gonna start to talk about storage in a little bit more detail. And if you were going to EBS, let me ask you a question for the crowd. If you were going to EBS, you're doing a rewrite performance test, and you were going to SSD, which would be faster? How many people would vote SSD? And let's say NVME SSD, and it's locally attached. How many votes for SSD? How about EBS block storage on Amazon? One brave soul. You won, okay, let's, so let's look at this. And this is relevant because it then leads us to some things like separation of storage and compute. So here's the basic architectures for a database that's gonna be talking to, in this case, mapped to a process running in an i3-4x large VM which is sort of an older Amazon VM that has attached NVME SSD, and then another VM with elastic block storage or EBS. And what's kind of interesting is when you test these guys, here's a, there's a popular test called Clickbench. It's used to do benchmarks against Clickhouse and also other databases. When I run this test and allow the storage to be cached, in other words, to run it multiple times so that the pages are pulled into the page cache, one really interesting result that pops up is that that EBS, the VM with EBS storage is universally faster, has higher performance than the NVME. This is a test where smaller is better, the red line is the EBS VM, and it's faster in every single case. And so that's just one simple example of how sometimes storage behavior in database is not quite what you think. And in this particular case, you know, we ask, you know, why is it faster? Well, it turns out since we're mostly reading out of memory, it's clock speed. The i3 is kind of an older Xeon. It's got a clock speed that's about two thirds of the speed of the M6i, and that accounts for probably most of the difference we're seeing there in these performance test results. Now you're thinking, I came here to hear something kind of obvious, but what's interesting is if we test it again, and even if we force direct IO, we actually get similar results, which is kind of surprising. This is another run of the same test, and actually we have the NVME direct, that's the blue line, the red line is EBS. You can see they're pretty much the same, not a lot of difference. We even threw object storage in there, S3 storage, that's also really fast. And part of this is because databases, like Clickhouse, are designed to dampen out the effects of storage. So that's important because it means we can use EBS, block storage for these database clusters. So that's a kind of interesting finding, something again was not obvious to us when we got into this, but then has major consequences going forward. Let me just dig into one of them. Separation of storage and compute. So in modern database systems, as much as possible, we want to break the bond between the storage and the compute that's applied to do things like calculate page bounce rates or time on page or sort of histograms, things like that. These are relatively compute intensive, and depending on how often you do them, you might wanna have powerful VMs or you might wanna have weak VMs because there's a huge cost difference. And what's cool about Kubernetes is if you use block storage in the cloud, you basically get separation of compute and storage for free. And so we lean pretty heavily on this. So this is, some people will say, oh, that's just vertical scaling of database nodes, but the fact is it works. And it means that you can choose, by choosing the node size, the most expensive part of many of these database systems becomes variable. So here's a simple example. What we do, and this is actually really important, is we in databases tend to like having a VM to themselves because they like, for example, they make assumptions that they own the entire page cache and that they have access to all RAM that they can see on the machine. When you're running in Kubernetes, Kubernetes does not hide the fact that if you're in a container, Kubernetes doesn't hide the fact that you may have 16 gigs of RAM. Everybody can see that. So somebody has an assumption that I'm gonna use 90% of that and you have a couple of them running, they will contend with each other. So we run them each on a VM and then of course we have the attached storage which we're gonna raise as they need more data. So how do we do that? Well, that just kind of, if you know how compute and storage are allocated, here's an example. We set up the Kubernetes clusters and first of all, for the VMs, we use a provisioner. In our production systems, we typically use node groups. Our cloud guys like them better. Personally, I use Carpenter because it's very flexible and you install it once and it works for all VM types. But the other thing that's important for storage is that of course you have the, in modern Kubernetes, particularly managed Kubernetes, you use a CSI driver, which then has a connector provisioner that can allocate EBS. What you actually get allocated, the properties of what gets allocated are controlled by the storage class. So the storage class will say things like, oh, it's GP3 storage on Amazon EBS and you can set the number of IOPS, you can set the throughput up to a gigabyte per second so on and so forth. So this is how storage gets allocated and if you set this up properly, it then becomes very easy within an operator like ours to use this well-known label, the instance type, and just say hey, give us m5.large. So this doesn't, again, it's not very profound but it has incredible effects for databases because it allows us, this could be a very large cluster. Like it could have storage that holds a trillion rows of data. When I'm developing, I'll run on this instance type and then what I'll do is I'll flip it to like an M6I 8xlarge when I wanna do performance testing. I just make that change and the operator will take care of basically reallocating the pod with a new definition and boom, in the background, old VM drops, new VM comes up and it now has extra compute horsepower applied to storage. So this is really, really nice. In Kubernetes, it just falls out of the box. I'm making this simple because normally I would do some other things. Like I have the node selector and you can see I've got a selector for zones to put it in a particular availability zones. I would also do things like add anti-affinity so that I'm not going to have two click house servers trying to be on the same VM at the same time. I would probably also, in a heterogeneous system, I would probably add taints and tolerations so that I could make sure I'm also not contending with other services that aren't click house. But this is all very, very easy to do and it's just a few inches of YAML. So, and then the volume claim templates which are the other part are equally easy. So we can just pick the storage type name. You can, if you're just playing around, you don't have to include a storage class but in this case I want to be sure I'm getting GP3 because this is good storage, very performant. I want it to be encrypted and then there's the size. That's again all I have to do to allocate storage. Moreover, let's see if I have the, I'll get to this in a second. I can now extend this storage while the system is running and I'll talk about that in just a minute. So, again this is kind of cool because these are things that are actually essential to making these large analytic systems work efficiently, particularly cost efficiently and they just fall out of running on Kubernetes. We do have users that are still doing this through Ansible on VMs and believe me, every time I work with one of those folks I get a syncing feeling because we're gonna have to help them get their Ansible scripts fixed so that they can deallocate VMs and then remount them on the new VM when it comes up, make sure everything gets properly installed, make sure all the mounts come up, it's a total pain and Kubernetes just makes it happen easily. So, with that, we can now map the servers, as I've said, precisely to the VMs and storage and as I said before, if I want to change it, I just change that value, resubmit the YAML and it happens automatically. So, there's another cool thing. I mean, it's just like all these little tricks and another really cool thing that drops out of this is because we're using the stateful set, do you guys know this trick about dialing the replicas to zero? Anybody do this? Okay, if you have a stateful set, it has this setting here, number of replicas and that's the number of pods that gets started, each of which is attached to storage. In our case, it's normally one and it was that way for a while and then we discovered this trick that if we wanted to just shut the pod off, like, hey, I've got storage, but I'm developing, I'm working nine to five and I'm going home now. Let's just turn this thing off. You can just dial it down to zero and at that point what Kubernetes will do is we'll just shut the pod down but it'll keep the storage alive. So, this is ideal. So, we use this to implement what we call uptime schedules. These are similar to what Snowflake does if you've ever used Snowflake Data Warehouse. You can have a schedule to run your virtual data warehouse. You can also do this in Kubernetes very, very easily. That's important because actually what we're doing with this open source database and Kubernetes with the operator, we're actually helping people. Part of our goal here is to help people build proprietary or sort of custom equivalents of Snowflake that solve particular problems and Kubernetes makes this work very, very well. We also, I mean, you can just go patch the replica set and do that but to make this convenient we actually added something to the YAML that we consume in the CRD. So, we just say stop equals yes. That turns off the compute and then you can go home and you don't run up a big bill. So, and that's the effect. You'll just run it and basically what'll happen, your stateful set is still there. Your persistent volume claim is there but the pods, they just vaporize until you turn them back on again. So, but we're not done. There's still more tricks and we've got 12 minutes. So, that was, we were feeling pretty good about this but actually GP3 when it came along, how many people are familiar with the difference between, how many people use EBS? So elastic block storage, okay, about half of you. What are you, the rest of you on Google or Azure? Okay, I see somebody nodding when I said Azure. Yeah, so GP3 is one of these things where you wouldn't think this would be a big deal but people were probably lifting beers in, it's really good block storage. It's way better than the previous generation of elastic block storage, which is GP2, includes things like the ability to set the actual throughput that you can dial it up and down. So you don't have, it used to be that EBS had these kind of weird, sort of weird rules where for example, in order to get adequate throughput we would end up stacking EBS volumes. So, because for every volume you added to the VM it would give you more bandwidth. We don't have to do that anymore. The throughput is, you can just dial it up and down up to 1,000 megabytes per second. You can set the IOPS. And there's also allowing volume expansion. I believe that one existed in GP2 already. In fact, I'm pretty confident it did but the interesting thing is these can be set in the storage class now. So this is a storage class and this is the definition of the storage, the class of service of the storage that you're gonna get when you refer to this GP3 encrypted. There's just one problem. That is that these parameters, particularly those that list of parameters right there you cannot change them once they're set because CSI drivers that the CSI interface doesn't know about these things and there's no way to get them from inside Kubernetes out to Amazon where they can get changed and actually do you some good. And people do change this stuff. I mean we have people who'll allocate sort of the lowest level of storage performance and then they discover things are slow, they wanna change it. So we need to give them a way to do that. So, what we wanna do effectively is have a simple way of changing these parameters on demand and so what we did was we applied a kind of idiomatic pattern to use from Kubernetes, which is to build a controller for it. And so what we do when we make our persistent volume claims now is we add a special labels or special annotations I should say. And this is an annotation. You can see it has our name, it's called altinity.com on it and it says throughput 1000. This is a request on the persistent volume claim to go make this have higher throughput. Kubernetes doesn't really know what this does. Kubernetes annotations, you can do practically anything you want. So Kubernetes ignores this but our controller that we built does not. So we have a controller called the EBS params controller and what it does and this is kind of an expansion of the model I showed a couple slides back is we have the persistent volume claim. It points to a storage class. That storage class then is the EBS, the CSI driver sees that and it actually creates the volumes. That's, that gets you the initial volume. But what we have is the, we also have these annotations which are on the claim. The EBS params controller is watching and every time one of these claims changes, it will go out and it'll take the annotations it finds and it'll actually apply them to the EBS service directly. So that we can now make these changes to elastic block storage very simply from within Kubernetes. It gets better. So not only have we changed the values and this is totally simple. I mean it wasn't really, it's the, I think the controller is about this much go code. It's really pretty small. One other interesting thing it does is it reads the current properties back from the storage, or from the volume. And this is, turns out to be exceedingly useful because if you wanna make changes to EBS volumes, you can only do them, like expand it, for example. You can only do it once every six hours. So one of the things we do is collect the state of the current mod, you know, is it completed or not? When did it start? When did it end? And this allows us for management purposes, again, from within Kubernetes to keep track of what's going on out there in Cloudland and sort of manage it correctly. So I've used Amazon examples throughout here, but this controller we're actually gonna be using, we're now certifying Azure. We have users actually already on it, but we're certifying this controller to do the same thing as Azure. It turns out that Azure has similar need to control storage. So this is kind of a cool trick. One of the things that follows from it is that every now and then we have users who say, hey, you know, I've got 24 volumes, I need them changed fast. Well, you can now just by applying these labels, this is a kube-cuddle annotate command. We can apply that annotation, sort of lock, stock, and barrel to, for example, all of the persistent volume claims that have the label my-click-outs, and so we can apply this to all of them at once. So this is very useful if we need to do something quickly. Otherwise, we would have to go and sort of do this in a much more cumbersome way, so we can just get it applied instantly. Okay, so how are we making good time here? So there's a final trick, and this is kind of interesting. One of the great things, in addition to the fact that you can reattach block storage, one of the other cool things about it is that in Amazon, in Google, in Azure, you can also extend its size on a live system. And that's totally cool because the most common configuration problem that we have, a resource problem we have in data warehouses, is people are ingesting more data and they go up, they get 70%, if they disk allocated, 75, 80, 85, around the time it hits 90, then we're starting to get nervous because they could run out of storage. And then things will stop, they will call us, they will be mad, we won't be happy. So what you can do is, of course, you can fix this in, like if you're just using a stateful set, you can go ahead and fix this by changing the storage allocation in the volume claim template. And that is fine because it actually gets the job done, but there's a problem. And in stateful sets, when you do it that way, what it's going to do is it's going to alter your pod definition and it's gonna cause the pod to restart. And that's bad because why is it bad? Well, it could be really slow. Data warehouses, when they open up, in fact, databases in general, tend to have to do things like go have a look at all the files that they're processing, they may need to rebuild caches, things like that. That's slow, you really don't want to restart a database. So one of the things that we recently added back in May is we've actually taken the responsibility, we give users the option of taking the responsibility for managing storage away from the stateful set and we just do it for them. And so what happens is this option right here says, hey, the operator is gonna manage the storage for you, we remove the template for it from the stateful set that we generate and we do it in the background and as a result, these operations can be taken without forcing the pods to restart. So that's the tricks that we've learned so far. I think there are others, but these were the main ones. Let me just give a few final words. I think we're right on time here. So learnings, Kubernetes is great for running databases. I want to emphasize that and operators are probably the biggest single reason for the greatness. But the Kubernetes, I think another sort of secret power or not so secret power of Kubernetes is this, it does work portably across many environments. We operate in multiple clouds. Our users want the ability to run these clusters, everything from mini-cube on a laptop all the way up to clusters with dozens or even hundreds of nodes in the cloud. Kubernetes does that. And so the things that we sort of got out of this experience are first of all, build on the existing Kubernetes resources wherever possible. We went with stateful sets. I think in the end, we're sort of taking things out of stateful sets and we're at a point where you might say, hey, maybe we should have just had a different, you know, just implemented our own controller, our own specific resource for this. Other operators have taken that, but we went with what was there and that had the advantage of making it pretty flexible and easy to understand. Performance is important. There is no cost that we know of to using, to running things in Kubernetes as far as performance is concerned. This was something we were, I should have mentioned this earlier, we had concerns that somehow Kubernetes, you know, maybe because it's using overlay networks or something like that, that it would affect performance, it does not. And certainly it has no impact on a process talking to storage. Because, and in the end, that's all that Kubernetes is managing. It doesn't insert anything in the way. But you definitely want to test performance carefully because how performance behaves with a particular application like a database may be very dependent on the use case and the way you're accessing it. This Kubernetes plus cloud block storage, that gives you separation of computing storage. That is such an important property for large systems. The ability to scale the compute, which is often the most costly part of these systems, to scale it up and down efficiently and using an operator to do so without taking apps out by using rolling upgrades. That's a really wonderful feature. And actually it's been one of the things that's allowed us to build a business on this. And then there's just storage, of course, is complex. There are these extra parameters, there are these tricks you have to, you know, things that you want to avoid like restarts. Fortunately, there's idiomatic ways to get around these within Kubernetes. It's a very flexible model. I think this, the controller pattern, that idiom of using controllers with, you know, the look for special labels is a very powerful one and can enable you to sort of hack around, excuse me, to engineer around limitations in APIs. So where are we going next? I mean, the current stuff we've got works pretty well. There's two things we're looking at. First is object storage. I haven't talked about it much in this talk. And it actually isn't really something that Kubernetes has that much to say about because object storage like S3 is just another application service. Runs on the same network as your applications are using to communicate. But it does have some interesting storage implications because object storage does require, it does require caches to use it efficiently. You don't want to, typically, you do not want to read everything from object storage. You'd rather, you know, if you're repeating queries, suck it over, you know, put it on fast storage locally and then process the blocks there. So we will be coming back to NVME SSD. That will be more prominent in our architectures and this is why. The other thing is we'll probably be, we're back up as a big issue in all databases, in data warehouses. It's kind of an interesting problem because data warehouses often have such huge volumes of data that maybe you just don't bother to back them up. It just adds so much cost to your system. Maybe what you do is recreate stuff from source data if you have it. But we are definitely very interested in disk snapshots or volume snapshots which are available in Kubernetes. Other operators have applied them very successfully. So that may be another place, sort of a new place that we're going in terms of volume management. Some references, by the way, these slides are up on SCED. I want to give special thanks. You know, I mostly work with spreadsheets and SQL and stuff like that. Alexander Zaitsev is our CTO of LAD. Klamenko is the engineer that wrote the operator. These guys are great engineers and have figured out they and other people on the cloud team are the ones who really figured out these ideas. And that's it. Thank you very much. And we don't have another talk so if you want to ask questions, you can just step up to the mic or you can, and we can go outside after a while, but I'd be happy to stay around and take questions. Please go ahead. Yeah, my question is with the use of EBS or block storage, how do you deal with challenges around multi-zone support and dealing with failover? At least I heard EBS will be multi-zone in the future but at the moment it's not. Well, actually on Amazon, EBS is, it's interesting. So Google Block Storage actually does have the ability to replicate to other locations. We don't use it. It turns out that ClickHouse data warehouses are pretty good at replication themselves. So we just spread, we just use the pod definitions to force them to particular availability zones and just make sure they're spread out evenly. So we'll just, we can operate perfectly well over AZs. Thanks, if I could ask one more quickly. The data, the benchmarks you showed, S3 was almost close to EBS, which was surprising. So how do you explain that? Yeah, how do we explain that? Well, you know what, it's a really interesting result and there are two things actually to notice about that. I'm gonna flip back to it. One thing is S3 is pretty darn fast. This is something that people don't realize that, like for example, S3 and EBS actually compare pretty favorably and the network performance on Amazon is very good. So that's one of the reasons. Now one thing that's interesting, the yellow graph, this is something where you actually can see a difference. All the little queries, so this is a log scale. So the littler queries, you tend to see a bigger gap between the S3 line on the larger queries you don't. What you're seeing there is the latency to go fetch data off S3. There is a real cost because it's a longer hop to get to the S3 data and get back and you actually do see that in these test results. The other thing is that, yeah, so S3, where it gets expensive is that S3 doesn't have a buffer cache. I mean, you have to write application logic to pull the data down and cache it in order to use it efficiently. So there's definitely, there's no free lunch here, but these are relatively, queries on a relatively small amount of data. I think it's about 100 megs or something like that. And for that, S3 is very, very fast. Yeah, so a comment here, yeah, with S3 you get P90 outliers, yes you do, or P95 or P99, and yeah, anything that runs on a network. Moreover, S3 contends with other stuff your application is doing on the network. That's a key thing. EBS, I think this is certainly something I didn't understand well, but EBS is a SAM. And so there's a separate network. The EBS, you will, and Amazon in particular, is very good at allocating bandwidth. So you don't get contention with other things the application is doing. Do you wanna step up to the mic and just ask that? So we get it. Are these S3 numbers through a VPC endpoint coming from your VPC over Amazon's private back plan, or are you going? We didn't do anything special. I just set up a VPC and I'm going straight to Amazon. There's no endpoint in there. Yeah, S3 is, yeah, there's S3 has a bunch of, this is why I'm not doing this talk. So yeah, so this point, yeah, there's special costs around S3. I think one that people really underestimate, and this is another reason why Cache's API calls can just blow you out of the water. Normally people look at S3 and what's attractive about it is it's internally replicated. You only keep one copy compared to block storage where we replicate maybe two or three ways. So already that's a huge difference in cost. But then of course you add in the API calls and maybe it doesn't look so good anymore. Get another question? Yeah. Yes, I have a question. Does alternative cloud run on top of Kubernetes by using your operator? Yes. Can you repeat that question please? Does alternative cloud run on top of Kubernetes? Yes. Cloud, your alternative cloud runs on top of Kubernetes. Oh, does it run on top of Kubernetes? Oh, yes, I'm so sorry. Yes, completely. We do 100% of our processing is on Kubernetes. We don't do anything on raw VMs. And so if we want to use alternative cloud in AWS. Yes. And do we need to build our Kubernetes cluster or not? All right, you can if you, but only if you want to. So the question is do we have to force our users to build their own clouds? No, what we do is by default, if you use our cloud and there's other clouds do it the same way, we will build the Kubernetes in our own account and you just get an endpoint. And under the covers is Kubernetes, but you don't know that it's there. What you can do, and this is I think another place where Kubernetes is very powerful is because it's portable. We actually have a model where we can just, you can pop a container into your own Kubernetes and it will form a management connection, a secure management connection with our cloud management plane. At that point we can manage data warehouses in your Kubernetes. This is how we bring up people on Azure. So we don't formally certify Azure yet. We're getting there where we don't run it in our own account is what I should say, but we can allow people to run it in their accounts. Okay, thanks. You're welcome. Yeah, that's a great illustration of portability of Kubernetes. So now that we're here on the KubeCon, you mentioned all you have to do to go around the parameters that you specify in the storage class because you can not change the storage class first and they will ignore changes. So how do you think this can be modified to be able to do what you have done but without having like a... You know, what I would expect is, and I don't know the CSI interface very well. In fact, I don't know it at all. I normally work in the database, but I would expect it to be some sort of custom parameters argument that you could pass in so that you could in fact control, like have a way of taking those parameters and just pushing them through. That would, yeah. So, all right. I'll take one more question and then we'll go outside. Yes, so this is modern ClickHouse. I hope that's okay to ask. So one of the things we were evaluating and I wanted your input was one is, you know, ClickHouse being used standalone versus ClickHouse being used as a query engine with the data in Parquet files in S3. You know, these are fragmented partition data or chunking in S3. So your thoughts on one versus the other? Oh, you mean using it in Parquet versus using it in ClickHouse? One is used ClickHouse standalone for everything. The other is used ClickHouse as a query engine and data is in fragmented chunking to Parquet files in S3. Yeah, you know, that's a great topic. Like the difference, and this sort of extends into object storage, the difference between having storage, you know, using things inside a table versus external data. We're trying to merge those models. And in fact, one of the things we were even talking about yesterday was to have a Parquet basically part type so that ClickHouse tables would just store the data in Parquet. We can read Parquet very well. And what we're seeing is that in analytic systems, this is sort of a secular development across many systems, is that Parquet is the favored long-term sort of read-only storage type. And people put it out on S3 and then you wanna have your data warehouse be able to read it because it's fast and can do real-time query. But at the same time, you'd also like it to be accessible to, for example, your machine learning and AI. So there is ongoing work in ClickHouse. I think the most important thing in ClickHouse right now is just to make Parquet reads fast. And we have seen that things like predicate pushdowns, row groups are done better in Parquet than ClickHouse. No, actually, well, okay. So, see, Parquet is just a format. And so predicate pushdown is something that, which is, in other words, taking a condition and sort of applying it within the Parquet library. Yeah, it's just a matter of how well the database uses those features. And right now, there are some databases like DuckDB that are really quick on Parquet. Our first priority is catching up. Great, I'm gonna call it good because I think our AV guys gotta go home. But thank you so much, everybody. Feel free to come. Yeah, and thank you all for coming.