 Hello, everyone. We're here to give you an introduction of Vitesse and present to you a real-world usage of Vitesse. So I'm here with Arthur, who's a software engineer at GitHub and also a maintainer of the Vitesse project. And my name is Florent. I'm a software engineer at PlanetSkill, the company behind Vitesse, and I'm also a maintainer of Vitesse. So today, we're going to give you a brief overview of what is Vitesse, try to explain how does it work, why we have Vitesse in the first place. And then Arthur will move on to Vitesse at GitHub. So what they had before Vitesse, why did they decide to move to Vitesse and how does it look now that they have Vitesse instead of the old solution. Then we'll move on to new and upcoming features. And we'll finish with some questions, answers and resources for you to learn more about Vitesse. So before I get started, I just want to do like a quick survey. Who knows, like who have heard about Vitesse before in the sense like, did you play with Vitesse? Did you touch Vitesse in the past? Just raise your hand. Okay, cool. That's a lot more than I expected. I'm happy. All right. What is Vitesse? So Vitesse is a scalable, distributed cloud native database system built around MySQL. The main goal of Vitesse is to be a seamless replacement for your MySQL. And it is part, sorry, of the CNCF. It is a graduated project. I think it reached graduation around 2019. And it originally started as a scaling solution for MySQL at YouTube around 2010. So it was created at YouTube. It grew in YouTube and in 2018, YouTube donated the whole project to the CNCF. Vitesse is massively scalable. This is thanks to sharding. And it is also highly available. So we run MySQL in replicated mode. So we have primaries, we have replicas, which means that whenever you have a node that like a primary node that fails, we can fail over to the replica, which is why it is highly available. Vitesse is compatible with MySQL 5.7, MySQL 8.0. It used to be also compatible with MariaDB, but since the two code bases diverged, MariaDB and MySQL, we prefer to focus only on MySQL. Vitesse is used by many large deployments, small or large deployments over the world. It's not all the names. I know that Activision is also using Vitesse. Just a quick survey. Who is using Vitesse in production here? Okay, cool. They're GitHub and Activision. Cool. All right, among all of these, we have some key adopters, Slack, who's 100% on Vitesse. They wrote a very good blog post, which is linked right at the bottom of the slide, explaining how they migrated, why they migrated, etc. GitHub, which we'll talk about in a moment, gd.com. This is a very important Chinese website, like commerce website, and they have more than 10,000 databases running in production, and also PlanetScale, they have also more than 10,000 clusters of Vitesse running in production. So obviously Vitesse is an open source project. That's why it's in the maintainer track. So we have 15 maintainers contributing to Vitesse on very regularly, and in 2022, we had more than 200 contributors from which more than 100 were caught contributors. All of those contributors came from 57 different companies, and the caught contributors came from 22 companies. All right, before we go into the more technical part of the talk, I just want to introduce two keywords about Vitesse. So the first one is key space. A key space is the equivalent of a MySQL logical database. So I don't like you can have a key space user, for example, where you're going to have like a bunch of table around user, user data, etc. And a key space can be composed of one or more shards. So a shard is basically going to be a subset of all your data inside the key space. And a shard is composed of a primary and one or more replicas. This is a simple graph or diagram of an architect of the architecture of a Vitesse cluster. On the right, you can see shard 123N. And these shards are composed, like I said before, of primary and replicas. One primary, for example, is composed of the MySQL D instance, where you're going to have all the data stored, and which is going to act just like a MySQL D. And attached to it as a sidecar, we have Vittitablet component, and the Vittitablet is responsible for managing the entire MySQL D instance. Vittitablet also exposes a GRPC API, which will be used by Vittigate, which is the small component in the middle, to receive query and instructions. All right, so Vittigate. This is the most user-facing component of Vitesse. This is where you're going to send queries. This is where you're going to send instructions. It exposes an GRPC API, and it uses the SQL protocol to talk with applications. Whenever you send a query to your Vitesse cluster, it will go directly to your Vittigate. And Vittigate will parse the query, evaluate the query, and then send the query to the correct shard and correct tablet. In yellow, we have the control plane of the cluster. So we have Vittistld, which is the administration tool of Vitesse. We also have Vtyork, which will allow you to repair any failure in your cluster. And Vitiadmin, which is a front-end UI that allows you to visualize and manage your cluster. Finally, in red, this is the topo server. So those can be like ETCD, used to be console, but now we're deprecating this. And it can also be ZooKeeper. So this is where we're going to store the metadata of the cluster and the configuration. All right. So why do you want to choose Vitesse? What's the main features of Vitesse as compared to Vanilla and MySQL? So we'll try to be as compatible as possible with MySQL, because the goal is to be like a seamless replacement of MySQL. We have the reshorning feature, which allows you to partition your key space into different shards. We also have materialization, which is the same as the SQL materialization. But in SQL, you have to manually update your view to refresh your data. And Vitesse will just do that for you automatically. We also have cluster management. So we have a tool that allows you to manage your cluster. We have online schema changes, which allows you to do non-blocking and non-blocking schema changes. So you can do like a big alto table and you won't have any down time, or maybe like a few seconds, or maybe like a minute. We have seamless backup, recovery operation, query consolidation, which is in the VTTablet. Whenever you have the same query multiple times and concurrently, we'll just execute the same query only once, get the result and return it to all the requests. And finally, we have automatic failure detection and repair. That's thanks to Vterior. I don't know if you've attended the talk of Activision yesterday, but they talk about them trying to destroy the closer, and the closer always being repaired. That's thanks to this. Now I'm going to hand it over to Arthur to talk about Vitesse at GitHub. Okay. Hi. So before I'm going to start to talk about Vitesse at GitHub, I think we should first look at how MySQL is being run at GitHub. So at GitHub, we have a fairly standard MySQL setup. So we have around 80 clusters running on across like 1,200 MySQL instances. Those are all bare metal. So this are like bare metal hosts that we are running. We have, I think like 300 additional hosts that are running on Azure VMs. Our MySQL clusters are usually grouped into like feature-based clusters, like actions and checks are like stored in one cluster. We store issues and pull requests in one cluster, or shared clusters where like data from different features is stored together. And sometimes like we do joins between tables of different features and stuff like that. At peak, we have around like five million queries per second going to the replica that we run. And we have around 500,000 queries per second going to the primaries of those clusters. So we basically have a very read heavy load on our database clusters. But it's not true for all of them, right? Some are more write heavy, some are more read heavy, but overall it's quite read heavy. In total we store 330 terabytes of data across the primaries. And that data is then obviously replicated to all the replicas that we have. So our scaling strategy for MySQL so far consists of like of different things that we could do. Like when we created new features, we would set up like a completely separate cluster to make sure that like the new features don't run into issues with like existing clusters where other features might live and there's like no noisy neighbor issues or anything like that. We also spend a lot of time breaking up existing clusters. So I just mentioned that like issues and pull requests live in their own cluster, right? But like I think four years ago I actually worked on a project where before that user data, repository data, issue data, pull request data and like a whole bunch of other data was all stored in one cluster and that cluster just couldn't maintain that load anymore. So we basically went back and like did like brain surgery and took like some tables out of that cluster into like a separate cluster to spread the load across like different database clusters. Another strategy that we employed quite often is like adding more replicas. So we try to send as much read load as we can to replica instances. And if you have a read heavy cluster, this is like perfect. You just add more replicas and the existing replicas that you have become more healthy because they need to serve less load, right? But often if you have problems with the primaries, the only way to scale that is by just switching out the machines, right? And often it's really hard or like sometimes it's wasteful, right? Like you have one machine size and then you reach the limits of that machine and then you need to upgrade every machine in the cluster to go to like a bigger machine size and then suddenly you have like twice the amount of RAM as before but you only needed 10% more, right? So there's like some problems there. But that was like one strategy that we employed. But then eventually we also ran into problems with those scaling approaches, right? Like as I mentioned before, better hardware is more expensive, right? And if you have to go to like really, really big hardware, it's getting really, really expensive. We also ran into like really unbearable schema migration times. So we have a tool called Ghost which is an online schema migration tool for MySQL. So we can add indexes and columns without any downtime, right? And this works basically like we create a ghost table and then we write to both tables and we copy the existing data out to this ghost table. So it's all like seamless and zero downtime but it takes a long time. Like we have one cluster where just like adding a new column to the biggest table takes like two months or something like that. And that's like, for a development team, that's not great, right? They build their feature and then they come to you and they are like, yeah, we need to run this tiny migration. We need this tiny int column here and you're like, yeah, cool. Wait two months for that to finish, right? We also ran into problems which I'm calling like buffer pool refreshing. So my SQL has a cache that sits in front of the disk that you have, right? So every time like a query comes in and it requests some data, right? My SQL Ghost and first looks into the cache is the data there. If yes, then it takes it from the cache which is fast because it's in memory. If it's not there, it has to go to disk, right? And load it from disk so the query becomes lower. Like if the working set of the cluster becomes bigger, like the frequently used working set of the cluster becomes bigger than the memory you have, you start running into like really weird effects where basically like a query comes in, you don't have the data. You read it from disk into cache. Next query comes in. You don't have the data. Throw whatever you read out to read something new. And you cannot really reuse the data that you're caching. And the cache becomes useless and you start like reading enormous amounts of data from disk over and over again. And then we also ran into issues with replication lag. So all the changes that happen on the primary need to be streamed out to the replicas. And they need to reapply those changes. If you have like a huge amount of changes happening on the primary, those changes also happen on the replicas. And the replicas become busy just applying those changes. And then you cannot actually like sort of, you read load anymore. And in some cases, like if the replication traffic becomes so big that your replicas cannot even keep up anymore applying those changes, they fall behind and you start serving sale data. Also, you cannot scale this by adding more replicas because those replicas will also be busy with the same data, right? Like the same data coming in through the replication stream. So we looked at different solutions to solve these issues or help us to work around these issues. And eventually we picked the test. Like the biggest reason for us is that essentially the test is still my SQL, right? Like we have a huge MySQL setup. We have a lot of automation around MySQL. We have a lot of tooling that we built for MySQL. All our applications or like most applications are built with MySQL. Engineers know how to write a code that uses MySQL, right? For us, it doesn't really make a lot of sense to switch to something else where we don't know how to operate it. We don't know what your problems will be. And engineers need to basically learn from scratch how to build things using a different system. So having like a solution for these problems that still is MySQL is great for us. Then another point also is that like the sharding model that Vitesse has really fits our data model very well. Like if you think about the data on GitHub, most of it is like scoped by repository, right? So for us, sharding by repository ID is like the easiest way to scale the data out, right? And shard it into like good chunks. And the third point is that the query compatibility between Vitesse and MySQL is acceptable. I'm not going to say it's perfect because we ran into a lot of issues. But over time, we worked through all of them like with the help of maintainers like Florent and also on our own. And eventually like by providing fixes for the Vitesse creator planner, that's how I eventually became a maintainer of Vitesse as well. So let's take a look at the timeline. In 2019, and I think actually in 2018, we ran the first experiments using Vitesse. And we started moving things over in 2020 with all the notification data that we have, like all the emails and stuff that you get, that is actually being like stored and cleared from a Vitesse cluster. And this was a very basic cluster, like there's only two tables in it, but a lot of data. In early 2022, we shipped a change that moved all actions, checks and statuses data over to Vitesse. And in early 2023, like just a few months ago, we finished the migration of issues and pull request data to the Vitesse cluster. So today we run 20 key spaces on Vitesse. And you might remember like these 330 terabytes of data that I mentioned before running in MySQL, that included everything that is actually running in Vitesse as well. So we have like 150 terabytes of data on the primaries in Vitesse that's like almost 50% of our data is behind Vitesse. And we see around like 750,000 queues per second across like primaries and replicas in Vitesse at peak. Yeah, so basically, I think like a big chunk of our largest features run on Vitesse nowadays. So let's take a look at the issues and pull requests cluster, which is like the latest one that we moved to Vitesse. It runs on 16 charts. So like we don't have like a crazy, massively sharded setup right now, like Activision had, we have like a fairly low number of charts usually. So we have 16 primaries and 84 replicas in those 16 charts. So it's like 64 machines. We store around 26 terabytes of data in this cluster. And at peak, we have 30,000 queues per second on the primaries and 220,000 queues per second on the replica. So this is like an example of a very greed heavy cluster. And now I have like this huge table here, which shows like the effects of moving from MySQL to Vitesse. So the old cluster that we have had that served this data had one primary and 100 replicas. And you can see here as well, like these are pretty beefy machines, right? Like each machine had like 780, 68 gigabytes of memory. And they're like all running on NVMe. Like at least they all use SSDs like to provide very fast data access. And we were able to move this to only 64 hosts, right? So we reduced the number of hosts that we need to serve this traffic, but also reducing the memory. We are now using like cheaper machines to serve the same data with less machines than before. You can see this here in the discrete rate. Like instead of reading, like this kind of like goes into this buffer pool threshing problem, instead of reading 11 gigabytes per second across the whole cluster, we're now only reading 800 megabytes a second from the disk across all these machines. We also see like an improvement in read rate on the primaries as well, just like not just on the replicas. I forgot to include the write rate, but that also like it didn't drop. But now instead of like writing all the data only on the primary, on one primary, right? Like we were able to split the data across all the 16 primaries that we run now, right? Like each primary is less busy writing data. And that also leads to less replication lag because all the replicas only need to read like 160, read and apply 116 of the changes that happen across the whole cluster. And I have like this online schema change duration improvement down there. So the largest table that we have on this cluster is like, it's called issue events. So basically like, you know, if you go on an issue page, you see like when someone like adds a label or removes the label, you see like these kind of events there. Previously, I checked the numbers like the last migration we ran before we moved to the test took three weeks to execute. And I think we even have cases depending on like how much traffic we see where it took longer than that. And now it takes two days. Like, it's a huge improvement for developers if they wanted to changes on this cluster. Okay. So our current architecture for vTest is basically built on top of our MySQL setup, as I mentioned before. We don't have like, we don't do what I would call like a best practice vTest setup where everything runs in Kubernetes and like automatically scaled and we have tiny shards and all the stuff. We run vTablets alongside MySQL on bare metal hosts. The only thing that we run in in Kubernetes is the vT gates, vTCTLD and vTatman part of the test. So basically like, you can have like a hybrid setup. And we're also currently running on an older version of the test, like it's not super old, but I think the latest one is v16, v17 is getting released soon. So we're a bit behind. In the future, I actually would like us to see like, I'm trying to move us into a model where we upgrade to newer vTest versions more often to get the latest fixes and performance improvements. And we're also looking into moving potentially like more clusters to vTest, but it really depends on the load on the cluster, right? Like we're always, or at least right now, we're shooting for a hybrid solution where we move things to vTest where it makes sense and leave things on MySQL where it doesn't. Because there is, like for us, without having like a full Kubernetes setup, there is an overhead to managing vTest and we don't want to pay that overhead if we don't really need to. Okay. So in conclusion, like vTest has enabled GitHub to scale MySQL much further than we were able to scale it before. And I think if you, like if there was problems that you heard about sounds familiar, I think you definitely should check out vTest as a potential solution. If you are curious how we did this setup without downtime and stuff like that and how we'd render as migrations, feel free to come to me after the talk and we can talk about that as well. Okay. So I'm giving it back to Florent who's going to talk about new and upcoming features in vTest. All right. So let's start with the new features. So this is all the most important features that happened since KubeCon Detroit. So that includes the ones in v15 and v16. Like Arthur said, the latest version of vTest is v16. It was released a couple months ago, I think. So we had two new big components marked as GA in v15, which are vTorq and vTadmin. vTorq is the component that will repair your cluster if it's failing. And vTadmin is the, like I mentioned before, it's the new front-end UI to visualize and manage your cluster. We also reworked all of our CLI flags. So we changed the infrastructure behind all of the flags that we had. They look more the same and now we can reiterate on our CLI flags, build some new tools, new documentation, et cetera. We also have incremental backup and point-in-time recovery that was added in v15. We also reworked the entire documentation. So we have a big documentation project and we started in v16 by going through all of the pages that we had and just making sure that they were accurate up to date and that they were saying the right thing. And the next step for the documentation is that we actually want to restructure the entire website. So we wanted to make sure that the content was right before we changed the entire structure. And we also added sharded views support. So this is the SQL view. It wasn't supported for the sharded key space, but now it is. So that's pretty cool. That was the new features. And now I'm going to talk about the upcoming features. So this is for v17, which is going to be released in June and beyond. Foreign key support. So we don't have support for foreign keys. We want to support them. We have a big project for this. It's not going to be released in v17, but it's going to be released beyond v17. We have schema tracking improvements on the queue as well. Schema tracking is a feature that is very important to us because it allows us to track the schema across all of the shards and make VTGate aware of your SQL schema, which allows us for a lot more queries than if schema tracking was off. So anyway, we're going to do some improvements to this. We're also going to improve the MySQL compatibility. We have a list of all the queries that we do not support and that MySQL support. And the goal is to continue consuming that list and support more queries. So we're just going to work on that. And we're also going to enhance our fast chat UI. We have a benchmarking tool for VTest, which allows us to run some nightly benchmark and see if someone degraded the performance of our code base. And that's our fast chat. And we want to improve the UI and the tool itself. All right. Here are some resources for you. If you want to learn more about VTest, if you want to get started, there are some tutorials that you can run on Minikube locally or just outside of Minikube, just on your local machine. You have our Slack. If you want to ask any question, if you just want to talk to us, we're right here. And thank you so much. If you have any questions, don't hesitate. You mentioned earlier that the compatibility with MySQL is not perfect. Do you have any example of such a use case where you have to modify your query so that it's compatible with VTest? Right, right, right. I think... I have a few. Yeah, I think GetLux is not supported, like named locks, but you can go ahead. Sure. Yeah, so one thing that I worked on was subquery compatibility. So if you have a query and you have a subquery and those two are correlated to actually only load data from one chart, which is a good case. It's a case that VTest, in theory, should support very well. That wasn't supported. So for example, I worked on that and I'm fixing that. We actually had a lot of subquery things that we needed to fix to make it work. When we initially ran... The way we migrated to VTest is we basically took our CI setup, switched out MySQL for VTest and ran it. And as Activision also ran into everything failed, it would time out and wouldn't finish, and we sat down, got a list of things that were broken, tried to group them together, and then we went one by one through and figured out whether it's something that we can fix on our side by just doing tiny changes to the queries, including charting keys or something like that. Or if that was a thing that VTest should support out of the box, but doesn't, and we have to go and fix it. I think mostly everything... All the queries are going to be cross-charted. You have to send it to one chart, another chart, and then aggregate the result at the VTGate level. That's the most complex kind of query. So that's why we've added schema tracking to be able to know which columns we have in which table and then et cetera. No, we're working on it, sir. We have a file that lists all the queries that we don't support. If you want, I can send you the file. Yeah, so I want to ask about the online schema changes. I know that there are different strategies to run them, like direct, the Percona thing, Ghost, VTest. So first question is how does the VTest weigh compared to Ghost? And then the second question would be if there are any performance differences. Okay. I've never run online DDL on my own, so I couldn't answer, but maybe after that. I think the VTest way nowadays is actually pretty similar to Ghost. There are some changes, but it's done by the same offer, like Shlomi is working. He built Ghost, and now he's working on the VTest online schema change. So they're fairly similar. I think the main difference is that the downtime is short, not the downtime, but the time that the tables get locked in Ghost is shorter than in the VTest solution. But overall, it's a very similar approach. Yes. In VTest, I think the way it works is queries get paused at the VTGate level until the cutover is done, and then queries are led through again to the new table, and in Ghost, it's like on my SQL level. Thank you. Hi. I have two questions. First, do you recommend starting, let's assume you start a new project, do you recommend starting with VTest, or migrating in a later stage? And the second question I have is, you mentioned you had overhead to managing VTest. Can you elaborate a bit on that? Yes. So the first question is, if you think you'll get to that size that I've shown, then yes, you probably should start out with VTest because it's easier if you, in our case, we had a lot of queries written with my SQL support. They worked fine on my SQL, but then you need to think about how do I shard my data? How will cross-shard queries work? What will the performance be like? If you start out with VTest, you have to think about that first, and then design your system, and then maybe some features. You won't even build them in the first place because they don't make sense in your architecture. For us, it was the other way around. We had to figure out how do we even support this feature in the new world? And then the second question was the overhead. Let's say there is management overhead from a database operations perspective. I mean, if you're using Kubernetes, that overhead almost goes away, but if you do everything manually, then you need to add additional tooling to manage VTest part of things. To start you need to make sure that monitoring support for all these things. And then there's also overhead on a design perspective. So one example that I can give is if you go and use our REST API, we give you, for example, repository IDs. And then you can later look up that repository by issue IDs. And you can look up that issue by ID. We don't shard by issue ID. So if you have a request that comes in like an SQL query where you try to find an issue just by its ID, by default, that would be like a scatter gather type of query where VTest is going to send that to all shards. If you have hundreds of shards, you send that query to all hundreds of shards. And we use things like, it's called lookup vindex. So basically the test keeps a mapping of the issue ID back to which shards the issue lives on. Basically you can think of it as a mapping between issue ID and repo ID. That's essentially the logical way of how to think about it. And that needs to be stored in VTest as well. So you need to store this mapping in my SQL as well. And it takes disk space. And if you then do a lookup, you have to run this lookup first. If you try to load an issue just by ID, you need to first do this vindex lookup and then a second query to actually fetch the data. So there's some performance overhead as well. It's not huge. It's fine. I don't think any one of you noticed that we changed everything out underneath. But it's something that you need to be aware of. Thank you. I have a question about how should I use VTest as in UCS purely like an OLTP, like a transactional database, or are there use cases for analytical use cases? Right now it seems GitHub is more using it as a transactional database. Yeah. So we have OLTP. We used to have support for OLAP. I think we have plans on removing it. I'm not entirely sure what's the plan about OLAP. But we had support for OLAP in the past. That's for sure. I'm not entirely sure what the future of the OLAP and the streaming mode of VTest is, honestly. But yeah. I can double check right after, but I'm not exactly sure. We're in the process of merging the sequential execution function, streaming function for executing queries together. But what will OLAP become? I'm not exactly sure. For now you can, but in the future I'm not entirely sure. Okay. So I guess we should probably use it more for transactional purposes. Yeah. Yeah. In your architecture diagram, you've shown a load balancer. How does it handle like adding replicas and failing replicas? Are they automatically added to the load balancer? Or do you have like a unified like access point, which load balancers because it was all charts and all replicas then? So the great part here, the load balancer, is not really part of VTest. Like that's the thing that you have to add on your own, basically. Okay. You have your layer of VTGate. You might have 100, 1000, but you need some kind of load balancer to distribute all your queries across all your VTGate. So it's yeah. I can also talk about this and like how it works in our setup. So we have a TCP load balancer at the front where like my SQL connections come in via like a shared or like a standardized DNS name. And then the load balancer just distributes across VTGates and then inside the VTGate layer, every query that comes in is automatically distributed across. Let's say you have four replicas and you run four queries. Each query one after the other might actually land to a different replica. So it's automatically distributed at the VTGate layer, but you need to also distribute like connections coming in from your application to the VTGate somehow. Another question. How many VTGates do you need? Do you need one VTGate for one primary or is it like? No. You can have, honestly, you can have as many. So you have one VTGate, at least one VTGate per cluster. Behind your VTGate, you can have as many key spaces as you want. But no, you can have just as many VTGates as you want. I think it depends on like how like what the load is on your VTGate if you want to scale up or down. But yeah, you can have multiple applications on the same VTGate, even though it might not be the best practice, I think, but you called. Yeah, there's like connection pooling happens at the VTGate tablet layer. And like also like things like, what did you call it? Consolidation. Yeah. And there's also connection pooling happening at the VTGate layer. So if you have only one VTGate and you have like, I don't know, 100,000 connections come in, it will probably fall over and die, right? But if you, you can scale different parts of your tests independently, like if you have many connections, but don't actually run many queries, you just scale the VTGate layer and you're fine. If you have many queries, but not so high number of connections, then you scale the VTGate tablet layer. Okay, thank you. Yeah, so I noticed that GitHub run all control plane of VTGates on Kubernetes, but data plan on bare metal servers. So my question is, now can with us run all control plan and data plan on containers and Kubernetes. Like do we, do GitHub want to move everything to Kubernetes? Is that the question? Yes. Yeah, so we were actually talking about this the last two lunches. We're talking about, yeah, you should move all your VTGate tablet and my SQL instance is the Kubernetes and just like, yeah, that's the preferred way of how to run VTests is like have everything in Kubernetes, right? Both the data and the VTGate layer. I don't know, like we don't have any plans, it would be nice or interesting to do, right? Because it allows you to be more flexible and scaling. And it also would allow us to not have to go and like provision huge boxes for some of the clusters that are not as big, but we don't have any plans to that. Yeah, okay. Thank you. Yeah, cool. Thank you.