 Welcome everyone to this breakout session. We are live in Detroit at KubeCon NA and welcome also to our virtual audience. Today, we will be talking about Wittes. My name is Deepthi Sigreddy. I'm the tech lead and a maintainer of Wittes. I'm Rohit Naik. I'm also a maintainer of Wittes. Matt was supposed to join us, but he wasn't able to travel at the last minute. We'll start with an overview of Wittes. Wittes is a CNCF graduated project. It became a CNCF project in January of 2018 at the incubating level and in November of 2019, Wittes became the eighth project to graduate from CNCF. Wittes is a cloud native database. What does it mean to be cloud native? It means that for a very long time now, Wittes has been able to run in Kubernetes. Wittes is massively scalable. It was originally created at YouTube to solve YouTube scaling problems with MySQL, and Wittes is also highly available. The way Wittes achieve scalability partly is through sharding, and that will be a focus of our talk today. I already mentioned my SQL, but I just want to reiterate that Wittes is built around MySQL and it's compatible with both 5.7 and 8.0. Wittes works with database frameworks, ORMs, any MySQL clients that you already have legacy code that's using MySQL. Pretty much anything that works with MySQL will work with Wittes. What Wittes does is that it gives a logical database view to clients. Behind the scenes, there may be many, many physical databases. There are people who are running thousands of MySQL instances behind Wittes, and Wittes can provide this logical view across all of those. It takes care of doing the query routing to the appropriate backend databases. Wittes supports both GRPC and MySQL clients, and it gives a single connection view to clients. This is the Wittes architecture. Queries come in to Wittes and first hit the VTGate component which is a proxy layer, and what VTGate will do is based on the query, it will figure out where to route the queries. Behind the scenes, especially if you're running Wittes in a sharded environment, there are multiple shards, and each shard will consist of a primary and one or more replicas. People run anywhere from two to 10, in the most extreme case, 85 replicas. The primary and replicas, the primary and replica instances of MySQL form, MySQL replication group, and each of those MySQL instances is managed by a Wittes component called VTTablet. So VTTablet is the one which receives queries from VTGate and then sends them down to MySQL, and by doing this, one of the things we are able to do is that connections to Wittes are much lighter weight than MySQL connections. So you can have hundreds of thousands of connections to VTGates, which in the backend translate to a much smaller number of MySQL connections through connection pooling. So we do connection pooling at the VTTablet level so that the same MySQL connections can be shared across clients. So this is the query serving path. What happens when a query comes to Wittes? There is also the control plane, which includes a topology server, which can be at CD or console or zookeeper, and there is a control demon, Wittes control demon, which accepts administrative commands through a command line interface and sends them to the appropriate components. There's VT arc, which we spoke about yesterday. We did a talk on VT arc yesterday. VT arc is the automatic failure detection and recovery mechanism in Wittes. So if the primary MySQL for any given shard, a particular shard or key space is down, VT arc is the one that detects and fails over to a replica with minimal downtime. VT admin is our new web UI API component, and there is a web UI as well. So that's an overview of the architecture of Wittes. Wittes is in production at many places. This is just a sample subset of all the companies that are running Wittes in production. Many of them run on Kubernetes, some of them still run on VMs and basically non-Cubernetes environments. Some of the key adopters of Wittes are Slack, which is 100% on Wittes. GitHub, which has started their migration to Wittes, but are still in the process of completing that migration. So right now they have a mix of Wittes managed and non-Wittes managed databases. JD.com, a Chinese retailer, they are running 10,000 plus databases behind Wittes and a PlanetScale has a database service that runs Wittes behind the scenes and now has tens of thousands of Wittes clusters running. Here are some testimonials from current Wittes users, Etsy and Nozzle. I'll give you a minute to read those. We have a vibrant community for the calendar years 2020 to 2021. I gathered some statistics from DevStats, which is hosted by CNCF. We had 21 maintainers over 400 contributors, of whom 214 were PR authors. So we are counting non-code contributions as well, website, docs, issues, comments on issues, reviews, all of those things. And our contributors came from 63 distinct companies and PR authors came from 30 distinct companies. Now I'll hand off to Rohit to talk about weed application. Wittes has always been shard aware. So all the components are inbuilt understanding of sharding architecture of Wittes. For example, query serving, cluster management, backups, et cetera. We have had tools to support sharding, right from the beginning. But around four years ago, our Wittes co-creator and co-founder of PlanetScale, Sugusugamaran had a sort of a-ha moment where he came up with this weed application algorithm, which works not just for sharding, but for several other day zero and day two tasks that are required in at scale systems. So weed application is essentially a framework for creating and managing data pipelines. These are shard aware pipelines to perform particular actions. At an abstract level, what it does is if it matches a criteria that you define and then executes a defined workflow. When we look at some of these workflows, you'll get a better understanding of this definition. There are short-lived workflows which perform a particular task and then they are done, such as importing data into Wittes, sharding, doing online schema changes for when you're running DDLs on large databases. You don't need to bring your database down. And there are long-lived workflows which are used for continuous materialization and for a change that I captured. We'll take a slightly deeper look into some of these. When you migrate into Wittes, let's say you want to just try it out or you've decided that it's, you want to move it into, move your MySQL cluster into production. You would first of course create the Wittes cluster. In this case, we have created a sharded cluster but you could start with unsharded as well. That's the target. The source is the existing database which is in your data center. It could be RDS, Aurora, MariaDB or vanilla MySQL. You would start an unmanaged tablet. So in Wittes, typical Wittes clusters, Wittes also, the WT tablet also manages the database. In this case, of course, the database is external. So it's called an unmanaged tablet. And we start the move tables workflow. In this case, all tables are moved from source to the target. The way the algorithm works is initially we do a bulk copy that's bulk inserts into the target from the source. And once we are close, we start streaming the bin logs. It's a little more complicated than that but at a high level, this is the algorithm. This is in a very highly rapid, eventually consistent system. And we're able to get like shards, hundreds of terabytes in days or up to a week. Now, once the workflow is running, realize that you're still online. In all these workflows, you're still online, of course, sourcing, serving your data from the source. So the target is not yet serving. So when your application lags between the source and the target is low, let's say about under a second or so, you can start switching traffic over into Wittes. You have the choice of switching all traffic in the beginning or just reads into Wittes and still serving the writes from your source and then switching writes later on. Now, when the writes are switched, we also by default start a reverse replication into the source from the target so that at any time you can roll back if you like for whatever reason. The same move tables workflow can also be used to move some tables into a different database. Here, for example, we are moving the product-related tables into products and order-related tables into orders. This works well if you have services that have subsets of tables that are fairly independent. You, of course, have common tables. There's another flow to support that. Note that Wittes will treat all of this as a logical database. So even though internally there is a separate MySQL database that is hosting some of the tables, using Wittigate's MySQL adapter, you will still address the tables in the same way. So the application doesn't need to change. Now, say you just have a very large table, right? A transaction table or a message table. So, and that, your current hardware cannot manage that, either due to storage restrictions or read or write QPS cannot be increased even though you have thrown hardware at the problem. So in that case, you can do horizontal sharding where the table gets distributed across the shards. In our example, we already had a sharded cluster, but here we are showing that we can increase the number of shards. Here we're just showing two to four, but it could go like two to 50 something if you want. The reshared workflow will support this. In all cases of sharding, you can go up or go down. So the same workflow works for doing both. For example, in the holiday season, you might just want to shard higher so that you can serve the traffic better and at the end of it just go back for cost reasons. Now, the previous workflows we discussed were short-lived workflows. Once that particular action is done, the sharding action or the import action, the workflow is complete. These, the next two we are talking about are long-running workflows. Now, you probably have heard of materialized views. We test supports materialized views at scale with sharded databases being backed by sharded databases, sharded tables. You can create a materialized workflow with your query to, for example, denormalize tables or to create aggregations for your analytics data warehouse reporting, et cetera. Anonymization of data is a very important application that several users are using. When you want to move your data into your warehouse or into staging or dev, you might want to redact certain information. And for sharded systems, there are always common tables that are required across shards like lookup tables, country codes, product catalog, language tables, for example. So it's possible to have a single table which has the master information and all that can be materialized into each shard so that all the queries are local and not cross shard. Vitas does support cross shard queries. That's not a problem, but it's just a matter of performance. So VTGate knows where the data is located and it will make the appropriate cross shard queries if required, but using something like local copies reduces that requirement for scattering the queries. The last application or read application that I'll discuss is not a workflow per se, but it's API that is available at VTGate. It's a GRPC API which will give you an event stream of changes. It can also give you snapshots or changes or both. You can start saying that you want to stream the entire database and then keep streaming the changes. This is extremely useful and many users are using this for moving the data into a data warehouse, for example, or for real-time notifications, updating your elastic search, stuff like that. You can stream either the complete database or a subset of tables, or you can filter certain tables for any special use. Now, there is a Debezium adapter which is open source and available. There are also users who have built, they're not yet open sourced, but adapters for a stitch and air byte. Just wanted to give an example of what's possible with this. I'll now move to a demo of a few of these workflows. First, I'm going to show how vertical sharding is done. Here we are starting with a single key space which has these three tables. So we're going to move the order-related tables, customer and order, into a different key space and we're going to use move tables. Now, I created the cluster earlier because it takes a few seconds. So at this point, I'll show you using, so this MySQL CLI client is connected to VTGate. As you can see, it is currently running on 5.7. And yeah, so essentially it will act as a MySQL. This is the initial key space. So currently this is empty. So now I'm going to run this command. So it's a fairly simple command which essentially just say which key space you want and which tables you want to transfer, copy over to another key space. You give the workflow a name so that you can refer to it later on. In this case, we have not only started the workflow, but we have checked that it's complete, that the lag is low, and we also switched both reads and writes at the same time. Now, if I go back to the client, I see that the order tables no longer exist here and they move to customer. So what I did here is essentially those tables have moved now to a different database. So if you had a service that was serving the customer's service separately, suddenly now you have your data spread across these different databases. The real short demo, so from the current key space which we already moved the previous tables into, these are the number of rows that are present. So this will start setting up the shards, right? What we are going to do is we're going to set up two shards. Slash zero is basically unsharded. So there's a single database. We're going to create these two shards. Essentially we're splitting the key space, the space of all IDs into two, the lower and the top. I'm showing it in gray because they're not yet serving traffic. The traffic is only being sold by slash zero. And then we're going to run the reshard command. We're still setting up. This is a quick demo of the VT admin API that this is GA in the latest version. And you can see that only one shard, the zero shard is serving right now. The others are coming up. As you can see, only one replica has come up. It will take a little bit of time. So by default, we recommend one primary and two replicas. And the VTR component which we had a talk yesterday about will automatically select one of them as a primary. Yeah, this is why I started the previous clusters a bit earlier. Because this VT tablet starts both itself and the MySQL instance as well, which is connected to it. Okay, so now let's look at VT admin and check. Yeah, so we have both the other shards created. Primaries have been elected. And we're going to now run the reshard command. So again, it's a simple command. Of course, we just have very few rows here. This reshard can run for hours and days. It is resumable in case there are network errors, et cetera. It'll automatically resume. And even if there's a different primary gets elected because of failures, all that is handled by the workflows. If you look at VT admin now, we see that the unsharded shard zero is no longer serving and the other two are now serving. So VT gate will be connecting to these shards. It knows where each row is located by the sharding key. So when you define sharding, a sharded table, you define a sharding key, which could be just the identity of your auto increment or a hash of it. If you want better distribution across the space or a tenant ID or a geo ID for, if you want to localize your data for legal reasons. So whenever queries are made, this, we call it a Vindex within Vittes. So the Vindex is looked up by VT gate and it figures out whether it can go to a single shard or it has to do a cross shard query. So what I've shown you just to recap was that we did the reshard, but we had not yet done the cutover. Once the cutover is done, that's you do switch traffic, the switch traffic command. The shards on the right start serving. The last thing I'll show is the materialization. I'm showing two things here. One is a reduction of data for PII, in case this, the credit card. And we want to denormalization. I'm going to move this into a facts table for your data warehouse. And I want, let's say the month and the year, right? Just examples. So what is most important here is the query that is being run. So you can look at the query. You can use MySQL functions and do computations, et cetera. You can have a where clause. In this case, we have removed all rows for a particular skew and for a particular price. So just to give up idea of what is possible. We have, for example, one customer doing PII who's running about 600 workflows on different tables and they're moving it into the data warehouse and they're all running on a single VT tablet just to give the idea of scale. And it's a fairly commodity hardware. Here you see that the credit card has been masked and we have the total price computation month and the year. The last thing here I will show is the aggregation which is very useful for reporting purposes. You could, so the thing is this, it's happening in real time, right? As changes are happening, the views are updated in real time. The usual lags are extremely small. So you can use it for reporting purposes or for read purposes without having to have a separate, like a data warehouse system or so. So this is a simple aggregate query that is going to run here. Again, what's used here is the SQL query. It's a very simple group by and count stars on et cetera. So this is what I wanted to demo. A quick demo to give you a flavor of how it works, how easy it is to run. We're now going to look at the new features in Vittes 15 and Deepthi is going to talk about that. We did the 15.0 GA release this week on Wednesday and there are a couple of things which are GA in this release. The first is BDOc which gives us the automatic failure detection and handling and it does it in such a way that the desired durability policies are respected. The other thing that went GA in 15 is Vitti admin. This is the web UI that Rohit included in his demo and this uses a structured GRPC API to the Vittes control daemon. And this is a more intuitive, easier to use web UI than what we used to have. In addition to Vitti arc and Vitti admin, we also added support for a whole bunch of compression and decompression algorithms during backup and restore. Previously there was only Gzip that was supported and anyone who wanted to use something different like ZSTD or LZ4 had to write some custom hooks. That's no longer required. There are a couple of features that went GA in 14 which happened in June, but since that happened after our previous update, I just wanted to include that. So there is a new query planner and optimizer which we are calling Gen4. The previous one was called V3. This query planner is able to support more my SQL features and is able to plan more complex queries. Schema tracking is also GA. What schema tracking does is that when new columns are added to existing sharded tables, previously that metadata had to be provided separately to Vittigate. It's no longer necessary. Vittigate can discover those new columns by talking to Vittitablets which are connected to the MySQL. We have native view support in progress so it will be available in a future release. Next up, read application. One thing that is new in 15 is shard level migrations. It's possible to migrate a key space from one Vittus cluster to another one shard at a time. Previously, the migration had to be done for all of the shards at the same time. We also have a better throttling for read application workflows in order that the databases that are serving production traffic are not negatively impacted by some of these sharding and materialization type of workflows. Still on read application, there is a tool called VDIF which can be used to verify that read application actually did its job correctly. So for example, if you're doing a re-sharding, you wanna make sure that the sum of the targets is equal to the source or the sum of the sources is equal to the target depending on how the sharding is structured. So VDIF V2 is resumable. It will restart itself if the error is recoverable. And it provides progress and ETA on the process. Native online DDL was GA in 14. So Vittus has built in support for zero downtime schema changes and that is also built on read application. The other thing that is nice about the native online DDL support online schema change support in Vittus is that these schema changes are revertible. We are also working on incremental backups and point in time recovery in a future release. We already have a point in time recovery workflow but what is coming up will support more use cases. Next up is Q&A but before we do that, here are some resources for anyone who wants to learn more. We have our website which includes docs and tutorials. There's also a link to the source code which is just Vittus.io slash Vittus on GitHub. We have a Slack workspace. There is a link to it from our website as well and we'd love to get feedback on what you thought of the stock so that QR code will let you leave feedback for us. So now we are ready to take questions. It's on top of MySQL but you mentioned you have your own query planner and optimizer. Correct. So how does that work? We try to push down as much of the SQL computation to MySQL as possible but for complex queries, for example you may have a complex join for which you have to fetch data from one shard and then combine it with data from another shard or you fetch one row from one shard and then you filter across all of the shards based on the value that you get back from that. So those types of complex queries require in-memory computation and those are the things that VTGate is doing. So when a query comes in, VTGate will figure out, do I need to just send this to one shard or do I need to scatter it or do I need to send it to a subset of the shards? Do I need to break it up into multiple round trips depending on the complexity of the query? So that's the sort of planning and optimization that's happening at the VTGate level. Yes, so I would say that that test suite may not be comprehensive but we do have a test suite where we compare the results between sending it directly to MySQL versus going through VTGate, yeah. So we in fact added support for a lot of functions in I think 13 but there are still a few MySQL functions that we have not completed support for. We have a LFX intern who's working on that right now. Yes, to carry on on that. On that query processing on the VTAS layer doesn't give you also some insight of the quality of your sharding decisions depending on some history of queries. Okay, so we do produce a query log that tells you what plan was produced by VTGate whether it was a select unique meaning it went to one shard versus scatter. So we do produce that query log so it's possible to process that outside of VTAS. VTAS doesn't do that as part of the suite of functionality we produce but it's possible to do that. Anyone have any questions about the different types of sharding keys we support? Sorry, just expanding on the optimization. So I guess it's own explain plan would show that VTAS is query plan. Yes, yes. Okay. So we extend the explain syntax and we have an explain format equals JSON and format equals VT explain. So there are a couple of them. Harshit can answer that. Yeah, so I'm from the query serving team from VTAS. So yes, we do have basically we have our explain plan for VTAS where you can understand that if you use this if you basically what we do is like you have to give like for this table what is your sharding key and stuff so you give that initially. Now you're running your queries. So you want to know what kind of plan is going to get executed by the VTGate. So you can do explain format equal to VTAS and your query. So it will tell you what is the plan that is going to be generated by VTGate. And it will tell you whether it's going to be a select like if you're given a set, suppose select query so it'll tell you whether it will go to a single shard will go to multi shard where it's a select in kind of query. So you'll understand whether it's using the right sharding key or not. Then you can add more sharding keys to it. So basically there's one thing called sharding key and then you can have secondary indexing on it. So we call index and so you can add more and more index according to your query. So suppose you have given your sharding key as suppose as ID but in your where clause you are not using ID. You're using some other column. So you can create secondary index on that column so that then it doesn't go to all the shards because your data is distributed across multiple shards. So you don't want a query to become expensive. So you'll create a secondary index on it so that now the VTGate can plan your query better. And so it goes only to the subset of the shards where your data is actually residing to. So yeah, so lower level performance is as of now it's only at the MySQL level. So you can only what queries goes down you can evaluate it. But at the VTGate level, what we give you is actually a streamable API which will tell you that for this query how much time it took like and where it has gone to which shard is gone to and what tables it's just and how much time it take at the MySQL level at the VTGate level. So those information are available at the VTGate level not a full performance MySQL that you like what you see at the MySQL level but you can query the MySQL level performance. How should let me add to that. So with Vitas you will use both sources to tune the performance of your system. So you have the metrics and the statistics coming out of VTGate but you will also then see how each of those individual queries is performing at the MySQL level. So you will need to do that. So it's possible that the queries VTGate is coming up with are inefficient in which case the VTGate planning has to be improved and it's possible that VTGate is coming up with the right queries but you don't have the right indexes on the underlying tables which is also fixable. So it becomes a combination. No, not really. But you can override means like I told you right like if you have missing indexes at the VTGate level you can add it and then you'll start having those. If the planner itself is producing wrong bad plans then you have to just create an issue and then someone from the query serving will get a better plan for that query. So while you can't override the query plan directly like Harshit said, you can add secondary indexes which can produce a more efficient query plan. Not exactly the same. Yeah, at the VTGate level it currently doesn't support indexings, but yes. But the other thing that we support is if you know that this query has to go only in this particular shard you can do shard targeting and stuff like that. So then it doesn't go through the planner phase or VTGate just directly pushes to MySQL. So what we do at VTGate level is we try to, the plans that we generate at VTGate level we try to push as much as to the MySQL so that we have to do less things on the in-memory stuff and let the MySQL handles more and more. So like, even if like we have to do aggregations, right? We try to, so we try to push those like if you have to do this as a counselor we try to push all the counselor till the MySQL and then we just combine the counselor again at the VTGate level. So we don't have to get full data we just need counselor from all the shards and then do it for you. So those kind of things we try to push as much as push down approach on the MySQL. The MySQL monitoring tools work on top of VITS also? Yes, yes they do. So VITS also comes with a sample Grafana dashboard that was contributed by one of our contributors that you can use to monitor the cluster. Well, you will, okay I'll repeat that question. The question was does the sample Grafana dashboard include metrics at the query level? Yes, it does because in VTGate we do track the amount of time taken in VTGate but also the amount of time taken at the MySQL level. But we do not export all of the performance schema metrics. So those are probably not available in the sample Grafana dashboard. Any other questions? All right, thank you everyone. Thanks everyone, coming. For showing up. Thank you.