 Hi, nice, nice to meet you and very, I'm very excited to present you all the lessons, all the interesting stuff that we learned. The past few years we have developed different drivers, especially the Rust driver and different ways that we actually use Rust in SelaDB. Before I start, there's a very short poll. I think it just popped up for you. So the question that we are asking you is, where are you in your NoSQL adoption? We would like to know you a little better and to gauge how the different databases and different storage solutions fare in our audience. So, it would be great if you filled out the poll and now let's jump back into the presentation. So a few words about me. I'm Piotr Grabowski, a software team leader at SelaDB, and my team is responsible for different drivers that you can use to connect to SelaDB database, as well as different connectors such as Kafka connector that you can use to write data into Kafka or read data from Kafka back into SelaDB. And I've been at SelaDB for almost three years now. So a few words about SelaDB and our database is really great for massive data intensive applications that require higher throughput, as well as low latencies. This isn't talked about that much, but having low latencies makes sure that your clients have a better experience of using the products and you can use the database in more advanced ways, knowing that the queries will finish in a short time. We have designed SelaDB with performance and close to the metal approach right from the beginning. So for example, Sela is written in C++ using our own C-star framework for writing asynchronous applications. We don't have to wait for, for example, Java's garbage collection. And when we compare the performance to Cassandra, which we are compatible with, we have five times higher throughput and even better on some workloads. We have much lower latencies. And this is a great difference when you compare a five millisecond latency to an 100 millisecond latency. And this really opens the door for some new class of applications. So to combine those aspects, the cost of using SelaDB is much less compared to different solutions. So we can either save money or store more data in your database for the same cost. As I said, we are compatible with Apache Cassandra. This was the first premise of our database, but in the past years, we have actually started being compatible with DynamoDB and also developing our own new features on top of that, both those two APIs. And we have very, very diverse solutions you can use SelaDB in. So for example, you can use SelaDB as a cloud offering on our SelaDB cloud. You can use our enterprise version of SelaDB, which is great for the large databases. And of course, you can use the open source solution for free. And it also works great. So a few words, why I'm presenting in front of Linux Foundation. So SelaDB runs only on Linux. And we actually take much, we take advantage of very different Linux only APIs such as IO U-ring. And previously we use Apple IO interfaces. And this allows us to really perform high throughput and low latency IO. And a fun fact is that our co-founder of SelaDB and current CTO actually begun the development of kernel virtual machine in Linux kernel. So we have a long history with Linux and while developing SelaDB, we also keep in touch with kernel developers. For example, when we are developing the IO U-ring backend for CSTAR, we actually went into contact with different developers of that IO U-ring backend. And the libraries you can use to talk with IO U-ring and we actually, through this collaboration, we made this library faster. Many companies use SelaDB. This is just a couple of them. And those are the one, the biggest that we have. So as you can see, this is a very diverse group from Disney plus Hotstar, providing streaming services, Discord, providing really great chat experience, as well as, for example, Epic Games, a game developer. So let's jump into the main portion of the presentation and go into our office at Poland. So let's zoom in into Warsaw and go into the office I'm sitting in right now, which hosts the driver's team that is responsible for developing the Rust driver. So today, first of all, I'll talk about a few facts about drivers, what they are and what is their purpose and how you can optimize their performance. Next, we'll focus on the SelaDB Rust driver and different stories and different optimization paths that we took. And we'll finish the presentation with a very interesting way that we use Rust in conjunction with our other languages, by building bindings from other languages like C++ into the core of SelaDB Rust driver. So in this presentation by drivers, I mean the libraries that allow you to send queries to SelaDB and allow you to do it efficiently and very easily. So the primary protocol that we use in the drivers is CQL. And this protocol is inherited from Cassandra. So you can use this protocol to clock with both Cassandra and SelaDB. This protocol is based on TCP. We at SelaDB support the fourth version of the protocol. And this protocol is based on different frames. You have different messages, for example, a message that initiates the connection when you connect to the database or the message that actually executes a query. And within those frames, you can have multiple simultaneous stream soft requests so you can send multiple requests and wait for many concurrent requests. And we support also compression. This protocol supports compression as SelaDB drivers compared to other drivers to Apache Cassandra support shard awareness. And this is a way to connect directly to a specific shard. And by shard, this means a core in SelaDB. And this greatly improves the performance. So a few words about the role of the drivers, because you might think that maybe this other layer of abstraction might not really be needed. So the drivers really do a lot of stuff behind the scenes. So first of all, they serialize and deserialize the protocol and different parts of the protocol. So for example, the SQL frames that I talked about the query frame, the initialization of the connection, as well as the serialization and deserialization of different types that you actually sent over the network. So SelaDB supports different types, you can of course send very basic types such as in strings and so on. But when you look at those more advanced types like dates, different user defined types, the serialization is much harder to do it on your own. The driver also maintains the metadata about the cluster. So what tables are available, what's notes are available. And this information is very crucial to send the query to a correct note. And in terms of SelaDB also to a correct shard, so the actual physical core that is containing this data. So the driver actually sends the data over the network and the different drivers like the SQL driver, the driver driver have many helpers that allow you to very conveniently construct the query. When you think of the driver, it might not be obvious at the first sight, how the driver could actually improve performance, but on top of the list, there are a few optimizations that we employ in our drivers. So first of all is the shard awareness, so sending the query actually to a correct note, to a correct physical core of that note. And that means that SelaDB doesn't have to route internally the query to a different core. So this could be very expensive when you have, for example, computer and CPU with different NUMA notes and to actually copy the data between those NUMA notes, which can be more expensive. Some features like Sela change data capture, change the partitioning scheme that we used to determine which part of the data is stored on which note, and this is, this has to be implemented in the driver to do this calculation correctly. SelaDB also supports low different transactions schemes like LWT. So when you send this transaction, you need to query it to a correct note to avoid potentially different Paxos conflicts that could happen if you just send it randomly to different notes. And the final technique that we employ in our drivers to make them a performance is to really optimize the performance in hot puffs of the driver. Those puffs that are actually taken when making the query, so this realization, this realization is really an expensive part of making a query to have it in the correct format to not copy the data multiple times. The routing code is another aspect of the driver that actually takes a lot of time to calculate the hash of the data and to determine that hash to get the data routed to a correct note. And when optimizing the driver, we need to avoid making any copies of the data that are not necessary allocations, which actually can take a lot of time and locks. So let's start with SelaDB REST driver. The idea for the driver was actually born during a hackathon three years ago. And of course, after the hackathon we actually continued the developments to make it into a real product that people can use in production. Here's a block that we made just after the hackathon to describe the driver. And the driver uses the Tokyo framework for asynchronous operations. This is a really common framework in REST. And the driver is now future complete. So we support many advanced features compared to other drivers such as having the shard awareness functionality, having the asynchronous interface, so you can really send a large number of queries concurrently. Different compression schemes, all those CQL types like dates, user defined types, collections, and so on, as well as other features. So actually, this might be a really interesting story is that the initial prototype of the driver was developed during the hackathon by only four people. In just a couple of days, but this initial prototype was actually enough to perform some basic queries and send them to Sela. So I think this is a great story on how you can actually do such a complicated project in real short time and have some by doing that, gain some knowledge and some insight about the performance. And actually, not to theorize about the performance but just write it in a couple of days and try the performance on your own. So after this hackathon, those were the initial benchmarks that we did. So we actually compare the Sela RAS driver with Go CQL driver, so the driver for Go, and the existing driver that could be used to talk with Cassandra in the RAS language, the CDRS driver. So what this table shows is the time it takes to process 10 million requests with some fixed concurrency. As you can see the driver, the RAS driver is much better compared to those two drivers, and this was really encouraging us to really develop this driver forward and really make it production quality. So after some time we actually did those benchmarks once again with the driver at much more mature state and the performance was still good compared to other drivers, so the blue bar on the left is the Sela RAS driver, and we compared it with CPP driver, different variants of CPP driver, as well as Go CQL driver, and this graph shows how much time it takes to execute one million inserts to SelaDB, and as you can see the RAS driver was clearly beating all the other drivers. So this is a graph of doing selects, so this might be a bit more straining on the driver because the driver now has to deserialize the output that it gets from SelaDB and the select can return many rows, and those rows have to be allocated somewhere in the driver deserialized, but when it comes to an insert, there is nothing that is returned by SelaDB when an insert is correct, so the need for deserialization is much lower, but even in that benchmark the RAS driver was much faster than other drivers. So a few words about the design that we did for the RAS driver, so we decide on the Tokyo runtime, so asynchronous RAS is based on a quite unique future promise model, by unique I mean that the future actually holds the data that will be filled in when this computation ends, and running a function that returns the future does not automatically respond as asynchronous tasks, as is the case in other languages, so actually for RAS you need to use some runtime, and we chose the Tokyo runtime as a base for the RAS driver, we also looked at other runtimes, but we chose Tokyo as it was the most popular and the most standard let's say runtime compared to different libraries available for asynchronous RAS projects, and having decided on the Tokyo framework, we then moved on to design the API, and starting a new driver from scratch was actually a great way to really rethink the API design, because other drivers were developed over the years and they had lots of crusts that has accumulated over the years of the development of the drivers, but with SillaDB RAS driver we could start fresh and have a new clean API, so we designed the API to be very clean to have sensible defaults, and as you can see it's very easy to perform a query, this particular query selects three columns from a table, and it arrays over them and prints them, and as you can see we used different facilities that the RAS language offers us to deserialize it back into native types like i32 and string, so it can be consumed very easily. So now let's move on to some interesting things that we discovered during the development of the driver that impacted the performance of the driver. So one day an issue was raised by the author of Latte, a benchmark tool for SillaDB in Cassandra, and the author raised an issue that the driver had some problems with scaling with high concurrency of requests. And the driver didn't really perform that well when the concurrency of requests, so how many requests you do simultaneously, when this number was very large, the driver didn't really scale as it was supposed to scale. And actually debugging the issue was really fun but challenging, and at the end we managed to identify the root cause in the implementation of Tokyo's futures unordered utility to gather many futures and wait for them. This was actually due to cooperative scheduling in Tokyo, so it was possible for this structure to iterate over all futures each time it was pulled, and if there were many futures in that structure, the runtime had to iterate over them multiple times over all of them, and this is clearly an N squared performance issue, and actually the fix was really nice to just limit the number of futures that are iterated in each poll, and that way this fix was merged to Tokyo and this fixed the scaling issue of the driver, and we worked with the Tokyo maintainers to get this feature fixed. So now let's move on to maybe something more high level. So, when you think of the RAS driver, when the driver starts, it makes connections to different nodes to different nodes of SyllaDB and actually making multiple connections is really crucial for performance. So by default the driver makes a connection directly to each core of SyllaDB, so this is the one connection per chart number. But you can actually customize it to make more connections and in some benchmarks that we did this actually improve the performance. So, in terms of Cassandra drivers they make multiple connections to different nodes, but when it comes to SyllaDB, we can actually connect to a particular core of the node, and this is what SyllaDB does and the SyllaDB RAS driver does, and this behavior we implemented right from the beginning of the SyllaDB RAS driver, and this makes it possible to have even better latencies. So, before implementing this feature, for example in underdrivers that didn't have this feature, when a query was sent to an incorrect core of the database, the core has to route this connection and this request to a different core, and this can be very expensive even on the very lowest level of the CPU design when you have different caches that are particular to some core, or you have different NUMA nodes in some server CPUs having to copy this data between different cores can be very expensive, but the RAS driver and our implementation avoids this caused by directly connecting to the correct core, the correct core, and this of course requires some changes in both the driver and the database. So, for example, the database has some facilities to allow you to connect to a particular core, so apart from connecting to a particular core of the database, one big thing to keep in mind is that SyllaDB is a multi-node database and the data is partitioned across the multiple nodes and the algorithm, the structure that we used to do it is a V-node configuration, so the data is put on this virtual ring, which is presented on the site, and we partitioned this ring into smaller chunks, those are called V-nodes, and different nodes have a portion of the data from some V-nodes, some node can have some subset of the V-nodes, and the role of the driver is to actually figure out that if you make a select query on an insert query, which V-node it will land to, and which node does it correspond to. So, this is how it looks like when there's a select query or an insert query that touches some partition key, we hash this partition key, this is actually done by the driver, and by calculating this hash and keeping in mind the ring structure that we fed from Sylla, we can determine which node the query should land to. Recently this year we did a refactor of this subsystem, so in this refactor we really focused on improving the performance of the load balancing phase of the driver, before making this change we actually benchmarked the driver and pre-profiled it very carefully, and the load balancing phase of the driver was actually a very considerable chunk of the driver execution time, and by doing the refactor we reduced the number of allocations and atomic operations, so this can be operations related to mutexes, while binding the query plan, and we did a very interesting design of this load balancing refactor, so we split the algorithm into two phases, there's a peak phase, which is the common case, this is the phase that assumes that the first node that you send the query to, that this query will succeed on that node, and there's also this fallback phase, which means that the query didn't succeed on the first node, for example, the network was flaky and the connection timed out, and this fallback is a way to determine which next node to send the query to, and by splitting those two cases into the hot path, the peak path, and the cold path, the fallback path, we were able to better optimize the common case, the peak phase, and this is the phase where only a single node from the load balancing plan is actually needed, we also did some pre-computation of the replica sets, and as you saw in the previous slides, we have this token ring, and actually when you calculate the hash, you need to scan this ring and to determine which node is actually holding the data, by pre-computing a phase subset of this phase, we were able to optimize the performance and have constant time access to the pre-computed replica slices that are used for load balancing. So this is a quick comparison of how the performance was improved and how we did it, so this is a short summary of different number of allocations, reallocations that we do during an insert, so as you can see before the insert, we had to do on average 15 allocations per request, and even allocating around two kilobytes of data for making this request, but after the optimization phase, after we implemented this new refactored load balancing, we were able to improve the performance and have the number of allocations really lower. So this is nine fewer allocations, so this is 60% less allocations, and even a greater number is that we are able to make the allocated space really smaller, so now it's only 300 bytes allocated per request, and as you can remember, in terms of SILA, we do potentially millions of requests per second, so all those bytes add up very quickly. And this is similar data for selects, in case of selects, the gain is a bit lower, because in case of selects we need to allocate some data from buffers for keeping the data that we received from SILA, so in that case, we still need to do a lot of allocations to get this data into correct buffers, deserialize it, but still this was a great improvement. So this is a short overview of other efforts that we have made or are still making in SILA DB Rust driver to improve the performance. One is actually a bit different compared to those previous optimizations, so this is actually an optimization, not of actually raw throughput, but mostly focusing on the cost of using the driver, so many deployments of our clients run SILA DB on different availability zones, so for example AWS Cloud, but doing so can bring costs up because sending the data between the availability zones can be costly. So by introducing RAC, our load balancing, by RAC, I mean, in this case, the availability zone of AWS, we are able to send the query directly within the availability zone and this greatly reduces the cost to make a query. And the second one is SILA DB, because the time to send the query is lower within availability zone, this actually even reduces the latency by doing, by querying the newest nodes in the cluster. And the second one is the rewrite that we did with this realization. So, as you can remember when the SILA DB database sends the results of the query, so for example you send the select query and the database sends you a list of rows. And that previous implementation had to deserialize each row into a vector of columns and then a vector of vectors of columns, so this is the vector of rows. And as you can see, this is, this will mean that for each row, you have to do an allocation of this back, and this can be quite costly. It can be an obvious but more harder to implement approach that we did is to allow you to deserialize on fly. So that means you don't have to materialize this vector. Before reading the data, you can just deserialize and consume the data as it's deserialized, so that means that we don't have to allocate this big vector each time you read the results of the query. So implementing this approach actually, the Rust Lifetime functionality made it really great to implement this on the fly deserialization in a memory safe way. So a few words about this particular approach that we did. So we marked the old API as legacy as deserialization is actually a very common operation that the users do with the driver so we didn't want to really break their workloads. So we marked the old API as legacy for backwards compatibility. And we introduced a new API that has the on demand deserialization. And that means that we have reduced allocations and the migration path is really easy. And this particular one wasn't started by us, it was a community started project by Joseph Perez, and this is his GitHub handle. So he actually was really interested in making a driver that would have no allocations at all. As hard as this task may seem, he actually made really great prototype that allows you to do queries with Sylla with zero copy deserialization and actually zero allocations per request and maybe one allocation in some corner cases. So he actually did the query plan caching, maybe a very similar approach that we did with the load balancing refactor and his driver can do zero or only one allocation per request. And we are actually looking into incorporating the ideas shown in that project into the SyllaDB REST driver and this is, this frankly speaking was a really interesting surprise for us. As we didn't know about this project before, but this was a really nice motivation for us to improve the driver even more. And I think this is a great way that open source community works that different people start their own projects and more mature projects can take those ideas and make the more mature driver even better. So while we are developing the REST driver, we use different profiling tools and the REST ecosystem actually makes it very easy to run different profilers and one of the profilers built into cargo is cargo same graph. A utility that can create the same graphs of the execution of the driver or any other tool that you are developing. And this produces such a frame graph file, I'm sure you are familiar with that. And the other tool that is mostly for projects based on Tokyo is Tokyo console that allows you to look into different tasks that are currently, so by task I mean the asynchronous futures that are happening on the Tokyo runtime. So you can look into, look into it in real time, you can see how much those tasks take time, and this is another great way to profile REST applications written with Tokyo. So now let's move on to the final thing of the presentation, which are the bindings to the TILA DB REST driver. So let's start with what I mean by bindings. So, when we are developing the DB REST driver and benchmarking it, it was so great and much better than the C++ driver that we had this idea that, oh, how we could have this performance of the REST driver, but still in C++, for example, if someone has some C++ applications that use some C++ driver. So, an obvious idea popped up in our mind is to develop a binding, so compatibility that allows you to call the TILA DB REST driver from the C++ code, and have great performance, and for us it greatly reduces the maintenance burden, because the bindings project is really smaller, and we don't have to maintain two huge drivers. We only have to maintain the REST driver and the smaller bindings, and by having this core of the driver written in one language, in REST language, we have fewer bugs as we don't have to test many drivers. We only have to focus really carefully on the single driver, the REST driver, so the idea of bindings was a really great idea. So, we started the development of bindings with the C++ C++ language, so we based the bindings API on the original C++ driver, so having the same API, and that means that when you build the C++ bindings, this is just an SO file, SO library that you can actually just replace as is, replace the C++ driver, and as the API is the same, this will just work, and we actually run the original test of the C++ driver by replacing this SO file, and most of the tests, the vast majority of the tests are working out of the box. The resulting project is really much smaller, actually when you look at different drivers, they can have multiple tens of thousands of lines of code, but the bindings project is so much smaller, and that means that it's very easy to maintain, and actually we saw that this bindings project has better stability, and while developing it, and while testing it, we had actually fewer problems compared to the original C++ driver, and this is partially what we expected as the REST core of the driver was really well tested, and we are very confident in that. So maybe as a final thing, this is maybe a very short peek at the driver and the bindings, so on the right is the original header declaration of some function in the C++ driver, and on the left is the implementation that we did in the bindings, so as you can see, and this is Xner C, this has the nomangal defined in this function, so this actually will be represented as the same symbol as the original C function, but under the hood, we actually call different components of the REST driver, so the user using the bindings is actually, from his perspective or hers perspective, isn't really aware of the fact that some other driver is working under the hood, for them it's just the same API, and they don't have to worry about configuring the REST driver is done in the same way as the C++ driver is done by this pin compatibility layer. So that concludes the presentation, so now we can move on to the Q&A phase, but before we do that, there are a couple of events that CillaDB is organizing, so next month we have a very nice webinar about building low latency REST applications on CillaDB, so this particular webinar will be more hands on, we will learn more about CillaDB, not only the REST portion, but also how to use CillaDB, and actually I will be taking part in this webinar, but apart from that, you can also check out our CillaDB university, and later this year, at the end of the year, we are organizing P99 conference, this is a virtual conference that we have many speakers from across the industry, so this is not a Cilla specific conference, this is really a really broad conference about different aspects of performance, and this is not our first time organizing this conference, and when you visit the website you can see the presentation from last year, which I really encourage you to check out then. Okay, so let's move on to the Q&A phase, and actually I have the view of the questions that you have asked us, so maybe let's start with this one question, so this one is really fun, I think. So, this question is about whether we plan or we have thought about using the REST language in CillaDB itself, so the core database, and actually I think there are some parallels to the Linux kernel, where the Linux kernel is a huge C project, of course in terms of Cilla, it's a C++ project, but similar to the Linux kernel, we actually are interested in changing some small parts to use the REST language, and actually right now one small component of Cilla is written in REST, so we recently introduced a support for user defined functions that use WebAssembly to run the user defined function, and this portion of the code that handles the user defined function is actually written in REST, so we found out that the particular WebAssembly runtime that we are using, it was much easier and better to use the REST language to interface with that, and so this is one small portion of the database that we actually use the REST language for, and actually internally we are experimenting with integrating REST language and our own C-star framework for C++ and being able to, for those two separate plans to coexist and to be able to write more components in REST for Cilla, so I guess this will be an exciting thing to watch how it progresses in the future. So maybe this question, I guess it should be visible right now, so the question is about the load balancing, is it single or multiple failover, so in CillaDB you can configure the replication factor of the data, so the default value that we suggest our users to use is replication factor three, so it means that if you have an insert query, this data will be replicated onto three different nodes, and in this case this will be multiple failover, as when the query, for example if you do a select query and two nodes don't answer this query, then the third node will be queried, and we have also, apart from that we have other replication policies, so you can have multi DC policies, so you can specify that some of the nodes are in some data center, like regional AWS, and you can specify a replication factor in each one of them, so CillaDBs can really handle cases where an entire data center goes down, and actually I think there was a pretty nice blog post about one case of a customer that had an entire data center go down, and actually the latencies of the database of CillaDB didn't really get worse, and the database handled it really well. So maybe this one is a really great question, so the question is about where should I start, if I also want to develop a CillaDB driver, and as I alluded to in the presentation, actually this is not a daunting task, it can be done by a small team in a hackathon, so first of all this is not a daunting task, it should not be scared about it, and my personal recommendation about how to start with that would be first of all looking into other drivers and how they do the lower level operations, so I guess the Rust driver is the newest driver, I think this is the cleanest driver in terms of the codebase, so I would start looking at the Rust driver for inspiration. First of all, the second thing is reading the documentation on the protocol description, and actually I think you can go really far by just reading the documentation, the description of the protocol, and just implementing this protocol. And when you get over that phase, I think that, for example, CillaDB University is a great way to learn about those load balancing concepts, for example, that are a great component, a really necessary component of writing a driver. So this one is a short question, but the Rust driver code is, yes, it's open source, so I guess someone will write it on the chat really soon, but yes, it's open source and it was started as an open source project, so actually if you dig down the history of the Git repository, you can see how it was developed, how it over three years ago, how it was started, what was the first components that we have developed. So another question is about the efficiency of CillaDB compared to other databases, so I guess my recommendation is to go to the CillaDB website, we have really many benchmarks written about it, my personal favorite, because I took part in that effort was comparing Cassandra 4 to CillaDB, and we published a blog post about those benchmarks, and actually we started the first blog post was comparing Cassandra 3 compared to Cassandra 4, so even for people not interested in Cilla, this was a great resource for learning about the performance of the databases, so in our blog you can read many articles about different performance of the databases, like CillaDB and Cassandra, as well as on our main website you can see, I think it's right there on the first page you can see I think the performance comparison of DynamoDB, and how the cost differs between CillaDB and DynamoDB, as well as other databases. Okay, so I think we are nearing the end of the presentation, and the end of the time, so I think I'll finish the presentation right now and take it back to the Linux Foundation. Thank you so much Peter for your time today and thank you everyone for joining us. As a reminder this recording will be on the Linux Foundation's YouTube page later today. We hope you join us for future webinars, have a wonderful day.