 I am going to get started so that we have more time for the other layer here's talks. I'm Shindong from Apple. I'm from the FoundationDB there. Neelam and I are going to talk about this awesome collaborated project between Apple and Stokely on the topic of native consistent catching layer in FoundationDB. Let's get started with some motivation behind it. So just like other systems, FoundationDB can suffer from the read hot traffic. To understand that problem, let's imagine that we have this simple cluster. We have a bunch of star servers happily serving requests. And let's imagine that there comes a bunch of clients who all are trying to access the same small key range that is hosted by this star server team. And at the first, when the traffic is still moderate, the load request can be load balanced between all the servers in that team quite nicely. But as we increase the traffic, that set of servers will be eventually overloaded and finally saturated because of the read hot chart. And a read hot chart can be particularly problematic in FoundationDB because a read hot star server not only slows down the read, it also slows down the writes. And eventually, it will affect the overall performance of the whole cluster. And also due to the fact that the read hot traffic tend to be temporary. And at the point of the saturation, there is virtually no possible way to keep up with the traffic without somehow we increase the replication number for that piece of the data. Then that's when the end because of the nature that the read traffic tend to temporary, we do not necessarily need a durable replica for that piece of data. An in-memory temporary replicated data can help. And that's when the cache comes into play. Now let me hand over to Neelam to talk more about that. Thank you, Shen. Good afternoon, everyone. I'm Neelam from Snowflake. And we're excited to talk about the caching here today. So now that Shen has convinced you that we do need a caching solution for FoundationDB, I'm here to walk you through some of the solutions. So before we actually talk about the native consistent caching within FoundationDB, you might ask, why not use an application-managed cache like Memcache.t or Redis Labs between application and the FoundationDB? Why not just use that instead of implementing caching within FoundationDB? So while that is a possible solution, it's not really a good one because the entire burden of managing that cache is actually on the application. And that's actually a very complex problem because the application needs to worry about the consistency and the coherency of that cache and availability. And complexity in general becomes really questionable. So even though it's a viable solution, it's not really a good one. With that, we will move on to what we are proposing. And that's a much better solution than just adding a side cache, which is to actually implement the caching functionality within FoundationDB. So as we know that the reads are served from storage servers, so this cache will actually sit in front of those storage servers. And consistent caching actually attacks a problem from really two angles. One, it allows certain hot key ranges to be held in memory like she pointed out. And it also allows us to increase our application factor for those read-hot ranges. So one, allowing it to be held in memory, it basically takes a load off of disks. I mean, it also reduces the read latency. And second, the increasing replication factor basically allows us to throw more CPUs at the problem. So the load can be basically managed by more CPUs. So both of these factors combined together also leads to low latency, which is also a good thing. So now since this cache is completely implemented within FoundationDB, it has the same consistency guarantees as the rest of the FoundationDB. So now let's look at how all this might look within FoundationDB, you know, via blog diagrams. So we have a FoundationDB client and the t-logs. So the FoundationDB client will continue writing into the t-logs. So even though there is like caching layer somewhere within FoundationDB, the writes continue to go to the t-logs. Like today, we have the storage servers. But now, in addition to the storage servers, we will also have storage cache servers, which basically implement the caching functionality. And they both will now start pulling mutations from the t-logs. So along with the storage servers, we have the storage cache servers pulling mutations from the t-logs. And the FDB client can issue reads from both the storage cache servers or the storage servers, depending on whether a key range is being cached or not. So just reminding you that, making it clear, that it's all happening within FoundationDB with basically nothing visible to the outside application other than the benefits that it's going to see. So now, like from this, I would like to zoom in a little bit on two of the main components in the next couple of slides, which is the storage cache servers, which are new and the changes that we need to make in the FoundationDB client. So first, the storage cache role. So we are bringing up a new storage cache role, which is going to be stateless and ephemeral. So what it really means is that it's not going to remember any state persistently, and it's not going to durableize any data. So if your storage cache server dies, it's going to come back as a brand new process. And in that process, it might even be responsible for a completely different key range. So it's completely ephemeral and stateless. So at a very high level, what is the storage server going to roll due? So it basically will establish a key range that it is responsible for. It will pull mutations from the T logs. It will filter out the irrelevant mutations, apply those mutations, and serve the ket request. So you might ask, but isn't it very similar to a storage server role? Actually, it is. That's exactly the case. But it is far simpler, because we are not durableizing any data. And that basically makes life a whole lot simpler. So now, without delving into too much detail, I would like to zoom in into the second piece, which is the FoundationDB client side changes. So we'll have the client API to configure the storage cache, which is basically you can add or remove a certain key ranges to the cache. And we'll have the ability to specify the application factor. And in addition, there is, so the client basically caches certain metadata about storage servers today, so that he knows which storage server to go to whenever a read query comes in. So now, with addition of storage cache servers, it also is going to maintain metadata information about those cache servers, similar to the storage servers, so that based on that metadata information, it can direct queries either to the storage servers or the cache servers. And now, as you can imagine, this metadata cache could be stale. So for instance, let's just assume that a new cache has come up. And the proxy or the client doesn't even know about it. So it might just continue sending its cache request to the storage servers. But in this case, with addition of cache, the storage server is going to recognize that this key range is being cached. So it's going to serve the query, but let the client know that there is a cache for this particular key range. So the client will then update this metadata cache, and then it will start sending queries for those key ranges to the cache server instead of the storage server. So this is at a very, very high level. The changes that are being made to the client side. So what are the implications of all this for the applications? Basically, none. And that is the whole point of this project, that we do not want to burden the application. They should just be able to benefit from this without any serious consequences. The cache is being completely managed within Foundation DB, and it's going to be completely transparent, except that it's going to have a mode to configure the cache, and it's going to see the added benefit for the reads. Now moving on to the last piece, for my part, which is the cache configuration modes. So first, we'll have the manual mode, where the application will have the option to define the key ranges that must be cached and a corresponding replication factor. And also, in addition to the manual mode, we'll have an automatic mode, where Foundation DB will detect the hot key ranges automatically, and then it will start to cache them as well. So you might ask, if we have the automatic detection and caching, why do we even care about the manual mode? So the manual mode actually gives us a lot of flexibility. For instance, there might be some key ranges that you just want low latency for. So they might not be hot key ranges to begin with. So if they are not hot, automatic detection is not going to detect those key ranges as being hot, and they are not going to be cached automatically. So we want to give applications an option to be able to cache any key range they want manually as well. So that is what the manual mode gives us. So from this, I would like to hand over to Shane to talk about the automatic detection and cache management and conclude our talk. Thank you. Thanks, Nino. So now that we've learned how the cache works and how we can manually spin them up, let's talk about automatic detection and cache management. For this part, we're going to reuse some of the existing metric reporting framework in the cluster. And the ultimate goal of that is to find an efficient way to marry the candidate key range and available cache together. We can further divide that into two parts. The first part is the automatic detection. The second part is the cache management. For the first part, right now, as I said, like each star server already keeps some simple statistics about all of the requests it serves during that period of time. And whenever a star server finds a shark becomes red hot, it'll notify a singleton process in the cluster about this. And that cluster, I mean, a singleton process in the cluster, and that process will then contact the star server to figure out in that red hot shark which key range has the highest rate density. So it will try to do the catch in the final granularity then catching the whole shark. And then for the cache management, this same process will not only track all the star servers. It also tracks all the catch rows in the cluster for the resource consumption. And that's, it all has the knowledge to know when and where to put the key range into catch and also manage the lifetime of the data in the catch. Now with all that, we really believe that this feature will be one of the most exciting features in the upcoming releases of FoundationDB because it adds a whole new dimension to the product and thus will allow for new use cases. And thank you. That's it.