 Hello, everyone. Thanks for coming in. So I'm Rashmi Krishnayasati. I'm a principal engineer working with Amazon Keyspaces. And joining with me today, I have Rohan Gupta, who is the other engineer working with Amazon Keyspaces for Apache Cassandra. And today, we'll start with Amazon Keyspaces overview on what it is and how we built it. We'll talk about how we architected it to be highly scalable and highly available as well. And we'll also talk about the features of Amazon Keyspaces. And Rohan here will also be talking in detail about how they were built to provide a seamless experience for the customers. And we'll end the session with some key takeaways and any Q&A that you might have. So Amazon Keyspaces for Apache Cassandra is a scalable, highly available, and fully managed database service. It's Apache Cassandra compatible. That means that you can use your SQL code to manage your tables, create tables, manage tables, read and write data, anything that you can do with a regular Cassandra. You can use your regular drivers, developer tools that you already use. Using Amazon Keyspaces is as simple as changing your Cassandra endpoint to Amazon Keyspaces service endpoint. And it automatically works everything. And you can use your Cassandra migration tools to migrate any data you need from one place to the Amazon Keyspaces. And it is fully managed, which means it doesn't have any servers that you have to manage. No servers. It also doesn't have any tombstone, compaction strategies that you have to manage. It's basically managed by Amazon Keyspaces, meaning by us, behind the scenes for you. And we'll talk about in detail a little bit more about how we land up doing that as we talk through the architecture. And serverless also means that you don't have to provision, configure, or operate this large Cassandra clusters. The tables in this with Cassandra or the Amazon Keyspaces will automatically scale up and down. And depending upon the provision mode or the on-demand capacity mode that you land up using, you can automatically, your tables can scale up and down depending on the load that is coming in. And it's virtually unlimited scale, which means that there is no limit on the size of the table or the number of rows that you have in the table. And at that virtually unlimited size, you still get single digit millisecond performance. And it's highly available, meaning that we provide 59 SLA with multi-region replication and 49 SLA for a single region Keyspaces table. And it integrates with a lot of other AWS services that we already have today, be it like AWS Identity and Access Management for authentication, KMS for encryption keys, and AWS CloudWatch for providing the data that the metrics, for example, like about if you want to understand how is your Keyspaces table doing in terms of throttling, metrics, performance, etc. All of that data will be readily available for you to build a dashboard on or anything with the Amazon AWS CloudWatch system. So thousands of customers across financial sectors, banking, media, entertainment are already using AWS Keyspaces for their Cassandra workloads. And we even have some customers like Intu to have migrated 100 plus terabyte workload to Amazon Keyspaces. So looking back at the architecture, right? Like this, let's say we take through the select query, right? We have the architecture where the client application is talking to Keyspaces. The Keyspaces itself uses Apache Cassandra's modular architecture, but it is reimagined to provide a fully managed service. So what does that mean? So behind the scenes, when you provide this Keyspaces endpoint, we go ahead and resolve the DNS and that gives you a endpoint to choose from, right? Like, so which connects to one of the Amazon Keyspaces node. The first thing that we land up doing is authenticate and authorize your request. And for this, we integrate with AWS identity and access management, also commonly known as IAM. And with IAM, you actually get a lot of fine-grained access control policies with which you can decide who can or can't access your tables and you can control it at per operation, like in terms of like select, insert, et cetera, right? Once we know the request is secure, at that point we actually go ahead and route this request to the storage partition to fetch that data. And the storage here is built based on the learnings from building other massive large scale databases which can handle millions of requests too. And we maintain multiple copies of this data so that the requests are, your data is highly durable, highly available and secure as well. And the storage is also built in such a way that we take care of this garbage collection, TTL, compaction, et cetera, that you normally see with Cassandra workloads, right? And we'll talk in a little bit more detail of how is the storage actually partitioned in the next few slides. So that's for the select. What happens with an insert, for example? With an insert, the request is still going through the authentication authorization, then it hits one of the storage partitions and the storage partition now ensures that it talks to the other storage partitions in the system, right? And by default, Amazon KeysPaces provides you with a local quorum consistency. This means that it ensures that at least two of the storage partitions have durably written this data before we acknowledge it back. So zooming out, we have a client application hitting the service endpoint, but we have many service endpoints across different availability zones. We have different key space services, nodes, and then we have multiple storage partitions. Like you can see, we show at least three availability zones here because we replicate the data at least in three availability zones. And we do this so that even if one availability zone is impaired, for example, in this case, AZ2 is impaired, we will re-resolve the DNS to an endpoint which is actually active right now and then your request will be flowing through without any interruption. They will just be routed through a different endpoint which is actively working and you should not be able to see any availability or performance impacts with it. So that's how we understand how this request flow works. Let's understand how the table data itself is organized. So at a high level, the table is basically made of multiple storage partitions and it is divided based on the key spaces, like specific parts of the tables are stored and certain storage partitions. And we are cognizant of how much a partition can perform. We lift that to 3000 RCUs or 1000 WCUs and one RCU or read capacity unit is basically capable of providing up to four KB of data. So if you're doing, let's say, SLAT which is returning you eight KBs of data, we consume like two RCUs or tokens behind the scenes. And in case of, let's say, this is where the local quorum consistency, but if it's like a local one consistency, then we consume half of that because now we can just distribute your request to across all of these storage partitions more equally and use more of your resources. And with a WCU, one WCU or write capacity units can provide you up to one KB of data read. And by default, there's a local quorum consistency. And we limit it to this much because we want that limit on how much in each storage partition can handle because that way we know we can scale and we can scale without having a noisy neighbor impact or any other issues that you might land up seeing. So what happens if, you know, now customer wants additional throughput, then we basically land up dynamically increasing the number of partitions so that we get actually double the throughput or double the storage as well, right? And we do this without, you know, the customer really recognizing this, like, you know, we do this behind the scenes and customers can provide us hand by increasing their RCU or WCU on the table and we land up increasing there as well. So this works great, but what if there is a skew in this request pattern, right? Like, even if it's all uniform, then it scales, it just divides and like you can get the double the throughput, for example. In case of a skewed pattern, let's take this example, like for partition A, let's assume that like there is a sustained traffic on this partition, like let's say row foo and row bar. So now your throughput is also limited. So what we land up doing in that particular case is the system is constantly monitoring for such a use case, where if it reaches above a certain utilization, we automatically go and identify, you know, hey, where is this hotness or heat coming from, right? And we go ahead and ensure that we divide or split this partition in such a way that the row foo and row bar are in two different partitions so that you can double your throughput without any issues. Right, and this splitting of partitions also help you with storage as well. So that's basically the high level architecture. Now Rohan will talk us about like, you know, some of the features that we have today and like, you know, how we build that, like let's say point in time recovery, and time to leave, and like all of those features that you land up seeing with Cassandra as well. Oh, hey everyone. My name is Rohan Gupta, and I'm a software engineer on the Amazon Keyspaces team. And I'll be today walking you through some of the key features offered by Amazon Keyspaces and how we actually build them. So starting with point in time recovery or Pitter. So Pitter basically helps you to recover your data in case of accidental deletions or updates. And Pitter has absolutely no impact on the performance or the scalability or the availability of the data. Yeah, and so with Pitter, you can restore your table to any second in time within the past 35 days. And how we do Pitter is you can actively select the table or you choose your tables, which you want to enable the Pitter on, and the pricing is based on the size of the table. And it is like very, it's highly recommended to enable Pitter on all your production tables. So how did we actually vent and build Pitter behind the scenes? So let's dive a bit deeper into the architecture. So as Darshmi mentioned before, like a table is comprised of multiple storage partitions. And so let's take an example of a table, which has like three partitions, blue partition, purple partition and orange partition. So we are building logs for each partition independently and we are storing them in our S3 bucket. And yeah, so we basically periodically and continuously backup these logs in S3. And in addition to the logs, we are also taking snapshots of the table data from time to time. And we are encrypting these snapshots with the Amazon KMS key or the customer provided CMK. And so why do we basically take snapshots? So basically this is an optimization over just storing logs. If we try to build the restored copy of the table just with the logs, it will take us a longer time and that increases the time to do Pitter. So basically this is an optimization to reduce the total time taken for Pitter. And for a given table for each partition, we are storing the logs and snapshots together. So at a particular time, we have multiple snapshots and logs for the table in S3. Now let's say a customer, you as a customer, you delete some of your data that you didn't intend to or you do some updates that was not supposed to do. So key spaces makes it really simple and easy to recover your data to a previous good state. To do that basically, we take a restore time from the customer and we first validate that it is a valid restore time. It is in the past 35 days and once that is validated, we go further and we go into the S3 and we look for the snapshot which has the most recent version of your data. Once we get the snapshot, we take the restore time and the snapshot time and we get the logs between those and we apply forward any changes that happen to the snapshot. And yeah, so we do this process continuously for all the partitions which basically results into the whole restore table. Now moving on to another feature which is time to live or we call it TTL. With TTL, you can basically insert a row or a cell with the TTL value which is in seconds and basically the row disappears after that certain seconds. So queries don't return expired rows or cells and records are removed from the storage by a background compaction process and which is typically within 10 seconds, within 10 days. And this does not have any impact on the performance or availability of the table as well because this process is completely asynchronous and this runs as a background process. So how do we actually do the compaction in background? Let's dive a bit deeper into this. So we have a fleet of hosts for this particular use case and we already mentioned that we continuously take the snapshots of the table and store in S3 to do pitter and we use the same snapshots to do the TTL compactions as well. And why do we use the snapshot? If we directly go and do a read on a table, that will be a very costly operation for both us and the customer. So instead we use the snapshots instead to do the use case. So basically TTL hosts go to the S3, it gets the partition, the most latest snapshots, it then compacts the snapshot, it basically removes the expired cells and rows and it then right back to the storage partitions. And all this happens in the background. So it does not have any performance impact. Now let's see another feature which we recently launched, multi-region replication. Multi-region replication helps you to get a fully managed active to active replication across all AWS regions, which helps customer to basically build globally distributed applications. And you can read and write data to any region across the globe. And the replication lag within regions is typically within a second. So how does multi-region replication actually work? First the application writes data into the table in a given region. Then the table, then the data is successfully written in that particular local region across three AZs. And then we send the success back. And finally that data, we have a replication service which basically send the log records across the other regions configured in the multi-region key space. And the other region basically accepts those inserts. And this step happens asynchronously in key spaces. What about handling conflicts? Like in key spaces architecture, we don't have a leader. So all nodes can have inserts at the same time. How do we handle conflicts for the same keys? So in key spaces, the last writer wins. We use that method of data reconciliation. And all regions agree to the latest update based on the cell level timestamp. And but there can be cases that the timestamps are equal. In that case, we go with the larger value of the data type, of the cell basically. So we have a strict ordering. We do a strict ordering of the data type and we then select the largest value out of those. So for example, if it's a string, we take the lexicographically larger string. So Jello will be greater than hello. So Jello will be accepted. In case of integer is the larger integer. And in case of collections, we do a canonical ordering of the data type and we figure out the bigger out of the collection. Yeah, so basically all nodes across follow in key spaces across all the regions follow the same data reconciliation process, which makes this whole process more deterministic. And this basically results into the same image of the item across all regions. And any conflicts or divergence in key spaces is automatically handled. So moving on to some of the key takeaways. Amazon key spaces is serverless, which means customers don't have to provision, patch, or manage any servers. They don't have to install, maintain, or operate any kind of software. Customers don't have to worry about configuring other processes like compaction strategies, managing tombstones, JVMs, garbage collections, et cetera. Table is made of multiple storage partitions, which are spread across multiple servers across regions, across availability zones in a region. And we offer high traffic isolation and partition splits for better throughput and more storage. And there's virtually unlimited storage and throughput for a table. And finally, we're fully managed with a per table granularity. Many of our features work at a per table level, like Pitter and TTL, and they do not impact, enabling them to not impact the performance or the availability of the table. We have some other sessions going on in the summit. You should attend like, they're quite informative. And we have a workshop also going on, that's tomorrow at 11.50, which you can attend, and we have a dinner reception. And if you have any detailed question, we also have a booth where you can drop by and we can answer any of your questions. And yeah, and we're open to any questions. Yep, please go ahead. So behind the scenes, right? So at the end of the day, what you're looking at is for a given key, right? So Cassandra is also like basically a key value lookup. So for a given key, it's basically we calculate what we call it as like hash of that key. And based on that hash of that key, we go and figure out like, which is the right partition to actually store or route this request to. And this is where the one thing that we didn't talk about in these slides is like, we also have another thing like metadata systems, which actually provide you that data. So let's say if the storage partitions basically become bigger or larger, if we have to split, then that is basically pushed back through metadata systems. We get him back directly from storage partitions to the key spaces, key spaces nodes, or through metadata systems. We get that hint back now saying that, like, hey, this data is now split. So these are the new partitions now. So this is where you need to route your next request too. And this is basically an atomic operation when the split happens. It's like, we have three nodes and then it becomes kind of like a six node cluster where it now becomes like two child partitions. And that process is basically atomic. So at any point in time, there's only one partition just taking that request. I think, I mean, we might, I mean, we have some data on like how key spaces performs in terms of like single digit millisecond performance, et cetera. And if you're looking for exact benchmark, maybe you can come by the booth and if we can talk in more detail about like exactly what type of performance you're looking at, like, is it availability? Is it latencies? Is it the cost performance that you're looking at? So there is like usually various factors that come into picture when you're looking at a specific performance. Go ahead, go ahead, yeah. I could actually take that one. Currently, we don't support adding replication regions to a multi-region key space, but we support creating new tables or new key spaces, net new. Yeah, so currently we cannot support that feature as of now, but you can, we have certain timelines. I think you can come to our booth and we can, you can, the product team can help you with that. The Cassandra features, was this 11 sun key spaces features, exactly don't work in a similar fashion in terms of like the version deprecations or that you see with 3.11x and everything. But that being said, like, you know, we do have like a feature roadmap to keep it, you know, to support more and more of those features that you also see with Cassandra 5 or other things. But yeah, if there is a specific feature that you're interested in, like we could definitely work with our product team here. Like, I mean, we have meat here, like who can help you with like trying to understand, like, you know, what feature you're interested in and like where in that, where in our roadmap does that align to ship? And when you say performance, are you talking about the price per? Got it. Currently, we don't provide any SLAs for latencies. What we kind of promise is like single legit millisecond latencies at any kind of scale that you want, but we don't provide a specific SLA. You had a question? We have it. So we integrate with like Amazon CloudWatch. So you can go on, you can tune, you can see your metrics for inserts or yeah. So you can basically monitor your latencies using CloudWatch. So usually there are kind of different classes of metrics. Like, let's say if you're looking at performance, you can look at your latencies per operation, per table that you have, like we also put in like, you know, throttles, for example, we kind of also show your capacity utilization metrics. Like for example, like what is your provision capacity, consume capacity, those kind of things as well. So there is usually an array of different types of metrics. Kind of yes. So if you're looking for specific operation level, like specific key, like specific things, like, I mean, you could, like what the service level things that you kind of see in the CloudWatch is like aggregated at the operation and at the table level, like if you want more detailed metrics, like if you're debugging something, for example, like one of the things that you can also do is like integrate with other systems like AWS X-Ray, for example, to get those detailed metrics of like, you know, hey, what is my queries? How long did it take? Like at your client level as well. To add that, we also have a CloudTrail integration for DDL operations. So that also gives you an insight. Yeah, especially if you want to audit, like the rights that is going on in your systems. No, not right now. I mean, are you kind of curious about like a tech search, like vector search, and just curious about like your use case, got it? No, right now we don't provide that. I mean, it's basically still a key value looker, but yeah. I think, yeah. Thank you. Thank you for coming. Thank you. Thank you.