 The Carnegie Mellon Vaccination Database Talks are made possible by Autotune. Learn how to automatically optimize your MySeq call and post-grace configurations at autotune.com. And by the Steven Moy Foundation for Keeping It Real, find out how best to keep it real at stevenmoyfoundation.org. We're excited today to have Dr. Ipucratis Kandis from Amazon Breschev. I'm talking about the interesting things that they've been working on for the first seven years now. Ipucratis is the Senior Principal Engineer at Amazon, and he works on Breschev. He's worked at Cloudera, he worked at IBM Research. So he's been involved in various database projects and industry. But prior to that, he got his PhD from Carnegie Mellon in 2012, also in databases. So he's seen me along in a good database, Brandon. So with that, Ipucratis supports yours. Real quickly, if anybody has questions for Ipucratis, as he's speaking, please unmute yourself, say who you are and where you're coming from, and feel free to ask questions at any time. Once it's been interactive and not have Ipucratis talking to himself for now. So with that, Ipucratis, thank you for being here. Thank you, Andy. Thank you for the invitation. Hello, everybody. So my name is Ipucratis. I work for Amazon Redshift. And today I'm going to talk to you about the world's new oil data. In particular, I'm going to talk to you about data management and analytics. Analytics is a very broad term, and there is a very large industry built around it. But in general, it consists of three big steps. We generate and collect a lot of data. We store them. We analyze them to drive our day-to-day business decisions. And it has never been easier to collect data. We have these devices. They emit a lot of data. We have the click streams, the web logs. We have all sorts of interactions that result in us collecting more data. And it has never been easier and cheaper to store this data. We have disks in the cloud where it costs approximately $20 per terabyte per month to store your data in the cloud. Or we have even these tapes in the cloud that it costs as low as $1 per terabyte per month to store your data. So we are very well covered in terms of data generation collection and storage. But when it comes to analyzing, a lot of challenges emerge. We have an ever-increasing appetite for higher scalability, higher concurrency, lower latency in our workloads. We have new types of data that we are being asked to analyze. And there is always a requirement to have ease of use in terms of what tools you can use. And obviously, you need to do all these things in a very secure way. And because of all these things, one thing, one trend that had been observed in the industry is what people used to call us the dark data problem, where the amount of data we collect and we store grows almost exponentially. Whereas the amount of data we use for our day-to-day business decisions, we analyze for our day-to-day business decisions, does not grow as fast. In fact, it grows linearly. And if you don't believe the analysts where they go and produce these nice graphs, we actually did an internal auditing in our systems. And we did see this ever-increasing gap between amount of data generated to the amount of data we use for our analytics. Amazon did observe this trend early on. And what we built was Amazon Redshift. Amazon Redshift was the first cloud data warehouse out there. It has been around since 2013. And it is one of the most popular and fastest cloud data warehouses. In particular, what we have gone publicly and said is that we have every day tens of thousands of customers that process exabytes of data daily. In AWS's global infrastructure of 25 geographic regions and 81 availability zones. And when we say these things, we're very careful when we use plural and not in singular. And when we say exabytes in day, means that today we're already past midday here in the West Coast. It means that our systems have already processed over an exabyte of data for the day. And this is quite impressive. This is what we call Big Data Management. So what is this talk about? And why I'm here today? Today I'm going to talk about the evolution of Redshift over the years. In particular, I'm going to talk about how we evolved the system since 2013, where the architecture of the system, if you will, was looking like this symbol picture that you can see in the screen, to what the system looks today, eight years later. You can see that the system has evolved a lot over the years, primarily driven by the requirements that have been posed by our customers, the news use cases we had to address. So we're going to open a bit the hood and look under the hood and see how Redshift looks like and how it has evolved over the years. And as Andy said, let's make this a session if possible interactive. If you have any questions, please feel free to raise your hand and please ask some questions. All right, so let's get going. Redshift is a big team and we are primarily focusing our energy, our resources into five big thematic categories. The first one, and that's why I put number zero there, and probably the most important one is security and availability. Our customers go and trust us with their data. They ask us to keep them safe, keep them secure, and make them available for processing at every point in time. And because of that, this is the highest priority in our systems. We need to make sure that we are secure and available. I'm not going to talk a lot about this security and availability work, but I'm going to jump to the next one, which is performance. Redshift by its DNA, if you will, has been around providing the highest performance one can get. And let's see how we do achieve high performance, high analytics performance. All right, in order to understand how Redshift works, let's look on the system on Redshift. A Redshift endpoint, a Redshift database, a data warehouse, looks like the following. There is a Redshift compute layer and then there is a Redshift managed storage. Every endpoint, every Redshift compute cluster consists of two different types of nodes. The entrance points to Redshift is the leader node, the one that it is depicted with a green box there. This is the entrance to the system. This is where customers connect. They use their tool of choice, JDBC or DBC data API, and connect to this leader node. The leader node is responsible for connectivity, but also for admission control, query planning, query coordination, and scheduling in general for this endpoint. The leader node goes, gets the request, the customer connects to this endpoint, and the request gets parsed, gets logically rewritten. We check with the catalog, with the metadata about statistics and other information, useful information. We go to the cost-based optimizer in order to do the cost-based optimization and the system generates a distributed query execution plan that may look like the one I have on the right-hand side. Once we decide about the distributed query execution plan, what Redshift is doing, which is quite interesting, is that it opens a C++ file and generates code. It generates code at runtime for every distributed query fragment that we have. We generate this C code, and we compile it, or if we don't find this code in our code cases, I'm going to talk more about that. We take the executable of this query fragment, we send it down to the compute nodes. The compute nodes of Redshift are assigned to different partitions or data slices, as we call them internally, of data. Every compute node takes the executable and tries to execute it as fast as it can. In order to do that, it starts scanning the data, which is a pure columnar system, and we do all sorts of good optimizations there that try to improve performance. For example, we use minmax pruning. We have vectorized AVX to vectorize the code that performs the scans. We use vectorized execution-friendly encodings in order to be able to do lots of operations on compressed data, and so on and so forth. By doing so, we are able to offer pretty good performance. In order to understand how strong the core execution of Redshift is, let's try to look a little more specific. Let's say you have a simple query like the one I have on this slide, where it tries to, it scans, it joins to tables and applies a predicate and calculates an aggregate on top of that. When we get a query like that, what Redshift actually ends up doing is it goes and generates a code that looks like the one I have on the left-hand side, where you can see we have a tight loop where it goes and takes values, applies the predicate, probes the hash table to do the join, and performs the aggregation. Very, very efficient, very simple code that one can pass in a single page, PowerPoint page. Because that will not be as efficient if we were to generate this scalar code as it is, we do a lot of optimizations. For example, we do prefetching, cache line prefetching in order to eliminate the data access latencies. In particular, what we do, we have a small l1 cache resident buffer where we go and we pull values for n tuples ahead of us, and then we go and we actually process the tuple that it is at the top of this buffer. We try to do some cache line prefetching in order to minimize the data access latencies. On top of that... How much is that cache line prefetching centered up to different versions of Xeon? Actually, you guys run a graviton too, right? So you have to deal with arm stuff as well? I currently receive, does not run on graviton. We support x86 instances. We support a limited set of instances, so we do spend time to obviously tune for this limited set of instances that we are running on. Our code does not have to be extremely generic. We specialize on the number of instances we offer our services on. And this code I just presented is actually the scalar version of the code. It's actually simple, but the actual code we actually executed is AVX2 enabled. So many of these instructions do not look like that. It is AVX2. It's not that difficult. I can show you the corresponding code. It's easy to parse, but it's not as simple as the one I just showed. And by doing all these things, you can understand that, and obviously we do mean max pruning. We do late materialization. We don't touch columns, payloads that are not being used, and etc. And by doing all these things, we managed to get very good performance. The core performance of Redshift is pretty good. And our team is focused on... A big chunk of our team is focused on continuously improving performance. And the way we do it is by telemetry, for example. In this graph, I'm showing you how much the performance of Redshift improved when it was running the cloud data warehouse benchmark model after TPCDS over the course of six months. We were able to improve the performance of the system by a factor of... by 3.5x. And the way we did it is by looking on telemetry, looking where the system is spending most of its time on the fleet, by collecting aggregated anonymized telemetry from the fleet, and seeing where the time goes, and understanding where there are opportunities to improve performance, and going, implementing that, and moving to the next bottleneck. And by playing this waka-mall game of finding the bottlenecks, fixing them, going to the next one, over six months, we were able to improve the performance of the system over three times. Other things that you can do in the cloud, there are many things that you can do in the clouds that are simply not possible in the, you know, in the data management world before the cloud, or the one we read in the cow book, if you will. So, for example, one of these changes we recently did, helped us help the performance of Redshift significantly, has been this combination as a service feature. So what happens is the following. Redshift generates this C++ code that then compiles and sends down to the compute nodes. But if you have to do that on every query, you are essentially adding in the query execution hotpath a GCC compilation, which is sometimes maybe very expensive, depending on the query execution time of the particular query you are executing. So, in order to minimize this latency, what Redshift has been doing since the beginning of the service was to have a small cache on the leader node, on every leader node, that it looks, have I executed this query fragment in the past by essentially using the, looking the signature of the generated code for this query fragment and pulling the code from the cache, the compiled executable from the cache and sending it immediately down to the compute nodes without having to GCC compile it. And that was working pretty well. In particular, we were getting this compiled code cache hit rate over 99.5%. So, only five out of a thousand queries were having to actually compile. But that was not enough, especially for us, the more applications were being onboarded to Redshift and some of them having very low latency SLAs. And because of that, what we did is we build a global cache where every time we see a query fragment that we have not executed in the past, we go and we push this query fragment, the generated code of this query fragment into a global cache. Then we have a farm of machines that go and compile this, this fragment and put it to a global place where all the clusters in the Redshift can consume. And by doing so, we were able to improve the compile code cache hit rate by another of magnitude. In particular, we went from 99.5% to over 99.96%. So, that had very significant impact in the performance of Redshift workloads. And kind of the way the inside why this thing works pretty well is actually because of the P Jones principle. Because there are that many types of different queries customers run on a massively fleet like the one that Redshift has. There are that many tables with four integers, five integers, and there are that many queries that have one predicate, two predicates, and etc. And by leveraging the commonality across queries, we were able to improve the query compile code hit rate by over a magnitude, over an order of magnitude. So, we have been focusing a lot on performance. And right now we are leading the pack when it comes to price performance against other popular cloud data warehouses. And in this slide, I'm just plotting the comparison of price performance between Redshift and other popular cloud data warehouses when they are running the cloud data warehouse benchmark model after PPCDS in two different settings. One where the out of box performance where customers just go and create tables and they load data without doing any type of tuning. And the other one where customers actually spend some time tuning the workload and picking the proper physical design. And as you can see Redshift has provides better price performance between 20% over two 3.8 types. What is the performance difference between Redshift and Redshift? I don't remember the exact number, but this is something that we obviously monitor and we try to minimize the gap. And I'm going to talk a little more in a later on what we're doing this front. And maybe, Andy, if you can wait a few slides, I will touch on this. And what makes the drop for Snowflake so significant in policy? There are, I don't name any names here. All right, so the point, the other thing that actually it is even more impressive for Redshift is the scalability. So in this plot, I am plotting the performance of Redshift, how much time it took for Redshift to run the cloud data warehouse benchmark out based off of PPCDS. When we linearly increase, increase, I'm sorry, when we increase the data set size from 10 terabytes all the way up to one petabyte, so two orders of magnitude more data. And we proportionally increased the size of the Redshift compute cluster that was running this benchmark. And you can see that the workload execution time of this benchmark remain pretty much the same even though we increased the amount of data we process by two orders of magnitude. So customers get pretty much the same performance and they get predictable performance and scalability. They get pretty much the same performance by just putting a little more hardware increasing when the data set size increases. And that makes the provisioning, the planning and all these difficult tasks much easier. All right. And by the way, since we have been focusing on performance, we have also the ability to take bigger bets and look a little forward in terms of where the hardware, for example, is going. And one observation we did there was that there is an ever-increasing gap that the bandwidth, the SSD bandwidth we're getting has been increasing in much higher rate than the CPU or memory bandwidth improvements that are happening, say, over the last nine years. And by making this observation, one thing that makes a lot of sense is to try to push computation down where the data lives. And one feature, one component that we recently built is called Aqua, which is essentially a custom and AWS designed processor that actually tries to perform a lot of processing on the wire between the disks and the processors. And we have been working on Aqua and right now it is accelerating a bunch of operations at the scan and aggregation layer. So we're also essentially we're designing building our own processors in order to give even bigger boosts in performance of Redshift. So that kind of concludes the one area where we are putting a lot of our energy. We are focusing a lot of our energy, which is in the core performance of the system. Once we did that, customers were happy with saying the performance of Redshift. We have Steven here or happy on the performance of the system. But then they were asked, we like the core of performance, but we would like to put more data in our system and we would like to have even more concurrent users using your system. And that's what we did. And in the next section, I'm going to talk about storage and computer elasticity. Probably the biggest architectural change that happened in Redshift over the years since 2013 and the launch of the service has been Redshift managed storage. What we did is that we disaggregated the storage from compute and we built a disaggregated storage layer which we called Redshift managed storage that lives outside of any Redshift compute cluster. And by doing so, we are allowed to do many things to provide services that we were not able to offer before. For example, since now we are committing the data, the source of truth is disaggregated from any Redshift compute cluster. We can offer guarantees such as we are not going to guarantee zero data loss. For example, one feature that we released last year has been the ability to relocate clusters across availability zones. So you get a button and if in a very rare case where something happens in a cluster in one availability zone, you can press a button and then move this cluster to a different availability zone and continue proceeding from where the cluster was left processing. And the other area where we put a lot of energy has been in the area of compute elasticity. In particular, customers, we offer a couple of options to our customers to improve the throughput of the system. The first thing we did was to offer something which we call elastic resize so that customers that connect to a Redshift compute cluster with a click of a button that can increase or decrease the size of this Redshift compute cluster in order to meet the SLAs for the workload they are running. So what elastic resize is doing is adding compute nodes in the system or in removing and then changes the assignment of data slices into the new compute cluster and continuous processing from that. And the customers get proportional improvements or decreases in performance. So with this feature, we are able to kind of tailor the latency of every individual queries to the needs of the customer. But if the number of jobs, the number of concurrent jobs that are being thrown to the system fully utilizes the resources of a single individual Redshift compute cluster, what we did is we built a feature which we call Redshift Concuracy Scaling that allows for customers to auto scale that allows the system to auto scale and adapt to bursts of activity that are happening by that are being submitted to this compute cluster. So the way it works is that the customer remains connecting to the single Redshift point and by then keeps on submitting the jobs as before. But if the system observes that there is queuing because the resources of the cluster CPU, memory, IO are being food at capacity, it goes and acquires additional equalized Redshift clusters and starts peeling over jobs to this newly acquired cluster. This cluster resumes operation by consuming data from the Redshift managed storage. And when the burst of activity comes down, we release the additionally acquired Redshift clusters and the system contracts back to its original state. And by doing so, we are able to improve the throughput of the system almost in a linear fashion. So in this graph, I am plotting the throughput of a Redshift endpoint while it is running the cloud data warehouse benchmark, when an increasing number of concurrent users is connected and submits queries from this benchmark with zero think time. So this experiment kind of emulates a typical scenario where a much bigger number of concurrent users are running an intensive workload. So back in January 2019, the maximum throughput one could get out of a four node DC28 Excel cluster was approximately 200 queries per hour by five concurrent users. By April of that year, we were able to reach approximately 1,000 queries per hour by up to 50 concurrent users. By August of that year, we were able to reach 180 concurrent users getting something less than 8,000 queries per hour. And by November of that year, we were able to achieve up to over 12,000 queries per hour by 220 concurrent users submitting queries with zero think time. So what happened is that within essentially a few months, within one calendar year, the customer did not have to do any change in the application. And the system achieved over 60X improvement in concurrency, without having to do any change in the application, which is kind of impressive. And by the way, since this is an experiment where we have zero think time and we are able to achieve this performance linear scalability with up to 210 concurrent users, this emulates a real scenario where you can have thousands of concurrent users using this endpoint. So with Elastic Resize, we are able to increase this and decrease the size of a Redshift compute cluster to meet the requirements for individual query latency. And by using concurrency scaling, we are able to auto-scale and increase the throughput of the system when you have bursts of activity. The next thing our customer said is that we really like now all these improvements you have done on computer elasticity. But ideally, we would like to pop up individual separated endpoints where we can use for the niche of our line of business or different department. So that's what we did. We recently, this year, we launched a feature which we call data setting where a user can be using the Redshift compute cluster, be committing data to this endpoint and then pop up independent isolated endpoints that consume this data. And they see this consumer clusters, compute clusters, see a state of the data that we offer the same strong transactional guarantees that we offer in the main Redshift producer cluster. In particular, what Redshift guarantees is transactional serializable snapshot isolation. We guarantee that even when we have independent consumer clusters consuming from a main producer cluster. So now customers have this nice compute elasticity capabilities within every compute independent Redshift compute cluster. And then they can pop up independent clusters to continue doing the same thing. So the options they get are pretty wide. And by doing so, we were able to address the requirements for elasticity both in the storage as well as in the compute layer of Redshift. The next thing our customers came back and told us is we really like the performance. We really like the elasticity of Redshift, but we would like to make the system much easier to use. And that obviously speaks closer to, close to these interests, recent interests. So they told us, you know, we would like to make the system easier to use. And also we would like to minimize the amount of work somebody has to do in order to get good performance out of Redshift. So a big tongue of our energy is being spent on building autonomics in the system. And there are several areas where we use autonomics. The first one is actually on operations. As I told you at the beginning of the talk, Redshift is a very, very large system enterprise. And we have thousands, tens of thousands of customers in 25 geographic regions, 81 availability zones. So when you have such a big operation, there is always something that may not be working as it's supposed to be working. And there are two types of errors. There are the easy errors, the one that something dies. You have something in a machine, it goes down, you don't get a response, you know that it is dead after a while, and you kick off an automated remediation process to fix the problem. And the knife is good. Where things are more difficult is in the case of gray errors. Say you have like a nick in a machine that it is not very well attached to the motherboard, or you have a top of the rack router that it is dropping packets. Being able to detect these types of errors and kick off the automated remediation actions that you have to kick off, it's quite challenging. Sometimes customers have to go and monitor and ping us and say, oh, hey, the performance of my system is not as good as I was getting. Can you please look at it? So that's what we did. We tried to build automated monitoring systems that detect these outliers and when we get confidence that something is not going well, we kick off the automated remediation. So that is one area where we are spending a lot, we are building a lot of smartness, a lot of automation in order to improve the health and the operational health of our system. That goes to the kind of on the fleet level. But then when you look on the individual data warehouse, we are spending a lot of energy into automating all the mundane maintenance operations that one has to do on a traditional data warehouse. So for example, in times of low activity or no activity, we have a list of maintenance actions that we can take. So we may decide to kick off and analyze in a particular table that it is very popular and it has outdated statistics. Or we may decide to document table or we may decide to maintain some materialized use and so on and so forth. So the way it works is that we're monitoring the usage of a cluster and we are ranking with some priorities the individual maintenance jobs that we have to do. And then at times of low activity, we go and kick off those operations when we feel that by performing this maintenance operation, it will help the performance of the system. And in order to do that, in order to make these types of to be opinionated and perform these types of operations, you need to have algorithms that help you give you confidence that the maintenance operation we are doing is good. So for example, there is this paper at VLDB 2020 where we give some details about the recommendation algorithm we use in order to recommend a distribution case for a warehouse. We have built similar algorithms in order to decide about, for example, the short key of a table and so on. And this was the first generation of the recommendations we built. The next thing we did, which this is a feature we launched last at the event last year, was what we call automatic table optimization. Well, what we do is we not only make recommendations to our customers to make some changes in their physical design, but we go and we do these changes automatically ourselves, the system. So the customer goes and decorates this table, this is the scheme of the table, and it is an auto table. So you manage it, you decide about the physical design of the system, I don't want to care about it. And what we do is we monitor the workload and we make decisions and go and apply these decisions. For example, in this graph, what I'm plotting is the performance of Redshift when it runs the cloud data warehouse benchmark. And what you see is that the time it took for the system, within 48 hours, within two days, where it was taking 100 minutes to run this workload, once the automated table optimizations kicked in, the latency to execute this query improved by 40%, went from 100 minutes down to 60 minutes, without the customer having to do anything. Machine learning is pretty good in many aspects. And one of the most interesting areas is that it helps you do nice categorization of jobs. And this is what we have been using this widely in the system. The first thing we did there was a very simple classifier. So when the system gets jobs, it quickly determines whether this is a big job or a small fast job. And by having this separation of fast jobs, small jobs and big jobs, we are able to schedule them accordingly. So what we do is we take the small jobs and we try to get them in the system and get them out of the system as fast as we can, without essentially admission control. And by doing so, we are able to increase the throughput of the system because we get all the small jobs in and out as fast as we can. The next thing we did is we extended the classifier and we built a query predictor that kind of predicts with some confidence the resource needs of an individual query. And once you have high confidence on the needed resources of a particular query, how much memory, how much IO it needs, we are able to schedule the jobs in the system to fully utilize the resources of an individual compute cluster and maximize the utilization and thereby improve the throughput of the system. So there is a lot of work there and it's very interesting to being able to predict with some accuracy the resources needed for a particular query. And we use machine learning classifiers to do so. So kind of this covered the area of the things we do in the area of autonomics. The last area where we spend a lot of resources is around integrating with the very broad AWS ecosystem. So Redshift back in 2013 launched and it was a cloud data warehouse. Customers were loading data into Redshift and they were running their BI workloads using Redshift. But as you know, AWS has a bunch of services, a very popular cloud vendor and we have all sorts of data and all sorts of data management systems that are running on AWS. So what we have been doing is we have been extending the capabilities of Redshift to be able to use the best tool for the job. So for example, the first thing we did there was that customers came and told us, look, we have all this, we like to use Redshift for our BI, our data warehouse data, but we also have this very big data in open file formats in Amazon S3. Can we, we would like to perform analysis across our BI and S3 data. And that's what we did. We built a feature of Redshift called Spectrum that gives the ability in Redshift to join, to process data in open file formats in Amazon S3. So now customers can use not only their BI data, but also the data in their big data in S3. The next thing came was we have also this high performing operational databases. We have Aurora, we have RDS, we have Aurora and RDS Postgres as well as MySQL. We would like to use this data as well for our analysis. And that's what we did. We built the ability to federate queries and access data in those operational databases. And one feature that we announced last year was the ability not only to federate databases, but essentially incrementally maintain materialized views over those operational databases into Redshift. This is the glue elastic views product that has been announced. By using this capabilities, federated query or elastic views, customers can have their big data in S3, join the big data in S3, as well as in the warehouse, as well as in the operational databases. Next came machine learning. Customers have a lot of data stored in Redshift being the system of record. And when it comes to that, they ask us can we would like to perform analysis, either train our models or use the models we have for inference within the database. So can we please have that? And that's what we did. We integrated with say Amazon SageMaker, which is Amazon suite of machine learning products. And customers can execute, can train queries, can train models and use, apply inference within SQL, within the Redshift interface. And they can do all sorts of stuff by performing joins and stuff as well as inference and training within a single step SQL SageMaker. And then in order to be generic enough, we not only integrated with all the other systems, but we also integrated with Lambda. So customers can execute arbitrary Lambda code and be part of their SQL queries. So what we did is we expanded the use cases of Redshift from being a traditional cloud data warehouse to being able to integrate with a bunch of services in the very broad AWS ecosystem. To be clear, the Lambda is like you can write a UDF and then you invoke that from the query. Yes. Right. I guess they're paying for computers, so they do something stupid. Right, you're stuck. I mean, at that point also too, because it's arbitrary code, your query optimizer treats it as a black box. You have no idea what possibly you're going to do. Or are there restrictions on where it can appear in the query? You can only be in the select statement, not aware calls. No, you're right. It can be anywhere. It is a little challenging on the costing. There is not a lot of things you can do there, right? Yeah. But in terms of pricing, customers pay with the invocation, the Lambda invocations they do. And the additional charges and that. And because we have been using, as we started expanding the use cases, we went from being able to process only scalar data, now we have all sorts of semi-structured or unstructured data that we had to deal with. And in order to do that, we extended the SQL surface of Redshift and we extended it by essentially enhancing the Redshift query language to support the particle query language so that we are able to perform semi-structured and unstructured data processing. Particle actually allows, we introduce a new data type called the super encoding, that essentially allows us to perform schema-less query processing within Redshift. And there is a lot of things to be said there, but since we are running a little out of time, I will move fast. One very interesting area which we have, a feature which we have launched the past four years has been Amazon Redshift Spectrum that essentially allows Redshift to join the data that are being managed by the warehouse as well as the data in the lake, in some, in the architectures that customers like to refer to as the lake house. And the way it works is that we build a micro, a very large scan and aggregation layer that it is able to go and process data from S3 at very high bandwidth, very high throughput. And the way we do that is we have to go and modify the planner in order to being aware of essentially external tables that are partitioned and we have to, we build a lot of interesting optimizations to do dynamic partition pruning there. And the other interesting thing that we built was a casting layer which essentially takes advantage of the fact that there is a lot of repetition in queries, especially when they are processing big data in S3 where typically customers, what they do is they execute queries and then they move to the next day and they add more data and they execute the same queries. By taking advantage of this repeatability we are able to improve the performance of queries that execute over lake data by building a very simple casting layer there. And I already talked about the ability to run training as well as inference within SQL in Redshift without the customer having to export and import data in order to, you say, Amazon SageMaker. And that kind of concludes my talk. So I kind of walked you through the past eight years of evolution in Amazon Redshift. We are in a position now where we have tens of thousands of customers processing exabytes of data daily, have the ability to store petabytes of data within a single database and have thousands of concurrent users consuming this database. And we are spending a lot of our energy in autonomics, in building smartness so that the customer does not have to worry about the health or the performance of the system and the system kind of improves along the way. And obviously we take advantage of the very broad and advanced ecosystem of services at AWS and we offer tight integration with them. And as we like to say at Amazon, it is still day one. We are excited about where the whole cloud data management business is going. And I think there is a lot of innovation to be done in the area. And we are excited about that. And that will conclude my talk. Thank you. My email is ipoatamazon. Please say if you have any questions, I will be more than happy to answer. Okay, so I will applaud if Gretas and Pat have anyone else. Thank you for doing this. So let's open it up to do the audience to questions. Again, please unmute yourself and say where you're coming from as far away. Steven Moe, you had a question in the chat. Do you just want to say that out loud? Yeah, thank you, Yipo, for the talk. So in a lot of other computing cloud data warehouse, one of the struggles they have is how to manage the meta store on the catalog. And with all this push into the data lake house, I imagine, compared to Russia, the Russia architecture is your catalogs being stored on the leader node. And with data sharing, last time I tried it, most of the catalog is still remaining on the respective individual leader node. So how are you thinking in terms of how is the next set of evolution as you're pushing the catalog workload higher and higher and you're really going to start, I think you will start overloading your leader node with the catalog. Yeah, so I mean, Steven, thank you for the question. First of all, in Redshift, even though the catalog lives, the live version of the catalog lives on the leader node, the data is committed to Redshift Manage Storage. So the metadata is part of the commit. So there is the separation of concerns, right? The durability and where the system, the source of truth is kind of separated with where the live version is. At least what we have seen is that the TPS of the system, especially when it comes to Redshift, the amount of information that needs to be consumed pulled from the meta store on the catalog is very small. It's the schema of the tables as well as statistics in a coarse-grained fashion. So there is not a lot of processing to be done. So our catalog can go to very high TPS that significantly exceeds the needs we see for an individual database. So so far we have been able to kind of go with a very simple architecture there of having the live version on the leader node of individual compute cluster and then serve very high TPS needs. Never know, as use cases evolve with my data, not be the case, but so far we have not seen that need, but I would be happy to talk with you if you have seen any use cases where this is not the case. Thank you. Awesome. Thomas, do you want to ask a question? Sure. Thanks for the talk. Thanks, Andy, for hosting. I'm from Hasselplatz Institute in Germany. My question would be about Spectrum. In a paper, the workers are being described as small and stateless workers. Can you explain how they would compare to lambda-hosted workers and would you do anything apart from select project aggregate on them down the road? Thank you, Thomas, and thank you for staying until midnight in Germany to attend the talk. Appreciate it. So it's very similar, right? I mean, if you think about it, it's almost like a lambda system that performs select project aggregate. We do other things like, for example, bloom filters and semi-joints. So a bunch of even of the joint processing does take place in that. So, yeah, the answer is very similar to lambda conceptually. Yes. Was it a second part of your question? No, thank you. Okay, awesome. Hi. Anybody else in the audience? Hi. This is Lin. Yeah, I'm obviously from CMU. Thanks for the interesting talk, Yipo. So I have a question about the query resource prediction you mentioned, since our group have been doing some work on the same line. So wondering, first of all, when you are predicting the query resource, what is the granularity that you are predicting is like on the query, entire query plan basis or the segment basis, operator basis, et cetera. That's the first part. The second part is I'm wondering how accurate are your predictions are, right? Like, it just works like a charm problem solved. All you have some challenges that may not be that accurate. I'm just curious. Yeah, yeah. Obviously, by the way, so to answer the second part is the more we train, right? The more we train with the actual workload and the data on the individual compute cluster, the more accurate it becomes, right? So the second to that is that you need to have some guardrails, right? If the system fails over because your prediction was not good, then you have problems. So you need to put the necessary guardrails there, which means also that there is some granularity, right? You are not predicting at, say, the megabyte level or whatever you want. We are putting enough coarse grains categorization to allow us some granularity, right? So, yeah, I think there is, I mean, the more accurate you are and the more confident you are, the better you can schedule stuff. So there is this part. Now, the second question was about what is the granularity in terms of query segments or entire queries. It depends. So we do it at a bunch of levels, but we kind of, we want it also at the top level, at the query level, so that then we can decide about scheduling of entire job. For example, our predictor can tell us, you know what, don't even run it. Don't even attempt to run it on the main Redshift compute cluster. But this is a query that should be executed on a concurrency scale. We need to auto scale. So we do it at the query level. Now, but then we can also become smarter at the subquery or at the fragment level based on what signal we are getting there. So it sounds like you have models at different levels, all making predictions. Is that correct? Yes, the main one is at the query level. Okay, got it. Thank you. Anybody else? So the table optimization thing, what is that exactly doing? It's changing with the encoding scheme of the tables themselves and you're doing this sort of incrementally, like block by block, or is it like the algorithm says, hey, this is what you should be doing, I think, and it goes to all the data and rewrites it. Or how should I understand what does it actually do? The basic thing is we pick a distribution case and shortcase for the tables. So the customer does not put anything. I mean, I'm sure Steven spent a lot of time thinking about this case and shortcase. He's laughing. I can see him. Now we take that. We say, okay, you know what, with this kind of work, actually, if it is a very advanced user, like, you know, Steven was amazing on that, like, he can do this thing by himself. But oftentimes, this is not the case. So putting the necessary automation there makes the life of our users much better. Okay. So then during this, the top of the two, you talked about like the, once you do hardware features, you guys have developed for Redshift, like there was Aqua, there was the Nitro stuff. Like, do people, does a Redshift customer have to enable, I want these things and they're like add-ons or like, they just get them for free, you don't need to cover them all the time. Mostly they get them for free. And this is where we are going also kind of mentally conceptually. We, you know, kind of eliminate the number of decisions the customer has to make. Because when you give decisions, they are also sometimes they make it wrong, right? So the mental model is to just make it work.