 Yo, hey, yo, hey, yo, yo. Pack the chrome styles, fly like Mrs. Jones. Lyrical mathematics will have the devil's smoking stones. I put heads to bed, lick shots, and rap let's fend with the church a few times. Now I'm not trying to give me head, take my head. Number one in D80 bass is a number one in MASH units. It's not a good mix. All right, so let's talk about query optimizers, the second part. So again, for everyone, a good reminder, this Saturday, so today's Wednesday, Thursday Friday Saturday, this Saturday. Submit your first draft of the DB-DBIO encyclopedia entry using the Google Form. I'll give you feedback a week later, what to fix, whatever you need help with. And then again, I changed the schedule, so the presentation for your status update about project three, that'll be next Wednesday, not next Monday. Monday will be a lecture on cost models, then everyone will force that on Wednesday. And we'll go in the opposite order that we went in last time. And if you want to talk about project two, please, project three, or any of them, please contact me. Yes? From two or three? For three. I think so, yes, there should be a markdown file. I'll post that on Piazza. All right, so again, what are we doing? What are we talking about? Well, we're in this world talking about query optimizers. And we talked about last time that we need to find a correct plan that has the lowest cost. This is super hard because we're trying to figure out all these different join orders, all the different transformations we can consider. And at the end of the day, what the database system is going to try to do is use some way to cut down the search base so we're not doing an exhaustive search. And that gets us close to what is potentially the real optimal plan. In some cases, we can short circuit it. We know that we can generate the optimal plan right away without doing any search considerations. But in most cases, especially in OLAP world, that's not the case. So we're trying to cut things down to get us to a good plan more quickly. So last class, we spent a lot of time, or we spent the time going through the history of different implementations in starting from the very beginning, from the 1970s, where people wrote rules, they wrote heuristics, describing here's how the transformations I want to apply based on some domain knowledge that the developers able to implement. Then also to the 1970s, IBM implemented the first cost-based join search algorithm using dynamic programming to define an optimal plan or near optimal plan for system R. And then, as I said, this is what most systems are going to implement today. We have a bunch of hard-coded rules written in procedural language. You do some initial transformations, like you always do a predicate push down and so forth, and then you do the cost-based search. And there might be additional steps afterwards to clean things up, like the Postgres does. We talked about how to do randomized search algorithms, so simulated kneeling or Postgres's genetic algorithm. But the one we want to focus on in this class is stratified search and unified search, where there's going to be an optimizer generator, where instead of writing in procedural code, the rules and transformations we want to do, we're going to write it potentially in a higher level language. In SQL Server, that won't be the case. They're going to write everything in SQL's flaws. But the original idea was you write in this high-level language, and then you co-gen the compiler rules and the compiler transformations to do your query optimizer. And in the stratified search, you would have heuristic-based rules followed by the cost-based search. And the unified search is a single platform that's doing all the transformations. Now, again, I had you guys watch the Microsoft video instead of read one of the Cascades papers. And as I was saying before the class started, when I watched it again last night, I realized, oh, yeah, this is a bit advanced for people that may have never seen Cascades before. They start talking about things like the memo table, groups, and expressions. We'll cover those in a second. The Copters DB1 is probably a better introduction to Cascades. But one of the things that they said in the Microsoft video is, even though the original Cascades paper and the volcano paper talked about you write these things in a DSL, Copters DB does that for the transformation rules, they're going to write everything in C++. Furthermore, they also talked about how they have these different stages in their optimization pipeline, where some of them don't even use the cost model. Some of them are like, again, these transformations you're always going to want to do. You don't even consider firing up the search engine to figure this out. And so the lines get blurry now in a real implementation because you're essentially doing this in multiple stages is not like one or the other. So in a modern implementation, these things, I wouldn't say they're distinguishable. So in the case of the Copters DB stuff and the way that SQL server implements it. But if you get any read original papers, it seems very much there's this dichotomy between the two. So this, I've already said, we can skip this. This is just a reiteration of the stratified search. You have much rules, then you do cost-based search. The unified search is that you do all the transformations together, either logical, logical, logical, or physical. The stratified search would just be logical to logical first, and then the cost-based search would be logical to physical. Whereas the unified search, you could do everything all at once. And we'll see when we talk about cascades, because you're trying to do everything together, you may have to do backtracking in the search algorithm, like you're sort of spanning out the tree. So to avoid redundant computations, they're going to make heavy use of memorization to record the expressions or to record the cost of things they've looked at in the past so that they revisit something they don't have to compute from scratch. And the big distinction of all of this is going to be the top-down versus bottom-up. Top-down's going to be cascades. Again, that's where we start with. This is the end result of the query that we want. And then we expand down during branch and bound search. We expand out the nodes, the operators, the logical to logical, logical to physical, to convert it into the physical plan that'll get us to our end result. Where in bottom-up optimization, it's you start with nothing and you build this up. So this is essentially how describing how we're going to do enumeration and do exploration of the different possibilities we have in our query plan. The research literature according to the Germans says this is better, but the paper you'll read next class is going to talk about how Microsoft has the best optimizer and they're doing this. So I'm biased to the cascades approach, the top-down approach. The Germans love bottoms-up. I think they're better programmers than me so they might be right. All right, so today's class, we're going to talk about quickly how to do logical query optimization. So you can think of this as like the pipeline that we're talking about could be in the stratified search. It could be the transformations we're doing without having to consider a cost model. Things you always want to do, you're sort of putting the query plan into a canonical form. Then we'll talk about cascades and go to more detail of what held the different components work together and how you actually do the traversal. And then we'll finish up doing sort of a quick drive-by of some real-world implementations, both either going bottoms-up and tops-down. So in the first step here for logical query optimization, the idea is that we want to just do pattern matching on our query plan to convert the logical query plan into another logical plan that will get us sort of closer to what we think the optimal plan will be. And this is us as being the database system developers applying domain knowledge what we know about the relational model, relational algebra, SQL, and the databases to put us closer to the optimal plan. Rather than sort of being blindly ignorant starting from nothing, we can sort of nudge it in the right direction and help us increase the likelihood that we will find the optimal plan. So at this step, these transformation rules we're going to apply can't compare whether one sort of plan is better than another. It really is like rules where you always want to do this. Like for example, where one equals two, we always know that one is never going to equal two, therefore that turns into false, and therefore we should not do a scan. Like you convert the plan to like a no op or something like that. You may laugh like, okay, who writes one equals two? People write stupid queries, right? So these things show up and we have to account for this. All right? So again, so these transformations are sort of, they can be written in the unified language or it can be written as any other transformation rule that we would use when we do cost-based search, but we know we're just always applying them. So I'm going to go through an example here that actually comes from the Germans. We're going to take a query that is doing bunch of joins and we can show how if we just go from sort of the exact translation of the SQL query to relational algebra, that's going to be inefficient, but then we can apply these transformation rules to put again, put the query plan, the logical query plan into a canonical form. Okay? So I'm going to show four different examples or four different, four different transformations that we want to do. So the query is the same one as we had before. We have three tables, artist appears an album and we want to do a join to find everybody on my, now my OG remix, so my drill mix. So in the first step here, we're going to want to decompose all the predicates into their simplest form to make it easier for the optimizers to now move them around to do predicate pushdown. So again, we have this where calls like this where we have three predicates. And so if you just convert this directly from the SQL statement or the parse tree into relational algebra or logical plan, you would have a single filter operator with all the conjunctive clauses in the predicate. So in this case here, you simply just break them up based on the ends and you have three separate filter nows, three separate filter operators, which each of the predicate correspond to some portion of the where clause. And then now you can now apply a rule that doesn't match that you have a filter above a scan on a table with something in between. In this case here, we're doing a join, but it's denoting with a Cartesian product for now. We'll get rid of that later. But in this case here, I would have a rule that says, I see a scan on album.name, but it's above a join of some type, which is above the scan on the album. So therefore, I obviously want to push this, that predicate down. So you can do the same thing for artists. It's right above the join here. Again, we don't need a cost model to know this. We know as humans in working with data systems, that the sooner you throw away data in your query, the better the query is going to run. So next thing is we obviously want to get rid of the Cartesian products. So what is this doing? Artists and the peers, it's taking all combinations of artists and peers and making the massive set of tuples. And then we apply the filter that says, artist ID equals appears that artist ID. So instead, if we have this filter, we know that as we're computing the join, instead of materializing tuples that are never going to get past this filter anyway, we convert this into an inner join. And same with the other one. Again, no cost model, we know we want to do this. And then the last one is a bit more nuanced and this isn't always the case, it depends on the system. But we can do projection push down. So again, I'm not sharing the schema, but assume the artist appears an album table, they're really wide, they have maybe dozens or hundreds of columns. But at the end of the day, the only thing I care about in my query is getting back artist.name. So in that case, I want to push down the projections so that I throw away the unused columns at lower parts in the query plan. So in this case here, I'm putting the projection above the filter because I obviously need album.name for this. So again, so now we have a query plan that has more operators than we had before, but it's closer to what we actually would want to execute anyway. And in this example here, the join ordering is going to matter a lot, but I just went, for this example, I didn't, since I don't have a cost model at this step, I just say, okay, I'm going to join them in the order that they appear in the SQL query. That was that semantic optimizer thing that Oracle Claim was amazing that we took that last class. But again, at this point, it's just a logical plan, that's okay. So now we have a logical plan, we can't execute this. We need a physical plan that's going to tell us, here's the exact algorithms I'm going to want to run to do these different operators. So back here, I have artist appears an album. I'm not saying it's a sequential scan, I'm not saying it's index scan, I'm not saying how I'm actually getting this data. Same for the joins, I didn't say we're joined algorithm, I'm actually using. So now we need to convert the logical operators into physical operators, because we can either execute that in our system, in our execution engine, or we could cogen it into some plan and run that. So to do this, now we're actually, we need a cost model. Because we need to know what's going to do, of the different choices we could have for physical operators or given logical operator, we need a way to say this one's going to be better than this other one. And this is where the cost model comes into play. So we'll spend more time talking about this on Monday next week, when you read the paper from the Germans, but we can't avoid it at this point. There's some cost model that's going to be able to tell us, this physical operator is better than another, based on cardinality estimates, based on a hardware information, based on something that it knows about the system. For now, we'll assume that it's perfect, we'll see next class that it never is. So before we get into the different numeration techniques, I just want to point out that everything we've talked about so far in this class, we've assumed that the queries are somewhat simple, meaning the, you know, they're always equi-joins or inner-joins, they're always predicates that are always sort of looking at just two tables at a time, and there's obviously no Cartesian products. We talked about lateral joins, and we talked about UDS, that was the first sort of complicated scenario or complicated queries that appear in the real world. Other issues are going to be mostly with joins, outer joins, semi-joins, and anti-joins. Semi-join would be an explicit semi-join, some data systems support that operator. Then anti-joins is like negations for joins. So real queries are going to have a lot of this stuff in here, it isn't always going to be as clean as TPC-H or even TPC-DS, and so the reason why this is going to matter is now when we start enumerating different physical operators or different plan scenarios for a SQL query, we have to be taking consideration whether reordering things is actually allowed or not, whether they actually produce incorrect results. So in this example here, I'm joining table A, B, and C, I do a left outer join with A and B, then I do a full outer join with C. And so even though this is a logical plan, we know from the semantics of what full outer join is going to do that we can't actually reorder this to have A join C followed by B. It has to be A, B first, then C. And the reason is because this full outer join operator is not commutative with a left outer join because I don't know how I'm going to join a tuple to come out of this join with C until I know what B dot val is. Because that's the thing I'm joining with C on. So I'm not going to do too much details of what's actually going here, why you can't do this. Just be aware that when we talk about these enforcer rules, enforcer properties, in addition to making sure the data is in the right physical form, like it needs to be sorted or has to be compressed a certain way, that there also might be higher level semantics about the query that prevents us from doing one join or versus another. And again, we have to implement that in our optimizer to make sure we don't violate this. So I will say, in practice, this is going to be easier to handle in the bottom's approach than the top down approach with Cascades. Because in the bottom's up approach, it's in the next slide. We're doing generative enumeration, meaning we start with nothing and we say, OK, what's the next steps we could go into? What's the next logical nose I'd want to represent different joins, join orderings? So at that step, you would know, OK, I can't flip around A, B, and C in the join order here. So I don't even generate, I don't even consider it. In the top down approach, you just have to have additional property enforcers to make sure that if I consider something, generate a group or expression that has an ordering that violates this that I identify that I can't do before I jump into it. This will make more sense when we talk about Cascades in a second, but it's just easier going from bottom's up. Yes? If the E-left side are going B, and then the whole thing pulled out are going E, but essentially it's flipped the solutions with C and A in the opposite. The question is, couldn't I flip the C and A here if it still work? So here, the AID goes BID, B is going to come out, but you might not have a match. So B actually might be null when you come out of this. So you can't join B and C because you don't know whether B dot val is going to be null or not ahead of time. OK, so this is sort of redundant to what I just said, but basically, the next step we're trying to do is how are we going to generate or produce the different permutations of the plan we want to consider. I'm going to use the word permutations because it sounds like transformation, but how are we going to generate the different choices that we have? And in the top-down approach, it's going to be transformation, meaning we're going to take an existing plan and we're going to permute it to produce more of the plan. So we'll expand out nodes in our query plan based on what we already have. Whereas in the bottom-up approach, the dynamic permutations, the system or dynamic permutations, you start with nothing and then you have some rules to say, OK, here's what I could do right now and you materialize that. Seeing a lot of blank faces, I realize it's a nuanced idea, but again, I think it makes more sense when we talk about Cascades. Again, it's basically how do we generate given join orderings that we then feed into our optimizer search model to say, here's what I could be doing. So I'm going to have one slide on the hyper approach. As I said, this paper is not an easy read, but it's basically an extension of the system or dynamic permutations we talked about before, the bottom-up approach. But instead of representing all these different logical nodes as being like, I'm joining this with this, you're instead going to do sort of a self-optimization within some hypergraph within the total query plan, and then you figure out how to connect those groups of those plan nodes together and then run your search on that. So again, according to the literature, this is the state-of-the-art approach. This is what Hyper does, this is what Umber does, and I think this is what DuckDB does as well. But it's basically dynamic programming with a cost model just as we talked about before. What I want to say, more time on is on Cascades. And as I said before, this is going to be a top-down approach. It's an extension or an improvement of the original volcano approach that I talked about last class. But the big difference is, well, one of the big differences you touted in the original paper was that it's object-oriented instead of being written C instructs in volcano. But there's other optimizations that will go along that matter a lot. And so there's the original Cascades paper from 1995. I didn't have you guys read that because it doesn't really describe it as well as I think it should. Then there's this master's thesis from a student in Portland State on an optimizer they built called Columbia that, I think, does a better description of how Cascades actually works. But again, the Microsoft video I have you guys watch is here's actually how to take the idea and actually implement it. So what I'll describe today is going to be based on the master's thesis because there isn't much to go and run original paper. So this is going to be the third-generation top-down optimizer from Gertz-Graffy. So there was Exodus and then volcano and then Cascades was the last one that he did. And this is what Microsoft generated. So the big idea that they're going to have is that they're going to materialize the transformations of different expressions or portions of the query plan on the fly as needed as you traverse down. And this is different in a volcano where when you land it into a group as you traverse down, you materialized everything and your search space exploded, even though you may not consider all different paths down the treat. But in Cascades, they're going to have the ability to use placeholders to represent some lower portion of the tree that you haven't gone to yet. But you don't need to go to it. You don't need to traverse down at that moment. You can then consider or search another place, another part of the search tree. And then you can go back to it when you're ready. And there's other cool ideas in this as well, but that's sort of the big idea that you don't have to have an exhaustive materialization of everything at each step along the way. All right, so there's gonna be sort of four key ideas. And again, some of this is like extensions from volcanoes. Some of these seem like obvious things you would do now, but again, consider where they were coming from, what they had before, that this is improvements of what volcano was doing prior to this. So the first is that they're gonna represent all the optimization tasks as these well-defined data structures. So this being patterns that you wanna match on your query plan, the rules that then get fired to do the transformation. And this is different than, again, the heuristic-based approach where it's a much if-than-alysis, where you're literally hard coding like, if I see my query plan looks like this and has these properties, then do something. Now you define here's the rule I wanna match, and then here's the pattern I'm looking for, and then here's the transformation I wanna apply. And then you define all these together and you just throw it to a search engine, or a rules engine to then do the recursive search and applies these things as needed. So the rules now are also gonna declare that, also provide these property enforcers that we can specify that if we know that the data needs to be a certain way or in a certain form, the rules engine will make sure that any data coming from a lower portion of the subtree has to have those properties. And again, we can declare these things, we can define them without having to have a bunch of extra checks on the side. Yes? So the question is how is defining a pattern different than an if-then-else statement? Cause I can declare at a higher level what the pattern I wanna match is, rather than like explicit code over and over again to true gross and find the thing I'm looking for. Like so we'll see this, it's gonna be more than just, oh, does this one node have this property? It's gonna be, does a node and its children have certain properties? So if you had to write that explicitly, you'd have to like, okay, if my node looks like this, then look down all the children, check when they match, then check this, if that matches, do that. And you have to write that procedural code. Whereas I say, here's the pattern I wanna look for and then the rules engine does that traversal for you. Also to declare it at a higher level? It's meant to be declarative, yes, yeah. All right, I'm not trying to knock on Postgres, but if you go look at the Postgres code before you get to the call space optimizer, you'll see a bunch of the if and else's in the source code cause they're doing basically the same kind of pattern matching but in written procedural code. The next big thing they're gonna have is this notion of priorities. And it's gonna allow us to find the order of how to apply transformations and then we can adjust them on the fly as we go along based on what we see the, how the plan is turning out and based on the cost model. The idea here is we wanna be able to find the optimal plans more quickly and not consider things that we know are gonna be stupid or wasted time. Now in the Microsoft case, they do this with explicit stages where they've already hand curated, here's the things I wanna consider at different stages. So you can think of those sort of priorities as well but they're not dynamically, well, they're not dynamically within one stage figuring out how to change the priorities as far as I can tell. It's when they finish a stage, then they figure out what the things, what should be the next priorities for the next stage. And then the last one that was a big idea as well is that you're gonna consider the predicates the expressions and the where clauses or the join clauses and so forth. You're gonna consider them as also like logical and physical operators as long with like the things in your query plan. Now, Microsoft doesn't do this, they explicitly said this in the video but in the CopperCB they talk about these scalar expressions and they can consider them in the same rules and in the same patterns as well. So basically like the same API to write a rule that does like flip join A, join B into join B, join A, you can also write expressions like convert one equals two into not or into false instead of having sort of separate stages, separate steps. Yes. So they're also like different ways of evaluating the predicates but is that something that you do like physical alterations of the predicates? The question is like we previously talked about how when you do a sequential scan you could have a branchless scan, a filter evaluation versus a branching evaluation. Could this also be a physical property? No, because that would be a cost. Because the physical property is the data that's coming out of it, of an operator. So in both cases, in both the branching and the branchless sequential scan, they produce the same data in the same form. The costs would be different though. So when you would do a logical transformation to say scan A into branchless scan or branching scan A, those would have different costs but they would be separate physical operators. So that's how you would account for that. There's another example in the Microsoft video I'm not gonna cover because we're focused on all that stuff with the Halloween problem which is, again my mind was blown when we talked about this. Basically you can set properties in SQL Server to say if I do some operator up above, like manipulate, like change data, could I have a lower property cause me to update a tuple multiple times incorrectly? And so they would have a property enforcer to say is this operator, is this portion of query plan, is this susceptible when I do an update to, sorry, is this scan below susceptible to producing the same tuple multiple times if I update it up above, right? So the Halloween problem, I think I covered this in the intro class but the Halloween problem was from IBM System R where they have a problem where you're doing an index scan and you're updating everyone's salary and you say find me everyone who has a salary less than $1,000 a year and then update them, give them $100 a raise. So then if someone has, their salary is $100 a year, you scan along, you find them once, you go update them, it goes back in the index now with the salary of 200, you keep scanning and you find it again then you update it again, right? That's the scenario they're talking about. The reason why it's called a Halloween problem has nothing to do with Halloween other than they discovered it, this problem in System R, on a Friday that turned out to be Halloween and they said, well, let's not worry about this now, let's go party and then we'll deal with it on Monday and it's called the Halloween party. Oh, sorry, the Halloween problem. It's only Wikipedia. Okay, all right, so we have the rules, we have patterns, we can change the priority of the rules as we apply them and then we'll consider predicates as part of our transformation process as well. All right, so the first thing we gotta discuss within Cascades is this notion of expressions. Now this is gonna be slightly confusing because I like to use the term predicate expression, portions of the where clause, but in Cascades they're gonna be different than how we use them. And they're gonna represent a higher level operator of some set of tasks or things we're going to do in a query plan that could have one or, sorry, zero or more inputs to them, right? So an example would be if I'm joining A, B and C here, here, I could have a logical expression that says I join A and B followed by I join with C or I can have any permutation of them. And then a physical expression would begin replacing the logical operators that I'm specifying here, like scan A with sequential scan on A, the join is now hash join, sequential scan on B or in C and so forth, right? So the big thing we're gonna need to be able to do in our optimizer is be able to identify quickly whether these two expressions are equivalent, right? Because we're gonna need to know if I have this expression A join B join C that when I do my transformation to a physical form so I can execute it in my system that this expression here satisfies what this logical expression needs, right? Yes? Is there any particular reason that Microsoft considered like source merge join or they only considered in the video? They said in the video? Yeah, he never mentioned source merge, he was like either you go hash join or you don't have to do, he said it more as a matter of fact, I don't know if it's true. They have merge join, I'm pretty sure, yeah. Actually, that was the Halloween problem example, right? I think like if you're doing a hash join and an update query, if you're on the build side and say you have A join B in your update table A, if A is on the build side, you build a hash table once then B probes it and then goes up and does an update, you're not gonna scan A again because it's a pipeline breaker, you generate the hash table when you're done. A nest of loop drawing may go back and read things or an index scan on A will go back and do, could read the same thing over and over again. But yeah, I'm pretty sure they have merge joins. All right, so now with expressions, we're gonna combine them together into what are called groups. So group is gonna be identified by the expression that it produces as its output and then within that you're gonna have these unordered sets of the logical and the physical expressions that produce that output. So in this case here, my output is I want to produce the combination of A, B, and C. So the logical expressions that are equivalent to it are gonna be A join B, followed by join C, B join C, B join C, followed by join A, and so forth here. And then the physical expressions again, there's B all different permutations over for the physical operators that represent the logical expressions, right? So this whole thing here, we're gonna call the group. And then these are all gonna be the equivalent expressions within that group that produce that same output here. And so these lists are gonna be unordered sets and then the priorities, the priorities of the transformations will get applied at runtime. So again, in volcano, when we do a traversal and we land at a group like this, it would again materialize everything all once and then you start doing the search. In Cascades, we can iteratively start generating these logical expressions, the physical expressions as needed on the fly. And then, because those are done with transformation rules and then we can have priorities in the transformation rules that say, oh, consider index scans first over sequential scans or maybe evaluate all the logical expressions before we do any of the physical expressions. So there's rules like that we can apply at runtime to decide how we're gonna materialize these. All right, because the reason is gonna be obviously that I'm only joining three tables here and I'm gonna have a bunch of different permutations but if I'm joining a lot of tables that they mentioned in the Microsoft video that customers have joined at 100 tables. If you had to materialize all the different combinations for all 100 tables, this thing would explode and be massive. So another way to cut down on the expansion of these different expressions is to use what are called multi-expressions. And the idea here is that instead of having to, again, expand out every single expression or sub-component of an expression, you can represent a sort of multiple expressions within this multi-expression group. I don't use the word group. You can represent multiple expressions with a multi-expression, I guess that's stupid. But one of our expressions can be combined together in a multi-expression and then that's the placeholder to say, hey, down below me in the query plan there's gonna be a group that produces this output in this case here AB. I don't know what it is, but here's a placeholder for it at this point in the group. And the goal here is to reduce them our expressions that we have to consider in our groups as we go around, as we go down. All right. Do we consider like physical qualities? Yeah, so this question is like, do these groups have physical properties? Yes, this is not showing them here, right? You would know that, okay, ABC and I have to be sorted on A or the A.ID or something like that, right? So what's gonna happen now as we do our search, we're gonna have to expand down to the, in some cases like we'll get down to the leaf nodes of the query plan tree to figure out how we're actually gonna scan, what accident that we're gonna use to scan the tables and then we'll build up from there and say how we do physical operators for the joins and other things, right? All right, so now we have these notion of expressions and then we can talk about how we wanna define the patterns for our rules to do our transformations, to do the expansions of these expressions in groups. So there's gonna be either a transformation rule or an implementation rule. The transformation rule is a logical expression, logical expression to a logical expression or the implementation rule is the logical to physical. And a rule is gonna be fine of two parts. There's the pattern we're gonna look for in our query plan and say, if you see this, if you see this sort of construction or ordering of operators or expressions that have these certain characteristics, then apply some rule that defines how we wanna do our substitution. So we're not gonna talk about this so much but in the Cocker's DB Optimizer, they talk about how there are some rules that you then apply where you don't care about retaining the plan you just had before. Like again, like stupid things like where one equals two and you convert that into where false, you don't ever, you're never gonna backtrack on one equals two, so you don't need to consider this. In the Microsoft Optimizer implementation, there's that pre-stage where they apply much of rules about a cost model and they know they always wanna apply them so you don't need to go back and backtrack and populate the memo table. So sometimes these rules will then, we'll update the memo table, keep track of things, other times we just throw away where we came from because we know we don't need to go back. So let's see an example of a rule. So this is our pattern here, this is what we're gonna look for in this particular rule. So if we have two equijoints on doing a left deep tree, if we see this pattern in our query plan, we'll match on it. Now, you see here at the bottom of the leaf there's group one, group two, group three, these are placeholders, right? This is saying the rule doesn't care what's below this. All we care about is we have equijoint feeding into another equijoint like this, right? So if our plan looks like this, this would actually match because again, we would have two, we have an expression here to do a join A and B and then we have another expression here to do that takes the output of A and B put together and then joins it with C. So in this case here, again, this rule would match this pattern like this. So we can have a transformation rule that just converts it to a, from left deep tree to a right deep tree. So now we're gonna join B and C first and then join the output of that with A. Or we're gonna have an implementation rule where we convert the logical plan nodes to say how are we gonna do the join into a physical plan nodes that say, I'm gonna use a certain merge join for this and a certain merge join for that, right? So what's one obvious problem with this sort of pattern matching approach? We have a lot of patterns, that's one, yes. We're using this to try to cut that down, yes. It's not dynamic to workloads, what do you mean? So you're saying, so like with the leaf nodes here, you get to see what they are first, but you don't care at this point, right? So the obvious problem is that I could get stuck in an infant loop, right? So I'm gonna have a transformation rule that goes left to right, I could have another transformation rule that goes right to left and what happens? Then I fire the same pattern again and open over again. And you don't wanna have to maintain state, like for each individual rule, did I apply it at this portion of the tree because it may be the case, left to right, right to left, that's obviously one after another, but it may be a series of transformations that I apply and then that something gets triggered again and I'm back in my loop. And so I don't want to maintain the state of where I'm going as I traverse on the tree open over again. So the way we're gonna handle this problem is to rely on the memo table. So again, I realized that in the Microsoft video they were discussing this without describing it, so now we're gonna go through what it is. So I think in the original Cascades paper, they refer to this as being like a graph data structure. In our implementation, when we were building a Cascades optimizer here, we used a hash table. I think in everyone else, all the other limitations that are open source, they're using hash tables. So the idea with this is that it's gonna be a table that's gonna keep track of the best plan or expression we've seen for a given output of a group, along with its cost. The idea here is that as we traverse down, if we see that we need to get a cost of something below us in the tree, and we know what the output is because we know what the expression is before we go down, we can look on our memo table and say, have we already done this? If yes, then we just reuse that cost and we're done. So going back to my example, I'm going left to right, right to left. If the rule matched, we would then see, if the rule got matched, we'd then see, oh, we've already done this transformation before because we would see the equivalent cost in our query plan, or sorry, in our memo table, and we would know we don't need to traverse down into it and consider it again. So this can help us do duplication detection and keep track of the cost and constructed the final outcome of the cost of the query plan and keep track of what properties are being, make sure that the properties are not being violated as we go along. So it's sort of like an overview table of here's everything that's going on as we do our traversal. So not to go too theoretical on you, but the basic premise of how this is going to work is using this notion of the principal optimality. The basic thing that this means is that we're doing a branch and bound search in Cascades as we expand out, look for query plans. And if we know that if we have the best plan we've seen so far at some point in the tree, if the next thing we could potentially consider going down to the tree already produces a higher cost than the best plan we've ever seen before, we know there isn't some magic way for that plan to go then going down that branch of the tree to magically get better than the best plan we've seen before because the costs are cumulative, right? It's just saying that we see something here, we see something in a query plan at some level and the next step is worse than the best we've seen so far, we can stop, we don't need to consider it any further. It's the basic notion of a branch and bound. So here's zero search now, sorry, let's see how we construct a query plan for A join B join C using the memo table and our groups of multi-expressions. So this is the game, in top-down search you start with the outcome you want. My output is A join B join C. So this is the outcome you want and we want to traverse down and figure out what should our physical plan be. And our memo table on the side here for simplicity I'm only gonna show for each output here's the best expression I've seen so far and with this cost. So that I know that when I go, if I traverse down at some point and I have to get the cost of an expression below me in my query plan, if I have it in the memo table then I can stop at that point and I don't have to go down and do it. So say the first transformation rule we would have is to convert the output we want into a logical expression here. So A B with a join on C. So now I could have a transformation rule that says, okay well I want to learn more about the cost of joining A and B. So I would have another group down here with the output is just A and B. And then the same thing I can now do a logical, join the logical multi-expression with A join B. And the brackets again mean this is the output that I want. So at this point here I don't know how I'm gonna scan A, I don't know how I'm gonna scan B, I don't know how I'm gonna join B but I know A will be the outer and B is the inner table. So now I can expand down here and get down to the group that's just the output of A. There's only one logical expression for this example, it's what you say for our purposes is get A, right? So then we can apply transformation rules that converts this into physical multi-expressions. So sequential scan on A or an index scan on A. So let's just say for this example here, for whatever reason this table, the sequential scan is the cheaper one, it has a cost of 10. So we update now our memo table and say, okay for A, the group that produces the output of A, the best physical expression we've seen for so far is sequential scan on A and has this cost. So then now I come back up to this node here, this group and now I need to get the cost of B. Same thing, traverse down here, only one logical expression to apply the transformation rule that generates index scan or sequential scan. In this case here, the sequential scan is still cheaper and so that's 20 and then we update our memo table like this. So now, now I come back up to this point here and I can keep applying different logical expressions, transformations, try different join orders. But in this case here, B join A. When I go say what's the cost of scanning A, these lines should be flipped. Go scan the cost of A, cost of scanning B. I look at my memo table and I have those results. So I actually don't even need to do that traversal. I just say okay, I got it, we're good. I know what this is gonna cost and I'm done. So now I apply transformation rules to try to have different join algorithms to join A and B. Cause it's A join B to A join B or B join A. There's no other logical transformations we can do here. So now we want to consider physical transformations. So again, same thing. I have these placeholders for the groups that represent get A, get B and I can use my memo table to say okay, I know the cost of those and I'm done. And so I'm really now costing the hash join here or sorry, the join algorithm. Let's say that the sort merge join is the fastest, the best one, the lowest cost. And again, these numbers are synthetic, it's not real. Yes. So at this point here, like instead of going down and repeating the cost here. Yeah, so the statement is at this point here, like I was saying, like you would know B join A is equivalent to A join B at the logical level. So you don't need to traverse down to the groups and get their costs. You just look at the memo table and say, I know what the cost is. Your point can say okay, well the cost would be A, cost 10 plus cost 20 is this two skins on this. Yeah, so these are equivalent so they have the same cost. Yes. Yes? Yes. Wait, sorry, you don't think so? If this is a hash join. And you're not there yet. This is logical. I'm not even saying I'm doing a hash join. So like, again, this is that principle of automality. At this point, I have a cost that is scan A plus scan B. Say here, my cost is 30. If I have some other best plans somewhere with a cost of 10, then I know that no matter what join algorithm I'm gonna use, I'm not gonna be able to get my cost down below 30. So I can stop here and go back up. At this point here, he's just saying that like, okay, A join B, B join A, they have the same cost ignoring the join algorithm. So I can either look at my memo table. In your case, like, you could have a memo table entry to say like, what's the cost of AB? And then you would know, you need a way to then say B join A is equivalent to A join B when you do your lookup. So that's really how, all depends on how efficient you want your memo table to be. For our purposes, I'm really being really stupid and just saying, okay, just like just looking at the individual, like scan operators and the outputs here. But you could have for individual entries another cost or another entry. For every multi-expression within the logicals level, you could have another entry in the memo table. The problem is that this thing would get too big on PowerPoint. All right, so now again, I apply my transformation rules. They convert the logical multi-expression to A join B, B join A into different physical operators for the physical join algorithms. And again, I don't have to go down, get the cost of accessing B and A. I have that from my memo table here. So now the cost of this group, AB, producing the output of A join B is going to be the cost of the accessing A and B separately plus the cost of the join. Again, synthetic number, I'm saying 50. It's the sum of them. And that's what gets populated in my memo table here. So now I pop back up to my root node. Now I want to go down the C side. Same thing. At this point here, I got to get the cost of accessing C. The index end is the cheapest. That goes in the memo table. And then now I want to apply a bunch of different transformation rules for changing the different join orders in my root or the different physical ordering. And say I find something that's the optimal plan from this. And that's what gets put into my memo table. And at this point, I'm done. But I'm obviously skipping a lot of steps because there's a bunch of different join orders I could consider, but to take a time in PowerPoint, I'm not expanding it out. You basically do all the same steps for all the join orders you want to consider. Yes? What about the interesting order I can get? The principle is like, but does the index scan produce something in order for either you want to do a join order? You can exploit that order and do it all in order. Yes. So his statement is, like how would you consider the case where the- Or what's your- Yeah, like if you're doing index scan and you have an index on the join key for both tables, how would you then be able to consider when you come back up? Like this is, like you now do a certain bridge join. So it's like the dynamic programming thing. You can't guarantee you're going to find the optimal plan. Right? You'd have to sort of seed it to consider, okay, let me try the index scan now for a, let me try the index scan now for this, for this, these access in these two tables. Because then that will feed into my sort bridge join. You'd have to sort of set up the transformation rules that allow you to do that. I don't know, I don't know whether you're allowed to go back and say, okay, even though I've already had this cached, let me go back and cost this again. Right? I don't think, I'm sure you could, but I don't think they actually do that. Is that like the company? So the- Do you have one which has a company? Yeah, yeah, yeah. So he's right, yeah. So if you have a sort bridge join, so in the case of like, I think SQL server, like, Postgres has this too. There's the call to merge join. So they assume the data's coming into it sorted. So the property be, the data has to be sorted. So when you, if you then consider a sort merge join here in your physical operators, you would then say, okay, is the, does the thing feeding into me, like say on the, on C side, is that sorted on the thing I want? If no, then I can go back and get a new cost and I would have a different entry in my memo table so because the properties now could potentially be different. So that's how you would handle that. You could revisit C because now you care about a property that maybe the previous one didn't have, but that was okay when you were considering hash joins first or something. And so going back down to the priority stuff, it almost always is the case as we talked about before that hash joins always would be faster. So you could then set it so your priority is to consider hash joins first, see whether you do okay, but how well you do. And then if you have more time, go try a sort merge join. That's according to Thomas, that's what he said. So his statement is, if your system doesn't have sort merge join, does that make it faster to query optimization? Yeah, because it's one less thing you have to consider. Absolutely, yes. The more, I mean it's sensory, the more features you have in that your data system could use, or for, you know, for your different choices of physical operators, like your example like branching versus branchless. You can do it, but now it's just, it's exploding the search, which may or may not be a good idea. Yes. So in my example here, I'm showing you the groups is like a separate thing. The memo table and the groups are like, they're basically the same thing. So like, right? And so you could represent the groups within this hash table that like, yeah, so in that case like you, you could see like in that case, within the group in the memo table, you would have, here's the things I've traded before and here's their cost. So like in this case here, the indexed scan was slower than the scruncher scan, but I could still, if I computed the cost, I could still keep it. And that would be in the memo table. Sorry, the word I got, I'm sorry? Like the texture? Oh, the texture one, yeah, yeah. Yeah, I mean yes, you could. I don't know if they actually do that though. Yeah, okay. So we covered this last class of like how do we decide when to stop here? In the case, cause you know, we could just do this forever. In the case of the dynamic probing approach, like you gotta get to the end at some point. And so you could use your research to cut things off instead of trying to do more accessible search. But if you turn doing it in that stages, you have to do all the stages until you get the end. In this approach here, like once we have, once we reach the leaf nodes for all the tables we wanna scan and we have a complete physical plan, we're done. We could say, okay, we can run this plan right now. It may not be the best one, but like we've run out of time, we gotta run it, right? So most obvious way to decide when to stop is we can exhaust all possible transformations. The other one we talked about too is like, you say, okay, I'm gonna run for a certain amount of time. And then when I'm done, I'm done. I think my sequel works this way. The thing that came out of the Microsoft one that I found interesting was they don't base the termination on the amount of time you've spent. They base it on the number of transformations you've applied. And the idea here is that if your system's overloaded, it's gonna be slower than applying these transformations because you gotta go to the cost model, you gotta go to the catalog, it gets expensive. So if your system's overloaded, if you're based on wall clock time, then since the transformation's just slower, I'm gonna end up with a crappier plan because everything just took longer. But if you base it on the logical count of the number of transformations that you've applied, then no matter whether the system's overloaded or not, you're gonna generate roughly the same quality of the query plan. So I thought that was a really good idea and it's not something I thought about before before Microsoft mentioned it. They care about getting the plan stability no matter whether the system's overloaded or not. And the last one, again, is a cost industrial order. It basically says, if I have exceeded some threshold or if I have a query plan that's better than some threshold I said before, it could be like, here's the best plan I see initially and if I find one that's 10% better, then I'll stop. I could be based on all the cost estimates you're getting back from the system itself. So this is an incomplete list of a bunch of different Cascades implementations. So, actually, Calcite might be wrong. I need to double check on that because they claim they're based on Volcano. But, so, again, there's the Cascades paper, but then that came out in 95. Microsoft hired the GERTS GraphE guy that invented Cascades immediately to go build Cascades on SQL Server when they did the major rewrite in the 90s. So there wasn't an academic version of Cascades until OPPusPlus in Columbia came along in the 90s. The code for these systems or frameworks are still there but it's like old C++ code that, who knows, it probably doesn't even compile. Because what? As opposed to what, C? Yeah. So, and then Green Pump Orca and Calcite will talk about in a second. I guess I don't think Calcite is truly Cascade, but SQL Server, we've talked about before, we'll go a little bit into detail. Tandem Non-Stop SQL was a hardware company in the late 70s, the 80s, that was building this super redundant hardware. Like think of like space shuttle kind of computers where you have like three CPUs computing the same thing and you see what produces the right result. And so they made a database system with Jim Gray when he left IBM, he was contractually obligated not to build another database system for a certain amount of time. So he was at Tandem working on like fault-tolerant stuff and then they got him to start working on Non-Stop SQL to build a new database system. This got bought by, I think DEC, the grant DEC got bought by Compaq and it got bought by HP. This is still around, a lot of banks still use this, but it's more or less a maintenance note at this point. Clustrix was a distributed version of my SQL that got spun out of AOL. America Online was a big internet company in the US in the late 90s and they got bought by the MariaDB guys a few years ago. The CockerDB was talked about as well. They did, they wrote, there was some scratch. For Clustrix, there isn't much documentation that says like what it actually does. If you read, I actually don't even know that their documentation is still around. It might have taken a lot of line when they got sold. But they had this optimizer called Sierra or something and they have one line that mentions Cascades. They're doing Cascades, but there's no details of how they actually do it. All right, so I wanna go through these five real limitations. Just go into more detail, roughly what they're doing. And I said, as far as I can tell, these are the ones that are using Cascades. Calcite is questionable. Single store is gonna be bottoms up. Again, I'm just gonna go through high to high level. What each of them are doing. The SQL Server one is probably the most interesting. Why, how do you guys watch the video? So it's the first Cascades implementation that ever existed, that they started in 1985 because again, Microsoft hired the author of Cascades. And it's been used as either the original Cascades implementation that was in SQL Server or some derivative of it has been used now in a bunch of different Microsoft database products. They talk about it in Cosmos DB uses this. There's the cloud version of SQL Server, that Azure SQL that uses it. There's another internal system called scope. Sort of looks like BigQuery that uses it. So it's been used in a bunch of different things at Microsoft. All the transformations are written in C++ instead of a DSL, like in the original Cascades paper or the cockroach DB approach. And then all the predicate transformations will be written as sort of a separate procedural code with rules and not integrated as part of the transformation engine, the rule engine, like the paper specified. And they said this for, in the talk they mentioned like the reason why they don't do DSL is because at some point you're gonna have to examine the state of the query plan and start doing more complex analysis and transformations and that writing in a DSL just makes it almost look like C++ anyway. So they wrote in the C++. And we'll see in the example of in Cochrane's DB, you can write, their DSL allows you to write go functions or make a call to go directly in an even associate high level language. So the big thing that I want you to get out of the video is that the way that they're gonna apply these rules isn't like I showed in my example before where everything's all being done within sort of one pass or one traversal of the query plan. Instead, they're gonna apply things on stages where they're going to increase the scope of what they can optimize and the complexity of the transformations that they're applying as you get further along in the optimization pipeline. And the idea here is similar to the stratified approach where there's rules in the beginning, you know you're always gonna wanna apply so you don't even need to fire up the rule engine for this, just go ahead and apply them like predicate pushdowns and other things. And then as you get to the cost-based search, the idea is that at each stage you sort of sub-stage within the pipeline, you look at the best plan that comes out of it and the sort of cost estimates and then they decide whether you can even continue with doing more robust or more complex optimizations because it may be the plan is just good enough and it's not worth doing additional searches, run the query because it's so simple or decide whether to do the full search. So in the first step, we're doing the simplification of the normalization. So this would be doing tree-to-tree transformations, essentially applying the rules that I talked about before that you know you're always gonna wanna do. And the key thing about this is that unlike in the stratified search where you would have sort of separate code to do the heuristics and separate code to do the cost-based search, it's the same rules, the same patterns and the same code base that applies these rules that you don't require a cost model, you run basically the same rules engine without doing the cost-based search. So it's not like, so you could have some rules that would used to be only in the cost-based search, you could then now apply them to, in this first stage here. So practically pushed down, we talked about converting outer joins to inner joins of course, sub-query removal if possible, Microsoft's case is not always, they can't always do. And then empty result printing, like if I'm joining, if one table is empty and I'm joining with it, I know it's not gonna produce any results and therefore I can propagate the joint, propagate the empty result to everyone else. So this is when we do a tree-to-tree transformations, again just applying our pattern, matching our patterns, applying our rules. Then you do a pre-exploitation and this is now, this is not actually going to be applying transformation rules, but this is setting up the metadata or the information we need that then do the cost-based search. So this is sort of a rule. They have code that looks for, if your query is super simple, like select star from table on a primary key, you have the primary key index, like you don't need to do any additional planning, you short-circuit the search right there and stop there. The one thing that is very interesting that they talk about is they look at the query plan, they figure out what statistics I'm gonna need based on what columns are accessing either the joint clauses, the where clauses and so forth, and then they go check the catalog to see whether they have those statistics ready or not. If they don't, they block the query optimizer, go do like analyze or do some sampling on the table you're gonna access, then build that statistics, then come back and actually start doing the query planning again, this is super fascinating because systems don't work this way. If you don't have good statistics on Postgres, Postgres doesn't care. This runs whatever it has, right? It doesn't tell you, oh, you're missing this, you should really go get it. They'll stop and go get it. And so what's interesting is that this will make the first query that you hit without statistics is gonna suck, but then everything else after that should be okay. Obviously there's other triggers you can do, like if you bulk-insert it to a table, the data system should identify, oh, I just bulk-insert it, let me go run analyze immediately before I run any queries. SQL Server can do that, Oracle can do that, Postgres doesn't do that, I don't think. All right, so once they get this, all the stats, they can do some initial cardinality estimates, you can do some potential joint collapsing. I said these would be transformational rules. These obviously aren't transformational rules, it's just things that get triggered that set things up. Then you get into the exploration phase with the cost-based search. And this is the Cascades portion that we talked about before, that's the tree traversal with the memo table. And so what they're gonna do is in three stages where they're gonna apply in stage one after another and then when they take the output of one stage, again, they check to see whether the plan they have already is good enough. If yes, then they'll stop and not go to the next stage, otherwise then they'll expand and go to it. They also talked about how they can, sound like they can dynamically also decide what transformation rules to consider at the next stage based on what they think the query looks like. So for example, if you're not doing anything that looks like a CTE, then don't even bother triggering or setting up things that optimize CTE queries. And then that just makes everything run faster. And so the trivia plan is sort of a more aggressive short circuiting than what you do here. Then a quick plan would be like for OOTB queries or simple like two table joins. But if you have a lot of joins, a more complex thing that you get to the third stage and you do a more exhaustive search. The second stage and the third stage can run in parallel whereas the first stage is just a single thread doing this. And then the last stage is we call the post-opposition or engine specific transformations. Again, this is what I was saying before where they reuse the Cascades Optimizer from SQL Server in a bunch of different products at Microsoft. And so all of this part is the same from their systems. But then this last step here, you can tack on things that they could do whatever the engine wants to do. So they talked about how in for the parallel data warehouse version of SQL Server, they will generate this query plan and then this thing will do some final work to chop it up to make it be a distributed query. So the two optimizations that I think are really interesting that they talk about, one are the timeouts based on the logical counts, that one we've already talked about. But then they also talked about how this memo table is a way to sort of initialize the search with things that we know as humans since we build the data system that we know we're gonna wanna start with to get as close to what we think the optimal plan is. Right? So again, in my example before, when I showed doing the top-down search, the memo table was empty or all the group expressions were all empty. And then I said there's some transmission rules we apply based on some priority, then filled out what we actually wanna do. So it's that population of the group expressions, if we choose things at the very beginning that we think are gonna be a good idea, then that could allow us to maybe find the near optimal plan in a lot of time. So they talk about how they have, since they have the cardinality estimates that came out in the previous stage, they can use that to figure out, here's the join order I think I'm gonna wanna do, and you examine that first. Or they do what the oracle way is, assume that the human has maybe put the tables in the order that they wanted to join them with when they wrote the query, and consider that be the join ordering. One of the join order orderings you can consider first. Is that some random jumbled up thing? And in practice they say, and this is all heuristics, you can't guarantee you're gonna find the optimal plan for any possible query, but in the real world, in databases we care about constant times, and that this might be good enough for what we need, okay? All right, so the next one is calcite. So this is an open source project, it's a standalone query optimizer as a service. It's written in Java, and the idea here is that instead of every, every data student had to re-implement their own query SQL parser and the transformation rules and all that, that you'd have a single service that could then speak a bunch of different SQL dialects, knows how to put it down into a canonical form or internal representation, and does the transformation on those, and produces out some physical plan for you that you can run. So this is originally part of a open source database system that also was a company called LucidDB, who here has heard of LucidDB? Nobody, of course, yes, it's dead, but they, for whatever, I think it was a European academic project and maybe a startup, but then they took out the, the optimizer service or sort of framework, they built for LucidDB, and then re-wrote it, strike it out, and then made it into a standalone thing called calcite. So calcite is from Hive. Is it from Hive? Yeah. Well, Hive uses it. We'll double check. Anyways. We'll double check. So there's a bunch of systems that use this, Flink uses this, HeavyDB, which usually on the side uses this, Splice Machine is using this, Hive uses this. A lot of people use it just for the SQL parser, like you don't have to use the full thing, you can just use it for the SQL parser, you can use it to do the query optimization stuff, but you have to, it's outside your database systems, you have to implement their API to say, here's my cost model estimates for these operators, like you have to implement the, all the things that SQL server is calling internally, you have to expose that in your system if you want to take advantage of this. So using it as a full query optimizer, I think it is non-trivial. Another sort of standalone service is this thing called Orca from Green Plum. So this is originally built by Pivotal, as we discussed, Pivotal got bought by EMC or and then it got spun out, now VMware owns it. So this was a standalone Cascades implementation in C++, it was originally written for Green Plum, but they also added support for Hawk, which is their version of Hive, like SQL on top of MapReduce or HTFS. The idea is instead of having to re-implement your own query optimizer for these different systems, you have one standalone thing that can support both of them. But Green Plum is the main user at this point. So we looked at this when we were trying to build our own system before, when we looked at it at first, there was no documentation, it was just like a bunch of example XML files that they get to upload and to use it. It's, since then they have better documentation. I don't know of anybody else using it outside of Green Plum, but they're actually maintaining it and it looks pretty good. And it supports multi-threaded search. So in the original Orca paper, they talk about some interesting things that I think are worth considering when you build an optimizer. Now, this is less, the first one is less of an issue if you're a cloud-based system. The second one still matters for no matter whether you're cloud or not. So the first issue they had to deal with was if people are running this on prem and there's a bug and something crashes, how do you figure out why the optimizer failed or why the optimizer produces a bad query plan? And so what they talk about is that if a query crashed, they could basically dump out the state of the memo table, which would be big, and then ship that as a binary back to the developers at the company and they can use that to recreate what was the search state of the optimizer when it was trying to find a plan. Again, the cloud makes this unnecessary because if you're hosting the system for people, you have access to everything. But there are some, there are some useful ideas out of this. The next one is how they do testing for this. How to actually check to see whether your optimizer producing the right plan based on the estimated cost. And what they talk about is how they would do, when they do testing with fuzzing, that they would generate the, you know, they run the SQL query through the optimizer, they would take the best plan and the second best plan, run both of them, and then see whether they were actually truly ranked in that order correctly or maybe the top 10 or so. And that allowed them to literally figure out, you know, how to improve the cost model estimates and improve the optimizer just so that the, that the estimates were matching up with the actual runtime cost. CoffersDB has a Cascades implementer their own Cascade optimizer starting in 2018 and everything's in go. And so they're using their own DSL called OpGen that you would write in this higher level language here the transformations I want to do and then do the cogen that into, to go because the whole system is written in go. Unlike in SQL server, they're going to consider the scale expressions, the predicates within ware clauses as transformation rules as well, along with operator transformations. So this is from a talk that Becca gave, Becca Taft gave during the pandemic, but this is just an example of what their OpGen language looks like. So this is a rule to do, if someone has an aware clause, where not not true, right? You can get rid of the double knots, right? So it's been inscrutable, but this is what a rule and transformation would actually look like. All right, the last one is going to be SQL Store. Previous called MemSQL. So there's a distributed query optimizer based on a stratified approach doing bottom-up search. And so they have sort of the standard stages, the rewriting stage, your logical transformations, logical fitters of transformations doing join order. But the one thing they have is very interesting that we haven't seen in other systems is that after they get out the physical plans from the enumerator, they then convert those physical plans back into SQL. And then they send the SQL queries to the executor nodes who then do all the stuff all over again, like parse it, plan it and optimize it, right? So it looks like this. The SQL query shows up, parser, binder, rewriter, physical plan, and then now you send the SQL query to the different execution nodes who then are gonna do all these same steps again. I'm gonna take a guess why. Is it an engineering debt? Yeah, so he basically said yes. So he said that each executor since it's getting a SQL query, the SQL query is also annotated with information like where to send the data to between different nodes. Like it's not pure, it's not like, it's not SQL the client would send, they put some extra stuff in it, but that's for internal booking and data movement. The reason why they do it is that all the executor nodes are gonna have their own statistics about what the data looks like locally. So if you're given a SQL query, they can decide how the best way to actually run it is. This thing is saying, okay, I know what my global cluster looks like for all my data. I know how to chop the query up and send it to different pieces, different locations, but how each node is actually gonna do the physical execution of whatever the scan I'm asking them to do, they can decide locally because the idea is that they see the data, they know what it looks like and therefore they're in the best position to make this decision. Yes. Is the data distributed? No question, is the data distributed across different nodes, yes. So this would be in their shared nothing on-prem system. I think they still do this, but you can imagine some, we've talked about doing like object stores stuff. In that case, there's distributed S3, then there's all these compute nodes. This wouldn't be a good use of that because everybody sees all the same data anyway, right? If you generate something like a group-by with a SQL, so the statement is if you do a group-by with a top end, there's no way to convert that physical plan back to SQL. This is what they're coming to do, it's in the paper. I don't know. This was, I think this is 2017, the year to be. So, all right. So again, hopefully this is going to crash course on what the Cascades query app might as well looks like. We talked about dynamic stuff last class. Hopefully the main impression you're getting from all of this is that it's hard, very, very hard. Save a question, sorry. Okay. And again, as I said before, this is why people build the heuristic-based stuff first or the no-SQL guy say we're not going to do SQL at all because you have to build a query optimizer, right? Nobody wanted, this is very, very hard to do. The Microsoft, all the major companies have spent millions and millions of dollars to try to make these things as efficient as possible. And even then, we'll see the paper guys in next class, they get things wrong. And I don't think machine learning is a magic wand because it's going to just solve all these problems. So, at the end of the day though, all of these algorithms are all going to rely on having the cost model. And that's the thing we'll cover next class, how you actually figure out, okay, what is the estimated cost? What is the productivity of these predicates on these scans? Okay? All right guys, join me again, see ya. Ha ha ha, that's my favorite all-time job. Ha ha ha. What is it? Yes, it's the SD Cricut IDES. I make a mess unless I can do it like a Gio. Ice cube with the G to the E to the T. Now here comes Duke. I play the game where there's no roots. Homies on the cusp of Yamafuku, so I drink brew. Put the bus a cap on the ice, bro. Bushwick on the go with a flow to the ice. Here I come, Willie D, that's me. Rolling with 501, South Park and South Central G. And St. I's when I party, by the 12-pack case of a thought. Six-pack, 48, gets the real price. I drink brew, but yo, I drink it by the 12. They say bill makes you fat. But St. I's is straight, so it really don't matter.