 We're starting to play networking protocols. So we're at the point in the semester where I now want to do like a high level overview of what we've talked about and what we plan on talking about. It's kind of difficult to show this at the very beginning because the topics we talked about were a bit scattered. Like it's current control, it's indexes, it's some storage stuff. This is sort of a conceptual illustration of what the system that we're sort of envisioning in our mind that we're talking about, discussing the various topics of how to build, what it looks like. So this is our application. They send a SQL query. This is going to land in our networking layer. And this is what we're going to talk about today. Then we hit the query optimizer or the planner. So we parse the SQL, bind to the catalog objects, do any rewriting we want to do, and then run it through our query optimizer to generate a physical plan. Then we're going to pass now this through our compiler, which won't make sense now, but we'll talk about this in a few weeks. Basically, we're going to take that physical plan and convert it into machine code instead of actually interpreting it. Then we have our execution engine where we'll execute the query in threads. And then we have a storage manager down below. So this one we've already talked about a bunch of stuff already. This one we've already talked about a bunch of stuff already. So these are sort of the parts that we've already covered. We've covered how to do concurrency control, indexes, storage models, and logging checkpoints. So now, again, today, we're going back to the top and then going forward, we'll sort of jump around and start here and then work our way back up. This part, though, for operating execution, we'll cover this all basically throughout the rest of the session. We'll talk about how to do parallel or vectorized hash joins, vectorized sorting. And the NVCC stuff sort of covers all of this, at least up to the planner. In the back of your mind, thinking about, I'm working on a multi-version database, what are the implications of that as I design my system? All right, so this is sort of our route for the rest of the semester. And then at the very end, the last two lectures will be like the popery things will be like new hardware or applying machine learning methods to the stack, okay? So as I said, today, we're going to talk about the networking protocol. And this is either the data coming into the database from the client or how we send data out to our replicas for high availability. But the paper I had you guys read was all about ingesting data and taking queries in and getting data out. So we'll talk about basically how you would write in your application the different methods for accessing the database and running queries. Then we'll talk about what the actual network wire protocol looks like that you're sending back responses over the network. Then we'll briefly talk about the different replication protocols that are out there, what are the sort of setup you can have. And then we'll finish up with two different techniques to bypass the operating system as much as possible called kernel bypass methods. So for all these, again, we assume we're going over the TCI P stack. And for simplicity, we're just assuming we're going to the OS. This is a way to work around them. And then we'll finish up briefly talking about what's expected for project two, okay? All right, so up until now, all the demos that I've given in class about using Postgres or in the interruption class, we showed MySQL, we showed Oracle, all of those were opening up the terminal over SSH, we typed in some queries hitting enter and then getting back a text response. So in a real application, that's not how it actually works because what I'm getting back and showing the terminal is text, it's characters. And so if I was writing an application where I would go through the terminal like that, I'd basically have to now parse the text field, extract out what the values I'm looking for and convert them to the right type of my application. And that would be super slow and nobody would actually ever do that, right? So what we're really going to do in a real program is that we're going to have an API, a programming interface that we can then write queries against on our database. And then what we're exposed to in our application is binary values for the most part. If you're getting back JSON, then not always but in real applications or if you really get that performance then you want to get binary data back to the application because then you don't have to do it in any of that conversion. Yourself in the application like, I know I'm looking at integer but I have a string representation of that integer. Let me now convert it to an end. So all these APIs, the way to get best performance is to go through sort of these standard programming interfaces. So we'll first talk about direct access. These are going to be a proprietary APIs or libraries that each database vendor could provide. And then we'll talk about ways to actually now have some more generic or generalized access methods or APIs that potentially would not have to require us to rewrite our application to use if we switch from one database vendor to another. We may have to change the SQL query dialect, but if we program against these things, then this would be standardized, okay? Yeah, all right, what was I, sorry. Right, again, the idea of these guys is that if I write my application using either ODBC or JDBC, then if I'm using my SQL today but I decide to give IVM money and now switch my application over to DB2, I don't have to rewrite any of my application code. Cuz these things are the API that you would use for my SQL and DB2 would be standardized on these things. Again, SQL might have changed, but the actual code itself won't change. So the, actually just go back real quickly. This one we're not gonna talk about too much. There's nothing really to say about this other than like everybody has their own thing and ideally you should not program against these APIs, right? You should always have either some kind of one of these things in front of it so that you make your thing more portable. In some applications, if you know you're gonna be running on a phone, you're only gonna be executing a SQL light, you could program it that way. But even then like Apple or Android will provide you with a database wrapper or API that hides the fact that you're actually using SQL light. All right, so the first sort of standardized API for data decisions was called ADBC. And this wasn't actually, it's not the first actually that was implemented. Like CyBase actually tried this back in the late 1980s where they got together with some other database vendors and some application programming frameworks. And they took this CyBase API, stripped out all the CyBase specific logic in it, and tried to propose this as a standard API called DBLibrary that everyone can use. For whatever reason that never took off, Microsoft then hooked up with this other company called Cyma Technologies. And then they proposed in the early 1990s this thing called ODBC. And pretty much today, every major database vendor, actually whether you're relational or non-relational, is chances are they're gonna have an ODBC library. Like MongoDB has one, even though MongoDB doesn't support SQL. It's just you give them MongoDB queries and you write MongoDB queries in their syntax. But it still supports the API. So the way it basically works is, they're gonna follow what's called a device driver model. And the idea is that on the client side, the ODBC driver is, this is something that the vendor provides. And underneath the covers they'll have all the logic that you need to communicate with the database server over the proprietary protocol. Like if I send a query request here, this request is gonna be specific to my database system. And then when I get back to the result, it's up to this driver now to then convert it into the format that the ODBC API specifies. So the parlance we're gonna use for this class is that I'm gonna refer to this part here, this is the part that we care about. This is the wire protocol. And again, every database system is gonna do something different, unless you're copying what MySQL and Postgres do, which is very common. But this is not gonna have anything, like this doesn't know anything about ODBC. It just knows that here's some SQL queries and here's a result. Now, the ODBC API does have additional things or it has certain features that not every database system is gonna be able to support. But you can essentially emulate that through on the client side in the driver. So for example, if your database doesn't support cursors. So cursors like if I run a SQL query, instead of getting back all the results at once, I can get a handle, a cursor handle to that result, like an iterator. And I can call get next, get next, get next. And that'll send a message over the network to get back the next query. You would do this if your result is huge and you don't want to ship everything all at once. So not every database system supports cursors. So you can fake that through ODBC library. So you can still call through ODBC in your application, hey, give me a cursor. Underneath the covers, they'll send the SQL query to get back all the results. But then expose to you the iterator API to walk through the cursor. That's sort of nice that you can hide all that complexity. If you don't support everything that you need for ODBC in the driver itself. So the other important thing to understand, too, is that the ODBC standard specifies what the data types are that you're going to get. When you say give me an integer from this result set. And so it's up for the driver to make sure that whatever the ODBC driver API specifies as a data type that it has to give that to you in your application. So that way you don't have an unexpected result and it makes it more portable. So for example, if we store every integer as 64 bits for whatever reason. And then the ODBC API says, give me a 32 bit integer. When I get back the 64 bit integer from the server, again that's to convert that to 32 bits because that's what the API specifies. So again, that's all trying to hide the specifics or the nuances of every single database system by through this driver model. So the other more common one, too, is JDBC. And the way to think about this is like ODBC, it's not specific to Windows anymore, but it sort of came out of that environment. And so that's sort of designed for C or C++ applications. JDBC is designed for Java-based applications, for applications that are designed to run on the JVM. So this was developed by Sun in 1997, when sort of a few years after they released the Java runtime and the Java language. And again, the idea was that they recognized that they wanted Java applications to run in the enterprise. Those things, enterprise applications need to talk about two relational databases, or SQL databases. So JDBC was their attempt to standardize an API for this. And again, like I said, you can think of ODBC as for C. JDBC is obviously for Java. And so what's going to be slightly different though in JDBC is that because they're running the Java runtime, they're not always going to have a driver written in Java. It's more common now, but back in the day, not so much, right? For ODBC, yes, if you release your driver, it's going to be in C, and that would be universal enough. So as a way to bridge people to be able to use JDBC before there was enough drivers written in Java, then they have a bunch of different levels or connectivity methods you can use to communicate with the database system. So the first one is exactly as I was saying. So instead of having the JDBC driver communicate directly with the database system, you have a little bridge, the middleware sitting in front of it, that converts JDBC commands into ODBC commands. And then ODBC then communicates to the database system over its proprietary wire protocol. So this is actually not supported anymore as of JDK 1.8. And this is actually with the pain in the ass to set up. I had to do this four times 10 when we do some testing here. It's a nightmare. It was really finicky. But again, this is a stopgap solution in case JDBC didn't have exactly what you wanted for your particular database system. The next approach is to have the JDBC calls just invoke the proprietary native API of the database system. Again, so SQLite would be the best way to think about this because SQLite's an embedded database system. So it's going to be running inside your same process as your application. So you could have the JDBC driver invoke through JNI to the C commands to execute queries or open up cursors or tables on the database system. But I don't think this would work if you're going over the network. This is only for embedded databases running in the same process. The next approach is to have the JDBC driver communicate with some other middleware system that then knows how to then speak the wire protocol of the database system. So it's sort of like the first one, but the idea here is that the commands coming out of JDBC are sort of generic for JDBC. But then when they land on those middleware that the database system would have to provide, it then knows how to convert that into the wire protocol that you expected. Whereas this thing is sort of, the JDBC is invoking ODBC commands. The last one, which in my opinion is the best one, is where the database, so that the JDBC implementation itself can invoke or send the packets that you need for the wire protocol of the database system. So this is not, wasn't that common so much in the early days, but it's more common now, like you can download a PostgreSQL implementation, or PostgreSQL JDBC driver purely written in Java. And you don't need to worry about any of the C stuff. And this will run the fastest because again you just go natively from Java over the network to get the packets and get everything back, right? So this one, I think, not actually, this one you sort of see in other languages. Like if you want to have, if you have some obscure programming, Rust or Go aren't secure, but in the early days, there weren't a lot of drivers written for different databases in Go and Rust and whatever language you want to use. So a stopgap solution would be this thing would wrap around like ODBC. You have a Go wrapper around the ODBC driver, which was written in C. And but that means again you have to then go out of the go runtime into the C program to send these commands. Whereas if you have a native Go implementation, that's gonna be way faster because there's less copying of packets or less copying of buffers. So this was always gonna be a better way to do this. So as I said, all of the database systems are gonna implement their own proprietary wire protocol. And these are, as far as I know, I don't have any other system that doesn't do this. Everything is gonna go over TCP IP. I don't know of any database system that goes over UDP. And it would be tough to do if you want to support transactions because you don't know whether your packets are actually gonna show up and you would need to send back acknowledgments. So for that reason, everyone's going over TCP IP. Now above that, they could have additional confirmation messages and back to say, did you get this packet? Yes, I got this packet. Like at the application level, but of course TCP is gonna do that for you underneath the covers as well. So the typically way, the way that the client's gonna interact with our database server is that they're gonna connect to the database system. They'll go through some kind of authentication process, like username, passwords, Kerberos keys, things like that. If most systems also now support SSL by default, so you go through that process to equip the channel. Then you send over a query. The database system executes that query, serializes the results, sends that back over the wire protocol to the client, and the client can then hand that off to the application code. So what we're gonna focus on today in the paper, how do you guys read? What's focusing on this problem here? How do you actually serialize the data and get out of it? Cuz this is actually where the opportunity for optimization is actually available. There are things that we can improve, right? There's not much we can do for this. There's not really much optimization we can do for this. Cuz most SQL queries are gonna be short, right? There's strings we send over. You can then maybe say I run these as prepared statements or stored procedures, but that's just reducing the size of the message. Most queries aren't that big. There's some applications I've heard from people in the wild where the SQL query could be like 10 megabytes. You see this at big corporations where they have the internal dashboards to do reporting, and you click a bunch of options to get all the people from the zip code or some geographical region. And then you sort of click all the options you want for the query. You click go, and then that converts whatever you clicked on the dashboard into a SQL query. And sometimes you can have these really long in clauses, like where zip code in, and then a list of every single zip code. And in that case, the SQL query can get really big. But the only optimization you really can do for that is just compress it with like snappy or GZIP, whatever you wanna use. So there's not really any techniques we can do in a database system to make this go faster. So we're gonna be focused on this one here, right? So I say all major database systems implement their own proprietary word protocol. That's not entirely true, question? Are there any systems that do kind of like a client side like basic SQL bits, so I'm sure some of the stuff is like when two people bump us along and that kind of thing. So good question. So the question is, are there any database systems where the client can actually be somewhat intelligent and maybe do compression, maybe do some query rewriting before it sends it to you? I was also thinking, you can also maybe do client side query caching as well. Not really for a couple of reasons. So the example you gave is like, can I do query rewriting? I don't know if anybody does that in the client driver, because like convert two equals one plus one, can I convert that just to be true? Because you basically would need to implement that logic to parse the SQL statement, understand the semantics of what the where clause is actually doing on the client side. Then now you also need to, if you're supporting a bunch of different programming languages and you're using having native drivers for those things, now you gotta rewrite all that in every single language. For that reason, nobody, I don't think anybody does that. There was a system called FoundationDB, which is a distributed key value store that Apple bought. It is now open source. Snowflake uses it internal for the metadata, but they only use it as a key value store. Before Apple bought FoundationDB, what they would do is they added the SQL layer and all that would do was in the client, they would convert your SQL statement into individual get queries, get commands you would send for the key value store. So that's the only example I can think of offhand, they do this kind of stuff. Other things like caching query results. Well, if you don't know what's in the database server on the client side, how do you know that you have the latest version? So for this reason, everyone always goes to the network. I have thought about this, we haven't pursued it yet. But there was some research done up in Waterloo where like you could provide hints about like, hey, I'm sending you this query, but oh, by the way, I'm going to execute these other queries pretty soon as well. But I don't think that makes it any commercialistic. The client drivers are usually pretty brain dead. Just like, give me this query, here's the result. Yes. Why can't we use the existing protocol to send the data across the network? What would be an existing protocol? When you say, why can't we just use an existing protocol? What do you mean, like TCP IP? I mean, yeah, exactly how the communication was over the servers, like sending data across. Why do we need to serialize the results and, you know? So I think what you're saying, when you say existing protocols, do you mean at the lower level in the stack, like at TCP IP level? Or do you mean like at the application level? Like at the application level. Right, so what's an existing protocol? To send packets across the network, it's TCP IP. That's TCP IP, but that's a layer below. We're above, and we're in the database server. So it's one of those existing protocols. So my question is, what's an existing one? It's a leading question because it's the next slide. OK, so as I was saying, like all the major database vendors, like Postgres, MySQL, the other big three, Oracle, DB2, SQL server, pretty much everyone implements their own wire protocol from scratch. But where I thought you were going to go is that most of the newer systems, however, don't do this, and they implement an existing one, in particular Postgres and MySQL, right? And the benefit of doing this is that, and we do the same thing in our system we're building here, is that now I don't have to worry about supporting a bunch of different programming languages for my drivers, right, for JDBC and OBC. If I speak the Postgres wire protocol, then anybody that comes along with the Postgres driver in Rust, they can just use that Postgres driver and communicate with my database system. Now, these aren't going to be the best protocols, and we can talk offline the troubles we've been having with the Postgres one. But if you think about it, if you're a database startup, or you're like your research project like us, do you want to be spending your time on writing client drivers for every single programming language in order for people to use your database, or can you just piggyback on what's already been done in a large ecosystem and have people just use what's already there? So that's why this is actually very, very common. So, but I will say though, sometimes you'll see in the documentation that, oh, we are Postgres compatible, you see in the documentation of these different database systems we're Postgres compatible, we're MySQL compatible. But just because they speak the wire protocol, doesn't mean they're actually truly compatible. They're not always going to be an exact drop in replacement because there's a bunch of other stuff that are specific to that Postgres and MySQL that you may not actually be emulating in your server or in your new system. And therefore, it's not going to work. Your application code using GADBC or ODBC may start sending SQL queries in a dialect that your server doesn't support and it's not going to work. So this will get you the sort of, by following their protocol, you'll get the transport to go from the client to the server correct, but then this query shows up and now what do you do? It's not going to be exactly the same. Other times, too, there'll be queries that if you don't support the same catalog structure as Postgres and MySQL, a lot of these visualization tools, the first thing you do when they turn them on is they go look in the catalogs and figure out what tables you have, what columns you have. But if you don't speak that, if you're not formatted the same way that Postgres and MySQL are, then that's going to break. Then there's other things, too, where the tools may go at the database server at sort of a physical level, like looking at files on disk. There's a lot of extensions and tools or Postgres that can manipulate the actual files of the database without actually having to go through the database system itself. None of that's going to work, because earning the coverage of your system is going to be completely different. Yes? For the Postgres protocol and MySQL protocol, the drivers, essentially, just write a string of SQL into some function that magically writes that over a network into a socket and then get some result back in some form. His question is, what do these drivers actually look like? At the bare bones, are they just like, here's a SQL string, you say, go run this for me, and then that gets converted into the packets that send to the database system, you get a result and then you get back data in some binary form. In the most simplest form, yes. But if you start doing prepared statements or cursors, you can specify, here's the query template I want to use and then the first value is the second value is a double. The API is actually kind of big and all of that won't be passing everything just through a SQL string. There are ways to construct queries programmatically through the API that isn't just, hey, here's the SQL string. And then the API specifies says, when you get back a result and you call next row, when you say, for this current row, give me the third field as an integer, it needs to know that I'm going to get back a third integer, not some other thing that your database has been storing internally. So SQL light stores everything as bar charts, I mean the covers. It's only when it gets exposed outside the database system then it gets converted to the correct binary form. So the driver had to handle that. Yes. Front row, yes. So SQL does support caching right now, right? Like you can turn on caching. So my question is, SQL or SQL server? SQL server. Sure. All right. So that's implementation, yes. Yeah. So my question is, why is it so different to my caching in the client side? What is the different client side caching and server side caching? No. Why is it different to, like you talked about caching on the client end and saying that you don't know the state of the server so the country gave you an idea? Yes. So why is it so different from caching it on the server and etc.? So the question is, I made a comment about the client side caching versus server side caching. Server side query result caching would be, I send a SQL query, it does some complex calculation, I send back the result to the client, but then I also remember that I execute that SQL query with the result. So now if someone comes along and executes the same SQL query, I don't have to recompute it. I just get back the same result. So an alternative could be that the client could do this so that the client says, I executed the same SQL query a few seconds ago, here's the result, just reuse that. The problem is if you care about having the data be fresh and up to date, on the client side, I don't know if somebody else came in and modified the tables and therefore invalidated my cache result. So you'd have to think about how you actually implement this. You could then say like the client could go to the database server and say, hey, has this cache result been invalidated yes or no, then if yes, then you react to the query, if no, then you just reuse what you've done. I don't think anybody actually does that though, because it's sort of one extra round trip. What if it's like a huge query, it can make a difference? Oh, that's the question. If it's a huge query, it can make a difference? Absolutely. If my query takes an hour to run and I can cache it, fantastic, right? So we're not going to talk about it this semester. It's also not an area that I fully understand myself. You can kind of get this on the database service side. So query result caching isn't always good, because it's like coarse grain. You have to cache the result for the entire SQL query. So if I say the giant select query has a predicate where say where name equals Andy, I cache that, then your query shows up with your name. I can't reuse anything that I computed, even though maybe one piece has changed. So a very common thing to do is what's called materialized views, where you can actually materialize pieces of that query. And so maybe you pre-materialize some expensive join, and then the predicate that comes in and knows how to do a filter on that materialized query. And the high-end servers, like SQL server in particular, can know how to reuse the materialized views across different queries. But again, for your example, you would need an explicit version validation message. And I don't think any system actually does that. All right, so you had a question too, yes. So for cursors, like you told that, like if my database does not support cursors, then it can give back the whole result, and then the client side can go one by one, right? Yes. But even if my database supports cursors, why will I always return only one to collect? Won't it make sense to return a good amount of the result, otherwise it will be going back into the time? Yeah, so his question is, so his question is, why would I ever want to have a cursor where I return one tuple at a time versus just sending everything all at once? So I think cursors, I don't know the exact details. I think they vary per system. I mean, it's not just one query. It'll be one packet, one message to get one result. They send a batch. But the question is, do I send, if I have a million results, do I send all one million, or do I send 10,000 at a time? So basically, if your data does not support cursors, then it has to send one million only. Correct. The statement is, if your database system doesn't support cursors, you have to send one million results. Yes. And again, think about what a database server would look like versus what an application server would look like. A database server is usually going to be running on high-end software. Now, if you distribute a database, the world's different. Think of a big, heavy, expensive machine that's got a lot of memory, a fast disk. The application servers can be these off-brand EC2 instances. They're not going to have as much memory. So maybe I don't want to blast it with all one million tuples. That's sort of the thinking of it. I actually don't know how differences do different things. Because also, too, you could imagine the way you implement this could be different, too. Because it could be, I execute the query. I now buffer the result. And you can call getNext on the cursor. And I'm just feeding through that. Or I could have it be set up where I have a pipeline breaker where I say, here's the final result of the query. And call it getNext, getNext inside the execution engine. Get a bunch of tuples. But once I reach my cursor limit, then I hand that back to the client. Because maybe they're not going to come back and ask for more. If they do come back and ask for more, then I'll just now execute the rest of the query. You could stage it that way. I think everyone buffers it and then gives you a handle to the cursor of the buffer. But you could implement it the other way. I think that's how it was originally envisioned. In fact, this is like the 1980s. In the pipeline model. Not the pipeline model, but the volcano model. The volcano model. Yeah, it is the pipeline. It's the volcano iterator pipeline model. Yes. Same thing. OK. So this is just a table showing what other systems are actually using the MySQL PostgreSQL Wire Protocol. I thought there were actually even more systems using Hive. But the only one I can find that speaks to the Hive Wire Protocol is Spark. So MySQL is very common. A bunch of these, though, are hacked up versions of MySQL. Like MemSQL, in the very beginning, actually wasn't a it was not a full-fledged database system. It was a storage engine, like NODB or RockDB, that sat underneath MySQL. So they got the MySQL compatibility and wire protocol for free. Eventually, they ripped all of that out, and they wrote everything themselves, because otherwise you'd violate the GPL and the proprietary. So they basically speak to the MySQL protocol now. So they are MySQL compatible. Cluster is based on MySQL, but it had a layer above that that was distributed. So there's a framework called MySQL Proxy that gives you a mid-wire front-end implementation of the MySQL Wire Protocol. And a bunch of these other ones are, again, speaking of the Wire Protocol. Amazon Aurora is slightly different. We'll talk about that in a second. Here's a bunch of these ones that are based on Postgres. So a couple of these are written from scratch, like Hyper, CockroachGD, our old system, Umbra. These are systems where we looked at the specification of the Postgres Wire Protocol and re-implemented it in our system. Things like Redshift, Green Plum, Vertica, and I think maybe Yugabyte. The original code they started with Postgres, and they ripped out the parts they didn't want and then re-wrote it to do whatever they wanted to do. So they kept that front-end piece of code, the networking layer, for Postgres. So the reason why Aurora is different is because Amazon controls the whole stack of, like, if you're running in their cloud center, they control the networking layer, the execution layer, and the storage layer. So what they do is actually super interesting, at least for MySQL. I don't know if they do this for Postgres. They actually take out some of the networking logic for the wire protocol for MySQL and shove it up into the Amazon, like, load balancing networking layer. So now when you hit, like, when your application speaks to your database instance and running an Aurora, you may not actually be hitting up the actual Aurora system itself. You're running up in this networking layer up above that doesn't load balancing for you, which is really fascinating. And what they can do is they can switch over sort of transparently from one instance to another. Like, if say I'm running on a small machine, I'm going to upgrade Amazon to my Aurora instance to a bigger machine. In the networking layer, because it's now decoupled from the actual database server, they can copy over your session variables from all your clients from one server to the other and then just have the networking layer switch over to the new instance. And it looks like you magically just got more, you know, more RAM, more CPUs without actually, you know, closing any of your connections. So that's crazy. And I think only Amazon can do that because it's Amazon, right? So we can take this offline. Implementing this has been a huge pain in the ass. It's been awful. You talked to Matt about it. This is our third attempt trying to re-implement the Puskos wire protocol. And it's really ugly because they have two different modes, which is a mess. Okay. All right. So let's talk about the paper you guys read today for this class. So as I said, where we can have opportunity to get better performance and improve the wire protocol is when we serialize the data back to the client. We can ignore the cursor stuff, just assume that they ran a query and they want a bunch of results. And the paper had you guys read came from the same authors that are building DuckDB or original people that are also working on Monadb at a CWI. And what they were focused on was doing these large data exports. So not so much running complex queries and getting back aggregated results, but I have my Python or data science application running some results and I have all my database, my data in my SQL database, how quickly can I get that data out and put it into my machine learning pipeline. So a lot of the things we're talking about today look like all the storage stuff we talked about before. How to do compression, how to do the row versus column stuff. Because that's essentially what it is, right? We have some giant chunk in data. What's the optimal way that we should organize it to send it over the wire? And it's basically the same thing if you want to store a bunch of data on disk or in memory. What's the optimal way to do that, right? So the other thing I'll say to you also is that everything we're going to talk about here however we organize the data on the server side and then send it over to the client, the client has to be able to interpret that and reverse it. So that's going to somewhat limit what we can actually do because we don't want to burden the client with actually doing a ton of work. Now again, they're focusing on doing these bulk exports of the data, but a lot of times the same client driver you would use in other systems, like the same client driver you would use for your application would be the same client driver you would use for whatever your applications are running on a giant Xeon box. So now no one's downloading a terabyte of data to their cell phone, that's stupid, right? But it just means that we need to be able to support whatever the optimizations we have apply here for the data. We have to be able to reverse them and interpret them. So that means that we can't be too heavyweight because we don't want to have to duplicate the optimizations, right? Again, if I want to support all the different programming languages and I have native drivers for each of these programming languages, whatever I need to do to decompress data in one language I've got to do it for another one, right? So it's sort of like the lowest common denominator, which is somewhat limiting, but it is what it is. All right, so the first thing that's going to be to talk about is the row versus column. So the ODBC and JDBC are inherently row-oriented APIs. The servers are always going to package up the tuples in the results one tuple at a time and sort of package that as a single message that gets sent over to the network. Now, we can stream a bunch of messages all at once, but the way these servers are mostly written is that it assumes that I'm going to iterate on the server side over one tuple at a time, a constructive message that says, here's what's actually in this tuple, and I put as many of those messages up, I guess I can, to my packet, ship that over, and then the client has to reverse that. So back in the 1990s, when they first started building ODBC and JDBC, this made sense because you were either running OTP applications where the result sets are small, like go get Andy's account record, go get all the orders that Andy bought from Amazon. That's not a lot of data and there's not a lot of tuples. If you're doing analytics, oftentimes it would just be computing the sales totals on a quarterly basis and then you're doing some aggregations on that. So the query is a bit more complex in the OTP side, but the amount of data you're getting back is not a lot. But again, as I said, the DuckDB guys are focused on exposing the database contents to these modern machine learning software. And in this world, we want things as columns or matrices, so this row-oriented approach is limiting. So what they propose to do is that instead of sending over in a row-oriented format, we can actually send it over as vectors. And in particular, the way they're going to do this is they're going to organize a bunch of tuples together and within a block, and then within that block, they'll be organized as columns. So it's not like I'm going to stream over one column at a time and a bunch of messages, and then when I'm done with that column, switch over to the next column and do that. They're going to say, here's a bunch of tuples, convert it as a column store within that block, and then ship that block over. And then for the next batch of tuples on the server side, you do the same thing. So we didn't talk about this exactly when we talked about storage layouts, but this is called the Pax model. It was actually invented not here at CMU, but by a professor who used to be here at CMU, and now she's at EPFL. And this is actually what we do in our database system now. We organize our data as one megabyte blocks, and then within that block, we have all the attributes or fields for a single tuple, but each of those fields are going to be organized as a column store, because we follow what Arrow does, and this is used in Parquet using Orc. Yes? Is this system still effective for rowing? Is it better just to stick to ODBC? The question is, is this approach still effective for ODBC? So again, ODBC doesn't know about any of this. ODBC doesn't know that I got a batch of tuples back and they're organized as columns. Now, the programming interface is definitely row-oriented. I get a cursor, I call getNext, and then I do whatever I need to do on that single tuple. I can't tell ODBC, give me the vector of values for a single column across all tuples. So I think what they're proposing here is that you could, you'd have to rewrite or have your own API at the application server level, application-side API to support getting these vectors. I actually don't know whether, I don't know whether ODBC is a way to say, give me my result set and just dump it out all at once. All the code I've written using ODBC or ODBC has always been a while over calling getNext. So the benefit we're going to get from this is that if we have everything as a column store, then we apply all the techniques that we talked about last week of compressing our data because we know all the values are going to be within the same domain and we're going to get a better compression ratio. So it's the same thing that we talked about before. We can do the naive compression approach where we just take our bytes that we have in our block that we're sending over and we just run gzip on it or snappy, send that over to the wire, and the client then just calls the same, uses the same compression algorithm to decompress it. The other approach is to do the columnar-specific encoding like RLE or the dictionary encoding or the delta encoding. As far as I know, no system actually does this. This does show up in some systems like MySQL will actually use gzip of all things to compress data. So as they talk about in the paper, obviously you're going to get a better compression ratio if you have the more data you have. And what's also nice about for larger blocks, this is actually going to work even better. And they say that this is the better way to do this because the approach is agnostic to what the actual data is. So there's no logic on the client's server side to say, oh, I'm looking at this column and it's delta encoding, so I need to now basically replay the delta to get me back the original values. If I'm just using snappy, I just take my byte stream or byte array and just run it through that and then I get everything out that I want. The other thing they also talk about in the paper is that if the network is slow, then a more heavyweight algorithm like gzip will be preferable over something like snappy because you're paying a higher CPU overhead cost to get a better compression ratio, and that's usually going to be a good trade-off. Again, what does it sound like? It sounds like a disk, which it is. It's essentially a slow piece of hardware that we need to get data through like a straw from one side to the other. So again, if we know it's going to be slow or we're going to have low bandwidth, then we want to do as much work we can on the ingestion and reading of the data to minimize that overhead. So for this reason, they claim that this is better, and like I said, I don't know of any system that actually does this. Now we get into the question of how are we actually going to represent the data, or how are we actually going to serialize it? So the first approach is to do binary encoding where, again, it's just like as we talked about when we talked about alignment and layout, we would represent the data in its binary form, and that's typically native to whatever C or C++ or what the hardware gives us on the IEEE 74 standard. So what will happen is the client side will be responsible for handing NENS because the server doesn't know what you actually want, so the server will send it out in one NENS and the client says, oh, well, I was told, the server is running x86, and therefore it gave me a little Indian, but at mine running on my cell phone, I need big Indian, so the client is responsible for reversing that. But that's not too expensive other than just having to copy data. Another thing that's going to be important, too, is that if we can have whatever the format is that the data system uses to store data in the actual columns itself or the rows themselves, if our wire protocol can match this, what we're actually storing, then it's going to be super cheap or have low overhead for us to go take data out from the result of a query and then put that into a packet to send that over the wire. So what I mean is that if, using my old system as an example, we built our execution engine and storage manager separately and then we sort of graphed it onto Postgres, and then when we actually want to go put the results of queries into a packet and send it over the Postgres wire protocol, how it represented strings was different than how we represented strings. So we would then have to convert it and basically means copying the data into our original format and convert it to Postgres as a format. So that extra copying starts to add up and you're doing this for every single query and every single tuple and every single result, then you're spending all your time just doing copying and serializing and deserializing, which sucks, right? I mean with Peloton, we had another mistake. We had like three copies of data just to get over the wire protocol. Two of them you needed, one of them was superfluous, but it's probably the reason why we killed that system. The other thing you can do, too, is instead of implementing your own binary format for sending over the messages, or that you're representing data messages, you actually can rely on these serialization, open-source serialization formats that are out there. Two most famous ones are Google's protocol buffers or Facebook thrift. Think about this as you define in a DSL what a packet looks like, what the types are and so forth, and then they have a way to compile and generate you code that you can then use to build structs and fill out these buffers, and then they'll serialize it for you so that you can send over the network. I don't know of any system that actually does uses protocol buffers or flat buffers is the newer version of protocol buffers. I think it's zero copy so it's faster. Hive uses thrift. Thrift brings you a bunch of other stuff like RPC and other networking communication protocol stuff, which is more than maybe you actually want, but Hive is the only one that uses this. I would not recommend these things because these are actually going to be a bit verbose. They're going to have to record all this extra metadata to tell you what the type of the data you're actually storing. In our networking protocol, if we do it ourselves, maybe just in the header of the packets we send over and say, hey, we're about to send you a bunch of tuples that have these types, here they are. They make multiple copies of that metadata for every single message. For this reason, I don't recommend this. You want to roll your own. In addition to the metadata about what my types are, we've got to keep track of how we actually represent nulls, the sizes of our data. All of that gets expensive and some systems do it better than others. The alternative is that the server just sends over all your data as plain text. The advantage of this is that you don't have to worry about any endianess. It's up to now the client's responsibility to figure out how to actually interpret what it is that you're actually looking at. Think of it this way. Say I have my server side. My query result has one tuple. It has one attribute, the number 123456. There's a 32-bit integer. I can represent it in four bytes. For in binary form, I just say, hey, I have a four-byte integer. On the turn of it could be I could convert 123456 into the string of characters 123456 and send that over. Now in that case, assuming I'm ASCII encoded, I need one byte for every single character. I have six numbers, so I need six bytes for those. But then, depending on how I'm representing either the length or say how long the thing actually is, I would need an extra byte, potentially, here's the null terminator, or here's the length of the string. And then now on the client side, I just call it a2i if you're using c to convert this string into the correct form. Then we can then hand off to adbc or JDBC. So this seems kind of crazy why we actually, whenever we want to do this, you see this in systems that support essentially JSON types or JSON result sets as like the output of a query. Monadb, we'll see in a second, actually supports this as well. I forget why. I think it's just for historical reasons they did this. But as we'll see in the results, storing in the binary form is the better way to do this always. Okay? All right. The last thing we got to talk about is how we actually handle strings and I sort of mentioned this just now. So the three approaches, again, the same thing we actually want to do in our database system. Do we just do the standard C way and have a null byte at the end of the string to denote the end of the string? So now if I'm parsing the packet that I'm getting over the network, as soon as I see this is an alternator character, I know that this is the end of the string. And then the advantage of this is I can use on my client side, I can use all the standard C API string functions because they know how to operate on strings with null terminators. The other approach is, again, how we do this natively in our database system where we just add the length of the string to the beginning of the byte to the string and the client knows how far it needs to jump ahead to find everything you need. The last one would be like a char field instead of a varchar where you have a fixed size of bytes that you use to represent every string for every tuple. Well, for every attribute of a tuple, you have a fixed size and then you just pad out whatever the value is for each individual tuple to fill out the rest of that size, the rest of that allocated space. So this seems kind of also wasteful as well, but if most of my strings are, say, as specified in my char field to be 16 characters and most of my strings are eight characters or nine characters, I'm going to bunch of zeros and if I'm using gzip or snappy, that's a good opportunity to get a good compression because it's going to see repeated byte sequences over and over again. I think, though, in the paper they discuss that this only really makes sense and gets the best performance if all your strings are like one character. And in the paper also talks about how sometimes this is faster, sometimes this is faster. It just depends on what your strings look like and as far as I know, no system is actually going to implement both. You just pick one and stick with it because it'd be too much engineering overhead both on the server side and the client side to have to support both of these. The other thing to point out to as well is that the performance you're going to get depending on how you represent strings is not just independent of just what the approach you're using. If I'm doing compression, then sometimes this would be really good and the other ones could be bad. So, again, there's a multiplicative effect based on how we're going to design our protocol and what compression scheme we're going to use and what we're doing row store versus column store. All of these things combined affect what the performance is going to be. All right, so let's just show two quick graphs. So, they did two experiments from the paper. The first one is we're going to go measure the time it takes to send one tuple and the time it takes to send a million tuples. And so they're going to compare against MySQL, both with and without, GZIP enabled, Monadbd, Oracle, MongoDB, DB2, and Hive. So, again, although MongoDB doesn't support SQL, they still support ODBC driver. So, this works. So, what you see is going across is that Hive is actually going to do the worse here, followed by DB2, but then these other ones here are roughly all about the same. What's interesting to point out, though, is that Monadbd is the only one of these ones doing text encoding, where everything else is doing binary encoding, and it still outperforms the other ones doing binary encoding. And as a case of DB2, I don't know why this is slow. We'll see in the next slide it's more pronounced, but at least with Oracle and DB2, they have their own confirmation message that the client sends to the server and say, hey, I got what you just sent me, send me more. I think that's sort of redundant over top of TCP, but maybe it's from a day where you wanted that kind of extra security, extra safety-ness. And then Hive, I said, again, it's just sending way more data. So, for this one, again, this is only one tuple at a time, they avoid all the overhead of, like, parsing the SQL statement and running the optimizer and running the query. They're going to run the query multiple times and then have the database system cache lease the query plan, not the result, but lease the query plan. So this is just saying what's the overhead of constructing the packet that we sent over the wire protocol. In the next experiment, they're going to then send back a million tuples. And so the first two results to point out, though, is with my SQL. Along the x-axis, we're going to vary the latency of these network messages. So when your network is really fast, up until this point here, the regular MySQL protocol without any compression is going to be better. But once, obviously, the network gets slower, then doing that extra CPU work to compress the data actually pays off. In this case here, MySQL with compression, it always gets the same performance here is calling gzip. For all the other ones, the curves look all about the same. Obviously, when they're faster, when the network gets faster, they do better, but as the network gets slower, they do worse. What's interesting to point out, though, over here is that Oracle is the second fastest one after MySQL here when the network gets fast. But then up here, it actually ends up being one of the slowest ones. I don't remember what the paper said that this was the case. DB2 here is an order magnitude off of Postgres. DB2 is doing like 500... Is that 500 up there? And Oracle... Postgres right below it is around 50 seconds. So that's just an example of how that confirmation message that they're sending over is super inefficient when the network gets slow. They can't send the next batch of tuples to the client until you get that second round trip to say, hey, send me more. Any questions about this? Why is MySQL lesser than MySQL 30? Sorry, this one here? Yeah. Why is this faster than this? Yeah. Because I'm sending back a million tuples. I don't remember the exact size of that. But when the network's really fast, I don't want to pay the penalty to compress it. I just want to shove it over the wire as fast as possible. And this one here, the dominant cost is the CPU overhead of compressing and decompressing. So that's why, again, it's basically flatline. So no matter how slow the network gets, the benefit you're getting from sending less data over a slower network is negated by the computational cost of this. Now, they could have used snappy. That might change the curves a bit. But it's basically saying, and why I highlighted this one first, is that when the network is fast, you don't want to do any compression. You just want to shove data out as fast as possible. But as the network gets slower, you do want to do this. So you now say, all right, well, in what case would the network get faster if the data server is running another one, then I can get up to 100 millisecond latencies. Yes? Your statement is, shouldn't they be bearing the bandwidth and not the latency? Because I mean, they're related to each other, but I think in the case of DB2, sending those confirmation messages, that's not a bandwidth issue. That's our view. What there is is the data side, and what matters is the basic network bandwidth. Unless that latency affects the... Your statement is that, let's take this offline, because I don't know exactly. I don't remember the exact setup was in terms of like, I think the client has to get the result and immediately throw it away. But in this case here, the client, so that the cost of decompressing on the client side is being measured in this as well. So I think that's why the... You can't hide that. I think that's why the latency matters with the bandwidth. Let's do that one offline. Okay. We have 20 minutes. I'm actually going to skip replication protocols. I think we covered this in the... We basically covered this in the intro class already. The only thing I have to say, too, sort of relating to what we talked about last class with logging and recovery, sometimes you do logical logging, sometimes you do physical logging, and that's an internal protocol that the data system uses that's going to be separate than the client protocol you would use to communicate with it, like ODBC and JDBC. And some of the oppositions you can do, again, for doing compression and other things or how you batch up the log records, depends on what the consistency guarantees you want for your systems. So that one will... There's not much more to say other than that. But we can talk about that offline if you want, as well. All right, so just to finish up real quickly, because this is something I think I do want to expose you guys to. In the experience with... from the DuckDB guy's paper, as they were showing, they're varying the network latency, but that's not the only bottleneck you're going to have. The network layer itself is not the only bottleneck you're going to have in your database server to communicate with the client. Oftentimes, the OS is going to cause problems, especially if you're doing OTP applications where you're sending a lot of small packets, instead of giant buffers of things. And the reason why it's going to be expensive is because communicating with the OS is always going to be a nightmare for us as the data system. It's our frenemy. We need it to survive, but it always gets in our way. So if we were going to rely on the OS to TCP... TCP IP stack, then we have to have context switches handled through interrupts in order to be notified that we have now a packet that we want to get, copying data out from our buffers in the database server down into the TCP IP buffers that we send into the kernel. That can be expensive. And of course, the OS is going to maintain its own latches to protect its internal data structures, and those are going to get in our way if we have a lot of threads trying to write to data at the same time. So one way to avoid all this is through what's called a kernel bypass methods. The idea here is that we're going to have the database system, we're going to implement it so that we're going to be able to write data directly to the NIC, to the actual, you know, the hardware, the network device by going around the OS, right? And so the idea is that we can now have a buffer of data that we can fill in with the result of a query or the packets we want to send over using our wire protocol, but that buffer actually now lives down in the NIC, so we don't have to do any copying to get off to the hardware. The hardware can fill up a buffer and we say we're done with it, and then now we can immediately go over the wire and send over our messages. So the OS doesn't get in the way at all. So there's two ways to do this. There's the data plane development kit and the DPDK, and then there's through RDMA, or Remote Direct Memory Access. So they're not exactly, it's not a true apples to apples comparison because the DPDK is a library you can download, originally from Intel, that provides this kernel bypass method. RDMA is sort of a category of hardware and software libraries that you can get for your system. So the point is there's a specific thing you can download called the DPDK, and that is a library. There's no single library called RDMA you can download. It's a broader concept. So with the DPDK, the idea here is that it's this library that Intel originally developed and sell their Intel's hardware, but they eventually donated the software to the Linux Foundation. And the idea here is that it's a bunch of libraries that, or API calls, that allows us to access the NIC directly. We can say give us a buffer that's on the NIC, we can then fill it up, and then we can then tell the NIC to go ahead and write our data to this location. So we're almost treating like the NIC as like a bare metal device, yes. Is that the buffer like on the NIC itself or is it in the memory? Does the buffer actually live? I actually don't know. I mean, the NIC has its own memory buffer. I think when you say give me, through DPDK, when you get a buffer for it, when you first do your write, I think it lives in the CPU cache because it has to because that's when you write everything. But then when it gets flushed, it doesn't get flushed to DRAM, it gets flushed to the NIC. I think that's how it works. Is it also a conduct switch if my process of writing to the cache lines that are backed by the NIC, if I get a conduct switch, does that then get flushed? I think so, yes. The OS doesn't control any of this data movement. The hardware is actually providing this functionality for us. And Intel, in their world, they're trying to sell hardware so that they're making it easier for you to write to their hardware. For the DPDK, so it came out of Intel, but now it's supposed to be this broader thing that other vendors can implement. I don't know whether you can get this with anything else that's not at Intel NIC. So again, there's no data copying because we can write directly into the buffers that are on the NIC, and there's no system calls to send any messages. We're basically going directly from our database server to the hardware and say, send our messages to this location. OS doesn't get involved at all. It's awesome. Amazing as this sounds, there's only one database system that I'm aware of that actually uses this. It's called ScaliaDB. They implemented this framework called CSTAR, which is a networking framework that uses the DPDK, and then ScaliaDB is built on top of it. ScaliaDB is a C++ re-implementation of Cassandra. Cassandra is really written in Java. This is now written in C++ and using DPDK to get faster messages. Right? So, as amazing as this sounds, as I said, this is not that common. You can get instances on EC2 that do support the DPDK, but it's not like the cheaper ones that we were running on this class. They don't support this. The other tricky thing, too, is like this sounds amazing and it's kind of seductive, but someone tweeted at me once about using the DPDK one. This comment, I think it's fantastic. It seems like it's going to be a really good idea, but then when you really start pushing it, then there's all these nuances of it that start to trip you up. It's not something you can just plop in, link in a library, and you automatically get it. You have to rewrite your database server to use this. That could be a major change. In the case of ScaliaDB, they're built on top of CSTAR, so all the complexities of the DPDK are hidden underneath the covers. For this reason, as I said, because you have to rewrite your server, not many database systems should use this. The other one is the RDMA, and the idea here is that if I have multiple machines, I can have the client or the server read and write the memory location of that remote machine. The way you think about this is my application server, if my client driver from my database server knew of the memory layout of the data on the database server, then instead of sending now a SQL query to go have that be parsed, executed, and then send back the result set, I could, if I know how the data is laid out, I could just read directly into memory and get the result I want back. That's actually super hard, because now if I start moving memory around, everything is going to get messed up. The only two systems that I'm aware of that actually use RDMA, the most famous one is Oracle Rack. It's basically a multi-million dollar cabinet of high-end servers that use RDMA to talk to each other within the rack. In that case, Oracle has designed this system specifically to do shared memory buffers using RDMA. Farm is a distributed transactional system out of Microsoft Research. I don't think it's actually running production, but the tricky thing about RDMA is you don't know whether someone's accessing your memory or writing your memory, because it's all handled through the hardware, it's all hidden from you from the OS. It's not like I get to interrupt to say someone wrote to memory at this location, I don't know. In order to do transactions correctly and farm, they basically have to do four-phase commit over RDMA, which is faster, but it's four-phase commit to be able to say, all right, I've made these changes, are you okay with that? So, again, for this reason, you would only use RDMA internally to communicate between servers of the same database instance. I don't know. There's been no work, as far as I know, about doing RDMA from the client to the server, because if memory gets moved around, then you're screwed. So, I want to show one quick graph on this. So, this is an experiment that a former student of mine did on our system where we just wanted to see how fast we can get seven gigabytes of tuples out of TPCC. The protocol is that the client is actually reading into memory. So, here's where the layout, the starting and stopping addresses for all the blocks of data that I want to read is sort of how fast can we get everything out. So, at the low end, if you have Postgres where you're actually iterating and going through the Postgres wire protocol, if you then do what the DuckDB guys proposed in the paper you read about vectorizing the blocks you send back, you can do a little bit better. AeroFlight is a GRPC implementation of getting raw blocks of data out from the AeroGuys, but the RDMA is like the bare bones, like if I know exactly the memory address that I want, and I can jump to that location on the round machine and get all the data back I need. So, this just shows you again you can get about an order management performance improvement if you don't worry about going through any of that sort of, any other database server software. Okay? So, this is a bit rushed at the end, and I've covered most of this already. So, the network reporting call is something that we sort of take advantage of, but I think there's a lot of opportunity for performance improvements, and this is what the DuckDB guys showed. The problem is though, you have all of these drivers out in the wild, and in some ways now the protocol is sort of set in stone. It'd be a major change to go back and have everyone link in new libraries, that's never going to happen, but now if you want to be able to update your protocol and support some of these optimizations, you essentially need to support both, because you never know when a client's going to show up and connect with the old protocol. The... So, Mike Stoenberger told me that when they were building Vertica, one of the big things they spent a lot of time on, because they wanted to be Postgres-compatible, is that they basically found every single jar file they could of the Postgres wire protocol that was out there, and just ran every single test over and over again, because there's so many one-off variances of all of them. They spent a lot of time making it just be Postgres-GDBC-compatible, and they had a lot of options to avoid the OS, avoid having to go through the front-end layer, but it's usually more work than it's actually worth. Okay? All right, so quickly, project two. So, the plan for project two is that everyone's going to implement in the team your own B-plus tree, an in-memory B-plus tree that has to be thread safe. That means you have to split some merges and have multiple threads accessing the index at the same time. So, the basic API you're going to have to implement is insert, get, delete, and range scans. You also have to be able to support conditional inserts. And the way that works is I call insert with a key, but I also pass you a lambda function that says that you need to evaluate on the key, and if that lambda function evaluates to true, then you're allowed to insert. Otherwise, you can't. That's basically how you can do, not only take a pair and swap, but to replace a key-value pair without having to do multiple traversals or lock the whole thing. You need to support forward and reverse range scans. That means you need sibling pointers, and you've got to support unique and non-unique keys. So that's all we're going to tell you you have to do. There will be some stub APIs or sub files, class files will give you. How you actually want to implement this is left up entirely to you. You can't do something stupid like go take STL map, put a latch in front of it, and have that be your tree. Whatever optimizations you want to do for compression, if you want to do make a latch-free or not latch-free, or how you actually implement the traversal itself and do the search within the node, that's up entirely to you. Yes? The key type. We already provided that for you. It's templatized. There's two types. Compact inskey and generic key. So you don't have to write it in the code to do the evaluation of keys. We handle all that for you. And the key type is you'll know the size of it. So you just pass that into how you want to start in each node. You're basically just building the data structure. Okay. This is already what I said. So we'll provide you with the header file and the API you have to build, and then all this other crap will do for you. So you don't have to worry about how to serialize the keys that you guys handled. So as I said, there's a bunch of design stations you're going to have to make, and there's no right answer. Sometimes there'll be wrong answers. Should I write everything to DevNol? No, that's wrong. But you don't need to come and ask us, can I do this, can I do this at every single step? It's up for you guys to decide how you actually want to implement this. And I can point you to a couple, there's a book on B-plus trees. It's not specific to in-marry B-plus trees, like here's some other things you can do to potentially make this go faster. So we'll provide you with a basic C++ unit test for implementation. It's basically the same unit test we have for the BW tree, and we just did a search and replace it with B-plus tree. But it's not going to be that exhaustive. It's certainly not going to test the internal data structure of correctness or integrity that you want to have. It's done at a logical level. I get the key back. But how do I know that I don't have any empty nodes hanging around? We'll check for memory leaks and things like that. You're going to want to write your own test that actually tests at the low level of data structure. We don't know how your thing is implemented so we can't provide you with things. We'll do a leaderboard to see who has the fastest one. It's going to run on Gradescope, which is not ideal because it's single-threaded, and then we'll give extra bonus points for the top three implementations. The other thing you have to do is also write documentation about what your code is actually doing and explain what you did in all the different parts of the system. For this, there is a basic check in Doxygen. We use Doxygen to make sure that you actually have comments for every single function. But obviously Doxygen can't read the comments and see what you're saying is actually correct. Matt and I would go through and inspect all of these manually. We'll do some additional stress tests for your implementation beyond what we're going to give you already. And then it will give you extra points for whoever is the fastest. And then you make sure again all your code is to follow the formatter and linter stuff. One of the things we did actually do Ian and another student actually made the linter go faster, so now you don't have to take all the cores in your machine and look at every single file. The linter only looks at what files you've changed. You can run faster as well. I posted this on Piazza last night. It's a group project, so with 35 students there should be three groups or 12 groups of three people and then one two-person group project. Don't assume you'll be that two-person project. We only have one. So please start finding a form of group. And then on the sign up there's that list for free agents. If you don't have a group to be in, teams can reach out to you and try to get you to join that. So the website is not up yet. We'll take care of that today. I think Matt pushed the stub files to GitHub so you can pull down the latest version of our new project branch. And then this will be due on March 15. The goal is obviously to have Grace come up as soon as possible, so it's not like a week or two before the deadline. Any questions about this? All right, this might be fun, right? It's for you guys to be creative and do whatever you want to do and see who has the fastest one. Okay? All right, so next week we'll start talking about actually now how to start executing queries. And this is where actually we can spend a lot of our time for the rest of the semester. We'll talk a little bit about how we're going to organize and schedule threads and our processing models for queries. Then we'll end up talking about query compilation. But at this point it's really we're focusing on now we know how to index things and store things and send things over the network. How do we actually execute queries? That's the whole point of a database system. You want to get SQL queries and be able to run them and produce answers. Okay? All right guys, enjoy your weekend. See ya. I need it just a little more