 Again, round of applause, last class with DJTPL. How are you doing? I'm doing well, how are you? So we have a lot of things to do today. And by the way, so DJTPL's agent sent over his glossies. So he's signing autographs if you want with him afterwards. So we have enough for everyone. So this is a nice photo. Yeah, it's pretty cool. Your agent is very aggressive. He's got to sign autographs. I'm like, it's a class. It's kind of weird, but whatever. All right, guys, a lot to cover. So let's finish up. So for the major thing on the docket for you guys, other than the final exam, again, is project four. It's going to be due this Monday at midnight. And then we're having the extra office hours on Saturday on the fourth floor in Gates. Who here has not started project four? Really? Bold. OK. That's not a good idea. OK. All right, any high level questions about project four? Yes. Yes, you can use late day as project four. Yes. We set the due date because the university policy is that we're not supposed to have anything due during the finals week. I did it one year. I got in trouble. So yeah. Any other questions? Yes. His statement is, is project four going to be used in previous semesters? Because it's the most time-consuming of all the projects we've done so far. Yeah, the plan is yes. We probably will use it in subsequent semesters. We try to dial down the complexity quite a bit. This is sort of why we give you the test ahead of time, why we said, OK, here's all this extra bonus stuff you could do if you wanted to. But we'll see how it goes and we will adjust accordingly. Yes. That'll be later at the end. I didn't forget. There's slides on that. Some things you brought on the Piazza you want to be discussed and we'll discuss that at the end. All right, the question about project four. All right, cool. So again, reminder and we'll post it on the Piazza again. If you like databases a lot, you like this class or you think bus stop is terrible and you want to rewrite a lot of it. Jignesh is teaching this next semester in the spring. So if you're going to be around and you want to be TA, by all means please sign up. And again, you don't have to be like Chi. Chi is violating labor laws sometimes and we don't want to do that. I've had two people come to be like, yeah, I want to be TA, but do I have to be like Chi? And these are no, okay? The kid's a freak. He might be on cocaine. But he's graduating. All right, any questions about this? All right. So the other thing too, also we'll post those on Piazza. Your feedback is strongly needed. So the point he just brought up, hey, project four is way too hard relative to the other projects. It used to be project two was too hard and then we try to try to rebalance things. So we do listen to the feedback and we went to your opinion about everything. Now for undergrads, I don't need to explain to you what it's like to write course feedback. It's for the master students. You guys are terrible. I'm not saying in particular, but every year the master students are like, this class is great. This class is great. This class is great. Whereas the undergrads go in very details and a lot of these are about what things are wrong and what we can do better. So if you're a master student, please like be brutal. It's anonymous. I can't see it. We don't see names. So by all means, if you want to do things better, let us know, okay? And again, if you're not graduating, come make suggestions in the course feedback and then come in the spring and be a TA and fix things for future people. For office hours leading up to the final exam, I'm moving mine to be on Monday at 9.30 in the morning over Zoom. Dignesh is actually with his wife right now. So he's not going on campus for the semester. So his office hours will also be on Zoom. If you want to meet with beyond these hours, please send an email and I will try to accommodate you. Okay? And then the TAs will hold the regular office hours up to and including this Friday. They will have the project for office hours on Saturday and then there won't be any office hours from the TAs starting next week. Okay? All right. So the final exam. Who has to take it? Anyone enroll in the class? It's going to be in positive 153. I'll explain how this is going to work yet. We actually have three locations. We'll say how we're going to be that up. That'll be next Tuesday, 8.30 a.m. and go watch that video to understand why. If you need a special accommodation, some of you have already done this. If you haven't yet, please do this as soon as possible so that way we have time to prepare and work with the disability office. And then the, I'll post this on Piazza after the class with the final guide for the exam and the practice exam is available on the website now. All right. So if you ever looked, if you looked at the CMU schedule for the final exams, we're actually assigned three classrooms. We're in Posner and then the Hall of Arts and then two different rooms. So I want everyone to show up at Posner 153, which is the big building. They're sorry, the big room. Show up there. We will then assign you to a random location so that we can spread out. Posner is the bigger one. Most of you guys would be in there, but we'll just randomly spread out across the different rooms that are close to each other. And there'll be two TAs in every single room and then myself, I'll be bouncing around asking questions and helping as we go along. Okay? So again, everyone show up to Posner 133 and then we'll figure out where to go after that. The rooms are, as far as I know, they're close to each other. Okay? All right, so what to bring. Obviously, we need your CMU ID, bring a pencil and an eraser. You can take the exam and a pen if you want to, but then you're going to scratch things out and it's going to be a huge pain for us to deal with. So please bring a pencil. Please bring an eraser. You may need a calculator. Using your cell phone is sufficient because most people don't have standalone calculators. And then you can bring one sheet of notes double-sided and they had to be handwritten, just like you had in the midterm exam. One year, a kid was hocking HFT themed clothing. This is two years ago when that was a thing. Actually, I take that. I don't care if you want to. If you waste your money on NFT clothing, that's your problem. Okay? But you can bring food. We'll have coffee and donuts for everyone as well early in the morning. All right? All right, so what will be on the exam in terms of everything that was before the midterm? Obviously, you need to know SQL. If you've forgotten SQL at this point after homework one and other things, then you have other problems. But everyone should have basic understanding of SQL. So if we show you a SQL query, you understand what it's doing. You obviously need to know what the buffal manager is doing because in the context of logging recovery, how do you know when things are allowed to be written out the disk? You need to understand basic data structures, hash trees, B plus trees, when to use one versus the other. The storage models, column stores versus row stores, how much data is going to get moved around potentially if it's in a distributed environment. Query processing models, what are the actual operators themselves when they're pushing or pulling data up? Are they sending batches, vectors, or the entire result set? And then basic interquery parallelism. Like there's an exchange operator you would use to combine things. Sorry, that's interquery. Interquery parallelism means multiple queries running at the same time. What are the implications of that? All right, so the basic high-level information we need to know from before the midterm. For the things that we did cover the midterm going forward, you don't understand basics about query optimization. We obviously only had one lecture on this and it's a very complicated topic so we can't go into too much detail but it's basic information about what are the implications of doing predicate pushdown in the query plan or projection pushdown or how can you rewrite a subquery to pull things out and execute it as a join? We're obviously not going to ask you to derive an algorithm that does that because it's hard but at a high level what does it actually look like? We talked about basic information about collecting statistics, cardinality estimations, histograms. Again, we can't obviously go into too much detail but again, understanding this at a high level and then what are the implications of something you use like a cost-based search? Like how would that plug in with one of these cost models to determine like the correct join order, for example? And then you can see how you confuse this or synthesize this knowledge with the distributed databases later on to understand oh, the cost model would have to include moving data across the network, right? So again, high-level things like this, not specific formulas. We spent a lot of time talking about transactions. So the first sort of set of lectures or the first lecture was on what we'll call the theory of transactions, concurrency control. Like what does it mean for a database system to say it provides asset guarantees, atomicity, consistency, isolation and durability. And again, when it was a single node, consistency didn't quite make sense. It makes more sense in the distributed environment but certainly atomicity, isolation and durability mattered a lot on a single node. Then we spent time talking about conflict serializability and view serializability. Again, high-level concepts and ideas, not necessarily the protocols we'd use to support them. That comes later. So how would you determine whether a schedule is correct? Is conflict serializable? And how can you determine whether two different schedules are conflict equivalent? And then view serializability, the main thing is understanding what is the difference of view serializability in terms of what the application sees in terms of correctness and the ordering of operations in a schedule relative to conflict serializability. In the case of view serializability, it was about you could reorder things and the client still ended up seeing the same result. And again, no system implements view serializability because you need to understand what is correct mean to the application. And that's not something any data system can reason about now. We talked about recoverable schedules. And then we talked about isolation levels and anomalies. What's the lowest isolation level in the NC standard? Read uncommitted. What's the highest? Serializable, yes. Is there anything above serializable? Strict serializable, yes, good. Excellent. Again, there's no form of these. Just understand what are the anomalies you could have. The phantoms, the dirty reads, the unrepeatable reads, understanding all those things in the context of concurrency tool. And then for specific protocols, we spent a lot of time talking about two-phase locking. The difference between rigorous and non-rigorous or strict versus non-strict. We talked about the cascading aborts problem. If a transaction reads data that was uncommitted from another transaction and that other transaction aborts, you've got to roll everyone back in the cascades. We talked about the different methods to handle deadlocks, which is the big problem you have in two-phase locking. How do you detect it? Will the wait for the graph and determine who to kill? And then deadlock prevention was wound or wait and die, trying to figure out some kind of ordering mechanism based on transaction IDs or timestamps of transactions to determine who to kill to avoid a deadlock. Or carry yourself early or kill somebody else. And then we talked about hierarchal locking or multiple granularity locking. And this is where we introduced intention locks. So as you're traversing down this hierarchy of the database system, there's a database. You could have a table, a table with pages, so forth. But you can take intention locks in those higher levels to give hints to other transactions that come along about what you may be doing in those lower levels. And the key thing here is understanding the performance tradeoffs of allowing for more parallelism by taking as the minimum permission locks as you need at the lowest point in this tree versus having to go to the lock manager over and over again to say, I have this lock. Like, there's a whole, you know, that has to be concurrent data structure in itself. And then lock escalation would be, if I hold a lock in this mode, can I take a lock in another mode? When is that allowed? Yes. This question is, do you need to memorize the intention lock grid? I mean, that helps you understand it, but like, if you understand what, like, we're shared intention, shared intention, so you don't need to understand what they are. Yeah. Yes. This question is going to go with the performance tradeoffs. So the, the, there's this tradeoff between how much parallelism I have, I want to have allowing other transactions to run at the same time and take, take compatible locks on different parts of the database system versus having to go to the lock manager and make a request I'm going to acquire these locks, right? So an example would be if I have a table and a table has a billion tuples and I want to update all one billion of them, I could take a exclusive lock on the entire table and then that blocks anybody else from doing anything below, below in the tree. But then I, but I only figured the lock manager once or I could go down, take, you know, take an intention exclusive lock on the, the, the table and then take exclusive locks on the individual tuples I want to update. But now I got to go to the lock manager every single time I need to acquire those locks, right? That, again, big picture there isn't, there's not math involved in determining that. Just understand the pros and cons of both of them. Yes. This question is, what do I mean by lock escalation? It'd be if I hold, in which cases for what, if I hold a lock in what mode, can I escalate it to a higher mode? And when does that allow? So it obviously give, say ignore intention locks. I hold a shared lock. Can I escalate to automatically to an exclusive lock? No, because you don't know who else may be holding the shared lock. Now, if nobody else holds the shared lock, what you would know because you have a lock manager then you could do that. High level details, we're not at, there's no formulas here. Yes. This question, when would you try to upgrade locks? If, again, don't think of like a single query coming in and crying locks and walking away. Think of a multi-query transaction. I do a select on something, the application gets the result, then it comes back and does an update, right? So then I, and I, if I'm doing, I'm doing the rigorous two days locking, I hold the lock until the very end. So can I go back on the second query and escalate that shared lock into an exclusive lock? Right? Yeah. And then we talked about like hints. You can do select for update. So you can do a select query and say, oh, by the way, I'm going to update it immediately afterwards. So take, don't take shared locks, take exclusive locks on the objects I read. So then when the, when the update query comes, it already holds an exclusive mode. All right. Again, that's a low level detail we don't have to worry about. All right. So after two days locking, we talked about timestamp ordering protocols or optimistic protocols. We talked to the basic timestamp ordering current control. We talked about this one optimization, the Thomas Wright rule. It allows you to, to write things, to overwrite, to write objects, to objects, even though they've been updated in the future and you just ignore them. We talked about doing OCC in the three phases, the read phase, validation phase and write phase. And there's a little, very similar to what you'll see, what you did in the homework. And I think we have a question on this, on the practice exam as well. And then in the same time talking about multi version current control. And I'm not so much worried about like, how do you fit in OCC or 2PL in MFCC? It was really the different design decisions you would have in a database system. If you want to support multi versioning, you know, how would you actually organize those versions in storage? Do you order the newest, oldest, oldest and newest? How do you do garbage collection, clean up old versions? And then how do you maintain indexes, which may have pointers to different versions? Yes. One thing that's kind of unfortunate is that you wouldn't miss. So the question is, is OCC an alternative to 2PL? Yes. So most, every non-academic system. Yeah, pretty much every system, there are some academic prototypes that kind of blend them. Like if you know, if you know your low contention, OCC will be better. And then maybe if you add more contention, it'll automatically switch to 2PL. But like, it's so hard. I mean, you guys have seen this in Project 4. It's so hard that it's just to implement one of them. Nobody implements two and tries to be clever like that. Everyone just picks one. OK, we spent time talking about crash recovery. So this is about, and if I have a bunch of changes sitting in memory, when can the buffer pool flush them out, right? And there's sort of two to nine decisions is whether you steal, no steal, a force or no force. Again, steal is when the data system is allowed to flush out pages that have been modified by transactions that have not committed yet. They don't want to make room for other things you need. And no steal says you can't do that. You can only flush out data when it's been committed. And then force is no force is the requirement that I have to flush all the dirty changes from a transaction immediately when it commits before I can tell the outside world that I committed. Or can I do this? Can I defer to some later point? So the right-hand logging scheme that we talked about was what? Was steal versus no steal? Most data systems are going to implement steal. Yes, right? Because otherwise, if I update more data that can fit in memory, I can never do it, right? And are they force versus no force? No force, right? Because again, I can defer the change. I put my right-hand log. Here's the changes that I've modified, or here's the modification I made to the data system. That has to get persisted. So then if there's a crash, you can replay that log and recreate what was in the buffer pool. We talked a little bit about logging schemes. And again, the decision between physical and physiological is not what we're really worried about. It's really physical versus logical, right? The idea of that, am I updating in my log? My log records are going to be, here's like the deltas of the changes I made to individual records or tuples in my database. Or is it the query that I executed, right? And then what are the tradeoffs of the two of them? If it's physical, yeah, I may end up store a lot more data. Like my query may update a billion tuples. And therefore, I have a billion log records for all those 1 billion updates. And then when I recover, I just have to replay all those. And logical would be, here's the single update query. Question, yes. His question is, in the homework, there was something about no steel, a right-hand logging using no steel and force. Is that just shadowpaging? More or less, yes. With shadowpaging, because you're writing to the shadow files, it doesn't matter, rather, actually things get flushed while you're still running. Because no other transaction can see them. But again, very few systems do that. All right, we talked about checkpoints, like fuzzy checkpoints versus like stop-the-world checkpoints. And then we spent time talking about Aries, against understanding the three phases, analyze, redo, and undo. What happens to the database system when it makes the changes? How does it update the internal state or the metadata that it's using to keep track of here the changes that are being made? How do the log sequence numbers of LSNs fit into the protocol? And how do the compensation log records? How are they being used to make sure that if I crash, the decision crashes during recovery, I can recover from my recovery? And then lastly, we talked about distributed databases. And again, because there wasn't a project based on this and it was sort of a high-level overview of the different topics in the space, we can't really ask you deep questions about like, how would you actually implement this thing and that thing? It's more about what are the different system architectures and primarily the difference between shared disk and shared memory? What are the pros and cons? What are the trade-offs of these different designs? How can a database system replicate its data across different nodes to make sure that it's always available or it can recover if a machine dies? The different implications of the different partitioning schemes? Again, like if I have a range partitioning, I can do range queries. If I hash partitioning, I can maybe only do point queries. And then basic information, like as we asked you in the homework about Two-Face Commit. Any questions about any of this? Yes. His question is, when will Homework Grants be released? We released. We will do that this week. Yes. Probably today. Yes. Is Paxos on there? The question is, Paxos on there? I mean, not really. Again, we would not show you like, hey, you know, high-level questions like Two-Face Commit is enough. If you understand Two-Face Commit at a high level, you would understand Paxos. The difference is that Paxos is a quorum. Two-Face Commit is everyone has to agree. Other questions? All right. So what's not on exam? Obviously, single store. And then this has come up in previous years where because myself and Dignesh, when we lecture, we like to say, oh, my sequel does this. The progress is this or whatever. Like all that is obviously we're not going to ask about. That's a color commentary for your own edification to understand how the concepts we talk about during the semester map to real-world systems. OK? All right. Final exam, Tuesday, 8.30 AM, come to Positron 153. And the practice exam is posted on the website. We'll post the link on Piazza, the files on Piazza after this class. And then office hours with Dignesh and myself on Monday. All right? His question is, would today's lecture be on the exam? No. This is Andy just ranting about databases. It's not. Actually, you were saying the PL class is basically them ranting about why he hates other programming languages. I like to think it's more than this, right? Yeah. So no, if any Andy rants are not on the exam. Other questions? All right, so let's get to the fun stuff. So I didn't make a graph of this because of what people voted for. There was one system that got the most votes. We'll come to that in a second. Most of this, I realize also too, in retrospect, I probably should have said here's the systems you could choose from just because the votes were all over the place. Like most systems got one vote. The first system we're going to cover got the most votes that was the non-joke system. So if you know what the joke system was, everyone voted for that. But this one got the most votes. And then the other two, there was other ones that were also tied in sort of second and third place. But I'm recovering the ones that I could put together slides quickly for. So the first one is going to be Redis. This one got the most votes. So who here has actually used Redis? No, a lot. OK, decent. Who here has actually written a clone of Redis? Nobody. You laugh. That's a lot of clones of this. All right, so Redis is considered the canonical NoSQL data system along with Mongo. It sort of came out in the same area. So it came out in 2009 by this Italian guy. He wrote a lot of it himself. So Redis stands for Remote Dictionary Server. So at its core, Redis is a single-node key value database system written in C. And what's interesting about it is that it's designed to be as fast as possible. So it has a single-threaded execution engine. And all the data resides entirely in memory. So the first thing, so let's take the memory stuff first. So this entire semester has been about disk-oriented database systems. So you have to have a buffer pool. And you bring pages in from memory, from disk into memory. And then there's a page table where you keep track of where things are actually being located. All of that requires additional internal latching to make sure that the data structure of the page table and all the other things that's inside the buffer pool are thread-safe. But there's overhead to that. And so traditionally, if your disk is slow, which historically it always has been, then you're OK paying the penalty of taking those latches in memory because going the disk is so expensive anyway. It's not the high pool in the tent. But now if you say my entire database is going to be in memory, there is no buffer pool, everything I could ever need, any page or any record I want is always in memory, then the cost of acquiring those latches starts to add up. And that becomes the bottleneck. So in Redis, they say, well, we're not going to do any of that. The engine itself is going to be single-threaded. And there's only one reader. And there's only one thread could actually be doing anything in the database at a time. So therefore, I don't need those latches. I don't need any internal data structures to keep track of things. I basically can almost run as bare metal speed. So this is why Redis has historically been known as one of the faster database systems. So what's also interesting about Redis is that beyond is being, there's a bunch of other key value stores that are out there. But what they do, which is interesting, is that they actually have specialized value types. So when I say a generic key value store would be, here's my key, and here's my value. And the value is just a blob, some byte stream. The data center doesn't know anything about what's inside of it. And so anytime you want to interpret what's in that byte stream, you've got application code. So in Redis, their values could be either strings, or hashes, or integers, or lists, or sets. And then they have specific commands that can do manipulation on those data structures within a single round-trip update. Instead of having to bring stuff in the application, modify stuff, and then push it back out. So that part is interesting. That part is other than a bunch of clones of Redis. I think that part of Redis was novel when it came out. So because it's in memory, we'll talk in a second of how they handle make sure that their data is persistent. In the real world, I come across Redis being primarily used as a cache. So you would have your regular data system, my SQL, Postgres, Mongo, whatever you want. And then you would have this cache on the side, the sidecard cache. And then your application would say, oh, I need to get this key. Instead of going to the data system running the query, I'll go check Redis. If it's there, great, I'll just use that. If not, then I'll go to the data system. It's a way to avoid having to do a bunch of additional operations. Data system, which would be expensive to do. So I was joking before, I was asking why anybody has written a clone, because there's a ton of these hobby projects that are out there where people, they have lettuce, because whatever programming language you can think of, take the first letter of that programming language, and there'll be a Redis clone. Like there's Jettis, and there's with the J, or G, or Go, or Java, right? There's a bunch of these clones of them. These KDB and Dragonfly, they're probably the two most developed, oh, the 10-set of 10-set as well. These probably the two most well-developed extensions or rewrites of Redis. Actually, KDB was Canadian. They got bought by, what's the other, not Instagram, what's the other social media that would hit photos with the ghost? Snapchat, yeah, they got bought by Snapchat. Oh, all they did is they fucking put a latch in front of it, right? So they say they can do, so instead of being single-threaded, it's multi-threaded, but they just put a latch, right? And they got acquired for that. Dragonfly is a complete rewrite. Tendis is 10-cent, I think it's open source, I don't know whether they fork it or not. But there's a, again, if you go to DBDIO, if you go click the compatibility with Redis, there's a ton of these systems. And there's a bunch of them I don't even put on. And there's some company where they'll sell you basically a course like 445 where you basically learn how to write your own Redis clone, yes. So how does it handle what, sorry? Variable length. If it's what, sorry, if what? Like the string doesn't have to go variable length. Yeah? How does it handle length? It's a hash table. Yeah, it literally, it's just an in-memory chain hash table. Yeah, we'll give that in a second. Right, yes? So like they would have, yeah, they had a global latch in front of the data structure. Because my student looked at this, it's a global latch in front of the data structure. Multiple updates come in, right? But then there's still a latch in front of the data structure. I mean, like it's concurrent network connections. I don't know. By any means? Huh? They might allow multiple reads, yeah. Yeah, yeah. OK, so here's what keys look like, right? It's just, they're just like strings. And then you have different nomenclatures that just correspond to like, is it a, well, here's it. If you want a sorted set, a hash table, you can get a list, a set, and then there's regular strings being stored in there, right? And then the other side here, these are just values. Well, we can give a quick demo of what this looks like. And so Redis, by default, doesn't support SQL. There is, there was a Redis SQL. There was like, again, there's people that take Redis and they put SQL stuff on top of it. They put graph APIs on top of it. The Italian guy that originally started writing Redis, he no longer works in the project. I don't know what he's doing now. And then there was a company that was doing hosted Redis called Redis Labs. I think they bought the Redis name and now they're just called Redis the company and they're trying to commercialize Redis. And what they're basically trying to do is go beyond the, or sort of break the mind share thought or the perception that Redis has only to use as a cache and they're trying to build a bunch of APIs in front of it so you could use it as your primary storage database. And it remains to be seen how successful they will be. So again, it doesn't support SQL. So instead you get a bunch of just these commands that do various physical operations on the different data types, right? And so in your application code, you have to be able to specify or know, like I'm doing, you know, I'm operating on a key that looks at this type and therefore I can only run these commands. Otherwise it will throw an error. Again, I only learned Redis last night so let's see how it goes. Materialize, sorry. You wanna go to eight. Right, so you can do something that's like set. This is a command line interface so it has a nice like, you know, suggestions. So do a key called your class and we'll put ABC in, right? So now if I call get ABC, no, sorry, get key class. Right, I get that back. Then you can do other things like you can do this is me trying to figure it out, right? So they have incremental, they have increments so you can define like a, you can basically define a counter, right? So I can say increment and then we'll give it a counter. We'll call it counter x, right? And every time I call that command it'll automatically add one to it, right? So think of like it's a game, you're keeping track of scores or something like you do things like that. What's kind of funky is you can put a number in front of this and say I want to increment it five times, right? If I call get, I should be able to get the current value counter x, right? Well, it's interesting here. I'm assuming here that they're returning back, hey, the type is an integer when I call increment but when I call get I get the response back as a string, right? So I think they're storing everything as strings and then, no, I can't, I said, sorry, think about it. The wire protocol sends everything back as a string in and out as a string. They must be doing something here to cast it or store it natively as an integer on the server side but I haven't looked at the source code to figure that out. But let's make a list now. So, get key scan, right? Yeah, so here, we'll make a list. We'll call it list x, right? And then I can add something else. So, all right, so when I was doing this last time, okay, great, I got a list. Give me back the list. So I thought, okay, I'll do list x. Doesn't like that, right? Because it knows that it's a list and I'm asking for a command that's not on, that's for basic key value pairs and not list. So this actually goes back to something that's had a very, very beginning about why SQL and relational model is a good idea and that there was earlier systems that had these APIs were based on the type or the value restoring, you had one set of commands or it's another set of commands. To me, it seems perfectly reasonable to be able to say, hey, get me that list. But no, you can't do that. It doesn't let you to do that, right? Because it's not storing any, potentially any information on the server side about what the type actually is. So it's relying on you, the application to know what command you should call it on. You just call on a different data type, right? So like, if now I do this, let's see if I let me do this. What if I call list x now? No, one, two, three. Let me do that. Now I can call, now I get it back, right? The data doesn't let me do that. And so that means if someone makes a mistake in your application code and doesn't know that it's calling certain commands based on a list type versus a regular type, you might break your database, ruin your data. So this is, to me, it is a great example of why schemas and why the relational model is a good idea because PostgreSQL won't let you do that. Furthermore, you just write select. And you don't care in your query what the data type actually is. You just give me these results and get it back. Now, yeah, there's certain functions you could call that you have to know what type it is and that maybe the system tries to do type inference and other things and cast stuff. But to me, that seems bizarre that I can't call get on a list, right? To me, that seems wrong. So the way you get it back, I think it took a while to figure this out. Yeah, here. The way you get it back was you have to call this L range. So it's a list range and I went from zero and then it's like the Python syntax where like negative one is the last item in the list, right? Well, I don't know. Yeah, see like I broke it. So now I gotta go back. All right, can't do that. So I'm gonna make a new list. List Y and then now. I'm missing, yep, yep, yep, yep, thank you. This again is why I use my laptop type instead of this, right? Then I get the list back, all right? So that's, yes. Yes, this David is if I add a batting SQL in the relational model on top of the system, like we have in bus top or other systems, wouldn't this be, wouldn't that slow things down because it parts the query and all that stuff? Absolutely, yes. So for like simple things like a key value store for a cache, sure, that's fine. But to use this as your primary database and not have something enforce your schema to protect you from shooting yourself in the foot like I did just now, that's a recipe for disaster. It'll help you get stuff up and running very quickly but think of like five years from now, 10 years from now, you're dead or whatever or you've been fired and you're not working on application anymore. Somebody else is picking up your code and now they gotta deal with your clusterfuck, right? So having things that prevent humans from doing stupid things for a primary storage database is a good idea. For a cache, like if the key is just a session ID and you get back JSON or whatever, that's fine. I don't care. Yes. Yeah, so his question is, I'm saying Redis is a cache but there's another famous cache, distributed cache called Memcache D that came out of live journal, if you know what that, think of like before Tumblr, if you know what Tumblr was, before Medium, there's Medium now or Substack and then there was Tumblr was a hot thing and then before that was live journal. Still exists, I think it's for the furries or whatever. They built this distributed cache called Memcache D and it is a consistent hash table that does to basically get and sets. So what Memcache D, as far as I know, they only support get set and deletes. Redis has all these, I showed the command, it has all these other detailed commands that you could do on different data types. I think that's why this would be different. Now your question is, can you make Redis distributed? Yes, there is a distributed version of Redis. I don't know whether it's a middleware on the front or whether you can read and write to any node. I don't know how it works. Yep. Yes. Can you mention Redis can be used as a memory cache to make it consistent? Yes. So who's the sponsor of the case that makes it consistent in the same way? So his question is, in my scenario where I said you had a Redis as a cache and then Postgres or whatever you do as the primary storage, who's responsible for maintaining the cache, the application code? Right? So it's a sidecar cache, not a write through cache. Right? So a write through cache should be like, sits in front of you and says, if you're right and then it gets propagated to the database system. Redis doesn't support SQL and doesn't support any of that. So in the application code, you're responsible for doing that. Yeah. So that means, of course, again, if you now do an update and something gets invalidated, you have to send an delete request to the cache. Yes. Is that not there for finding it? This question is, I'm saying that they're providing some kind of level of persistence. Does that mean single-threaded is a bad idea? So, not necessarily. So if you're only doing, if you're only getting like a single key at a time or getting data that's always gonna be within the same partition, then what you could do with Redis and other systems do this similar approach, like my PhD thesis was on this approach. You could have every core just run a whole instance of the database server. And they don't need to talk to each other because everything's gonna be within that single system and single-threaded. So then now, something up above is to decide, okay, I need to get this data. You could hash it and then decide which core you wanna then talk to. So you could scale out Redis because the transactions don't need to talk to each other, don't need to talk to different partitions. You can scale those out very easily. But something above it needs to handle it. And I think the Redis, the commercial company has something that does that. It's when you have to start talking to, if you want an intersectional, like an intersectional consistent manner, I need to get data at two partitions, then that's where things become problematic. Cause then it's a distributed transaction like we talked about before. Cause you gotta coordinate it across each instance. All right, so if it's in memory in my demo here, if I shut the system off, turn it back on, it gets wiped out every single time cause everything is in memory. So the way they handle, they finish things that are persistent and durable is that they use a write ahead log for keep track of individual updates. And that literally is writing out the command that gets sent over the wire. Like when I typed in the console, they literally take that string and they write it to a file. It's logical logging. Cause then when you load it back up, then they replay the log as if it was coming over the wire. But what they do to handle checkpoints or snapshots is that they basically have a background cron job where I default, I think 60 seconds. Every 60 seconds, they're gonna fork the process. What happens in OS when you fork the process in memory? What's that? Copy on write, right? So you're gonna fork the process and now you have a consistent snapshot of what was in memory at the time that the process got forked in the child process. The parent process can keep processing updates and making modifications to memory. Then the OS does the copy on write updates and makes new pages in memory. And so the child process doesn't see any of those updates. And because it's single threaded, you know that there's no inflate transactions while you're running. So it again has a consistent view of the database. And then that other, that child process just takes whatever's in memory and writes out the disk as a checkpoint. There's a clever trick that would do, is basically stop the world checkpoints. But because it's single threaded and because you can do this fork, there's very little overhead involved in doing this. The pause isn't very long. So that part is very clever and that is the unique to Redis. For other things we've already talked about the single threaded, it's a chain hash table. There's no secondary indexes. So again, I can only do lookups on the primary key, on the key. And there's no schema, there's no constraints preventing you from putting whatever data you want in. Yes? You know why they chose a chain hash table slash like, how do you resize it so you don't get like hand? This question, why do they choose a chain hash table? I don't know. And then how do you resize it? You don't. You basically say, you basically say you would define, you would allocate the amount of memory you want at the very beginning. And then if you run out, I mean, I think you gotta turn it off and turn it back on. But now, for the periodic checkpoints, because I think it's just writing out the contents of memory, so it's literally writing out the data structures in memory, I don't know whether you can resize it based on the checkpoint or whether you get to replay the entire credit log to repopulate it. All right, so they have some notion of transactions, but the basically way it works is it's client-side transactions. I don't know how to do this in the command line, we could look it up, because it's not begin commit. When you call begin, they're going to begin on the client side, then any operation you apply then, it just gets batched. And then when you go commit, then everything gets sent over the wire to the database server. And again, because it's single-threaded, then it just runs whatever commands you provide it all at once. And that's their notion of transactions. So the problem though, is they don't allow for a rollback. So if I send a batch of queries over to do updates, and I make a mistake like I showed before, and I update, say I have two records, then I make a mistake and the transaction gets aborted, it can't roll back the previous things that I already did. So it's not true at atomic transactions, at least in the open source version. For the replication, they're doing asynchronous primary replica replication. So the master is just going to send the op-log of the commands that came into the wire, do a downstream replicas. And you can configure it to allow the primary to wait until some quorum of the replicas respond and they got the requests, not that they actually apply them, that they just got the network messages. By default though, I don't think that's turned on. So it's like eventually consistent. Again, for a cache, that's probably okay. The primary database or primary storage, probably a bad idea, okay? So one thing I hope you get out of this, going over the systems very quickly is that there's a bunch of these buzzwords I'm using that we've covered throughout the semester. So you see now how you can look at any system and apply the concepts and methods and things we talked about throughout the entire semester and say, okay, now I understand what the system is actually doing when they say they're this or they're doing that. And what are the pros and cons of all of them? All right, so the next one that was the most was CockroachDB. So CockroachDB came out in 2015. It was less so recently, but initially when it first came out, it was sort of incorrectly described as the open source version of Google Spanner because the co-founders worked at Google before they started CockroachDB. But they didn't work on Spanner. They just said, let's go a bit of distributed database system. So it's a distributed relational database system that's written in Go. And it's going to be a decentralized homogeneous shared nothing architecture that's going to use rain partition to split the database across different nodes. So it's going to be, they claim to be Postgres compatible, but for the SQL dialect and for the wire protocol and the catalog, meaning like in theory, if I have my application written for Postgres using Postgres SQL syntax and so forth and Postgres client drivers, I can just point it at Cockroach and everything will work. Not entirely true, some things will be different, semantics of certain things will be different because CockroachDB is distributed. And certainly if there's new features that come out in Postgres that just came out, CockroachDB probably wouldn't support them. So we haven't really talked about licensing models this semester, but it's interesting to point out that CockroachDB is open source, but they use what is called the BSL, the business source license. And this is sort of a trend that lasts five, six, seven, eight years in databases in particular, probably in other open source projects, but definitely in databases where it is open source, but the idea is that if you're just like, you know, researcher like us or just like, you know, building a hobby project or just want to use it for your, you know, whatever your application is, you can just download the software and you can just use it. The BSL, so it's open source, it prevents you from being a cloud vendor and taking their software and then selling it as a service, like an Amazon, like a Google, like a Microsoft. And there's a bunch of systems that have changed their licensing model to switch to something like the BSL, Mongo has something similar, Elastic switches something similar. And again, basically there's a bunch of open source projects where Amazon was making more money selling their database system than the actual people building the database system. And so they switched to this licensing model to prevent that. It hasn't been challenged in the challenge in court, but there's been enough, it's enough to scare away some cloud vendors from doing certain things. Mongo has had dust-ups with Amazon, Elastic Search had dust-ups with Amazon, so Amazon forked it and things called open search now, right? There's, everyone's trying to avoid getting taken over, eaten alive by the cloud vendors. So, Congress DB on every single note, even though it's distributed, it's gonna be log-structured storage, we'll talk about that in a second. The query engine is gonna be a pool-based vectorized, use a pool-based vectorized processing model. So even though it was originally designed for OLTP, they switched to a vectorized model a few years ago because they want to support also analytics on the system. And they're gonna do NVCC, OCC. And what's interesting is that all the transactions, even distributed transactions will run with a serializable isolation level. Like if you call, like I wanna run read and committed or recommitted, that's just an alias for serializable, right? They don't run anything lower than that, which is really interesting. So the way about cockroaches architecture, it's the multi-layer stuff that we've seen before, but at its core, it's just a replicated key value store and that there's this infrastructure on top of the database system or on top of that key value store that understands SQL, understands partitioning schemes, understands transactions and so forth. And so they're basically, you can think of the storage layer as just a giant sort of map in the key value store and then the upper layers are trying to figure out, okay, for this query, you need data from here and go up here and so forth, right? So when cockroaches DB first came out, they were using ROXDB, which is Facebook's log structures key value store, like the single note embedded database system. And then I remember they were telling me a few years ago when I went to go visit them that the overhead of going from go into C++ for ROXDB was, the overhead was quite high. So a few years ago, they threw ROXDB away and they wrote their own clone of ROXDB in go called Pebble and that way sort of the whole stack is now entirely written in go. And that's the logo for Pebble that I made because they didn't have one so I sent it to them because I had to put something. Anyway, and to do transactions, they're not gonna use two-phase commit, they're not gonna use Paxos, they're gonna use Raft, which again is just a variant of Paxos that we talked about last time. So the way they're gonna order transactions is interesting. So they're gonna use what are called hybrid clocks, which I think we talked about a little bit earlier in the semester, where because now they wanna run transactions that are across nodes that could be in different locations around the world and in the data center, you can't guarantee that their times are all gonna be tightly synchronized. And so the transactions are gonna get timestamps using a combination of the wall clock time that you get from the server you're running on and then a logical counter and like a host ID to break ties. And that's enough to globally, to make every transaction globally unique and also determine the order in which transactions should commit. So the local clocks are still be synchronized using NTP, but it's not gonna be super precise. Like when we talked about Spanner, they were using GPS satellite receivers and atomic clocks and that guarantee their times would be within some bounds of being very synchronized across the world. But in this, in cockroaches you can't run that proprietary hardware, they rely on the wall clocks being loosely synchronized and that's just good enough for them to at least coordinate without massive delays. Of course now if you have some node where like the clocks way off by like an hour, then every transaction is gonna fail because those transactions are gonna be way back in the past and that will break things and therefore you mainly intervene to kill that node that's beside the point. So for OCC the way they're gonna do is the transaction is gonna stage all the rights as intense and then when they go commit, then they go, that's when they go check to see whether you passed the validation phase and allowed to store stuff. And then all the metadata about where the transaction state and the partitioning scheme or the catalog of the database itself that's just also stored as a table in the key value store and that's replicated and you do transaction on top of that. So here's the basic overview of how a transaction would work. Again, so you have some kind of catalog that keeps track of the partitioning scheme of the database and I think by default in CockroachDB everything's always range partition and so this state catalog we replicate it across all the nodes. So if I wanna do an update, say a simple case where I'm updating on the primary key, I go check the catalog to say where can I find this partition and then I'll figure out which one is the leader and you just run leader election for that partition whenever you need to determine who's the leader and then all the rights will go to this node, the leader and then it's responsible when propagating the updates to the other partitions or other nodes that have the replicated copies of that partition using raft and then get everyone to agree that the change is a lot to happen and then you go ahead and commit. When you do a read by default, you would always wanna go read from the leader because again, they only run with serialized isolation level so you can't do by default still reads. So you can guarantee but if you always go to the leader that you always see the latest version of any object that you're looking for. So you'll be directed to go new reads here. And obviously when we talked about primary replication versus like multi-home, this was if all your rights are going to the leader and all your reads are going to the leader, the leader can become a bottleneck and slow things down. So I don't think they don't let you set the isolation level to go read from other nodes. You actually have to modify the SQL syntax to introduce or give a hint to the database system that you're okay with reading stale data on the replicas. So in this case here, you change the select query to say, select from table XXX as a system time within the max stale staleness of 10 seconds. So this then allows the distance aside, okay, if this node is at least 10 seconds in sync with the leader, I can then read any data from that. So although CockerGB is open source, I think for this feature it's only available in like the enterprise commercial version, right? So again, CockerGB to centralize, homogeneous, share nothing distributed database system that's using RAF to do transactions across different nodes and they're doing multi-versioning on top of that. All right, next one, Snowflake. So if you take 7.21, we're gonna spend a lot of time talking about Snowflake, so it is one of the first cloud native OLAP database systems. Everything's written in C++. It's one of the first systems that did a shared disk architecture, at least for the cloud. The query engine itself is gonna be a push-based vectorized query processing engine and this is what actually made Snowflake different than a bunch of other systems that were trying to do analytics at that time. In fact, they leveraged it very heavily on using SIMD or vectorized instructions on the CPU to process things in parallel within operators. To avoid the overhead of trying to interpret bytes or the types of columns and data at runtime and having a bunch of indirection in the code, like giant switch statements that says, if I'm adding two numbers together or two values together, if it's integer do this, what's the float do that? They wanna avoid all that indirection because it's not great for modern CPUs. So what they do is they pre-compile all these low-level primitive operations for different data types, right? So if I wanna add two types together I wanna check maybe whether a value is greater than another value. So I would have different copies of that same function but one for floats, one for integers, one for strings, one for dates and so forth. And you pre-compile them when you're actually making the build for the system and then at runtime you go figure out because you have a catalog, you can look in the scheme and say, oh, I know I'm gonna operate on a column of this type. Let me go pick the primitive out that knows how to operate on 32-bit integers or 64-bit integers and so forth. Yes? This question is, would this be implemented as simple as templating? Yes. In bus top we don't do that. In bus top, if you go look at the value code for the value types, there's a switch statement. And that's how most systems do it. Postgres does this, MySQL does a lot of systems do this. But again, if I'm gonna read a billion tuples and the type's always gonna be the same, I don't wanna do that switch statement over and over again. It's a waste. Right, you have a switch statement, you have all these different versions of the function anyway. Like does it save you much to do it with a switch thing or something? This question is, does it save you much to do, does it save you much to do over templing, over switch statement? But think of like, it's more like the dispatch code. So you give it to the switch statement and you literally say, here's the pointer to the function that does 32-bit integer comparisons. Then at runtime, when you generate the query plan, you literally put in the point you call that pointer instead of calling the function that has a switch statement that man calls it for you. It's like a software engineering thing. Do you save much time? Do you save much time? Sorry, what? Using switch statements instead of templing. No, you wanna get rid of the switch statement, yes. I guess I'm asking, why would you not do this? Because there's a software engineering overhead. You have to design your system for this, to do this. And prior to, there was another system before Snowflake, the co-founder of Snowflake, but another system called VectorWise, which was a 4-common ADB, which is one of the early common storage systems. Like prior to them, the VectorWise fuel, I think, innovated on this. Like nobody did it this way. Just, because they didn't, I don't know. The alternative, the alternative is also, like you could do code generation. Like it was a single store was talking about, like you take the query plan, they generate the YOP codes and they compile that to machine code for every single query. The alternative is to do that. But IBM did that in the 70s and then abandoned it because it was a software engineering overhead. Also too, column storage weren't really a thing until the 2000s. So it didn't really make sense. Like for doing OTP stuff, like in a bus-tough process in my SQL system, you're going to grab one row. This, it won't make a difference. But now if you're doing OLAP where you're just ripping through a column of billion tuples, that switch statement becomes a big bottleneck. All right, so they're gonna separate the table data from the metadata, and I'll talk about that in a second. There is no buffer pool at every single node. There is a cache, but it's not what we think of a traditional buffer pool. And then the, the internal storage format for Snowflake, Snowflake is basically packs that we talked about before. But they do support reading data from Parquet, CSVs and all the other data types you would have. All right, so the architecture itself is, the second time I'm gonna go through this real quickly. But it's disaggregated storage. So you have the compute nodes and then you have the storage. And then all the storage is just the cloud object storage stuff we talked about before. So we talked before about, like I said, you don't want to use the OS for anything, but when it comes to cloud databases, maybe you want to use Amazon's S3 or Azure Storage and aren't we giving things up? So in the region papers about Snowflake, they talk about how it made their life so much easier just to offload all that storage stuff to Amazon and just worry about making up really fast engine. So it gives you the flexibility of scaling out compute separately from the storage. So the second time I want to skip all of this, I think we all already talked about this. But let me tell you one thing they do do that is actually interesting. So, say a query shows up that wants to scan a lot of data, right? And say we're doing a simple join here and the build side and the probe side. But let's say that this side of the query is doing a massive scan on maybe petabytes of data. And if I have a fixed number of machines or workers to process this query, this will become a bottleneck for you. So what they have the ability to do is while the query is running, they can identify that, okay this side is taking too long. So they can dynamically rewrite the query plan to split it up into separate operations. In this case here, you can basically add more nodes. They start doing the table scan you need and they're gonna write it out to a bunch of temp files and then you load it back in on the regular worker nodes that then doing union all and then you produce the correct answer. So like while the query is running, they can add more nodes to scale things out dynamically and use the user or the customer you don't know. And then the enemy results are just getting written back to S3. So for them, that's cheap and fast, fast enough. Yes? Is this different from the materialized view? This question, is this different from the materialized view? So materialized view would be like, I compute the answer once and then other queries can then reuse that result. And then if the table gets updated or anything gets updated, then it automatically gets updated. This is just for a single query. Yes. And also too, like it can't, I don't think this thing supports incremental updates like deltas and things like that. This question is, how do they figure this out? Is this done while it's running or is it done before it starts running? Actually, I think I misspoke. It's done at query optimization time. So they look at it, they know how much data you're gonna read, they know that this is gonna be a bottleneck therefore they can give you more resources ahead of time. Okay, so it's like an estimation though. It's like an estimation, yes. But what the interesting thing about is that the extra resources they're coming from, they're not like things you have to pre-allocate. They have basically like a bullpen or like a spare worker pool. I think you can borrow nodes from other customers and they, because underneath the covers, this is all hidden from you. So like you can basically say, say you pay for like, I want five compute nodes, this query shows up and they'll give you maybe a few extra just to finish the query and you're borrowing them from somebody else. And that's all hidden from you. Because otherwise they would be idle. Okay, again, so Snowflake, again, it's a distributed shared disk, vectorized push-based OLAP engine that runs entirely off of, they originally started on S3 but they support other cloud platforms as well. And remember, I think the Marcin, the co-founder, like in the early days people were asking like, you know, you're crazy to be cloud native. You know, I want to run on-prem, make it run on-prem. And they made the hard choice of deciding not to do that, even though some customers were offering them money and in the end it paid off. And they went IPO, 2018, 2017. They went IPO in five years. It's amazing. Okay, so this is the data setting that was voted for the most. MangoDB. All right, that's the best logo I could find. It's the only logo I could find. All right, so what is MangoDB? So MangoDB is a satirical implementation of MongoDB. It's written in Python. I can't run it because it only supports an older version of Python and libraries. It only supports the MongoDB wire protocol up to version two. So the key idea of MangoDB is that all the data is written to DevNol. And so the reason why, what the joke is based on is that the original version of MongoDB, like this thing like 2009, 2008, 2010, that they would put out these blog articles or these social media posts showing how much faster MongoDB was than every other database system. And part of the trick that MongoDB was doing is that when you wrote to the database system, when you sent a right request to MongoDB, they would immediately come back with acknowledgment and say, yep, we got it, right? And that was the benchmark numbers they were reporting, like them versus my SQL versus Postgres, right? And the obvious reason why that's a bad idea because it wasn't that, oh, you know, not that you even started writing the queries, you just got the message on the server side. And there was no guarantee that that right request would actually get written to disk. If you wanted to know whether your right was actually written to disk, you then had to send another request and say, hey, that thing that is totally to write, did you actually write it? And it would block until you actually did it. So this obviously meant that like if you were doing a lot of writes in your application and there was a crash, which of course, every system is gonna crash at some point, there's no guarantee that the things you actually thought were written within some time window were actually made at the disk. There's other things that they did in the early days too because they were using M-Map instead of regular bufferful managers that they had a single database lock. So even though, again, going back where you were staying before, like if you can't only, if you only have one writer or reader, is that really multi-threaded? So it was basically the same idea, they had a single exclusive lock on the disk. And in terms of you had an update, it would lock the whole database system, right? Because they couldn't handle doing multi-updates. The M-Map stuff was a whole other ball wax. So I normally don't like to show you guys source code, but we can look at source code for MangaDB because this is it. This is literally the entire database system. So the first thing to point out here in this line here, they open up a file descriptor to DevNull. Fantastic, great, yes. Then down in here, this is actually, if a command shows up, so by is telling you to disconnect, wait is waiting for, was wait, conditional lock up there, right? So this is actually a command actually doing something. So there's our single global lock entire database system, we acquire that. Then we write out whatever you sent or the wire to the database server to DevNull. Fantastic, right? And then they have, if you're in durable mode, they'll flush the output buffer and then do an F-sync on the file descriptor, which is DevNull, so it comes back immediately. And then if you're running with eventual persistency mode, then no matter what you send us, well, you can send the server to get back 42, otherwise you get a random value, right? And then you release the lock. So this is, the joke is again, is this as good as MangaDB? Well, if you care about your data, you obviously wouldn't want to use this and you probably wouldn't want to use early versions of MangaDB. What's that? It's web scale. That video we cannot show because a lot of our fans are there. We can post that later. Just Google MangaDB web scale, you'll see it. It's a cartoon. So, I'm not trying to pick on MangaDB, right? The sort of, how does this, the playbook, if you will, of what MangaDB did to become successful has been done before, meaning put out a quote unquote inferior product that people like to use because it's easy and it's new and exciting. Get enough traction where that you get enough money to then hire engineers to go pay off that technical debt of the mistakes you made. Like that MangaDB wasn't the first one to did that. That was, but MySQL did in the 90s. MySQL, the original, the storage engine was called MyISAM. That thing was total crap. They would lose data, couldn't handle transactions. You would corrupt the database all the time and then NADB came along and fixed all those things and now you would not want to use MyISAM. But for MySQL, this is what Oracle did. Oracle was a garbage system in the 70s and the 80s. Ingress, the predecessor to Postgres, that was considered the better database system but Oracle got better traction because they made a bunch of claims about what it could do in the future that eventually they got there. They got big enough where they could hire enough people to fix it and now Oracle's a solid database system. So again, I'm not trying to pick a Manga. You can think of like the, it's a marketing way to say like you sort of sell this, you don't sell the stake, you sell the sizzle. So you sell your database system that's gonna solve all the world's problems. It's stay the yard, yada, yada, yada. And then, yeah, people get burned along the way but whatever, right? And then you get the money to fix things. Yes. MangoDB. Okay, sorry. You point it, so I was like, wait a minute. For mission, oh, oh, like, again, if it's, his question is, is it okay to use like bleeding edge database software if it's for mission-critic applications? If it's your bank account, hell no, right? Of course not, right? So, but for some applications, sure, maybe, right? But I would say the default choice for any guys here that wants to do a startup, default choice should be PostgreSQL. And then at some point, PostgreSQL won't scale if your business grows and it's not enough. And then you hire somebody to help you make it scalable or pick one of these open source projects or these other newer systems that can do it, right? But yeah, if it's mission-critical I would say also too, like all the things that the NoSQL guys say were a bad idea, SQL, obviously in the name, they didn't like transactions, they didn't like joins. All of them have added those things. And so the intellectual distance between a NoSQL system and a relational data system, at some point, it's gonna be, there is gonna be no difference. So all the things they said were a bad idea, turns out, and for me it's like, you know, SQL maximalists are purists, then I was saying, oh you don't want to be, you know, you don't want to be, you don't want to not do transactions, you don't want to not do SQL. Turns out they eventually came around and added support for it. So the only NoSQL system that doesn't support SQL now is actually Redis, there's a clone for that, but again, it's primarily being used for a cache. I know the MongoDB co-founder, he's not there anymore, the CTO, he helped out in the earliest when I was at CMU with research funding. And I remember him telling me during the city office, like, oh MongoDB would never support SQL, never gonna happen. They added it two years ago, right, exactly after he left, but like still. So, don't worry you did a DevNol. And I think, again, for mission critical data, I don't think shiny new things are a good idea. So, Intel has this ADR stuff, I think it's also RAID as well, like I'm running the disk, right, yeah. If the hardware does it, is that gonna be enough? Well, I mean, well no, because again, this is the whole semester we were talking about, like things got written to disk, but first of all, can it bring back the program state exactly? The program account, I don't think it can. Right, so the problem there is like, if there's a bug in your software, and it starts going down some code path it shouldn't have and starts writing data to your disk, the hardware thinks that's fine, the software's broken though. You gotta recover from that. So if you can't guarantee strong consistent transactions, like across different nodes, then that's gonna be a problem. So, as much as I seem like a, as a person that kind of has loose morals, that's not the right word. When it comes to databases, like, I'm super cautious, right, because you don't, it's the, it's like the operating system, it's like the core foundation of a software system, and that kind of thing you don't wanna mess around with. You wanna use JavaScript, the application layer, that's fine, put it all over there. Datasim should be written as something more serious. Oh, that's gonna be emails, okay. I'm gonna show you one more fun system. This to me, this is the, I'm jealous, because this is actually what I wish I invented this system. This is pure genius. It's called TabDB. So it's a relational database system where they store the contents of your database in the titles of your tabs in your web browser. So the way the guy did this was, he took SQLite C code, ran it through mscripten, which allows you to convert C code into JavaScript. So now SQLite is running in JavaScript in your web browser, in your web webpage. And then when it wants to do writes, it splits up your database file and then writes it into tabs in your web browser. And you can run SQL, it's amazing. So, oh, sorry. But so we gotta make some tabs, right? So we'll call create statement, right? And then now you see the tabs have been renamed and then now I can run the select statement, and I get results back. I don't know where the results end up, but it ran, right? So if I add more tabs, I get more data. Oh, I know I didn't do much to insert. That's why I gotta do inserts, inserts. Yup, and I get my results back. And that's stored entirely in the tabs titles. Genius, I am very, very jealous of this. I don't know what my joke database is, I gotta think of one, but this is very hard to beat this. Yes. If you click on a tab, I think you just get, yeah. But what if you delete a tab, right? Oh, still got it. Delete the first one. Yeah. Oh, you know what it is? It's like the data's so small. Actually, I don't, it's still in memory. Yeah. Anyway, all right, that's a fun one, tab DB. We tried to get the guy to come give a talk with us and then I was gonna like try to promote it as like a distinguished lecturer in the department to say like there's the groundbreaking idea, but he was too busy. All right, so quickly, quickly, quickly, quickly. So, concluding remarks. Where did the name Bustop come from? So, I read this blog article, the best name in the name of David's system is you pick two random, one syllable words and you stick them together and that makes your database name to be guaranteed unique for like, you know, search engine and stuff. Like post Crest, post Crest, click house, grid gain. Like you search those words and it's guaranteed to be your data system. So, one of my PG students, he wrote a script where every morning on Slack would take two random words and put them together and post that on Slack. We would try to figure out which one you would want to use. So, they were screwing me the first time because the first thing to spit out was poop dish, but then the very next day, the very next day, the real one to actually point it, spit out was Bust Hub. We just thought it was hilarious and we did that. So, we hired someone to draw a logo. So, that's where it came from. Why is the relational model superior? I think I've already said this a couple of times, even today, but like the relational model is, it's like the foundation for databases. Like, you know, no one's gonna invent a new arithmetic, like one plus one equals two. No one's gonna reinvent that. The relational model can be used to model any possible type of application or data set, whether it's arrays, lists, or, you know, JSON stuff. And although that goes against with the original idea of what Ted Codd had in the 1970s but what the relational model was because he had no lists, he had no documents. But over time, the relational model has expanded to adapt to changes in how people program where today it can support everything. The next question he had was, why do all the tech companies, why do tech companies support a lot of multilater systems? This is money, right? Plain and simple. Yeah, so. All right, so, again, hopefully through the last day of class, you've been here with us the entire semester, you now realize databases are awesome. They're super important. There's still a lot of activity in this space. There's still a lot of money. We barely touched what all the different things you could actually do. I brought up memory databases today. That's the first time we've really talked about it. That's a whole other category of databases that are super fascinating. And there's embedded devices and large scale systems. So a bunch of these things, but the core ideas we talked about, even though we described it mostly in a single node context, again, it's applicable everywhere. So again, hopefully going forward the rest of your career, no matter whether you're actually building the internals of a data system or not, you'll understand what's actually going on when you run a query. So when there'll be performance problems, at least now you can understand what's going on and decide is it something because we're using the data system wrong or it's because the data system's just not implemented correctly or good enough. And this is gonna allow you to make better decisions throughout your life about what kind of system you should use. It goes back to the thing he was asking me before. Why would anybody wanna use one of these brand new databases or should you use them if it's a mission critical application? So now you can go look at the documentation and understand what claims a data system is actually making to understand what are the things that could be a problem later on. Because the first five minutes, sure, yeah, the query might run fast. That's what MongoDB did, but as things scale or as you do more complex operations on it, you could have a bunch of problems. Okay? All right, so before we go, before we go, we just got notice from the provost from the university that the voting is in and DJ2PL has been voted the most dank course DJ at all of Carnegie Mellon. So, yeah, it's like a rare photo. No, you grab this, it's yours. You're gonna like this. Got it? Okay, congratulations. All right, and again, for those who came late, DJ2PL's agent sent over photos, so he's signing autographs right after class. Okay? All right, good luck with the rest of your classes. Take care, hit it. This shit is gangsta. I'm gonna pop you with the motherfuckin' up. Still got you shook up. I smack you with the bottom of the clip to tell you to look up. Show me where the safe's at before I blow your face back. I got a block on taps, the feds can't trace that. Style is like tamper proof. You can't lace that at the Dominikin' or you could call me Dominican. Black, skelly, black leather, black suede timelands. My all black, dirty haters send you to the pearly gates. You get your solvent trying to skate and that's your first mistake. I ain't lying for that cake. You're fam, I see you wait. My grand's is heavy weight and ran through every stake. When they asking how I'm livin', I tell them I'm livin' great.