 So any questions about project one? It's due today at midnight. Some of you have questions, a lot of you are submitted. I know there's one of you I'm going to talk to you after class. Anyone not started project one? OK, that would be really bad if you didn't. All right, so for today's class, this is sort of the last lecture we're going to have on concurrence control. But there's a lot of sort of stuff I want to cover to get through this. And then at the end, we'll finish off with discussing on project two. So I'll start off with doing an overview on the compare and swap atomic operation, because this is going to be useful for not only understanding how we're doing MVCC, but for project number two, you need to do this in your index. Then we'll do an overview of MVCC and then deep dive into the different design decisions that you have to make when you actually implement MVCC in a real system. And this is sort of the paper that you guys were assigned to read. And then if you have time, we'll talk about two modern MVCC implementations in both Hecaton and Hyper. And this will sort of show how you sort of wrap all these things together in a single system. OK, so as a real quick show in hands, who here knows what the compare and swap instruction does? Nobody. I mean, atomic addition, does this not ring a bell for anybody? OK, good. So one of you. All right, so the compare and swap is it's not a new idea. It's been around in processors, I think, in the early 70s. But they've really become an invoke now, maybe the last 15, 20 years, because everyone's trying to build these sort of latch-free or lock-free data structures. And the underlying primitive you're going to use to make this work is using compare and swap. So compare and swap is a special instruction that the CPU provides that allows you to examine the contents of a memory location and check to see whether its current value is equivalent to some value you're checking it for. And if so, then you're allowed to go ahead and then modify it with a new value. And if not, then it fails. So if, say, we have a memory location M, we want to check to see whether it has a current and given value V, and if it does, then we're allowed to install V prime. So the way you actually implement this in code is to use these atomic spilt-in or CPU intrinsics. This is sort of one example how you would do this in GCC with libc. So you can see, again, it's a function call where, again, we're given to some address in memory that's going to have some current value. And then we're going to be given the compare value that you want to check to see whether this current value is equivalent to. And then we'll have the new value that we install. So the key thing to understand about this, though, is that it looks like a function call, but in actuality, the compiler where we write the intrinsic to be the single instruction to perform this operation. And this is why we say it's atomic. So if you had to write this in assembly or C by yourself, you would write if value equals this, then do that. And that cannot be atomic. You could have a race condition where someone could come in and change the value after you evaluate in your if clause. So this allows you to do this sort of one instruction all at once. And so I'm showing how this one here, sync, bool, compare, and swap. Sometimes they have different names. This one returns a bool whether it succeeds or not. There's other ones that you can have different types, like 8-bit, 16-bit, 32, 64. I don't think there's 128. So in this example here, we would check to see is the current value 20? If yes, then we're allowed to go ahead and change this to be 30. And that's all done in a single instruction. If you can come back and try to say I want to check the compare value to be 25 and then install the value 35, in this case, the compare would fail. It would not update the value. And this one will return false. So there's other compare and swap implementations or instructions or intrinsics that can do different things, like instead of returning a true or false, you can return back what the current value is. And you can use that to figure out whether you succeeded or not. So it's really important to understand this not only for project number two, but also as we talk about how we're actually going to implement NVCC and OCC from last class, because the underlying principle to make it all work without having to set locks is to use compare and swap. Again, we can use this as a building block to do more complicated operations in our database system. So is this sort of clear what this does? It's pretty obvious, but the syntax looks a little weird because these double ampersands in front of it, but again, it's sort of a built-in thing for libc that the compiler provides for you. All right, so now we can discuss multi-version currency control. So the basic idea is pretty easy to understand. The database system is going to maintain multiple physical versions internally for a single logical object in the database. And again, we can be vague on what it means to be an object. It could be a table, it could be a page, it could be a tuple, it could be a single attribute. In practice, everyone always treats objects as tuples or uses tuples for objects because that provides the right amount of parallelism with the right amount of overhead. So the basic idea of what happens is that anytime a transaction wants to write to an object or write to a tuple, the database system is going to create a new physical version for that logical object. And so when I mean logical, I mean that the application level, when you do a query, you would only see one tuple, right? But internally, it could store multiple physical copies of that tuple. And that's how it's going to maintain, figure out how to interleave the operations And then what happens is when a transaction reads an object, like if you do a query, it'll do a lookup and the database system has to figure out what's the correct version you should look at based on when your transaction started. So we talked about snapshot isolation in last class and if you see it, it actually works quite nicely, right? You want to see the version of an object that existed before your transaction started and that was made by a transaction that had committed before you started. We'll see how in hackathon we can relax that a little bit to get better parallelism. So, again, the sort of confusing thing maybe when you read the 1000 core paper is that we talk about NVCC as a single protocol and then that paper, the protocol we're using is sort of derived from this original work done in 1978 from this guy at MIT who did his dissertation on this. So as far as we know, this is the first sort of layout of the explanation or definition of the NVCC protocol. The first implementation came out a few years later in a system called Innerbase at a deck, which no longer exists, but Innerbase actually still exists. They open sourced it now and it's called Firebird. So it's one of the original databases from the early 1980s. I've never actually used it, so I don't know actually how good it is, but they still maintain it, right? Still have people actively working on it, but it's certainly not as famous as Postgres or MySQL as an open source system. The only thing though, why this is actually relevant to the discussion with his classes, although the protocol is from 1978, pretty much every single database system that came out in the last 10 years with the exception of H2R and VoltDB, they all use NVCC. So you have this protocol that's 30, 40 years old, but now everyone's actually using it. So part of what the paper you guys are reading is trying to understand what are the aspects of NVCC that make it ideal for modern in-memory systems, right? So like back in the 1970s, the first two database systems they were doing transactions were Oracle, System R, and Ingress. And when they first came out, none of those systems were doing NVCC. Oracle added NVCC maybe later in 1984 or 1985, but for a long time, most of the data systems that came out were always doing two-phase locking. And actually the first version of Postgres was doing two-phase locking, although a variant of NVCC. So we want to understand what's interesting about NVCC that makes it be used in every single database system that's come out in the last 10 years. So the main benefits we're going to get for NVCC are that the writers aren't going to block the readers. So if you have a transaction that's writing to an object, other transactions can read older versions of that object so without having to acquire a shared lock and get blocked because of the writer. We can also have a nice benefit that if we have a transaction that's pre-declared as read-only, like SQL allows you to do this. SQL allows you to start a transaction and say, I'm read-only, I'm never going to write anything. So when that happens, the data system can ensure that that transaction is given a consistent snapshot of the database so you can do all the reads that it wants without ever blocking from any other writer that could occur. And then that's sort of a side effect or that's what you get when you have snapshot isolation. So it's sort of that when you get for free. And the last sort of thing that's kind of interesting, it's interesting from an academic standpoint, I don't know how interesting it is in the real world, but with multi-version concurrency control it allows you to easily support what are called time travel queries. So time travel query is when you can say, I want to run this query on my database as it existed exactly two days ago or 10 days ago or a year ago. And again, because we have snapshot isolation you can be guaranteed that you're going to have a consistent snapshot of the database that only sees the things or you can only see the things that existed at that point in time. So this is not a new idea when Postgres first came out. This is one of the things they touted that they were going to be able to support. They eventually took it out of Postgres in like 1999 because as we'll see when we talk about garbage collection if you're maintaining the entire history of every single version of every single object that ever exists you run out of space pretty quickly. And certainly in memory system this might be a bad idea. So it's actually come back now, you start to see newer systems tout they can support this. There's a new database system called FaunaDB and you read their definition of MVCC they claim all we can do time travel queries. Look how great it is. SQL Server will sell you time travel query packages as well. I think Oracle and IBM have something similar. So as far as I know, every time I talk to somebody nobody actually really does this. It only shows up in like financial firms like Wall Street Banks and things like that because they care about regulatory and meaning the regulatory obligations. So because of StarBanes Oxley you have to maintain the history of your all financial transactions in the last seven years. So if you get all of it and say what was going on in my database six years ago, time travel queries are the idea thing to use. But beyond that most people don't actually need something like this, right? And if you do you create snapshots and shove it off to like Hadoop or Spark or something like that. So this is sort of nice to have and it's an interesting property of MVCC I don't know how useful it is like in the real world. So the thing that I hope you got from when you read today's assigned reading is that MVCC is more than just the concurrency control protocol that we discussed on the thousand core paper. And again this is a byproduct of the language being kind of vague or reusing terms in concurrency control. I remember I talked about how there's optimistic concurrency control but then there's all bunch of schemes that are considered optimistic but they're not the same, right? So MVCC is a protocol but it's also sort of this higher level idea of how to do transaction management with multi-versioning. And so what happens is like how you design your MVCC implementation was going to affect how the database system manages transactions, how it's going to store different versions, how it's going to maintain the indexes. So it affects every single part of the database system. So and in that again the paper you guys were assigned to read was meant to look at these four major issues of when you actually implement modern MVCC in a current system and to kind of understand what the different trade-offs are for all of them. And as I said the last 10 years since all these different database systems have come out and they're all saying they do MVCC but no one really does, no one really did a thorough evaluation to decide like what's the right implementation of MVCC. So for example we'll talk about the Hecaton paper. They have a great paper that talks about how they do MVCC and they compare maybe just one of the two design decisions but there's all these other things that they never compare. They just say this is how we did it and it's done. The hyper guys basically said the same thing. So when we were designing Peloton we knew that we wanted to go and use MVCC to support the hybrid work flows that we talked about in this class but we didn't actually know for all these different design decisions what's the right one to use because everyone just picked their own and no one actually tried to understand why. So that's what this paper is really about. Since we know we were going to implement MVCC but we didn't know exactly how we were going to implement it let's implement everything see what comes out to be the best and that's what we'll use in our system. So that's sort of the motivation for this paper and I will say too this is the paper you guys are assigned to read again part of the reason why it's not public on the website yet is because we don't know what to call it the title of the paper because they don't like a current title that you guys saw but it's already been accepted at VLDB and it'll come out in the commerce later this year. So you guys sort of get a sneak peak view of it. Okay so first we're going to discuss the concurrency scope protocol and again this is how we're going to decide how to interleave transactions and we've covered a lot of this in the last two classes then we'll discuss how we're going to maintain different versions in our system then do garbage collection and then manage our pointers and our indexes. So go through each of these one by one. So again so this is the same table that you had in the paper and again this is showing you the scope of the different design decisions that you can have that there's no one system that makes all the same decisions as another one, right? So I would say sort of high rise down these are the ones that implement in the last 10 years. Oracle actually originally came out in the 1970s but it didn't support transactions at all and it didn't really come out until like the 1983-84 when they finally added MVCC. The original version of Postgres when you read the early papers from the 80s and 90s they talked about using two-phase locking and then when it left Berkeley and became actually an open source project that people were actually using outside of Berkeley that's when they switched to using timestamp ordering and then NODB does MVCC with two-phase locking. So again you can see there's all these different design decisions everyone's doing something different and we're trying to figure out what's actually the best way. So that's sort of again the goal of this work. So another thing that could be mindful of too as well is in order to do MVCC in order for the protocol to work correctly we need to maintain some extra information inside the tuple to figure out what version are we looking at and whether we're allowed to see it or not. So in the header and every single tuple you can always allocate this header piece and that's going to be some number of bits that can maintain this extra information or metadata about the transaction and then everything that comes after that is the actual data and this will get changed up a little bit when we talk about doing column stores because for column stores you're going to break out all the columns to be separate particular blocks of memory and you wouldn't want to put the header for every single column it wouldn't make sense but for now I would assume we have a row store because we're dealing with some kind of unique transaction identifier and again we said this is typically some kind of time stamp whether it's based on system clock or a logical counter it doesn't matter and then we had begin and end time stamp and this is going to tell us what the the version lifetime is for this transaction so based on what time stamp you get for your transaction when you start that will determine you know what range you fall into for looking at each tuple or each version decide whether you're allowed to see it or not and then you're going to have a 64 bit pointer to either the next tuple or the previous tuple in your version chain and again we'll discuss how that works and what the different trade also that and then I'll have some additional metadata that we'll store for the different protocols we'll see this because we'll discuss doing time stamp ordering with MVCC but as you read in the paper toothpaste locking and OCC have to maintain their own additional metadata here so the one thing I'll point out is each of these are going to be you know 64 bits so I have a 64 bit transaction ID 64 bit time stamps and 64 bit pointers that's actually a lot when you think about it right so we have in this case here ignoring the additional metadata we have four four fields we need to maintain right so that's eight eight bytes per field so we have 32 bytes per tuple so if you have a billion tuples you have 32 gigabytes of data just for the metadata that's a lot right so we'll see in Hecaton how you maybe you can kill one byte but sorry you can kill one of these fields so you reduce eight bytes but in general you need to maintain all this stuff so again MVCC sounds amazing and avoids having to do locks that you would have to do in toothpaste locking but you still have to maintain you know you're paying sort of a space cost to do this it's not really anyway you can compress this down all right so for current these are the same things we looked at in the previous lectures and we talked in the thousand core paper and the OCC paper right and the only thing that really changes is that now they're maintaining multiple versions right so we'll go through how you do multi-version timestamp wording but for these other ones it's more or less the same idea that instead of overriding the master version or single version of a tuple you create a new version of it a new physical version and whether you over you know you're maintaining storage or whatever doesn't matter there's always going to be multiple physical versions and when you kind of think about it like in case of OCC it's kind of straightforward to think about how do you make this multi-version right you have the private workspace and then when your transaction validates and commits rather than overriding the original version you just plop them into the table and just sort of make sure you maintain the chain for all of them toothpaste locking it's sort of the same thing right you just have a time stamp ordering one because this would be relevant for this is how we do this in Peloton and this is how they're going to do this in Hecaton and this is actually how the original NVCC protocol work but we're going to make it more modern and deal with you know use compare and swap to make this high performance so the first thing to point out in our tuple is that we're going to have a additional field that corresponds to the read time stamp so this is going to correspond to the time stamp or to this tuple and then we're also going to have the transaction ID that corresponds to the transaction ID that currently holds the lock for the this tuple that they're modifying and then we have to begin an end time stamp again that corresponds to the range so under the sort of NVCC or MV20 TO the transaction ID is always zero unless some transaction is currently taking a lock for it and again you use compare and swap to try to turn this transaction ID is if you failed and you know someone else acquired the lock before you did if you succeed then you hold the lock for it and then for the begin time stamp this is just the time stamp when you started and then the end time stamp this would be the time stamp of the transaction that created this and then the end time stamp is either infinity meaning it's the latest version or the time stamp of the last transaction that created the new version that came after this one here so let's do a really simple example of a transaction of operations I want to do a read on A and a write on B so for this when the transaction comes to the system we'll sign it transaction ID 10 right I think it's off it's off skew because the guy actually came and cleaned up the filter after I complained so many times of a crashing so it should not crash anymore but I guess it's a little off here and I sent him in the email I sent him all the YouTube links of like me talking and then it going up and me getting disappointed because there's like I think they I think it's funny fix knock on wood okay so we're transaction ID 10 and we want to read object A so we're going to say that this transaction is allowed to read A if the no transaction currently holds the lock and our time stamp falls and within the range of begin and end time stamp here so this is one to infinity so obviously 10 fits into there so we're allowed to read it and then we'll go ahead and update our read time stamp to be our current time stamp and again we can do this swap operation to make sure that if there's someone came at say time stamp 11 and read it that we don't go back and change it to 10 right this read time stamp always has to be going forward in time and compare and swap allows to do that with a single instruction so now what we want to do the right the first thing we have to do is that we have to acquire the lock on the the tuple that we want to modify to make sure that nobody else is trying to do it so also say to so I'm showing here the transaction IDs like AXBX this is not actually stored in the database system this is sort of for illustration purposes we understand what we're looking at here so the first thing we need to do we're trying to flip the transaction ID to be our time stamp doing compare and swap if we get it then we know that we now hold the right lock for this transaction then we go ahead and can create our new version and implicitly when we store this we'll set it initially with our transaction ID and because of that we'll get the that we technically hold the lock for this as well but now we'll set the end time stamp to be infinity to saying this is the latest version and then the begin time stamp is the time stamp for when we started for our transaction so now we go back and we update the end time stamp for this tuple which we can do because we hold the lock for it so that's okay to now be our time stamp so this is saying that this this tuple is valid from version 1 to 10 so if your time stamp is greater than this you shouldn't read this version so once this succeeds then we can release our locks and our transaction commits and we're done and now this this version is visible to anybody else right it's basically the same thing as the time stamp ordering protocol that we discussed two classes ago but now I'm showing you how to do compare and swap to do all the operations you need to do to update these these fields atomically and sort of understand how we do this in the context so any questions about this yes wouldn't you remove the right lock often the transaction is completed after the run for example her question is do you release the right lock after the transaction commits or when the transaction completes the right in this example here you release it after you commit right and the idea is again in this case here you don't want anybody to read dirty data right so no one should be able to read anything from a transaction when we talk about hackathon they'll allow for speculative reads so they'll allow you to do that but there's an extra check you have to do at the end right this is basically you don't have to maintain any centralized state to do this because you know what your time stamp is and therefore as you read the tuple you know whether the thing is commit or not because you can look directly inside the tuple see whether the flag has been flipped right for dirty data you have to go check some kind of centralized data structure to figure out well this transaction this tuple is being locked by transaction 10 but I'm going to go ahead and read it but I need to figure out is transaction 10 actually going to commit or not so then you got to check a centralized data structure so this is the decentralized way of doing time stamp ordering we'll see how to do the centralized way that can do the thing you're talking about later yes so your question is what if the transaction ID is smaller than MTS in that case here assuming that your transaction ID fits within the range and therefore you're allowed to read it then you don't update it because this can't go back in time it always has to go forward so you're still allowed to read it you just don't update this field so what if the transaction is a rise transaction the question is what if a transaction is a writing transaction that wants to update this tuple yeah it's not possible so let's be clear so you're saying so say I have transaction five and it read this yeah and I have transaction four that wants to write to this yeah so that would fail because you would be end up trying to write something that occurred that you're writing something in the past that a future transaction did not read and that would violate serial order so the writing transaction would fail feedback yes so we have two transactions one is TRD is 10 and the other is 11 yes and the one with TRD 11 first read the tuple and set the read the transaction and then the other transaction read that it just don't set the read time stamp and then the TRD 11 what how would read that so this question is oh yes the question is I have transaction ID 10 transaction ID 11 11 reads it reads the tuple and updates the read TS transaction 10 comes along wants to read the same thing it's allowed to do that because it's been the range but then it doesn't update read TS because 11 is greater than 10 in that case in that case you don't care right because 11 came before 10 so all you need to know is that there's somebody read in the future this object therefore I can't update it if I'm in the past so yes there is a there is that could cause a false abort but the tradeoff is you you don't have to maintain extra metadata in order to handle that one case yes I'm a little confused about about the old tuple so when I write transaction creates a new version so does it still hold the lock for the old tuple your question is at this point here my transaction is going to create this so it it wants to update this tuple we'll get a new version so I have to acquire the right lock for it okay so then when I insert this new tuple because I added it with my time my transaction ID as is here implicitly I acquire the lock for it but why do you have to hold the lock for the old now your question is why do I have to hold the lock for this because I have not I'm not showing the pointer stuff here the version chain because then I have to go back now and change my version pointer to now point to my new version and furthermore I have to update the end time stamp to say that this end time stamp you can release the lock correct so like you would change the end time stamp when your transaction commits so when your transaction commits then you go back and release all the locks and you have to be carefully doing the right order because you don't want yet I think you go from oldest version newest version to oldest version so you can release this followed by this because you don't want someone to get to this recognize that the end time stamp is faster than your current time stamp and try to jump to the next version but you can't read it because it's locked so you release this followed by this yes so the reason you need a read time stamp is that you don't want that earlier write a transaction to modify the content that has been read by a later read transaction right so the statement is the reason why you need the read time stamp is because you don't want someone to try to go back and write a new version with a lower time stamp that should have been read by a transaction with a higher time stamp in the future but because we're not actually executing in true serial order we wouldn't see that correct because that would violate snapshot isolation okay so awesome questions let's keep moving because there's a lot to cover all right so the next design we have to deal with now is the version chain and how we actually store these different versions so I'm just showing you the simple example his question was like I don't understand why you have to maintain the pointers to keep track of the version chain and so the version chain allows the data system at runtime to figure out what's the right version of a tuple that a transaction should be looking at so the version chain is basically a latch free single direction link list that you build from the pointer in the tuple headers right and we'll see when we talk about skip lists and other indexes why it has to be single direction because you can't do atomic swaps for multiple address locations you can only do one so you can only guarantee that you have a latch free link list going in one direction and this will be an issue for you guys when you do project two how to go in a different direction so the indexes are always going to point to the head of the version chain the head of the version chain could either be the oldest tuple or the newest the newest version of the oldest version based on what ordering scheme that you use but we always consider the head to be whatever the primary key index points to so in the paper we talk about how the different threads are going to store these different versions of these tuples in what are called sort of local memory regions and what I mean by this is that we're like each thread is going to allocate some kind of memory pool and any single time it has to create a new version it goes gets a free slot and it's a local memory pool and it uses that to insert it and because each thread maintains its own memory pool it doesn't have to set a lock to acquire a new location right it's not that you have multiple threads in a single pool each one has its own local memory this has become more interesting when we talk about doing a a new malware allocations we can have you can figure out that this thread is running on this socket so I'm allocating memory that's close to it this thread is running on another socket I'm allocating close to that by doing this you kind of get that for free and this is essentially the same thing that the partition version of silo was doing in the last paper each thread has a chunk of the database so now for the different store schemes you can have they're going to determine where and what information we have to store for each new version that we create so the three different choices we're going to have are do append only time travel storage and delta storage so at a high level when you squint and you start looking at these things they kind of really look kind of the same like at a high level they're basically the same but there are different implications of trade-offs in these so we'll go through each one by one but append only basically you're just going to insert a new version of the tuple in the same table where you store all the other tuples time travel storage you have basically another table where you insert older versions or newer versions and then delta storage is where you only store the sort of the diff of the tuple that got modified or updated so for append only again you only have one table one table space or one space or heap for a table and what you're going to do is all the physical versions different physical versions of a logical tuple they all store together and then we're going to have our pointer maintain the chain going from one version to the next for one single logical tuple so any time that you want to update a tuple say in this case here we want to update this object A then we would go find the latest version we want copy it into only modify the values that we want to modify so xxx so the key stays the same but then we update the value and then we'll do another compare and swap to now update the version chain of this previous tuple to point to this new tuple here so in this example here I'm going from oldest to newest right AX is the oldest and it points to AX plus one that's the next and then the newest one is AX plus two right so this is going to be AX plus two so in oldest to newest it's kind of really easy to do because all you have to do is when I want to create a new tuple or a new version I append some free slot in the table and then I update the last the previous oldest tuple version to point to my newest old version or newest version the downside of this though is that you have to traverse that chain at runtime to always find the latest version so if you're old to be heavy workload you're doing a lot of transactions and you're trying to always look at the latest version of a tuple every single time you want to do the lookup you have to traverse that chain and check the visibility with the begin and end time stamp over and over again alright so that sounds bad but then the plus side with this approach is that when you update the tuple you don't have to update any of the pointers in the indexes because the indexes are always going to point to the head of the version chain which in this case is always the oldest alright we'll see how you deal this with and secondary indexes later on but this is actually the easiest to implement it's just slower to actually run at runtime because you have to traverse that chain newest to oldest the flip of that the head of the version chain is always the newest version so any single time you update a tuple update a version you have to take the latest version have it point to the previous newest version and then go update all your indexes now to point to that latest version but now this is super fast for transaction lookups because when you traverse the index and you find the version chain your chances are you're going to find the first version the first version you see is the first version you're going to want you're going to want so each of these different chain ordering have different tradeoffs and we'll discuss a little bit about that when we look at some of the graphs with time travel storage it's basically the same thing as a pen only but instead of appending the new version to the main table with everyone else in this example here we'll copy the old version that existed in the main table over to the time travel table have it update whatever pointers it needs to to maintain the version chain and then we'll go ahead and insert the new master version into the main table and update the pointer here so in this case here I'm going from newest to oldest the last one was going oldest to newest so the newest version of the tuple is always here in the main table and then the oldest version is always here we have to maintain the same kind of chain over there so let me take a guess how you want to do this and again at a high level this looks basically the same right as a pen only it's just now that instead of storing all one giant table I have two tables can you take a guess why this helps how does it be beneficial yes you avoid index updates our statement is you avoid index updates but that's a byproduct of going newest to oldest so I could just make this oldest to newest I always could have the oldest version here and just maintain the newest versions over here right yes easy for scan his statement is easy for scan but again that's a byproduct yes and no it's a byproduct of it's a byproduct of going from newest to oldest because I only need to scan the latest version and I can just scan this table right if I have to go if I'm doing snaps to isolation and I want to go back to an older version then I got to go over here and actually somebody says it's hard too because I have to do the scan here if I don't find what I'm looking for maybe also do the scan over there I think it's easy to do garbage collection bingo that's it she says it's easy to do garbage collection absolutely so if I'm going newest to oldest in this example here I don't have to scan anything here I just go look in here and I find all the old versions and I blow them away right so it makes garbage collection much easier sort of implement because you have to look at less data but you still have to do all the things we'll talk about later on to make it go faster but again sort of again the high level this looks roughly the same as a pen only but underneath the covers it's slightly different so this is what SAP HANA does this and if you buy the time travel package from SQL Server they're not doing MVCC normally but this is essentially what they do and I think in SAP HANA they go from oldest to newest and in SQL Server it's newest to oldest alright the last one is the deltel storage and the idea here is that anytime you update a tuple in the main table you always consider this the master version and for the deltel storage I say too it only really makes sense to go from newest to oldest going from oldest to newest is kind of you can do it but it's kind of like you wouldn't want to so in this case we're going to go from newest to oldest so anytime you update a tuple you're only going to copy in the deltel storage the diff of the attribute that got modified so in this case here my tuple has two attributes key and value but I only update it I'm only going to update value so I just copy the old version of value over here and then when that's done I can update it here and then point to now the head of the virgin chain or I guess the beginning of the deltel storage chain over here again same thing when I want to update it again I'll only show you not only copy in the old value so now if I need to go back in time and find an older version of a tuple because that's necessary for the transaction my transaction is time stamp I have to reverse the virgin chain and basically reapply these deltas in inverse order to get back to the version of the tuple that I need right so this is sort of like the log structure of the basically the same idea alright so so we've done the current tuple protocol and we've done the virgin storage now we've got to deal with garbage question question so her question is what's the advantage of the deltel storage so one obvious advantage is that if you're not updating every single attribute in the tuple you're storing less data say I have a thousand columns or a thousand attributes and my transaction only updates one in the deltel storage I'll have to store that one here in all the other schemes I have to copy all the other you know 999 attributes and put it in the new version so in this case you can store less information that's probably the key advantage as well and again the statement that you made before about garbage collection being easier it's actually easier with this too because you just take this and you can figure out what you know depending on how you organize you can say well I know all the the delta records up to this point in memory are all four transactions that don't exist anymore and no one can see them anymore so I don't even need to scan and evaluate their time stamps I just blow them away so you just take this whole thing and just chuck it so MySQL does this so it makes garbage collection a little bit easier it makes you store less data but again it's more expensive as you said to do if you need to go back in time to reapply all these updates so this is what MySQL does this is what Oracle does this is what Hyper does I think they're probably the only three that I know that do this and we'll see how Hyper manages this later on alright so with garbage collection again we're creating all these different versions we have a lot of you know we're appending things we're adding new stuff all the time and eventually these versions are not going to be visible by any actual transaction so we need to go back and clean them up because otherwise we're going to run out of space so this assumes that you don't want new time travel queries so you want to actively clear out all the old versions so we're going to say that a database system is able to reclaim the visible version from the database if it knows that there's no other action transaction running in the system that could possibly see that version and this is what you get when you're using snapshot isolation or if you know a version was created by a boarded transaction which no one should ever be able to see it so we have to go clean that up so there's two additional design decisions we have to think about when we do our garbage collection the first is how we're going to actually find expired versions that we know they're not visible and the last one is actually how can we safely reclaim memory to know that there's no other thread could possibly be looking at this object we're trying to get rid of right now and that's a little bit different than a logical transaction this is actually like internally there's some thread that could be doing something with this this tuple and we need to make sure that we don't free it up and then we get a seg fault so for this I'm not going to discuss now we'll discuss this more when we talk about indexes because this would be really important for project number two but I think we alluded a little bit about this last class when we talk about silo and then the paper you guys read today talks a little about this there's different ways to actually safely allow to reclaim memory in a latch free system so we'll cover that more so for garbage collection there's two approaches we identified in the paper the first is to do what we'll call tuple level garbage collection and the idea here is that the database system is not going to maintain any information about what the status is for different tuples or different versions in the database and so therefore it's going to have to go scan the tables or scan the tuples themselves and figure out and try to find the versions that it knows that it can delete and the two ways you can do this is do background vacuuming which is probably the most common approach and then the other approaches do cooperative cleaning and I'll go through both of the examples and then the last one the other approaches do transaction level of garbage collection and the idea here is that every transaction is going to maintain its own reader right set about what versions of tuples that it saw or read or wrote while it was running and then when they commit rather than having to scan the entire tables to find these older versions you just look at this read-write set for the different transactions and say oh yeah these are the versions I can go ahead and delete and again you can do this with a separate background thread as well so for tuple level GC let's start off with a quick example do background vacuuming so there we have two transactions running in separate thread thread one and thread two and the first transaction has transaction ID time stamp 12 and the other guy has time stamp 25 so the vacuum thread and this is like the best icon I could find of a vacuum to me it looks like a vacuum there's other ones like there's like a maid pushing in I didn't want to do that so this is our vacuum thread and we have one or more for now it's assuming we have one and so what it's going to do is it's going to know with the different threads that are running in the system and it's going to look inside them and say what's your time stamp or your transaction that's running right now and it's going to pick the lowest time stamp of all the transactions that is examined and that will tell us what's the lowest threshold it needs to look at to decide whether tuples are actually visible or not and then therefore they can be deleted so the vacuum thread will literally just do a sequential scan in the table and examine the tuples one by one and in this case here it'll look at the begin end time stamp and see that the range that's specified by this version is below the time stamp of the smallest time stamp of all the actual transactions so it knows at this point there's no transaction that could possibly be running that could ever see these versions now they may depending on what kind of version ordering scheme you're using or chain ordering scheme you're using you may traverse the chain and see these things and in that case you would know that they're not visible to you and you just sort of skip them so that's what I was saying you have to be careful make sure you don't reclaim memory that another thread could be looking at even though they're not actually going to be doing anything with it so in this case here assuming there's no other threads running at the same time we can say that these versions are not visible by any actual transaction so it's safe for us to go ahead and clear this out and what happens is remember I said each thread maintains its own memory pool so the vacuuming thread would say well I know that this memory belongs to this thread so I'll go put this memory location back in its free slot pool so that the next time the transaction needs to create a new version rather than allocating another memory location it just reuses this one over again so that's a background thread a background vacuuming and this works with any possible storage that we talked about this is usually what people get upset about when they use MPCC this is the one thing that sort of is a big catcher gotcha when they actually try to run this because you may be running you're running your database system and also the vacuum thread decides to start running and it slows you down in Postgres up until maybe like ten years ago you had to run the vacuum manually which is a big pain and now they have the auto vacuum which is a little bit better but again it can just start running whatever sort of once and that can slow you down and certainly also to be careful about if you have a high throughput or highly concurrent application with a lot of transactions running a lot of threads running and making new versions of the database open over again if you just use one thread it may not be able to keep up so that means you have to use multiple vacuum threads and make sure that's locked for your latch free but now that means you're taking away threads you could otherwise use to process transactions to now use to do garbage collection so that slows you down and so then you wouldn't have to do this in a single version system like when 2-phase locking but you have to do this in a multi-verging system so this is assuring you that the multi-verging stuff doesn't come for free like you pay a performance penalty to clean things out because otherwise you run out of memory alright so now the other approach is to do cooperative cleaning and this is the approach that's used in Hecaton and the basic idea is that as transactions do a look-ups and an index and find the version chain and they scan the version chain to find the version that they're allowed to see that's visible to them they're doing the check to see as they go along for every single version to say is this visible or not and so as it goes along if it recognizes that I'm looking at a version that cannot possibly ever be seen by any actual transaction running in the system let me go ahead and clean it up right now as I go along so everyone's sort of helping out there's no sort of dedicated thread that's doing the garbage collection so as you scan the version chain you prune things out that are no longer visible alright so this only works with going from oldest to newest because obviously if it was going newest to oldest you would always land in the newest version and that's visible so you're good right so this works in Hecaton notice the newest because again, in order to get to the newest version you're going to have to go across and see a bunch of expired versions and you just sort of clean them up and use a compare and swap operation to set a flag in the version to make sure that nobody tries to clean the same thing you are at the same time so one downside with cooperative cleaning or additional downside of cooperative cleaning is that if you have transactions never traverse a chain ever like say for whatever reason if you have some tuple that logical tuple that gets a lot of action gets a lot of updates and then it never gets any more updates anymore under cooperative cleaning no one's going to ever clean anything up because no one's going to go across that chain anymore so these are called dusty corners in Hecaton because you can think of like a big house you have dusty corners because nobody ever goes to it so in order to deal with those that problem you have to still run the vacuum thread occasionally to go do a sequential scan across all the chains and find the things that should be cleaned up okay so transaction level GC again I think it's pretty obvious again you maintain the rewrite set for transaction so you know exactly when it commits and finishes up what are all the different versions that it touched and you know what to clear out alright so the last one I think is really important is index management and again this is I think this is one of the things that the best of my knowledge hasn't really been any papers in recent years that really sort of looked at this problem because of MVCC but it actually is a big deal so the question we're trying to solve here is what should the indexes actually point to for our version chains so if you have for the primary key indexes they're always going to point to the version chain head regardless of whether it's oldest to newest or newest to oldest or what storage scheme you're using you just always need to know that you're at the beginning of the version chain so how often the data systems are going to have to update this index determines how often they're going to have to create new versions and whether you're using oldest to newest to oldest if the transaction updates an attribute that is indexed in the primary key then rather than trying to be smart about making sure that oh we can just flip that one attribute and make sure all the pointers still work nicely typically what everyone does is you actually just delete the old primary key for that tuple and insert the new one that's the easiest way to implement this the downside is that you lose the version chain because now you're dealing with actually a new logical tuple rather than the same one from before but this is what pretty much everyone does because it's just so much easier but secondary indexes are way more complicated and actually this shows up actually in the news a lot I don't say a lot but like when people discuss the trade-offs between MySQL and Postgres this is like one of the key things that people don't realize that you'd be mindful of when you decide your database system and actually this sort of shows up and there was an article last year sort of right at the time we were finishing the paper on the engineering blog for Uber the title of the article is Why Uber Switch from Postgres to MySQL and one of the key things that they cite is how you're going to manage secondary indexes the stories actually more convoluted than that they started with MySQL and then they hire some guy who's really like Postgres and they switch to Postgres and then they realize one of the things that Postgres does poorly for their workload is how they manage secondary indexes so then they had to switch back to MySQL so they went from MySQL to Postgres back to MySQL which you know I'm sure it cost them millions of dollars and had my paper been written before then they would have could have saved millions of dollars right alright so the two ways we're going to manage secondary indexes is either logical pointers or physical pointers so with a logical pointer what we're going to do is we're going to use some kind of like immutable identifier like a logical identifier for the tuple that does not change no matter how many different versions we make alright and that's what we're going to use as the pointer the value for the key value pair inside of our secondary indexes so now we have to then be able to map this logical pointer to the actual physical pointer of where the head of the virgin chain is so this requires us to have an extra indirection layer to make this work so the two ways we can implement this is use the primary key or tuple ID and I'll show what that looks like in the next slide and then the other approach is use physical pointers and this is where we just point to the head of the virgin chain alright so let's look at the example so for the primary key index as I said any time we want to do a lookup on this like here's our chain we're going append only newest to oldest but it doesn't matter what scheme we're using for our purposes it's just easier to understand alright so if we do say we do a lookup on the primary index we'll get the object A well the index will tell us exactly the physical address for the head of the virgin chain so we'll find what we want before we're done now any time that we in this scheme here any time we update the tuple and create a new version then we have to go update our primary key index to the point to the new virgin chain that's not that you know it's not free but it's not that I don't think it's that bad especially in memory system and the benefit again is if most of your transactions only need to look at the latest version then that's an okay price to pay right if you're doing oldest and newest then you wouldn't have to update it that often only when you do garbage collection but then it requires every thread to do a lot of redundant work to doing the version chain to wrestle so now for the secondary key index if we're using the physical pointers again the same thing we're just pointing to the physical address the physical location of the head of the virgin chain so again that's not that big of a deal if we only have one secondary index but if we have a lot of them then they're all going to point to the head of virgin chain so that means that any single time you update this tuple you have to go traverse these indexes and do the update to change the value now for that key and this is the key reason that one of the key things that uber cited and why Postgres was not performing well for their application because every single time you update a tuple regardless of when you update the key or the value for the that the index is indexing on the attributes that they're using for the index regardless of whether you update it or not you still have to go update them anyway because you have to update the pointer location so the two alternative approaches are to use the logical pointers and the first one is that you can just store the primary key as the value for the secondary index but then when you want to go get the physical address you just now do a look up on the primary key index just before and that gives you the virgin chain header a virgin chain head this is fine if your primary key is like less than 64 bits if it's just an integer then that's not a big deal to store that because you're going to store 64 bits for an address anyway if it's something larger than 64 bits then this can be problematic because now you're storing a much more using a lot more space to store the value to do this extra indirection look up you can be smart about things like if I know my primary key is a subset of the secondary key or like the secondary key is a subset of the primary key I don't want to store all this redundant information but if the secondary index is the key is very large then this is actually going to be a waste of space another approach is to actually use these again the logical pointers are the tuple ID identifier the basic idea here is that you're going to have some kind of other index it could be like a latch free hash table or it could be another B plus tree or whatever you want to use and then so when you do a look up in the secondary index you get a tuple ID and then this map table will tell you what's the physical address for the virgin chain and the nice thing about this is that no matter how many secondary indexes I have any time I update the tuple and I need to update the physical address or where the head of the virgin chain is I just update this one index and that causes all the secondary indexes to be able to use that new address without having to go update all of them so typically what happens is in your database system we don't do this yet but eventually we will if you don't declare a primary key index then the data system usually creates one for you internally like MySQL if you don't declare a primary key on a table they create an internal record ID for you and that's essentially what they're using for the tuple ID addresses it's not something you can see logically from the application it just makes the bookkeeping internally easier so that's what we'll add to that I think for now we just store we do it this way and we just store a unique identifier for the tuple but it's not embedded in the actual tuple itself yes so which version of the attribute value do you store in the secondary index? that's a great question the question is what version of the attribute value do you store in the index? excellent question the indexes are agnostic to versioning information they don't know anything because what needs to happen is you get the head of the virgin chain and it's up to now the worker thread for that transaction and try to figure out what's the right version or not so this is what I was saying before if you update the primary key index rather than trying to be smart about maintaining the same virgin chain even though the primary key is different we just insert it as another whole tuple so regardless of whether if you're looking for an older version if you're looking for a newer version of the primary key you'd find the other virgin chain and that's semantically correct as far as I know I don't think any database system actually maintains extra versioning information inside the index it makes it way more complicated but I might be wrong about that I think mastery might be doing something but I think it's all for an internal state even if you were just updating the even if you just had primary key wouldn't it still take a non-trivial amount of time to like scan the index to find the tuple every single time you update the index so his statement is if you only have the primary key index isn't it still going to be expensive to still traverse the index just to find it in order to update it yes but like I mean it's what N log N and it's in memory so it's not great but it's not it's not the end of the world right the argument comes into play when you want to do oldest and newest versus newest oldest and I depends on the application if you're old, tp heavy, you're right heavy you're creating all these new versions over and over again you always want to find the latest version so it's better to have one thread pay the penalty to traverse the index to then point to the latest version rather than all threads have to traverse the version chain over and over again and depending on how aggressive your garbage collection is going to be anyway you're going to be pruning these versions over and over again with oldest and newest and so you're going to still have it updated anyway and that's sort of the reason why we're trying to figure this out we want to test all these things any other questions okay cool so I'm going to only show one graph so although the payment has been accepted I actually don't know what the right answer is yet of like what really is the best scheme to use our best choice of different design decisions that we talked about so this is just one graph from one workload in one scenario you don't want anybody to think that this is this is applicable to all possible different types of applications but this is the TBCC workload running with 40 warehouses on a machine with 4 sockets and 10 cores per socket so you have a total of 40 real cores and so what you see is that the Oracle MySQL essentially the same thing New ADB and Hyper actually perform the best and again only for this one application so the only thing that these are not one design decision that all these things have that make them all better like the Oracle MySQL use the Delta records Hyper uses Delta records New ADB uses the append only storage Oracle MySQL are using MV2PL New ADB is using MV2PL Hyper is using the MV OCC so again it shows that these guys are the best but there's not anything I can point to except maybe using logical pointers instead of physical pointers that makes these guys work better but in case of TBCC there's only 2 secondary indexes so it's not that big of a huge cost so this is kind of unsatisfying right because it's always nice to be like after all this work here's the one thing you should actually do whenever you build a database system but again it depends on the application okay alright so we only have a few moments left but I want to talk a little bit about what Hecaton does and then maybe we'll save Hyper and Han for later time so Hecaton started as an incubator project at Microsoft in 2008 where they were set out to build a new OTP engine for SQL server and the project was led by probably one of the two best database researchers and developers in the world Paul Larson and Mike Zwilling so Mike Zwilling helped build he was a Wisconsin database group alum he helped build short in the 1990s and he was one of the first guys they hired to go complete the port of Sybase to work on Windows NT and then Paul Larson just retired last year but he was a researcher at MSR and the database group he invented linear hashing he's invented a lot of awesome things within a couple years so what I really like about the Hecaton project is a lot of design constraints that you don't really see in academia but they had to deal with the real system or as a real product so the first thing is that they had to make sure that the Hecaton engine integrates with the whole SQL server ecosystem so it's not like us here in academia we're building a new database system where we just start from scratch and we don't have any customers to worry about in their world they just couldn't start building a new database system because no one would actually use it they want to be able to just plop their thing in and work with that nicely the other thing they had to deal with is that they wanted to make sure that they got better performance but that the performance gains that they got were predictable across all OATP workloads so we saw last class we saw in the 1000 core paper in the case of the H-Stroll protocol that's used in VoltDB the single threaded partitioning approach gets amazing performance for some applications where if your application can be partitioned nicely and your transaction only has to touch a single partition then the H-Stroll protocol outperforms everything but then as you saw in that one graph in the silo paper after you have maybe like 15% of multi-partition transactions then the performance actually gets worse it actually creates terrible performance at the far right end of the graph so this would actually be terrible for a product you actually wanted to sell if you said, hey look buy our new database system for some of you you're going to get negative performance improvement but for some of you others you're going to get like negative performance improvement that would be terrible so they made design decisions that may not get the sort of optimal best performance you can get at all possible but across the board you get the best consistent performance gain so instead of going from 50x they might go from 5x and that's still a pretty significant gain so again we're low in time but basically they're doing the same begin and end time stamp visibility check that we talked about before the only difference is that they're going to have a time stamp for when the transaction starts and the time stamp when the transaction ends and they can use this to get rid of having to store that extra transaction ID field so they'll save 64 bits by maintaining an extra time stamp again I'm sorry that I'm rushing through this but it's sort of complicated I don't think I'll do it justice in the short amount of time that we have so the last thing one graph I'll show is that they did sort of the same kind of evaluation that we did where we were comparing the OCC approach versus the two phase locking approach and for this particular study here they showed that the optimistic approach actually outperforms the pessimistic one this difference doesn't seem like a lot but the scale is in the millions of transactions 1.5 million this is doing 1.25 or 1.3 so this is like getting an extra 100,000 200,000 transactions a second by using the optimistic approach instead of using the two phase locking approach and think about what my SQL and Postgres can do in a single box maybe 30,000 40,000, 50,000 transactions a second so getting a 200,000 bump on a single box is pretty significant so again I'm just going to skip all this because I want to get to the project number three so the main takeaway I want you to get from all this is that MDCC is probably the current of the best approach we have now to support mixed workloads the paper you guys read was all about pure OTP we are in the process of doing additional studies to understand how this works in a hybrid system and so some of the design decisions that we made that were better for OTP may not be the same for HTAP but that's sort of we don't know we don't know the correct answer yet and then for people that think about what to do for project three or other research projects for cap systems and things like that I think there's still a lot of remaining questions we want to solve or answer in the context of doing in-memory MVCC so things how you do block compaction so instead of doing garbage collection just always freeing up a slot and then reusing it again you may actually want to take two blocks that are sort of half empty it's that all the old tuples are together in a single block and then newer tuples are already put in another block version compression or compression in general we're interested in so instead of having, as I said you couldn't always do time travel maintain time travel information because you would run out of space there's a way to compress the versions in a smart way so that you can still maintain all the previous history and then online schema changes or alter tables and things like that is actually how to do this in context of MVCC and still get really good performance is actually we're looking into that now and again the remaining 10 minutes I want to talk about project 2 so the project 2 is to implement a latch free skip list inside of Peloton there's a quick show of hands who here knows what a skip list is one, two maybe a small smattering of people so skip list is a probabilistic data structure so unlike a B plus tree where you know exactly how to traverse down to the tree going left and right in a skip list when you insert new entries you'll flip a coin and you'll decide whether you stored extra information at the higher levels of the tree so again you'll squint at a high level it'll look like a B plus tree but the difference is that you have this probabilistic part at the top so you're going to be required to implement a latch free skip list in the system that supports both forward and reverse iteration and this is important because since we said we're going to do a latch free it means we're going to use compare and swap and I said earlier you can only do compare and swap on a single address location so that means you can only guarantee that you're latch free going in the forward direction right so you have to come up with a way to do it in a reverse direction you're never going to get the exact same performance but you want to do this again without having set locks and then you're going to have to be able to support both unique and non-unique keys so both primary key indexes and secondary key indexes or non-unique indexes and we'll discuss different ways actually how you implement both of these approaches I think next class or next week so what we're going to provide you guys with is a header file that provides two header files that provides the index API that you have to implement so if you implement this API you can just plop your index in and everything else will just sort of work and same SQL queries and everything will just work without having to change anything so some of the hard things that we'll already do for you like how to do data serialization like how do you take a bunch of attributes and pack them into sort of a compact byte form and store that as the key and how to do comparisons for these keys we'll already do all that for you so you're really just building sort of the scaffolding of the data structure that makes this all work so I'll discuss skip lists next week but as you see when you start to think about this there's a lot of different design systems you have to make but I'm not going to tell you how to do like I'm not going to tell you how to do garbage collection I'm not going to tell you exactly how to do reverse iteration there's a lot of different examples out there there's a lot of different papers so let's talk about how skip lists actually work as far as I know only one database system uses it for the primary index like the main data structure and as MemSQL I'll discuss more about that next week and so we'll discuss the canonical example of the skip lists but there's more optimized versions that you may want to look at as well because the skip list byte the generic version of the skip list actually gets terrible cache locality and actually gets bad performance so there's ways to sort of fix that so again we're not going to hold your hand through the entire process we can discuss tradeoffs and different things but it's up to you to decide whether you actually want to implement those and so everything we'll provide the same thing as before we'll provide you with 100c++ tests not only test the correctness in a multi-threaded environment but also check the performance of your system and we also have a reference implementation of a BW tree that you can look at to compare your skip lists against so the BW tree I'll talk more about this next week is another latch free order preserving tree based data structure that came from the hecaton guys I'll discuss why they picked this over the skip list because it's actually kind of interesting and it's relevant to why MemSQL picks skip lists and again just like before we strongly encourage you to do your own testing so I think what we'll do is rather than just having you submit you know for the extract example you could submit as many times as you want and then we're going to grade you on all the test cases that we ran when you actually submitted it so how about additional test cases that we're not going to have under AutoLab that will run in addition to the ones that AutoLab runs we can check the correctness of your data structure as well so we'll do additional speed tests and we'll do additional correctness tests that AutoLab won't cover so that means you're going to have to learn how to write your own test cases and make sure that you have good coverage of your data structure the answer we're also going to go back and look at your documentation about how you're actually what the code actually looks like and this is going a little bit more deeper than just using clang format and we'll also go back and look at the inspections manually to make sure that you didn't plop in you're not referencing the BWT inside of your skip list to make it look like you're going to skip this without it for grading we're going to run additional tests beyond what we provide you but we're going to provide bonus points for maybe the top 3 or top 5 groups with the fastest implementation so it is my hypothesis that the BWT that the student built last year will outperform all of your skip lists and that's not a knock on your coding abilities it's just we think that BWT is faster than the skip list and the Microsoft papers show this but we're actually curious to see how fast you guys can get compared to our best index okay and again we're going to do the same thing before if your full line is correct we're going to use Valgrind to make sure that you don't have any memory leaks okay alright so again this is a group project everyone should contribute equally I don't like to hear complaints about so and so you know went on spring break and did not do anything that happened last year we don't want to happen this year so we want everyone to contribute equally to the project and I'll be checking if there's an issue I can check and make sure that everyone is doing progress and making the same progress I'm not in a group yet please email me so we can figure out who to group you up with who here is not in a group oh that's nice okay that's easy awesome done so the due date will be March 2nd so you have a month to do this for three people in a group I can say this is a reasonable amount of time the basic version of skip lists is actually very easy to implement like a latch free skip list you can do with forward iteration with forward iteration you can maybe implement in 500 lines or less it's actually really easy doing it making it run fast and making it reverse iteration plus doing garbage collection and all the other things actually make this a real index you can actually use is a little more complicated and that's why we're giving you a month and there's a lot of implementations out there that you can look at but again the same rules applied please don't copy other people's code you can be inspired by them copy and paste so the instructions for project 2 and all the header files are not available online yet I'll take care of that after class any questions about project 2 okay awesome so next class we'll discuss index locking latching and this will be a precursor to the lecture on Tuesday next week I'll lay out what the skip list actually does but I encourage you to go there's probably youtube videos or blog posts about what skip lists actually do I think it's good for your own preparing for this project to go look at what these things are ahead of time okay awesome guys see you all on Thursday