 So all right, that's good. Thank you for that. And any announcements that you have? I'm DJing on Friday for the Diwali party, I told you. Okay. But I think it's sold out now. So if you can get tickets please do come and say hi to me. Okay, that sounds great. It's sold out, but do you have a few swift-like tickets in the back that you said on the secondary market? Yeah. Okay, cool. Okay, that sounds great. So let's get started. We have a bunch to cover. I'm going to start by going back to the OCC stuff that we rushed through in the last class and a few things that are needed to go correct from that. First thing is, let's go back to this chart over here. I've updated the slides to correct one of the arrows in the figure. It didn't change in a big material way, but to be true to what was present over there. So let's see if we can understand why this works. Remember, there were three conditions. And when a transaction is in the validation phase, it checks against these three conditions. And the best way to think about this is that the ultimate check that needs to happen is between two pairs of transaction, TI and TJ. And I is checking with all J's that are in the future. So it's just looking at it from that direction. If everyone does that, everything works out. So the first part of that, and some of that was a little confusing in the last class because I scored it on the issue of precisely when some of these transaction IDs get assigned. But the first part of that is that TI and TJ are assigned these transaction numbers so that I precede J in the equivalent serial schedule that we are trying to enforce with this protocol. So effectively, let's start from the bottom here, case three, which is in some sense the most difficult case. But the arrow is from the end of the read phase of TI, which is kind of when the transaction work all ended. The rest of it is can I commit and stuff like that. So TI to TJ goes in that direction. This means that the read-write dependencies are all going from I to J, just by that. Because the write phase of TJ is going to come way later, so the objects only become visible in the global database to anyone else when the write phase happens. Remember in OCC, all the changes are being made to the local copy of the objects. So this becomes trivial. We can't really from just this schematic diagram to say anything about the other two anomalies, the WR and the WW. And so for that, we are going to in the validation phase check that those two sets don't intersect. Okay, so if you understand this part, the rest of it follows, it sort of gets weaker in terms of the checks you have to do. Because if you say the write phase follows the write phase, you can basically not have to check the WW because that is implicit by the definition of that condition. So what it means is that if transaction I is getting ready to commit, it'll say what are all the J's that I need to worry about. And there's some state kept in the system that keeps track of all the active transactions. So I will, and transactions get assigned these transaction IDs. So I knows it needs to only go check from its point on to everything that is in the future. And that future, every transaction at some point gets assigned a transaction ID. And that comes from as we talked about transaction IDs get assigned by the simplest ways. There's a global counter, everyone does an atomic increment to that and gets a transaction ID. So I knows everyone that it needs to go and be concerned about. And then it does that check. The other two checks are easier. So if you get case three, the rest of it will just follow from that. And thanks for pointing out that arrow bug from that paper, that certainly helps. But it also made sense to start from case three rather than case one. And if you get that, everything is simpler. Okay. All right. So we're going to keep marching on. The second thing I want to mention is I've told a couple times over the last few weeks that 721, the advanced database class is going to cover advanced transactions. That is not true. Two years ago, that's what 721 did. I know many of you are starting to figure out what to register for. But the earlier version of that this year in spring of 23 was all analytics. And the coming version is also going to be all analytics. Now, that means if you are still interested in transactions and things like that, if you want to pursue something, you can still do some sort of an undergraduate research project. Come talk to me and Andy and we can figure that out. That's really where your heart is in. If you're not correct, if you're registering for 721, it's going to follow the curriculum of spring 23. So it's going to be all analytics. Okay. Questions on that? I know that came up multiple times in questions even after class, especially after last class. All right. So we are going to go and look at the rest of the OCC and relatively quickly. So far we've looked at that read phase. We were in that validation phase, which is what that diagram is. A transaction has validated itself. And now is ready to say, I've been given the green signal to write all my stuff to the global database. And it's only when you write your stuff, all of that was happening in the local copy, will it become visible to everyone else? So now multiple transactions could be entering the right phase at the same time. And you want to make sure correct things happen. I'm going to only go through it at a very high level here. There's a full-fledged paper on this that I put a copy of that, the exact paper that it refers to, that was the HTCUNC paper. You saw the image in the slides. So you should go and read that paper if you want more details of what I'm going to tell you next. I'm certainly happy to talk about it offline. So the simplest way to get correctness in the right phase is to say, we are all going to have a serial order of doing the writes. And any transaction that wants to get into the right phase is going to grab a latch in memory and say, I'm the one that's writing. Everyone else, please wait for me. It finishes all the writes, which will take a long time because the transaction may have updated a million objects. And now it has to write it out to disk. And once it is done, the other transaction can proceed. There's an advanced protocol that allows parallel writes to happen. And at a very high level, what it's going to do, it's going to play off that condition three and say, I'm doing a lot of these checks between the read and write sets. And it's going to even do a couple of things where it's going to delay the transaction number assignment and the readers will not even get a transaction number. So this means there are fewer transactions to check against if you're a writer because it's sort of like reads go free and we'll hit that theme again as we talk about NBCC, which is the bulk of today's lecture. And then even for the writers, you will go make that check happen in succession without being in a critical section for as long as you can avoid doing that. So effectively, some of the ideas that are mechanisms that apply in other places too is if a lot of us are writing together and our writes could interfere with each other, if we establish some sort of an order saying, I'm going to go always every object has an object ID and we are always going to go from the low to the high. You can start to play games where you can start to do quote unquote unsafe stuff with the right, but not get into each other's way because you're establishing a certain order. I just leave it at that. I know you probably have a million questions. I'll take that offline, but there's a, you know, it takes about 30, 40 minutes to go and really deeply understand that parallel commit protocol. The main thing I want you to know is that the right phase itself, you need parallelism. And if you're not careful, all the work that we did in validation, we can't just overwrite each other stuff. Right? If TI finishes before TJ, TI's right set has to be written before TJ's because you don't want the other way around because that would basically mean that some later transaction, some newer transaction, some older transaction overwrites a newer transaction, right? So you have to follow that order. Okay. So optimistic concurrency control versus pessimistic concurrency control. This I do want you to understand and fair material for example, right? Intent is when do we think one works better than the other? So there are trade-offs as with every of these mechanisms that we've been talking about, the optimistic concurrency control doesn't do locking. And as a result, it can start to do really well when locking would have become the bottleneck in having the transaction do its work. But the cost that it has to pay is that it's going to make a whole bunch of copies of that data set and that's going to be expensive. As I alluded to, the transactions that are read-only can actually go through without getting in the way of everyone else. And they can pass behind even writers. You'll see MVCC which has similar traits of it's a mechanism, it's not a full-fledged concurrency control mechanism, it will get paired with other things like OCC and 2PL and things like that. But the same kind of things of allowing readers to go through will apply. So essentially, all these copies are going to get expensive. So if I have a workload in which there are lots of updates happening and the updates are happening to large portions of the database, obviously OCC is going to run into trouble because it's going to have to make all of those copies. Furthermore, if there are real conflicts between these transactions, real conflicts, you can't avoid that, which means some abortes need to happen. In OCC, if that validation phase, I have to abort the work and imagine I've done the work, I'm a transaction, I've been running for an hour, I've made a ton of changes, I go to validate, I have to throw all that work away. In the two-phase locking, the pessimistic approach, all of that wasted work doesn't happen because the first time two transactions try to step on each other's toes, we'll stop them. We'll stop them. The downside is you are acquiring locks all along and you have read locks and S locks and here you can kind of make readers just pass by. So that's the trade-off. And in some workloads, this is going to be better. Some workloads, the pessimistic approach is going to be better. It will depend upon specifically these types of characteristics. How much contention is there? And how much work do you have to undo if there's contention and you have to abort a transaction? No, it can happen like a phase validation. For 2PL. Aborts can happen when you have deadlocks. Yep, exactly. And I'm skirting a little bit, sometimes if you have locked upgrades and stuff like that and you're trying to do funky stuff then there might be other reasons for it. But by and large, in 2PL, the and cascading abort, we talked about that but let's leave that aside but there was the other tricky situation but that's where the abort caused other problems to happen. All right, we have so far assumed in everything we've been talking for about 2 weeks now that we have database objects that are real, they're physical like pages and records that we are doing something with, making a copy of or acquiring locks on. But in real life you can also have transactions that are creating new things. We have completely scouted the issue of what happens when something is getting created. We've kind of pushed the side on the side and by and large, everything we've talked about with locking and stuff like that will work but there's an interesting problem that comes into play when you are creating you have to worry about one more thing that we haven't worried about so far. We talked about the different anomalies like read and dirty reads and all that kind of stuff. There's one more anomaly that we have to now worry about. Another anomaly has to do with the fact that all these protocols are doing things on physical objects that are present in the database and so if there's a transaction that's creating new stuff you will never have seen it when you're trying to acquire a lock or making copies. So let me illustrate that with an example. So here are two transactions now instead of having read write calls there's a SQL query in there and you can kind of see what's happening there. The first SQL query is trying to find how many records are there in this people table with the status call it and it's going to repeat that after a little while but in between a new record got created that new record should be in the answer but the first query is not going to see it because when it ran that record didn't exist when this then runs again this exists. Now that could have grabbed a read level lock on all the pages or records and this would follow two phase locking protocol if the locking was done at that granularity of pages or records and you would essentially get the wrong answer. Now if the locking were all done at the database or the table level for this query it would be fine but then you're not going to have that granular locking right we were trying to get this granular locking to allow parallelism to happen. So this is called the phantom problem and the reason why it's called the phantom problem is you know here for example assume the first query returns 99 the second one should have returned should return will return 100 and that just feels like that is obviously not serializable but repeatable read type of semantics is basically getting violated. So does that make sense? I'm sure it is not. Yeah. Yeah, yeah, yeah. So what exactly counts as a transaction and you talk about this a few times before where it's worth repeating a transaction you can in SQL actually put in an explicit begin call write a bunch of SQL queries and end it with a commit or an abort you could even put an explicit abort so that's the other case you might have an explicit abort or if you don't put in your SQL assignments you never put a begin a commit in the assignments that you did so far in which case this database engine will implicitly put the begin at the start of the SQL query and put a commit at the end of it. Okay. Unless you get aborted for all these reasons we talked about. So these are explicit transaction boundaries that get put into the system. Okay. Yep. Got it. Yeah. Yeah. So the question is can I if I understood your question correctly can I write an application as begin and then do a nested begin transaction. The answer is no. There are models that allow you to do these types of nested transactions but in practice you can only do one begin. You can do these things called save points that we talked about where you say begin save point save point save point so when you say oops I want to roll back something explicitly in the user transaction you can say roll back only to the last save point or the second last or the third last save point. So you won't explicitly nest that database systems can't quite do that but it's not they can do that. There have been models that will use that type of idea in other ways but SQL and practical database systems will not let you nest it in that way. Was there a question here? Maybe it was the same one. Okay. All right. So you get why this happened. This happened because the first transaction was locking only existing records and it didn't see it won't see the new record that is getting created after it has started to run. Even if you're using OCC it's going to make copies of everything it has read. It's just going to reread from that because it thinks that's all the records that corresponded to that status is equal to live. So OCC two-phase commit both of them two-phase locking sorry will both have this problem where they will face the phantom with all the mechanisms that we've talked about so far. So we need a little bit more if you want to prevent this type of a problem this type of an anomaly to happen. Can you think of what might be ways we might avoid this? Yeah. Yeah. Yeah exactly but the question is how do you know that someone else is going to be interfering with you so does that mean so the answer was if the first transaction had acquired a table level lock a shared lock then this wouldn't happen. You're absolutely right. But how do I know when I should not do that? Or should I always do that? If I always do it then I don't have as much parallelism as I want in the system. Yeah. No, no I get that but what I'm saying is if the protocol we change a two-phase locking protocol to say you will always grab an S lock on the table and you will not do any finer granularity of locking then you've reduced the parallelism in the system. So that is not what I would say. Yeah. The reason remains the same but whenever you're trying to insert something you take a right lock on the table. Got it. Okay okay sorry I misheard what you said. So you're saying when I'm trying to write something I should grab a read lock on the table on the original table that I'm reading. Writing a new stuff will take the right lock on the table. On the right lock but that means if a reader is in progress I will have to stop. So I was just going to go to that compatibility matrix let me see if I can find it really quick otherwise these are all the games that you could play in that the right lock will basically not be allowed because even if I have an IS lock I think it was in the previous text so I'm just going to let it go. So that would run into trouble because you are now going to that update will not happen. Here what we are trying to do is to see what happens when we want to allow maximum parallelism in the system. So we don't want to take so harsh an approach like put a right lock for even one record update because that means I'm blocking everything off and that transaction is doing more locking out more areas of the database than you need to but you're on the right lines that we can do something akin to that. So you can logically do stuff like that and the way you could logically do that and get around that is this thing called predicate locking let's start with the re-execute scans. Yup question. The database objects the database objects like records just think of it as records for now. So if you had no updates right why did we start out on this path saying we have a dynamic database in which new inserts are coming in or things are getting deleted so far everything we talked about updates to an existing record is allowed but if I sample I showed you for the phantom was with an insert same thing will happen for a delete. So if you have inserts and deletes then you start running into trouble and so that's kind of what we are trying to prevent. So one way to do that there are three approaches and this is going to be the theme throughout today's lecture and the next lecture is for every little problem there will be multiple mechanisms they're going to have trade-offs and so the first scheme is to say I whenever I have to go finish the transaction I will go and reread all the stuff that I needed to read as specified by the query and check if I got the same thing. So this is easiest if you have OCC because I keep track of all the objects I've read. So when I'm done I will go and reread it I'm done imagine I'm in the validation phase and just as I'm trying to get into the validation phase I will go reread everything and say whoops is my reach set changed from what I did when I actually ran this stuff. If so then you're going to say that setup happened there's the and we'll talk about that in a little bit more detail in the next slide predicate locking says this would not have happened if I just keep in track of all the predicate status is equal to lit and anyone who tries to write that covers the records logically by the specification of the predicate I need to do some special handling for that so predicate locking is the first solution that was proposed when the spantum problem was detected in the 70s but as we'll talk about it it's really hard to do and no one does it because it's an NP complete problem it's boolean satisfiability for those of you who are theoreticians the approach that end up getting used is to use an index lock and we'll have to make changes to how we allow we use the index to go do this locking because an index points to actual things but it can point to ranges of things so it kind of can simulate a predicate lock so let's just jump into these details and see how this works the simplest part way to do this is to re-execute the scan as we talked about I just go re-read stuff and say is it exactly what I saw when I read it in that case that we looked at the second time we go and re-read that table we'll see something different in that case it was an update but if it was a delete we will see that and we'll say whoops can't go ahead and do this if I want phantom protection I'm basically stuck and I'll have to I'll have to stop okay but that requires an expensive check that re-execute requires an expensive check going through the system and the place where this gets used is there are this is a subclass of transactional systems that work on data that sits in memory because practically today you can get like a four terabyte main memory server and a lot of even big heavyweight transactional workers can just be in memory so this in-memory OLTP is a big thing I was just talking to some of the folks at Oracle recently at a retreat that's kind of where I ran to after the class and you'd be surprised a four or eight rack Oracle system is what runs massive things like the New York Stock Exchange it's not a big cluster large memory small cluster that runs that okay and I know we won't talk about it in the advanced database class but I'm happy to do this offline there's a really cool paper that says transactional systems if you add more nodes actually gets slower and you can prove that they get quadratically worse unlike analytic system where if I add more nodes I can do partitioning and all this stuff and I can get faster there's a huge incentive to keep fewer nodes in an analytic system because more things trying to do stuff start to get into each other's way and that contention and having to resolve that grows quadratically okay I'll leave it at that and happy to talk about that offline so the re-execute scan works for when your data is in memory because it's much faster right if you're going to disk you'll be waiting a long time to re-execute the scan the golden way to do this is through predicate locking which is to say every time I have a query so I look at the where clause of the select I look at the where clause of my update in certain delete queries these are the ones that are trying to make changes and imagine I could look at all of those predicates and resolve by just looking at the predicates logically do the interfere if yes I can actually go solve this problem by just looking at the predicates and that's beautiful because I really don't need to mess around with objects and lock tables and stuff like that I can just look at the query predicates now this turns out to be really hard to do because just checking for overlap between the predicates because the predicates could be complex right you could have conjunctions disjunctions in the where clause turns out to be the same as a SAT problem obviously that's pretty hard and most people don't do that though this was the original solution that was proposed Hyper which is a new system not new anymore but it's a really cool system that came out of Germany has this notion of predicate lock and one of my students who is just graduating had worked on using predicate locks in a limited setting for OLTP recognizing that they have a certain structure and because they have a certain structure you can actually do predicate locking for an important class of OLTP workloads but not completely in a general way the general way still requires solving this SAT problem was there a question yeah you'll do that separately and so the question there's the other part and this will again go maybe an offline discussion if you could do predicate locking correctly you may not need the other types of locks and we discussed that in that paper but we could only get a limited class of OLTP working but it was a class that we didn't think could be made to work before but in the general case it is still super hard that's why no one uses it but happy to talk to you offline about that or point you to that paper so intuitively how is predicate locking going to work you're going to have to create some sort of a structure to find out which predicates overlap one very simple thing is to say imagine I'm creating like a two-dimensional structure or some sort of a hierarchical structure we'll just take a hierarchical structure here the first query here says the predicate is simple it's just status is equal to lit right I have no conjunction and disjunctions but we know that that's what happens in real life and a lot more complexity comes in so just taking to the simple example I can say everything that I'm doing is this predicate so all the it's effectively like defining that range for it and then the second predicate that comes in is a subset of that so that's what you're trying to determine like what is a subset of each other where that overlap is and just doing it one predicate is sounds pretty easy but imagine doing it with conjuncts and disjunctions in the general case that's where all the hardness comes in but in the simple case you'd say okay I see status is the field and on that there are different predicates status is equal to lit is one of the predicates on status field and then within that I'm a subset of that so I can say oh the first query covers the sets of records that I am touching so anytime there's overlap it's unsafe right the red stuff is the dynamic database update insert or delete query and you can start to make these checks so you get this general idea that you can do this with predicate locking but it is a very hard problem right so doing this predicate talks is that yep question how many locks do you have so in this case I'm just showing the predicates ignore the lock has been a lock it's just the predicate so it's not like you're grabbing a lock on the predicate per se it is called predicate lock you're simulating that by basically saying I'm covering this range so you can imagine this data structure not sitting in a lock table type of a structure but sitting in the traditional lock table that we talked about but in some sort of a predicate lock table in which it is keeping track of these data structures so it's not a lock table entity in the traditional way that we've talked about so far great yep yeah just hold on to that question for a little bit you could certainly view it in that way and you can and effectively the way it evolved is that people said can we try to make predicate locking working and realize oh my gosh yeah we are not going to prove it P is equal to NP or make the database system so slow that can never come back so index locking is a cheap way of doing predicate locking with a little bit of more complexity than this but it becomes practical that's that's a short answer and you'll see that in a second other questions yep yeah what happens if the re-execute scans don't match yep you would then abort that's correct yep other questions yeah yeah so what happens if I insert the object and delete the object assume objects have an object ID that you can hold on to for simplicity and so there'll be some we'll come to a version of that when we talk about MVCC as to what happens if it's the same record got deleted and got reborn again the values are the same but you know it's a different record what do I do with it so hold on to that the mechanisms are going to be similar to what we just talked about in a bit yeah okay alright so that's predicate locking awesome idea very hard to do now we get to how people actually do it and you're going to muck around with the B trees okay and immediately I'm going to tell you that if you don't have a B tree on the predicate you're trying to protect you can do this so you'll have to build a B tree on all the predicates that show up that you want to protect with a predicate lock otherwise you'll have to do stuff like this which is grabs things at the table level XN S locks to do that so assume for now that every predicate of interest that is that we're trying to protect against the phantom has a B tree on it okay and so with that we're then going to go into index locking by the way that's not an unreasonable assumption because OLTP workloads often tend to have updates the ware clauses in the update in certain deletes are usually going to be along very specific keys I'm updating the record the home address of a customer ID your shopping cart application is going to have where customer ID is equal to one that you present it's basically going to be that so it's not even if the customer record has 100 of columns in it it's basically coming down on these update queries on one or two columns right so it's not unreasonable to say that you're going to have these B trees okay and these B trees also help because it helps you identify in this case the customer ID of interest that's what trees do so it kind of matches nicely that you're going to have this synergy happen with this index okay but you need the index to do everything we are going to talk about next so remember now it's been a while more than a month but hopefully still remember the B trees right they have at the leaf node it looks like a sorted keys and of course with the keys we are pointing out to the records right so not showing the pointers over here but they represent ranges and now we know how to do physical locking also we know how to read physical things in OCC and stuff we know how to make all of this work with physical stuff the problem is this ghost record that showed up in the example we don't know how to protect against that so if you know how to do physical stuff imagine a lock table so far was saying I can lock a page I can lock a record I can lock a table can also say I can lock a key in a B tree we can easily make that extension in the lock table it's a new type of thing that is lockable okay so now what you can say is I can actually grab a lock on the key 14 which is in the index that protects anyone else from touching key 14 while I'm working on it I'll grab it as I'm standing the index doing my range predicate and as I touch each of the keys I'm going to put locks around someone else try to do something in that key it won't work immediately you're going to say this doesn't solve the problem because the problem is not the key itself but the stuff between the keys that didn't exist so I've got 14 and 16 and someone's trying to insert 15 what I just told you won't work so I have to also protect the gaps and there's a really cool type of lock that's called a gap lock okay so I can acquire a gap lock that says whatever is the gap lock after 14 that I'm protecting okay and now we're going to do a little bit more and we're going to do this thing called the key range lock so we're going to take those two ingredients we can lock keys and we can lock the gap between keys bring it home and we'll do these key range locks so imagine in the simple example we had where status is equal to let just the key lock would work right because there wasn't a range but now let's make the example a little bit more complicated and say the regular SQL query the select query was trying to do read all records between 14 and 16 and we didn't want a record 15 to show up in middle because that might cause that same phantom problem that we discussed so what we'll do is we'll do a key range lock which is the key 14 and everything that is to the right of it okay and that's called the next key lock and logically what it says I'm locking 14 but it's a special type of a lock that says it's 14 and everything that is to the right of me till a real value shows up and if I wanted to if my predicate was let's say 12 to 16 or greater than 14 I will grab 14 16 and everything after that right so I can depends on whatever range I want and effectively it is saying it is inclusive of 14 and exclusive interval of 16 so anything between 14 and just shy of 16 right that's the range does it make sense that's an interval lock you know we are basically just creating an interval yep because it won't so the question is why not grab the lock on 14 if someone's trying to insert a new key 15 however I've protected myself they will never see the 14 lock they'll just go ahead and make the change if my query was exactly equal to 14 then I would be fine with what you propose like status is equal to lit and I was protecting the lit value in the status column I would be fine but if I'm doing a range I will need to protect this gap yeah yeah yeah so that's coming up next great so there are two ways to do it one way is to say I have a next there's a completely symmetric way to say I might have a prior lock now you will implement only one of these mechanisms not both others you'll end up with deadlock remember we just talked about everyone swim in one direction so it'll depend upon how am I going to access my beetry a beetry typically comes down from the lowest key when you're doing a range and then you typically go scan the leaf from low to high you could do a completely different way of go high to low but most people go low to high whatever is that access path you have for your beetry is what you're going to do so if you're going to low to high you'll say my system implements the next key mechanism right so it's it's got to match the way in which you're doing this to be to be natural you could make it work the other way around too but it's just more natural to do it in one but you will not do both you will not do you will not say some transactions will do prior key and some will do the next key then you can start to run into trouble okay so let's say yeah yeah if you know if you wanted to protect 14 and 16 with the prior key you'd have come down on 16 and that the prior to 16 to 14 regardless you so it's really simple it's not that confusing you either do the prior key or the next key locking okay if you and depending on that you're going to set what my interval is you're going to say in this case it is everything greater than 12 just greater than 12 up to an including 14 right that's my interval and in the other case it would be the other way so whichever gap you want to protect now you can protect as long as everyone's protecting the gap in the same direction yeah yeah no no I just said that you will only implement one of them in your implementation if you do both you'll run into trouble with deadlocks so you pick which one you want to implement only implement that and then use the same way in which you protect ranges so you shouldn't say one range one query that wants to protect 12 to 16 is going to go left to right the other one goes right to left don't do that just go all in one direction no no you can only implement one of the two locking modes yeah we just talked about that you would do 16 and prior key you do 16 if you wanted to also include 14 so let's take the three cases that are possible I want everything my predicate is greater than 14 and include 16 so 15 and 16 is kind of if this integer key is what I want to protect right then I can basically do 16 and the gap that I want to protect in between and now you might say okay what if I don't have so that your question might be can I have a next key locking like we have seen on the screen screen over here and it only protects the gap and 16 does it kind of look like a prior key yeah it does but you don't want to do that so in this case you might say I'm actually going to lock a little bit more because that's all I have so I will do 14 15 and 16 even though 14 is not my true conflict because that's the granularity at which I can do that so I think that's where you're getting hung up it's like okay I may be locking a little bit more than I need on the edge cases and that's true yes yes so yeah exactly no matter what you pick it's very simple you just pick one of those and implement that you will that edge case that you think about is very legitimate you will have you will be locking something for a little while but no more than one value right so you're okay with that the question is would you implement are you should you do only one of next key and prior key yes can you do gap locks and the other locks too the answer is yes but generally your implementation could get very complicated so you'd probably just do one of them right there is a reason and as I said that doesn't mean that the systems don't do that B3s are so important they will do all of this stuff and make it make it work so yeah you could even do a pure gap lock you could just say I'm just going to do 14 and gap I can just do the regular value locks and gap locks because what we are talking about here is a composition of that as long as you go in one direction you can make all of that work yeah so all of these will have some representation the lock manager right so you these are locks so they will be requested from the lock manager has to check that and so then the question is what do I put into that lock manager and it will basically say something like yes key 14 it will basically talk about that interval saying it is you know 14 to 16 that I'm trying to protect and the lock manager can have different ways of representing that yeah and there are details for that again we can talk offline about what makes sense but that's not super complicated but there are some some interesting issues there too okay great I was hoping to finish MVCC today so that will keep us here till 6 p.m. no I'm just kidding we'll figure out how to adjust the material for the rest of the class great so the other part that we have to worry about is can we use the fun stop that we had with granularities of locking with IX mode and the intention modes too like I'm scanning a B tree I'm reading it and then only some of things that I read I may want to update should I grab the X lock first as we said that will not allow in a parallelism everything we talked about in hierarchical two-phase locking actually beautifully applies over here because all hierarchical two-phase locking needs is some sort of a containment hierarchy right as you're coming down you can say here is stuff here's a bunch of things I need to read need to operate on and that's organized into smaller sets of stuff and I've got a beautiful containment hierarchy you kind of have that here so imagine I have a query that is just reading stuff from 10 including 10 up to 15 and not including 16 it can grab an IX lock on that that's the predicate and if some of it after looking at the record right it may have got 14 chased it down looked at something and said yeah now I need to go update this 14 key to something else it can only grab that X lock on that and some of the transaction that was getting an IX lock and that same range but was updating a different key like 12 is allowed and compatible it can go forward at the same time so you can do all of this locks that we talked about in with the gap locks and the prior and next key locks you could have the lock mode also follow all the lock modes that we talked about right the lock modes are orthogonal to what we are locking and what we are locking that resource if it's got a hierarchy you can apply all of those principles inside a B-treater okay so it's beautiful that that whole theory holds up here too and you allow more parallelism in the system as a result we talked about this if you don't have an index then you'll have to do something else like lock every page in which stuff of interest exists and kind of use that or lock at the table level which will be the other way to do it but the tradeoff over there is you will not have as much parallelism that you allow you have more transactions that are blocking each other than you would with some of these more advanced schemes alright so so far we've talked about everything assuming we wanted this conflict serializable view serializable schedule and then we said you know if you just thought about the way we do that you're missing phantoms so we are going to add that phantom protection but in practice it turns out that many times you need even weaker forms of protections an example is I've got a database operation that wants to read all the records to build statistics for the catalogs that the optimizer can use to figure out how many records there are what these histograms are remember now again a month or so back or six weeks ago we talked about histograms that optimizers need to produce cost cost to cost the plans that they're searching through where do they come from so imagine building a histogram I've got a petabyte table I have to scan that petabyte table it's going to take a very long time even if you've got a massive cluster it may take days or hours and so what do you do do you grab an s lock on that you will block everything else out so what you'll do is you'll say this transaction is okay if it sees like dirty reads and stuff like that I'm just building an approximation structure it can run in a very low transaction right so that's a good example of how you might have these weaker forms of isolation okay so essentially we might say I'm allowed I'm okay you know don't want to get myself protected against all these anomalies we've talked about dirty reads unrepeatable reads phantom reads and so on and I can get these different isolation levels SQL has a standard where defines these isolation levels with specific terms turns out that these terms are confusing have a bonus slide that talks about if you really want to get into the details the experts in the world who understand this wrote a paper saying how SQL's definition is all wrong so it kind of cheat a little bit stay with the definition I really wanted to understand that SQL has definitions and you can actually start a transaction when you install a database system many database systems will have default isolation levels and we'll see a slide in that and you can even add the transaction level in most systems big different levels everything we've talked about is kind of serializable with what we've built up so far perfect world everything is serializable phantoms are protected again so that's called serializable schedule okay that red stuff is actually a keyword in SQL the other one is repeatable read everything we talked about phantoms so it's like yeah you'll get repeatable reads all this other stuff but phantoms you won't protect yourself against and many database systems that's a default level okay and you're kind of giving up and saying I want more parallelism I know kind of what I'm getting into I'm okay with that okay read committed is where phantoms and unrepeatable reads may happen so kind of the histogram query very likely runs at that right so it's like fine I don't care about phantoms that's just a small change to the database if I've read a record someone updates it or removes it I don't get a repeatable read on that table scan I'm okay with it I'm just building a histogram okay and then read committed is all of them can happen and you know you're in on your own basically at that point and some of this and he pulled up this I'm just using this example from Andy slides which was many years ago like 2012 or something there was this thing called silk roads that used to do all kinds of shady things but the technology wasn't shady the use of the technology for selling stuff for shady but I think the intent for setting it up for shady but it all got shut down and then there was one big thing that started to bring all of this down is because the behind every data every data platform or any application of any serious work is going to be a database system so they had a database system in which it was super easy to hack and someone figured that out and they effectively said I can do Bitcoin transactions imagine I've got one Bitcoin in my account and I'm going to withdraw that one Bitcoin but I'm gonna have like a hundred transactions do it at exactly the same time if you were run at one of the lower isolation level and didn't pay attention all of them will go through like the debit card account that the transaction that we started the beginning when we started talking about asset right you and your significant are trying to withdraw at the same time if you're not running with the right isolation level your database system is gonna let it go and that all started the huge collapse right so something like billions of dollars worth of Bitcoins one guy could remove by just firing up the same query and just making it all go at the same time it's not that hard you could say begin transaction fire my query begin transaction fire my queries it's not like you have to time it too much either right could put explicit begin and time it so you can do all kinds of crazy stuff all right so in terms of these four isolation levels that are the ones in sequel I'm not going to go into the details of all of this I'll let you look at the slides it should be pretty manageable but this tells you precisely what is protected against like serializable will guarantee that no dirty reads happen no unrepeatable reads happen no phantoms happen and as you can start to see read uncommitted is is like all of those may happen so you better know what you're doing okay there's also a way to map these into locking protocols again I'm gonna go through that at a very high level serializable basically says lock obtain all the locks first plus the index locks and use strong strict to PL so all the stuff we talked about if you really wanted that stuff you're gonna do all of that repeatable read says all of this stuff till like the last 30 minutes of what we talked about right you'll do strong strict but don't have these index locks read committed is weird to get read committed if you're using to face locking you'll do everything as above except for the S locks where after you read something you won't hold on to the S lock you will release it so you're not doing strict to face locking in that sense right you're not holding the locks till the end but that means that's how you get you know I release the locks and I get this unrepeatable read behavior okay phantom went away because we weren't doing this index stuff and so I'll let you buy yourself go through this and figure this out but it'll map very nicely if you've understood everything so far it'll be trivial you understand these slides and read committed says the only thing I'll protect against is you know writers on records interfering with each other but there are no S locks I can read each stuff happening at any point in time so there's no locks at all though there's still right locks yeah because of data races and stuff like that exactly and I'll point you to a paper where you can look at it in gory details and understand that so it's not there are no locks in read committed you'll still do the right locks okay as I said you can explicitly set these isolation levels in sql sql supports that those are the red stuff that we saw in the previous slide is are the terms and some of them will differ in terms of when do you need to set it in some systems it's like you have to declare your isolation level if you don't want to use the default one at the begin in others it can be done at the end I think my sql is at the end Postgres is at the begin and you'll see you'll have to be careful about which system you're using because it's not uniformly implemented in that way not all database systems support all the isolation level so Andy loves to collect details like this which is awesome and so this is basically he's collected a number of different database systems and what's the default level which is that middle column and what's the highest level they support and as you can see most of the popular databases that you probably played around with are used don't start with serializable they start one level lower and so if you really want serializable you're going to have to do it in the transactions and put that call in there and some of them don't even support full serializable Oracle gets pretty close to that but not quite it does something called snapshot isolation which basically says every transaction you can imagine I'm going to give you a full copy of the database I'm going to simulate that imagine I check out the whole copy of database whenever I get my transaction gets to run I get my transaction ID and then I make changes to it and then I put it back in a safe way but that's called snapshot isolation and we'll talk about that in a little bit it does not do full serializable so it won't have all the other stuff we talked about like phantom protection and things interestingly many of the newer systems including cockroach TV and Google Spanner Google Spanner is even stricter they all start with the higher isolation mode because we know that you give application programmers rope to hang themselves they will hang themselves okay so it's like yes it comes at a cost for powerism but can we make the protocols better and the prime example where someone goes even further than that like what we've talked about in this class is strictly realizable so far remember we've skirted this issue of say let's imagine I issue two transactions at the same time T1 and T2 or I issue T1 first and then T2 right after that so let's say I just split it up right after that everything we've talked about including phantom protection says as long as the database returns this back in some order it could be that the database serial schedule that it admitted was T2 followed by T1 all of that is still okay the serializable stuff does not prohibit us from doing that the Google guys wanted an even stronger guarantee for their ad system and it's a globally distributed system spanner and it kind of gets weird if for example you ask the report for saying what are my ad impressions and you get a report with a timestamp and then someone else asks for ad impressions and they don't follow the time order right if the second report with the later timestamp looks like it had fewer impressions you're going to freak out like whoa whoa how did he go back in that so for that they have an even better form which is called strict serializable and that name has evolved over time if any of you are like systems or distributed systems people you might have heard of linearizable stuff which has that property but for a single object this does it for the database which may be multiple objects and so the timestamp order of what you get back on the commit follows the serial schedule the serial schedule is true to the commit timestamp the commit timestamp when the query comes back again there's a full paper on that I won't talk a lot more about that there's this weird thing called cursor stability which is we haven't talked about it's a weaker form old systems like DB2 start with that they protect even less so if you're using that just be careful and hopefully now you have enough tools to go look some of this stuff and understand it as you read the papers right it's like teacher person how to fish that's what we are trying to do right now with all this transaction stuff because it could take you know there are people who even to this day will spend six years and write a thesis on making this better this is not yet done there are ways you can make all of this better okay right because you have all kinds of crazy things like CXL RDMA memory storage hierarchy and processors are changing so it's like we keep revisiting this so if you understand transactions or query optimization you'll have a job for life this is a very simple diagram again I'm just going to toss it up over there where the highest form of isolation level is at the top as you go down it becomes weaker and you can say repeatable reads and snapshot isolation don't have an arrow they're not quite comparable certain things that will allow versus not we can see snapshot isolation is lower than even serializable okay and again I'm going to just leave it at that and there are two more slides on this before we move on and he did a survey asking a bunch of database admins to what is the default isolation level in the database system and as you can see you know serializable not very popular okay and read committed which is one level below is the default as you saw in the previous slide and that's the world so you know if you're going to do any database you stop and even if you're not going to build the intervals of a database system you could make all kinds of crazy problems for yourself and for your application if you're not careful about this stuff okay so hopefully you're getting back through in terms of how important it is to know how these protocols work why they work and what you're getting in return what are you trying to protect against okay alright so every concurrency control protocol can be broken down into the basic concepts we've talked about you know there's a full this is my bonus slide if you look at this paper which was written in the late 90's that criticized the ANSI SQL standard which defined these isolation levels but they defined it in a vague way standards done by committees it's like Oracle's going to try to get its form in and call it serializable you know dv2 is going to try to get its stuff in ultimately it comes out in some form that doesn't look like anything that's reasonable and this paper is beautiful it's written by the 5 star gurus who understand isolation levels of which there are only a handful on the whole planet including people we've talked about like Jim Gray and Phil Bernstein and a whole bunch of others that talked about how the definition there is vague at times and that leads to all kinds of problems is like when database vendor A say serializable does it mean the same as the other one and you define it more precisely and they do a fantastic job of doing that end up with even more complicated graphs than what I just showed you over there with all kinds of very precise stuff saying on this arc to that arc this is exactly what gets violated it's a beautiful paper if you ever want to get deeper into all this isolation level stuff that's a must read paper it's a bonus paper we obviously not going to ask you questions on that in the exam but think of it as an apology from my side for telling you many times that 721 will cover that but you know this paper will definitely cover that so happy to talk to you offline about what you find in this paper if you go read it okay alright next class is already here and this class is nearly over but let's get started great I think we might make up some time in the next two lectures and and I will talk about and figure out how we're going to do the rest of it okay some really fun stuff now is multi-version concurrency control of we've talked about two phase locking we've talked about time order protocol which we said is just theoretical just getting you used to time stamps we'll start to use that today and we talked about optimistic concurrency control which was very different right and checking out objects very much like github style and checking that back in making some validation protocol that checks for conflicts unlike github where you have to fix the conflicts by yourself you know database guys are nice they try to fix it for you with the validation protocol and then you basically start to get into this next thing called multi-version concurrency control protocol and the best way to think about it is that it's a protocol that's going to have at a high level two components to it when I change things how do I manage that change so far what we've said is there are two ways to do it one is I'm going to go and update in place in the two phase locking stuff that was implicit and I'm going to grab a lock a right lock and no one else can touch it while I go make changes there right in OCC it was like I'm going to make copies I'm going to do everything here ultimately I'm going to write it when I go to that right phase right and in there in OCC we were making copies in our own local space right and we owe all the transactions at their own workspace so now we're going to start to play around with some of those ideas and then see how we can bring all of that together and figure out how we do that in the global database this workspace checking out checking in stuff is too much and we're going to try to do all of that in the global master database we'll use timestamps and stuff like that and we'll keep version change around so that we can go back and forth and really interestingly what we'll do is we will make the readers go by without doing much work we will still need some sort of two parts to this one is how do we manage the versions the MV part and the CC part is about we will still need something like 2PL or OCC or something like that or TO to go and protect two writers essentially from writing on each other stuff but readers will allow them to go through so what I'm going to what we're going to look at today is based on this thesis that came out of MIT in 78 and their whole huge rich history behind that and for a while it wasn't like people were getting super excited about MVCC as a way to do these things but essentially every database now goes about and does that and he has a little story about how that database that was for the this whole idea that came out Jim Starkley started a bunch of companies that had this type of implementation he's also the founder of new ADB and then eventually that these companies they go and get sold a bunch of times eventually became this product called Firebase which is still sold to this day and you can actually go to the website and see all the stuff that's available but when Mozilla ended and wanted to give their new browser and the new company a name they wanted to call it Phoenix first but they couldn't do it then they wanted to call it Firebird but because this database is called Firebird they ended up calling it Firefox so database guys influenced Firefox's name you know copyright infringement or trademark infringement so or little tidbit okay so what are the main ideas behind multi version concurrency control we will set things up so that the writers do not block readers effectively we'll create new versions and as a result what we'll do is readers do not block writers and readers can slide by okay we will use this notion and you see that with an example in the next slide we will use this notion of a snapshot which is kind of what we were getting when we were doing this checking out checking in business with the OCC kind of protocols it's like see I'm imagine I could make a copy of a database every time a transaction started not like literally but logically and effectively I get a snapshot and I can work on it which means others are not interfering with what I saw at the beginning it's as if everything I did in the database was at the beginning and then of course when I go to make writes I'll follow all these validation style protocols to figure out what the next snapshot needs to look like but that's the whole idea behind the snapshot idea and the specific there's a specific name for that called snapshot isolation if you look at that little tidbit I had in the last bonus slide that snapshot isolation wasn't well defined and that paper actually defined that as an isolation with very specific properties so if I have that notion when I'm making transactions I can also do things like time travel which we'll see in a second so let's just jump into MVCC as a method so we'll just look at a simple example where I've got two transactions and now I've got this database in the database I'm actually going to keep track of the value that value think of it as a record for now and in a little bit I'll tell you when that value is not a record and it becomes a column value and I'm going to have a begin and end timestamp associated with it we kind of saw this already previously in the time order protocols and other stuff it's just going to say when did someone do something to me and when did someone end and you saw previously in some of the protocols we had read and write timestamps here it's just begin and end which largely says I am valid from the begin time till the end time okay alright and then in the diagrams here there's also this first column which is called version A0 that's just to make the slides easier think of it as essentially saying I'm record A and then the 0th version of it that just allows us to refer to the examples in a far easier way and if you look at the slide I have a bonus slide from one of Andy's paper that says here's a sample example of what an actual record would look like when you're doing stuff like this okay so this is a schematic representation of what that would look like the begin and end time timestamps are what we'll care about mostly but the version numbers make it easier for us to refer to that in the examples these timestamps are not could be any one of these things that we talked about when transactions were doing this it could be a logical timestamp a counter that we are grabbing and often it's some mix of that plus an implicit transaction number and you'll see that with a couple of examples as we go through it so here's the version number the begin and timestamp let's get going transaction T1 let's assume we are assigning transactions there transaction numbers up front let's just keep life simple right so now the transaction number is going to be a proxy for the timestamp okay and imagine I just read a global counter I read it and increment it atomically and I get transaction 1 T1 and 2 of the transaction numbers start with the read read is going to go and look at the value here it's trying to read the record a and there's only one version of it so it's going to go look at it and say it will check the begin timestamp and says I'm one am I allowed to read this the end timestamp is infinity here as indicated by the hyphen essentially that's saying right now as the state of the database the snapshot as we are seeing right now is this value started at time zero sometime in the past and as far as I can tell right now it's going to continue in the future till someone else does something to it which means one can read it it's in that range of the begin and end time it's going to go read it and then T2 starts context which is over it has to write okay so now is where the fun stuff is going to begin and the big differences it's actually going to create a copy of that record in that same table and as you'll see it could be in the table a separate table and a diff table but logically assume it's creating it in the same table there are three different ways to do it but they all amount to behaving this way logically okay I did not override the old stuff but now I'm transaction 2 I need to fix something with the old stuff and say you ended at 2 so any transaction that has a time stamp less than 2 will read the first version anything that is 2 and greater will read the new version but that's not visible yet I'm not done when I'm done and I'll put an end time stamp in it everyone after a transaction time stamp of 2 will have to read this version so the version is just evolving and it's like github history on a specific file right saying what was valid for what time it's very precise right you take any point in time point to a version chain you'll get only one version that you're allowed to read okay now we are maintaining that in this table not in a separate workspace and along with this we need a little bit additional machinery which is a transaction table that keeps track of what's happening to that transaction for example if I come in and start to read the A1 value the one the A1 version of the record A I need to know is 2 still alive or is it committed or it's supported because the transaction could be in a boarding phase in which case it's like it's invalid if it's committed then I can go and refer to that so in addition to that number which is that transaction number a proxy for time in this case I will also need a transaction table in which I keep track of what's happening to that transaction transaction start out by being active and then as they make progress they could go into a commit phase or they could be in a board phase and that commit in a board may take some time if it's a boarding it has to clean stuff up so it won't immediately get out of the system it'll update its status as a board or even if it's committing it'll commit but it still may have some stuff to do it'll set itself as commit so think of it as just saying as I a new transaction trying to access these version chains I'll see these ranges begin and end which will tell me am I allowed to see it but I also need to know if the stuff that has made changes to it if it's in that range like that number 2 is it like is that transaction committed or not and I don't want to stick that in the table there because that's happening for record right so that's why you stick it outside so now our records have already gotten a little fatter right every record has a begin and end time stamp and a little bit more as you'll see in that bonus slide I don't want to have that transaction table all be then by the way I still want the transaction status to be only one place when the transaction changes status I just go and change the transaction table and make that change that transaction table is basically sitting in memory okay and we won't need the transaction table in great detail but just want you to know that you need that additional piece of information to go figure things out alright the next operation here is the read of A and so this transaction can go ahead and read that but now notice what it's going to do it's not going to read the new version which is not yet visible that transaction T2 hasn't committed it's going to read its old version because now remember previously when that read happened for that same transaction T1 the state was 0 to infinity so it had to just go pick that now it will actually encounter that version chain and say oh I have to pick A0 because 1 is between 0 and 2 okay what T1 gets assigned to and T2 gets assigned to 1 so the question is what happens if T1 gets assigned to and T2 gets assigned 1 T2 would have grabbed below all of this is locking is happening the T2 would be will have locked the version that has been stamped as 2 and that would stop now there are variants of this where you can delay assigning 1 and 2 still as late as possible and again ignore it if you don't get it and in many cases in some cases with the specific implementation of these protocols like MVCC is not one protocol it's a mechanism you can do all kinds of fun stuff with it you could even allow the readers to go by without being registered in the system so ignore that but if it were 2 followed by 1 your specific question then then the 1 which now becomes 2 will not be able to go forward till that till the outcome of the other transaction is figured out okay alright so this case T1 reads the version that it saw before right so it's getting repeatable read the whole illustration is that because you're carving up from the life cycle of a record as time evolves it has very specific points changes its state so you always know at a given time and point my transaction number which version I should read there's no ambiguity about which version I should read okay and that version is not visible like in this case I have to wait there is in this case a right lock that will happen on that so the version mechanism is just a mechanism you still have to have a concurrency control protocol to prevent the right path here's a slightly different example a little bit more complicated where you have now T1 is doing a little bit more it's reading as before so T1 starts when it starts it gets an entry in the transaction status table becomes active puts a read in there doesn't have to make any changes to the database yet but when it writes it has to do exactly what happened before that T2 was doing this is like a close approximation the question that you just asked and then it goes and creates that value 1 fixes the chain the chain is fixed by fixing the end time stamp on the old version now you can imagine logically a single link list is formed for this record A and now T2 begins T2 begins it became active so that it got an entry in the transaction table transaction status table and then it reads version A0 in this case and that is allowed because T1 hasn't committed yet but you'll see what happens next it now goes to the right phase and that will not be allowed because it has to go create a new entry in the version chain and you can't quite do that just yet because someone is creating that additional list so essentially what will happen is that the right path is going to catch it or the objects as it happens and you'll basically go and stop it at that point okay question transaction there's a lock on A2 like a physical lock you can imagine that's the simplest mechanism and when T2 wants to write it says does someone have a right lock on it so you still need to protect the chain that is getting updated with locks so MVCC is just maintaining versions and two phase locking type of protocol for example to go protect that but what it gives you is that it's going to allow the readers so now for example if someone were like this stuff over here this stuff over here where I was just reading stuff the right came in I could still go through if I'm just doing read write stuff I could go through because I made a copy of stuff effectively like the OCC stuff was doing and I got that parallelism in there okay it's basically a logical way of making that chain happen so you think can I skip the end time stamp but then I would have to imagine I'm trying to figure out is A0 a version that I'm allowed to read if I didn't have two and I'm transaction one I'll also have to read the next thing it's basically forming a singly link list I'll have to read the next entry in the link list perhaps in IO to go figure it out so that's the trade off you're making here so I thought I wanted to ask if you repeatable reads can then have different values because sometimes that too can come in from it and now you're no longer able to read the older version yeah if the transaction that comes and commits it will not okay so just wait for it for a little bit we are not going to so if T2 came in and this is like similar to the question where T2 became T1 what happens and is too gonna read that and there are it'll have to go and figure some of that stuff so just hold on for a little bit and then we'll get to that and if I don't answer it then ask me again yeah as I said you know we'll ignore the transaction status table I'm just putting up there for completeness when you do the pull protocol and I can point you to a couple of papers where it actually is important to have that because you need to understand what's happening to the transaction table and there's more technical stuff that happens sometimes the end time stamp is is a dual representation think of it as a union type between being a time stamp and being a transaction ID like the higher order bit will decide that and that allows you to say I'm not a real time stamp yet I don't have a time stamp yet but I am identified right now by my transaction number so there are all these little tricks that you can play with that and those happen when you do actual implementations yeah yeah the right right will will get blocked and read write will depend upon how you're trying to implement that what the concurrency control protocol allows and doesn't allow this think of it as a mechanism that allows you to do other things and gives you this nice property of saying I get versions that are maintained inside the table and I will still have my locking protocol so here's what I'm going to do I'm going to go all the way up to the very end and we'll come back to it but there's a whole mesh of this you can take the multi version mv part of it and then combine it with all kinds of things with it to get a whole blend of things with it okay so again as I said this will go well beyond everything we can cover in the 4 minutes that I have to cover 50 slides but we won't do that I just want you to understand what the version mechanism means the chain mechanisms means what it means to access it and data structures associated with that and then we can talk offline about these different sets of protocols and there's also a bonus slide that I have that talks about bonus means won't be material for the exam but answer all the questions that you're asking and I'm happy to take questions offline what is in this data structure what does a record structure look like this is one of Andy's paper it's a beautiful paper I'd recommend reading that and then there's a hackathon paper that actually has a very simple protocol it came from Microsoft's it showed how to do MVCC along with SQL server which is an on disk system so hackathon is an in-memory system as we just talked about and has a beautiful protocol that is really simple and it's optimized even further than what we'll talk about for the in-memory case and I had a chance to work with some of the guys at hackathon it was I don't think I contributed to any of that stuff besides the experimental section but I learned a ton from them so all the questions you're asking make sense all I want us to go and cover today is to think about that multi-version part and all the data structures and other things that we'll have to go worry about to make that happen okay there are different combinations of making all of this stuff happen and they'll give you different ways of combining all of this stuff and all the isolation level mess and stuff like that okay alright so going back to versions the MV part of it we are going to address one more issue which is isolation stuff I already alluded to as we looked at that picture about different isolation level so snapshot isolation level was one level lower than the serializable level so what was it missing it is missing something called the right skew okay so this is one more anomaly that you need to know besides the ones that we talked about you know repeatable reads and dirty reads and phantoms and stuff like that so the right skew is the following and this is also beautifully explained in that paper that I flashed as the bonus stuff in the previous lecture's material so imagine I've got a database this is an example that Jim Gray used to use and cite to explain people in like 30 seconds what right skew is I will probably take 3 minutes because he was way smarter but here's the example imagine I've got database of marbles and two white marbles we'll finish this slide and stop so I promise I won't have you here till 6 o'clock and transaction one wants to change all the white marbles to black and the other transaction wants to do the other if I just did snapshot isolation I will make a copy I will make a copy at the same time perhaps and then the first transaction will change its copy to flip things over to be all black marbles to black and the black marbles to white and what you end up happening when you put those things back the conflict is only have the first transaction only change the bottom two and the first transaction only change the top two marbles and so the diff that will merge is going to basically be that state and that's obviously wrong because it's not serializable because if transaction one had happened before transaction two you'd end up with all white marbles or the vice versa would be all black marbles so right skew can happen and snapshot isolation basically if you just take it in this little way is going to end up with this right skew and where we'll pick up is get deeper into this multi version concurrency protocol in the next class and we'll go and start working with it me with a safe set for a blow your face back I got a block on taps the vets can't trace that style is like tamper proof you can't lace that at the dominican or you could call me dominican black skelly black leather black suede timmelins my all black dirty haters send you to the pearly gates you get your solvent trying to skate and that's your first mistake I ain't lying for that cake your fam I see your weight my grandsons heavy weight and ran through every stake when they actually how I'm living I tell them I'm living great