 All right, so today is the next chapter on the current year protocols. So now we'll be talking about an alternative to two-phase locking called timestamp ordering for current year control. So just as a reminder for everyone, homework one is due a week from today. On the 12th, project three is due two weeks from today on the 19th, which will be before the Thanksgiving deadline, or sorry, Thanksgiving holiday. And I was debating whether to give it to you before or after Thanksgiving. The TAs convinced me that doing it before Thanksgiving so you relax during Thanksgiving was the right thing to do. That's not the right thing to do. If you feel like that was the wrong thing, email them and complain. So to understand where we're at, what we're talking about again is the first lecture on current year control was all about understanding what serializability means, to have a serializable schedule, complex serializability versus view serializability. And then last class, we talked about two-phase locking. And two-phase locking was a current year control protocol that we would use, yeah, sorry. I was going to show you this before. That's on the right side. That'll be at the end, yeah. So previous class, we skipped isolation levels and phantoms, that's the end of this. So you don't need to understand isolation levels or phantoms to understand what we're talking about today. It's sort of like extra, okay? All right, so again, last class was two-phase locking and I said this was a runtime protocol. You needed to understand for the system to use at runtime to generate schedules on the fly that would end up being serializable, right? Because in the real world, we connect with our application and we start issuing requests. We don't, the data system doesn't have the full schedule ahead of time. So it has to order these operations in a programmatic way to make sure that you don't have conflicts. So that's what two-phase locking does. So timestamp ordering now is an alternative approach that will produce the same effect, right? It'll generate schedules at runtime for transactions that are serializable. So the way to think about these at a high level is that two-phase locking protocols are considered pessimistic and curtail protocols, meaning they assume that transactions are going to have conflicts and therefore they require them to acquire locks before they're allowed to read and write any object in our database. So I get the lock on the thing that I want to access before I can access it. Timestamp ordering is considered an optimistic protocol because we assume there's not going to be any conflicts and we'll just later on at the end when transactions go to conflict as we go along, we don't require them to acquire any locks as they run. At the end we figure out whether this was the right thing to do or not, right? So now this is, today's class we're focusing on the second one and what's going to be really confusing about this is that the protocols we're talking about today are part of a broad class of concurrency protocols based on timestamps that are timestamp ordering protocols. But there'll be a protocol specifically called the timestamp ordering protocol. Furthermore, there will also be another protocol specifically called the optimistic concurrency protocol. So both basic timestamp ordering and optimistic concurrency control are examples of timestamp ordering protocols and are both examples of optimistic protocols. That is extremely confusing and I apologize just the way it is, but as we go along I'll hopefully clarify things. All right, so the basic idea of any timestamp ordering protocol is that we're going to use timestamps that we assign to transactions to determine the serializability order of their operations, right? And so basically we think about this we'll introduce a new function called TS so we have a transaction TI and TJ so we say the timestamp of TI if the timestamp of TI is less than TJ then the database system is going to ensure that it generates a schedule at runtime where the final state of the database of the interleaved operations is equivalent to one where we execute them in serial order where TI came before TJ, right? So TI is going to have a older timestamp or a smaller timestamp than TJ and that's going to be like as if we had a serial order where we execute TI first followed by TJ. So the way we're going to generate these timestamps are, these timestamps are essentially these unique fixed counters or numbers that we're going to have per transaction and it always has to be monotonically increasing. I mean our timestamps always need to be marching forward in time we can never have timestamps go back in time, right? Because this will mess everything up. So again, we're going to say this TS function will use to determine what the timestamp that we've allocated to transaction TI and then an important aspect of this is that the timestamp is not actually the time of when the transaction showed up in the system or when it started executing. So we'll see the first protocol they're going to assign timestamps when the transaction begins. The second protocol we'll look at will only assign timestamps when the transaction goes to commit. Right, because that's just how these protocols work. So it's not the timestamp of when I showed up. So in this class, this is the intro class we're going to assume that every transaction has one timestamp. In the sort of advanced class we'll actually look at some protocols where you can actually have a timestamp of when you begin and when you commit. But for our purposes here we don't need to worry about that for now. So there are a bunch of different ways we can generate these timestamps. The most obvious one if you just ask the systems clock what's the current time in milliseconds or microseconds and we use that to assign that to each transaction. So every transaction shows up, we need a new timestamp, we check what the current time is and we give it to it. What's one problem with this? We just had it this weekend. Daylight savings, right? So we started assigning transactions in order based on the current time and all of a sudden we have to roll back the clocks by an hour or it's a leap year or leap second at the end of the year then this is going to screw up this scheme, right? So another approach is use a logical counter and this is just I have a single variable and I add one to it every single time I hand out a new timestamp to a transaction. What's one problem with this? Using a logical counter. Yes? Well, eventually I think the counter is going to get too big and then like a lower number. Perfect, yeah, he got one out of the two issues. The first issue is that as he said, the timestamp will get, if it's a 64 bit timestamp, it'll take a while but a 32 bit timestamp, as soon as you hit two to 32 minus one, it'll have to roll over back to zero. And now all of a sudden you have transactions that have timestamps that look like they're in the past even though they're really in the future and that messes things up. We'll see how to fix that next class. Another issue with this is like, say I'm on a distributed system, right? If I have transactions showing up on different machines, how do I keep those two counters in sync? And that gets expensive. So there's not one magical way, the most typical way of people to actually do this is usually a hybrid clock. It's a combination of a physical clock like the system clock in the CPU and then a logical counter. Postgres will do a logical counter and again it'll have the wraparound issue. We'll talk about that next time. All right, so today's agenda we're gonna first start off talking about the basic timestamp ordering protocol. Again, this is what I was saying before. These are all timestamp ordering protocols but I'm recalling one the basic timestamp ordering protocol and that's a very specific algorithm. Then we'll talk about OCC or the optimistic occurrence control protocol and again, all of these are optimistic but this one is explicitly called optimistic occurrence control protocol. Then we'll talk about the partition-based timestamp ordering and then we'll finish up with what he was asking about before is like we didn't discuss isolation levels and fandoms in the last class and we'll finish up with that, okay? All right, so jump right into this. So the basic timestamp ordering protocol, the way it's gonna work is that we're allowed transactions to read and write objects in our database without locks and now every object in the database is now gonna include a two additional fields to keep track of the timestamp of the last transaction that read the object and the timestamp of the last transaction that wrote to the object. So in this protocol, we're getting timestamps and when the transactions show up in the system, right? So when I arrive, I get my timestamp and so I can use that to figure out what the serializable ordering is for my operations are. So every single time now I'm gonna do a read and write. I had to first check the read-write timestamps of the objects I'm trying to access to see whether I'm allowed to do that. And the idea here is that these timestamps are always moving forward in time and we wanna make sure that our transaction is not allowed to read or write anything from the future. And if we see that we're violating this timestamp ordering, then the database doesn't abort our transaction. So there's two cases we need to handle, right? You have to handle reads and have to handle writes. So the protocol for reads is that I check my timestamp and if my timestamp is less than the right timestamp of the object I'm reading, then I know that there's some transaction with a higher timestamp than mine that's in the future, even though it's running at the same time I am, it's in the future that has written to this object and therefore I should not allow to see that object or see that updated version of that object because I'm actually in the past. My timestamp is less than that and I should not be able to see that. So the way to sort of think about this is that the timestamp says, here's the state of the data I should see at the timestamp that I showed up and I can see anything that was modified before I arrived. And so if I see a write from somebody in the future, I shouldn't be able to see that and therefore the data will have to abort you and restart and when you restart you come back with the same timestamp, right? So now you're the older one. Actually, that might be wrong. You actually might wanna get, yeah, that actually should be, you come back with a newer timestamp. Ignore that, that's a mistake, sorry. All right, so if we don't violate this conditional here, then we're allowed to read the object and then we have to update the read timestamp of the object with either the max of the current read timestamp or our timestamp. Now what an additional thing that isn't, I don't think is in the textbook and it describes the timestamp ordering protocol is that in order to make sure you have repeatable reads under this scheme, you always have to make a local copy of the object you read after you read it. So that way if I wanna read the same thing again, I can read the version that I know that I read that I was allowed to read. For writes, yes, sorry. What if the two timestamps are equal to each other? So the question is, what if the two timestamps are equal to each other? So what does that mean? What's that? Oh, sorry, sorry, you're saying if this timestamp is equal to this? So who wrote to it? Right, so there might be two transactions that are writing to the object, but your example is Ts, the timestamp of Ti is equal to the right timestamp of X. What does that mean? Who wrote to X? Anyway, yes, Ti, exactly. Again, timestamps are unique. It's uniquely identifying my transaction. I said that in the beginning. So if the right timestamp is equal to my timestamp, I wrote to it. Therefore I'm allowed to read to it, right? All right, for writes, we have to check to make sure that my timestamp is greater than the last timestamp of the transaction that read the object, right? Because that means again, some of the transaction is in the future and has read the older version, but I'm overwriting that version, and they should have seen my right, but they didn't, because I didn't make the right yet, right? So if this is violated, or if I'm trying to overwrite a change from a transaction that was in the future for me, right? So now I'm in the past and I'm over trying to write in the future, so my old version now becomes the latest version. That's not allowed to happen either. So in this case, I have to abort, right? But if I don't violate these things, then I'm gonna update the write timestamp with my timestamp. And same thing, if I wanna make sure that I can have repeatable reads, or read my own writes, I have to make a copy of the object I read and put it into my private workspace. Yes? The table has a billion copies. What do you keep a local copy of it? So he says, if the table has a billion copies, how do you keep a local copy? You have to make a billion copies, right? So the memory you'd back for this private workspace would have to be backed by the buffer pool manager so that you could swap it out the disk if necessary. Yes? His question is, why do we actually need a local copy? So if I write to an object, right? I'm overwriting the oldest version. I don't violate these conditionals, so that's okay. Then now a transaction in the future that's still active, that's still running the same time I'm running, it's gonna end up writing that object. It's gonna overwrite my write. But if I wanna be able to read my own write and not have to abort, then I can make a local copy and I can then read it again. You don't have to do this. This prevents you from unnecessarily aborting. This is like an extra thing that you don't have to do. And also we'll see the same idea in OCC in a second. Right, because again, if I don't have a local copy and I try to read my object again, I would come back here and my time stamp would be less than the right time stamp of the transaction that overwrote in the future and therefore I would abort. So by making a local copy, I avoid this problem here. Actually we can see like these transactions being completed at these regular time stamps and with no time stamp. Wait, say that again, you can view this as what? Sorry? Like these transactions takes a little time to complete. I don't think all of the data that you write will happen exactly at this time stamp. All right, so his comment is that with time stamp ordering, you can think of a transaction that's taking actually zero time because you get one time stamp and then all your updates have the same time stamp. So as if all of a sudden all your stuff magically is there. That's what it looks like from the outside world. That's what serial ordering gives you. The serializable schedule gives you. It looks like things are happening in serial order but in reality, we have to interleave these operations because we can't magically have these changes to occur all at once. Now the partition based TO will say in a second at the end, essentially does this but our purpose is here, we have interleaving and we have to handle that. All right, so let's look at an example. Let's say a T1, T2, right? T1's gonna do read A, read B, read A. T2's gonna read B, write B, read A, write A. So the thing to point out now, over here now in our, sorry, we sign our transactions, time stamps T1, T1 gets time stamp one, T2 gets time stamp two. Over here in our database, we see that we now have the read time stamp and the write time stamp for our transactions. So our transaction starts, single thread, single core, single threads, we can only execute one operation at a time. T1 starts, it does the read on B and again, here our read and write time stamps for our database objects are all zero so assume that some transaction time stamp zero bulk loaded everything. So the very first thing we do read on B so we update now it's read time stamp B1. T2 then starts, does the read on B, same thing we update the read time stamp on B to now B2 because that's the time stamp we got when our transaction started. Then we do the write on B, update it's write time stamp to B2, switch back over to the read on A, update the read time stamp to B1, back to read A again and update it to two because again we're always moving forward in time, do the write on A, update it to two, then we both commit and there's no violations of the time stamp ordering protocol so we're allowed to finish here. As you go along, you have to update these, you're always checking the read and write time stamps inside of the database for the objects you're reading or modifying and as long as your time stamp does not violate the conditionals I showed before then you're allowed to do the operation. Yes. How do we determine the initial time stamp T1 being one, T2 being two? So his question is how did I determine that the time stamps being assigned to T1 and T2 were one and two? Right. If we have them the other way then. Right, so his statement is if I switch them the other way then one of these guys would have aborted, T2 would have aborted, right? Yes, so, but for this particular example, I mean I made it to B so it doesn't abort and I'm just saying like T1 started and on a basic time stamp ordering protocol when you start is when you get your time stamp. So this is just a logical counter, right? T1 started gets one, increment the counter by one. T2 starts goes get the counter now gets two. It's just adding one when they started. So let's look at an example where they would abort. Same thing, T1 starts, does a read on A, we update the read time stamp in the database. T2 starts, we update the write time stamp in the database for to be two. Now T1 is gonna try to do a write on A, but we would see that if we try to update the write time stamp for this object, A, the time stamp for T1 is one, but the write time stamp for A is two. So one is less than two, so we're not allowed to make this modification. We're not allowed to overwrite this change. I'm not even showing what the value is in the database. I don't care. All I care about is these read and write time stamps. That's a determinant, it's gonna help me determine in what order I can have these operations occur. So when I see this violation of my conditionals, then I have to abort T1. Is this clear? All right, so we talked about this last class. What's one obvious optimization we can make for this? For this particular schedule here. Do we care about actually what the write of that T2 did to A in the final state of the database? Not really, right? Because it's gonna get overwritten by this guy here. But since we have time stamps and the time stamps are telling us our serial ordering, we actually wanna see this version of A and not this version here, right? Because T1 has to occur before T2. So there's a optimization we can apply to time stamp ordering protocol called the Thomas Wright rule. Thomas has some, from the 60s or 70s, whatever. And so what'll happen is we say if, if like before we're trying to do a write, if our time stamp is less than the read time stamp, then we have to abort, right? Because again, that's a transaction in the future, read our object that we haven't written to yet. So we can't update it. But if our time stamp is less than the write time stamp of the object, the Thomas Wright rule says we're allowed to just ignore making the write to the database, keep the write in our local buffer or local workspace and allow our transaction to continue. So nobody else will see that write because it would have been overwritten by T2 anyway. Right? So again, this will technically violate their time stamp ordering protocol, but this really simple optimization that allows us to have fewer aborts. So we go back to our example here. Again, read on A, update the read time stamp, write on A, upload the write time stamp. Now come back here, T1 wants to do write on A, but we can't because T, you know, the T1's time stamp is one, which is less than two, but we're not actually gonna update the write time stamp on the object and just ignore the write, nobody else sees it, and then our transaction is allowed to keep running and commit. Pretty simple, right? Okay, this is not the same as blind writes because this is specifically under blind writes, the last writer is what we end up seeing. In this case here, the last writer technically T1 in real physical time, but in the logical time based on our time stamps, T1's write actually occurs before T2. So the way to sort of think about what we're talking about here is like there's like this, there's a physical time of what, how the events actually occur on the chip when it runs, and then there's the logical time of what the outside world actually sees. So the logical time says T1 followed by T2, but in reality, the physical time was T2 followed by T1, and this is all correct. And this is another good example that I was saying before about how serializability allows us to have any possible interleaving we want as long as the end state of the database is equivalent to one where we had a serial ordering. So by doing this interleaving in a certain way, we can get better parallelism, we don't have to abort things unnecessarily, but we may not end up with the state of the data as you would expect, yes. So do you have to keep a local copy of A? So this question is, do we have to keep a local copy of A here? Correct, because otherwise if I then read this and I kind of go read this, I'm not reading my own write. So the basic time-stamp ordering protocol. So it's going to generate schedules that are conflict serializable as long as you don't use the Thomas Wright rule. There won't be any deadlocks like in two-faced locking because it's sort of like deadlock prevention because transactions are checking as they go along whether they're allowed to read and write something and they just abort themselves if they find they're gonna violate that ordering. One issue you have in the basic time-stamp ordering protocol is that you could have starvation for really long transactions, because I'm aborting any time I'm gonna violate my time-stamp ordering. So if my transaction is gonna update a billion tuples and as I'm going along, all these little transactions are coming along and updating one tuple and then committing right away and being done, as my long transaction will get to that, it's time-stamp will be less than the little guys that came along, so it'll get abort and roll back and it'll keep doing this and doing this and it won't actually ever complete. So I would say in practice, I'm not aware of any data system that actually does time-stamp ordering, the way I'm describing here. We were doing this in our own system in Peloton here at CMU. That was a bad idea and we undid that and we're not doing that anymore. We can talk about offline why that was the case. So the one thing I also talk about now about time-stamp ordering is that it's gonna allow schedules that are considered not recoverable. So a recoverable schedule is one where a transaction is only allowed to commit when any of the transactions who they've read changes from go ahead and commits. It's sort of like the cascading rollback stuff we talked about before. If I read something that was modified by another transaction, then I have to make sure that they commit before I'm allowed to commit. Time-stamp ordering protocol does not allow to do that or does not check this for you. And the reason why you wanna do this is because if you come back after a crash, you'll be able to have replay all the transactions that modify the database and you'll end up being able to read those changes again and not worry about them getting rollback or missing. So here's what I mean by this. So in this case here, T1 does a write on A, then T2 does a read on A and write on B. So here, T2 on the basic time-stamp ordering protocol is allowed to read the write by T1 on A, right? Because T1's gonna have time-stamp one, we update the right time-stamp, T2 comes along, it reads it, its time-stamp is two, which is greater than one, so therefore it's allowed to read this, but then it goes ahead and commits and we send a response to the outside world, hey, you're done. If now, later on, T1 aborts, we have to roll back all these changes, right? But we have no way of rolling back the changes that T1, sorry, T2 has made and telling the outside world this transaction actually didn't actually commit. So the textbook definition of the basic time-stamp ordering protocol would permit this schedule, which is considered not recoverable. And the way you fix this is just maintaining the read-write set of transactions to know that this guy read the modification by this guy, so when it commits, it has to wait until to see this guy actually finishes. Strict 2-Base Locking fixes this for you, right? Doesn't have this problem because I would get the exclusive lock on A and I would hold that until I actually finish. So this guy would actually never be allowed to even read it. You can sort of think that's like a speculative read. I think you're gonna commit, so let me go ahead and read something that you wrote and I'd wait and figure out to see whether I was actually allowed to do that. So the performance issues you have with basic time-stamp ordering protocol, again, as he asked about what if I update a billion tuples? Do I have a billion tuples in my workspace? The answer's yes, right? So you pay this penalty of having to copy everything into your workspace, just be able to have to recoverable reads. 2-Base Locking doesn't have this issue because you acquire the locks and then you're allowed to do whatever you want to it. And then we talked about before, the time-stamps, long-running transactions can get starved because their time-stamps are gonna be in the past and they can get tripped up by newer transactions that come and go and update things that you're trying to update as well. One observation we can make is that with sort of basic time-stamp ordering protocol and 2-Base Locking, you're kind of making this assumption that conflicts are gonna occur and you're just sort of making sure that nobody causes problems as you go along. The basic time-stamp ordering protocol is a little more optimistic than 2-Base Locking, but it's still, you're checking at every single modification or access you do. So what if we made the assumption that conflicts are mostly gonna be rare and that our transactions are gonna be mostly short-lived, meaning they're not gonna stick around and do a large number of updates. Most of the times you're only updating a small number of things or reading a small number of records. We talked about it before and we talked about OlaTP applications. When you log into Amazon, you only access the data for your account. You maybe update your payment information, update your address, buy some items and make payments and things like that. You're only touching a small number of tuples. It's very rare for a transaction to come along and touch a billion tuples all at once. So if you assume most of their transactions are gonna be small and the conflicts are rare, then having to have transactions acquire locks before they do anything or update these read and write timestamps as they go along is actually adding additional overhead for conflicts that are actually never gonna occur anyway. So you're sort of being overly cautious. So a potentially better approach is to optimize for the non-conflict case because that's the most common thing. So this is what the optimistic occur trigger protocol does. And again, it's the timestamp ordering based protocol that is optimistic but it has the name optimistic occur trigger protocol. So this is actually was invented here at CMU in 1983. So this is done by HT Kung who is now a professor at Harvard. They basically in the 1990s go fix up Harvard CCS because it was actually one of the best people the most the premier expert in the world on concurrency troll and transactions was a professor at Harvard, this guy named Phil Bernstein. All the papers and all the seminal work in the late 1970s, early 1980s on transactions was invented by this guy. He got denied tenure from Harvard because they didn't understand what a database was. And so nobody ever wanted to go to Harvard for CS as professor because if Phil Bernstein couldn't get tenure then what chance do I have? So it's HT Kung from way from CMU. All right. Well, so the way OCC is going to work is that just like before we're going to have a private workspace work transaction but in the basic time same ordering protocol I said that was optional and OCC here you have to have it. That's how the protocol works. So anytime you're going to read an object you're going to copy into your private workspace and just read that thing over and over again. Anytime you make a modification you're first going to copy it into your workspace and then modify it in your workspace. Then when you go to commit then the data system has to figure out what changes did you make in your workspace and what are the other transactions that are running right now and look at their workspace and see whether there's a conflict. If there's no conflicts then you're allowed to install all your changes in your private workspace into the global database and then everyone can see your changes now. So the way this is going to work is to be broken down in three phases and again this is more confusing terminology because there's a read phase, write phase and sort of read phase, validation phase and a write phase. In the read phase it's more than just read you actually can do reads and writes for whatever you're going to call it the read phase. So in the read phase this is when you're going to track all the changes that transactions made or what objects they've read in the private workspace. Then when you go to commit now you automatically enter the validation phase and this is where the data system figures out any of the changes you made will conflict with other transactions and then if there's no conflicts then you enter the write phase where now you apply all your changes from your private workspace into the global database. Again, the read phase includes both reads and writes just to be clear. So let's look at an example. So the first thing to point out is that now in our global database I got rid of the read timestamp. I only had the write timestamp field because I don't care about what the timestamps of the transactions that are reading objects because they're going to do all that down in their private workspace. The other thing I've now added in the schedule is these little boxes with these labels for the different three different phases. Read, write, read, validate and write. So just like in two phase locking when I said you don't really have these explicit lock and unlock commands the data system does that for you you don't explicitly tell the data system oh put me in the read phase put me in the validate phase. This happens automatically in the covers. I'm just showing you the boundaries here to say what state the transaction is or what phase the transaction is in at these different time steps. So we start off, transaction begins and then it automatically enters the read phase and this is when we'll instantiate a private workspace for our transaction and initially it's empty. And as I said to Amadu this private workspace is made back by the buffer pole manager because if we get too big and we run out of memory we want to be able to swap this out the disk. So now T1 is all set up ready to go so does the read on A and so what we're going to do is we're going to copy the current value of A from the global database into our private workspace. We'll also include the right timestamp. So then now we do a context switch. T2 starts again and enters the read phase. It begins with instantiating its private workspace and then we do a read on A and we copy that into our private workspace. So now the transaction is going to tell the data system to commit. So I know I'm drawing commit here. This is sort of meant to be when the database system says you're actually committed but assume from the terminal or from the application I'm telling the data system go ahead and commit me. So when this occurs now we automatically enter the validate phase and we need to figure out whether we have any conflicts in any of the transactions. So at this point here when we enter the validate phase this is actually when the data system will provide us a timestamp. So again to point out here T1 started first T2 started second but T2 entered the validation phase first before T1 so therefore T2 gets timestamp one and that's the serial ordering we want to guarantee with this. So we do our validation phase it's super easy to do because this guy has only read this object and this guy didn't read that sorry this guy only read it and this guy only read it. There's no such thing as a read-read conflict so we're fine. So we pass the validation phase then we do enter the write phase where we can apply any changes we made to the global database but again this is a read only transaction so there's no changes made. So we go ahead and just commit our transaction and blow away it's private workspace. Then we switch back over to T1 it does the write on A so now again we're gonna apply our change directly into our private workspace here but the thing to point out is that I've updated the value and I had to update the right timestamp but I set the right timestamp to be infinity. Why? Exactly you don't know when T1 is committed at this point you don't have a timestamp right in the read phase transactions you don't have timestamps so you set it to infinity some forever long point in the future. Then we enter the validate phase we get now our timestamp T2 so now we can go back and when we apply this we do our validation still haven't set our timestamp in here yet still infinity but if we pass now the validation phase and we do our write then we update the global database and now here we set our timestamp to be two. Right? So at a high level this is clear obviously I'm being very hand wavy on the validation phase but we'll go over that in more detail next. Okay, so validation and the basic idea here is that the data system is going to ensure that it only permits serializable schedule schedules for our interleaved operations. So what's gonna happen is the transaction that wants to commit that's in the write phase is gonna check with other transactions that are active to see whether it has any read write or write write complex and if it does it'll make sure that those conflicts are always going in one direction. So you can have different types of validation you can afford direction and reverse direction we're only gonna go in one direction to keep it really simple here. Right, so we're gonna say that older transactions don't conflict with younger transactions and if we do then we abort ourselves. So that means we only need to check what other actual transactions are happening right now. So the way this is gonna work is that we're gonna have to have a global view of all the transactions that are active and we have to allow our internal databases and the threads to be able to go inside the private workspace and see what they have in there. Right, so we're gonna keep track of the read write sets of transactions which we're doing anyway in our private workspace and then we just go to see whether our read write set conflicts with the read write set of other transactions. So if our write set conflicts with the read write set of other transactions. So for our purpose here just to understand the protocol we're gonna do this what's called serial validation. So we're gonna have a big fat latch in front of the validation phase and only one transaction can do validation at a time just to keep it simple to understand the protocol. In a more complex scenarios in sort of real systems you actually can do this validation in parallel. But it's a little bit more complicated and we're not gonna cover that here. But for our purposes again, only one transaction is in the validation phase. For the write phase it doesn't actually have to be single threaded. You actually can just order the writes in such a way that you don't have any conflicts or deadlocks but for our purposes we don't care for understanding the protocol at this point either. All right, so again and we enter the validation phase transactions will get a time stamp and then we're gonna check to see whether whether our transaction conflicts with all other actively running transactions as well. And so if our time stamp TI is less than a time stamp of another transaction TJ and again TJ is going to be actually running transactions are not in the validation phase so they're running and so their right time stamp is infinity. So basically if our assigned time stamp is less than infinity meaning they've written something that they don't have a time stamp yet that's why they're always in the future then the one of the following three conditions must hold in order for our transaction TI to pass the validation phase. So the first one is most obvious, right? If TI completes for TJ even begins then TI is allowed to commit, right? So here TI starts, does read validate write then it commits and it's done, right? T2 starts, does revalidate write at no point did T2 start before T1 started or it got through the validate and write phase so there can never possibly be a conflict here, right? This is sort of trivial, right? This is just serial ordering, that's all this means. The way to think about this though is like again I'm showing two transactions but now I think if you had hundreds of transactions actually at the same time so you could figure this out and say I don't conflict with this other transaction and I'm done. The other condition would be that if TI is gonna complete before TJ before TJ starts his write phase and TI does not have, has not written to any object that has been read by TJ, right? You take the write set intersected with the read set if the intersection is empty set then I know I don't have any conflicts. So therefore I'm allowed to, or TI does not conflict with TJ and TI would be allowed to commit in reference to TJ. So going back here, right? We've been running for a while. We have some objects that are, we want TI to, T1 to commit. T2 has only read an object, A. So when we go do our validation we check to see is the write set, is the write set conflict with the read set of this other transaction? In this case, yes, right? This guy wrote A, this guy read A. So in this case here, we would have to abort T1 even though T2 is never gonna actually modify this object and didn't actually even read, wouldn't actually read our write, right? Because again, T2 is still active when T1 is trying to commit. T1 has an intersection on the read set with the write set. Therefore T1 shouldn't be allowed to commit. Because what would happen here, right? If I was allowed to commit T1, I would apply my change into the global database but, and I would get timestamp one, this guy would go to validate and commit, and it would get allocated timestamp two, but it didn't actually read the object that was written at timestamp one, it read the object that was written at timestamp zero. So it missed that write from T1. So that's why T1 is gonna have to abort because T2 will miss that write, right? The order, the order transactions, order, again, the timestamp being higher, they should not miss any updates from, sorry, order transactions where the timestamp is lower, they should not be able to write to anything that a newer transaction would end up missing, right? So we go back and throw our example here again. If now we just switch around the validation order here, so T1 end up validating first before it does the write phase though, T1 does the validation phase, and now this case here, this transaction is timestamp one, this is timestamp two, two is greater than one, so therefore that's okay, we can apply our change because we know that T2 would not write, it did not miss the modification that T2 wrote because I wouldn't have seen it anyway even though it occurred physically after the modification of A, so therefore that's a correct serializable ordering. I'm seeing blank faces, is this clear? Again, this is a good example of the difference between logical time and physical time. Logically, T2 commits first. Physically, T1 wrote to A first before T2 read to it, but that's okay, because logically T2 will commit after T1, right? So this, we did it right here, it occurred in our private workspace, we do a read on A, but we read it from the global database, we didn't see this change, and then we're allowed to commit because we wouldn't have seen this write anyway based on the logical ordering of the timestamps. Yes? I knew you weren't the only one, so thank you. I knew you weren't the only one that didn't understand. Keep going. No, no, no, T2, so here. So say I start, I do a read on A, bring it down to my workspace, do a write on A, I update in here. Now I do a context switch, this guy starts, he does a read on A, he gets the state of the data, the state of A from the database here, at timestamp zero, brings it down to our private workspace. Now we do a validate. We're a read only transaction, we don't conflict with anybody, so we're allowed to complete the validation and we end up to the write phase, we're pending for the write phase. So now when this guy goes to validate, he's gonna get timestamp two, this guy got timestamp one. Physically the write to A occurred before the read to A. This guy wrote to it first physically, then this guy read it, but he read the version up here. So then now because this guy does not conflict with this because this guy's in the past, he's allowed to install his change. So again, this is a good example of how the physically things occur in one way, but the end state of the database is logically in another ordering. Like whether this write A into one happens before the read A or after the read A, and the change doesn't really matter because this changes the write to the private workspace. Yes, correct, yes. So his statement is, doesn't matter whether this read to A occurs before after this write because the change that T1's gonna make always lands in the private workspace and T2 will never actually see that change. So this is what the private workspace does for us. So the final validation step is that our transaction Ti is gonna complete its read phase before Tj completes its read phase, and our write set does not conflict with the read or write set of the other transaction. Sort of extension of what we had before. So we go back here. So this guy's gonna read on A, write on A. This guy does a read on B and write on, read on B, write on A. So here we end up into our validation phase. We get timestamp one. Then we check to see whether we have a conflict with anybody else. This guy read something, but he didn't read A. So that's fine. So at this point here, it's safe to commit. And then T1 will come along and it does the read on A, right. So again, apply or change to the global workspace. Then we do a read on A here and it's gonna get the version that this guy had written. And again, that's correct, right. Cause it's seeing the state of the database logically after T1 is already committed. Cause T2 has not committed yet. Now I am being some hand waving in some ways about how the changes get applied in the right phase, right. Like what if this guy updated five tuples and I do my read and I end up reading the first two before the other three have been applied, right. You have to use latching to prevent that. Or additional checks to prevent that. But that's, we don't care about that at this point. Yes. Issued by two different threads. His question is, I'm assuming that two transactions are issued by two different threads. Yes, two different connections are sending these requests. But in my example here, I only have one core, one thread actually running the transactions. Okay, so OCC works really well when the number of conflicts is low. Because yes, I'm copying things, but I'm not checking every single step along the way as I read and write stuff. I'm only checking at the very end. So if I don't have a lot of conflicts, then this will be faster than two phase locking because I don't have all this extra overhead of acquiring locks as I go along. The downside though is that if the database is large and sorry, the right set of transactions or the working space of transactions and what tuples are the accessing is small and I have a lot of conflicts, then I'm gonna basically execute all my transactions, do all the queries in my transaction and then only at the end do I figure out whether I have a conflict. So if I have to update a thousand tuples and then those updates are expensive, it's only when I get to the very end that I actually go to commit that I realize, oh shit, the first guy I conflict, I actually can't commit. So I gotta undo all my work and roll it back. Whereas in two phase locking, you have to get the locks as you go along so you would never actually do that. So basically the research shows that when you have a lot of conflicts, two phase locking and OCC basically are degenerative cases of each other, they're exactly the same, right? Two phase locking, you basically spin, try and acquire locks forever, and OCC you end up spinning because you're applying what your change is and rolling back and aborting and restarting and then over again. So under the extremely high contention, they're basically the same, one's not better than another. But when there's low contention, the research shows that something that's optimistic actually is better. And where do you find that sweet spot depends on the application, right? So again, the OCC performance issues are that we have a high overhead for having copying data. Locally again, the validation phase can be a bottleneck unless you parallelize it, even then though it still is, you still have multiple threads trying to update the same state of the database at the same time. And again, as I said, when you have a lot of abortes, they're more expensive to do in the context of OCC than our two phase locking if you only find out that you have a conflict at the very end, right? Because you're doing this validation phase after you've already applied the changes you want to apply. Okay, so another observation we can make about OCC is that when the transaction goes to commit, you have to check to see whether there's a conflict with other transactions. This means you have to use latches to protect the internal data structures of the transactions, because now one thread might be writing something at the same time you're reading it, right? So if you have a lot of concurrent transactions, even if conflicts are rare at the logical level, logically meaning they're not actually updating the same tuple, but physically they could all be still going to the same small number of data structures because I have to read them to figure out what you've, I have to read your data structure, your workspace and figure out what you've actually written to. So I'll have to take latches to do that. So to do that though, again, when you have a lot of transactions, this could be very expensive to do. Another way to think about it is you guys did your latch crabbing in your B plus tree. Even if you don't have conflicts on the keys, you're taking latches in order to enter the data structure to make sure that the internal representation is physically sound. So again, that all becomes very expensive. So an alternative approach is to actually have transactions execute in serial order, literally one after other physically and get rid of all the low level latching you have for your data structures, your B plus tree or hash tables, whatever, and not worry about any conflicts between transactions as well. Because again, you're running in a single threaded mode, only one transaction can run at a time, there could never be a conflict. So this is what partition based timestamp ordering does. So the basic idea is that we're gonna split up the database into distro and subsets called partitions, which we talked about before, horizontal partitions. If you're in the no SQL world, these are called shards. And then for every single partition, we're gonna have, we're gonna assign timestamps to transactions when they show up. And then when your timestamp is the smallest one out of all the transactions waiting for that partition, it's your turn to run, right? And then you can run in a single threaded mode and not worry about conflicts with anybody else, both physically and logically. Basically, I'm just running at bare metal speed here because there's no overhead of a lock manager, no overhead of latching. So the first thing I understand, we need to understand is how can this actually work? How can it seems like this be a slow thing to do because I'm running transactions in a single thread? So the way, again, the way we're gonna parallelize this is through partitioning. And when we look at a lot of applications for old to be environments, they're actually quite amenable to partitioning and actually makes it using this particular protocol actually feasible. So let's look at a really simple schema here for something like Amazon, right? We have three tables, it's a storefront for an online shopping system. We have a customer table, we have orders, and we have order items. So a customer has orders and orders have order items. So you have foreign key dependencies going up the chain between these two tables here. So what we can end up doing is, as I said before, most of the time for your transactions on Amazon, you're updating only your account. Actually, you are only updating your account because I can't from login to my account and start updating your payment records, your credit card information. I can only update mine, right? The application enforces this. So what I can do now is recognize here is that I have a customer ID all the way down through my hierarchy here. So if I partition my tables based on this customer ID, then I can allow transactions that only have to touch one customer to run at a single partition in this single threaded mode. So visually it looks something like this. Let's say I have two partitions, right? Whether this is on different machines or on the same machine, different cores, different sockets, I don't care, right? So the partition at the top, it's gonna have customers one to 1,000. The partition to the bottom is gonna have partitions 100,001 to 2,000. So now my application server issues a request and it wants to touch customer in partition, the first partition, I send my request, I call begin, I start my transaction and then I can update, it's missing a little thing here, sorry. It should say get customer ID one. So I can update customer ID one and then I go ahead and commit. And at no point did my transaction ever have to communicate with this other partition here, right? So having this single threaded mode for this particular type of workload is actually can make things run really, really fast because I don't have to worry about locking of tuples and I don't have to worry about latching of data structures. So this partition-based timestamp ordering is used in a couple systems. It's an old idea that was first proposed in the late 1980s, early 1990s, right? And again, the idea is that when transaction shows up, you get a timestamp and then when it's your, you have the lowest timestamp, then you're allowed to run. So this is sort of like a locking scheme but it's not like two phase locking, right? There's only one giant lock for the partition and when my timestamp is the lowest, then I get it. So this works really well if you only touch one partition. If you have to touch multiple partitions, then I have to wait, I can't run my transaction until I acquire all the locks for the partitions I need to touch. Of course now this means that I need to know what my partitions are ahead of time. So the oldest system that actually does this is KDB from, it's a financial time series database out of New York. HStore's system I helped build when I was in grad school. This was later then commercialized as VoltDB. HStore was also worked on by a professor at University of Maryland called Dana Boddy who when he was at Yale he invented a system called Calvin which is sort of doing a variant of this and that idea has been since commercialized in fauna DB which is a newer startup, right? And again, in all of these systems, actually for KDB I don't think works this way but for fauna and VoltDB and HStore, you have to know what the partitions are ahead of time, right? In order to know what you need to lock before you're allowed to run. Because otherwise if I run and I'm running a partition one and I say, oh, partition two, I can't just go grab the lock for it because that might put me in a deadlock. I have to abort my transaction, roll back and come back and say, give me one and two. And then when my time stamp is the lowest at both of those partitions, then I'm allowed to run, right? So again, for reads, I can read anything at the partition I hold the locks for, right? And the second part is exactly what it said. If I try to read something at a partition I don't hold the lock for, then I have to get aborted and restarted. For writes, same thing. I only have to have a single version. I don't have a private workspace. I have to maintain an undo buffer to keep track of the changes I made. So if I abort my transaction, I have to roll them back but I don't have a private workspace. I can overwrite exactly the master version of the tuple. And again, if I try to write to something that I don't hold the lock for, I get aborted. So visually it looks like this. So say we have now two servers, one and two. We have two partitions and the partition on the customer table by the customer ID. And then say that these two transactions one of both servers wanna start a transaction on the first server here. So again, there's this queue that's gonna keep track of when these requests show up. And you're gonna assign them time stamps, right? So server one's request gets time stamp 100, server two gets time stamp 101. And then whatever transaction has the smallest time stamp is now granted the lock for this partition. And it's allowed to run and do whatever it wants, right? So they can do, get customer ID one, do whatever changes it wants, then it goes ahead and commits. And now there's no validation, there's no checking you see whether I have any conflicts because there can't be any conflicts because only one transaction was running at this time. So I go ahead and commit. Now the second request here becomes the oldest time stamp. So then it's given the lock and then it's allowed to run and do whatever that it wants to do. All right, so we're doing locking here. But again, we're not doing two-base locking because the time stamps are telling us in what order we have to acquire the locks. So one additional thing I'm not talking about here as well is that since I may need to touch data at two partitions, I have to send two requests to those two different partitions to get those time stamps or get the locks with them. And so before I'm allowed to say, all right, you're now the oldest time stamp, go ahead and start running, I had to wait a little bit, like five milliseconds to see if anybody else is gonna show up with a time stamp that's less than me. And therefore they should have given the lock. But if now if I give this lock to 101 and somebody shows up with now a time stamp 99, that time stamp 99 transaction is gonna get aborted because time stamps always have to be going forward in time. So I missed my window, I couldn't acquire the lock and I had to come back and try again with a higher time stamp. So again, this is super, super fast. If you don't have to touch the data at multiple partitions, if you don't touch data at one partition, because you don't need latching your indexes, you don't have any latching in your data structure, your buffhole manager, everything is super fast. But what's down the downside is in my example, I was showing these network round trips, right? I was beginning transaction, then sending a request, getting a response, then going commit. If you're going over the network with those round trips, that's actually gonna be really slow because now the data system is basically idle at that partition, waiting for your response over the network. So the way systems like HStore or VTB get around this is that you have to run all your transactions as stored procedures, right? So now there's no network round trip, everything runs really fast inside the machine. Calvin and Fauna do something differently. They basically run your transaction once, figure out all your reads and writes in sort of this reconnaissance mode. Then when you go to commit, then you actually run it for real and see whether your reads and writes actually match what you had before. So that allows you to figure out what you were gonna touch ahead of time. Yes? His statement is, can we view this as a type of lock escalation? What do you mean? We are doing this because we don't wanna issue a lot of requests already. Correct, yes. And latching, we don't wanna have, we don't have to protect our data structures anymore. Yeah, so like in the locks, we only have like a whole DB level or a turbo level and we need like a turbo level. So if we just like a group of some tubal together and let's see if we can lock these tubals all at once. So you're not like kind of like equivalent to the user approach. All right, so I wouldn't call it escalation. Excalation means I have a shared lock and I want like a right lock. Yes, you can think of like a bunch of tubals as being a partition. And that would fit in your hierarchy, right? A table can have partitions. But the sort of difference about this is like within a partition, there's multiple tables. So in my example here, my first partition here has all the customer's orders and order item records where the customer ID is within this range. So it's not quite the same as the hierarchy that we talked about last time. Okay, so again, multi-B guys are coming at the end of the semester and they'll talk about this protocol more. This works great for transactions. You can't run any faster than this because it's single threaded, single versioned. It does not work well for analytical queries if you have to touch multiple data at multiple partitions and you want to run that in serializable mode because now you're holding basically locks for the entire database or multiple partitions. Okay, so this is the stuff that we skipped last time. So one thing to point out is that everything we talked about the last three classes have assumed that we have transactions that are just reading and writing data and we didn't worry about doing inserts. We didn't worry about doing deletions, right? We didn't worry about the number of tuples changing in our tables. But if you start doing that, then you have problems that two-phase locking and timestamp ordering actually don't solve. So let's look at the example here. So now instead of showing reads and writes, I'm actually showing you SQL queries. So the first guy is gonna T1 from the start and say, give me the max age of all from the people table where their status is lit, right? Let's say that there's some 70 year old Gimmer and he's lit, right? So the value we get back is 72. But then we do a context switch, T2 starts, and it inserts a record where the age of somebody that's lit is 96. Now I come back and do my same query as before and I get 96, not 72. So would two-phase locking solve this problem? I've already said no, but why? Say you're locking at the tuple level, right? I can't lock a tuple here that doesn't exist, right? So this is called, this is an anomaly called a phantom. And this occurs because T1 can only lock things that exist when it reads, when it runs. And another transaction comes along and starts inserting and deleting stuff and things start disappearing, right? That can cause problems. So complex realizability in the way that we talked about so far only works for individual items, right? If the number of objects are fixed, if you start doing more complicated things like counts and aggregations and other things like that, two-phase locking will not help you. So the way, there's a couple of ways to solve this. One is to do what's called predicate locking. And the idea here is that instead of actually locking like I want lock tuple A, tuple B, I can declare a predicate that would have my where clause in my query and say I want to lock anything that satisfies this predicate, right? Even if it doesn't exist yet, right? So in my example before, I can do my lookup where status equals lit. And then now if I declare that as a predicate lock, anybody that tries to insert something that satisfies that predicate would not allow to be able to proceed because I hold the lock for it. So this sounds like this is exactly to solve our problem. And in theory, you would be correct. This is actually the ideas they came up with in 1970s at IBM. But in practice, this is actually really hard to do and nobody does this, right? Because again, for simple things, like something equals something is fine. But when you start talking about ranges, conjunctions and disjunctions, this becomes a big mess because it's sort of a multi-dimensional space and you're trying to decide whether your predicate or whatever it is that you're trying to do intersects with some predicate lock that somebody already holds. So nobody actually does this. The hyperguys, the German guys do something very slightly similar to this, but not exactly the same. The general idea of I'm gonna lock based on conditionals like this, nobody actually does. But we can actually rely on the indexes we built, the B plus trees, to actually do something that's very similar or almost exactly the same thing. So the idea is that with index locking, we can start locking the individual nodes in our index and use that as a stand-in to doing predicate locks. So in this case here, status equals lit. I could find, if I have an index on status, I could find where all my entries are, where status equals lit in my index, take a lock on that node in the index and now that's essentially the same thing that is taking the predicate lock. So again, it only works if the index is already created on the thing that you want to take a lock on, right? Where things get tricky now is if there's no record with status equals lit in your index, then you end up having to take a virtual lock to say, had I had a record that said status equals lit, here's the location of where it actually would be, like the gap of where it would actually occur and you can take locks on those individual things. And again, when I say a lock, I'm talking at a physical level inside the index, but it falls under the same logical locking protocol of two-phase locking. So I had either deadlock prevention or deadlock detection to make sure everything turns out okay. If I don't have an index already created, I can't do this. The alternative was then just to take a lock on every single page in time at entire table and prevent anybody else from inserting something while I'm doing my scan. Of course, that's not actually feasible, right? I could also take a lock on the page and prevent anybody from, you can read things, but you can't do any inserts or deletes. You can update things, and I can check that as you go along. But again, as far as I know, nobody actually does this. The third approach is to do what are called repeating scans. So basically when I go to commit, either in two-phase locking or timestamp ordering, I just rerun all of the scans that I actually did when my transaction ran, see if I get back the same result. If I do, then I know that nobody else has changed anything and I'm allowed to go ahead and commit. If I don't, then I know that somebody inserted something that I didn't see that I should have seen, and therefore I have to roll back and restart. So the only commercial system that I know that actually does this last approach is Hecaton, which is an in-memory storage engine for Microsoft SQL Server, right? They essentially do the same thing. It's an option though, and then Silo is an academic system out of Harvard that does this. So I also mentioned this last class. So this is an example of an anomaly called a phantom. I read something that wasn't there and I read it again and it appears. And that shouldn't happen if I'm running in serializable isolation level. But as I said, the dirty secret about transactions in the real world is that most systems don't actually, most applications don't actually run in serializable isolation mode, right? So serializable isolation is sort of the most strictest thing because it'll generate schedules. It is actually not the most strictest, but from our purposes it is. It generates things in, you know, there's equivalent to a serial ordering. But enforcing that can be very expensive. And as we saw, if we have conflicts that violate that serializable ordering, we end up having to abort transactions and that ends up being wasted work. But in reality, in a lot of cases, some applications are okay with conflicts because it won't actually affect the correctness of what's going on. So this is where isolation levels come in. So isolation levels are gonna allow you to control what behaviors transactions are allowed to see or what anomalies or conflicts that are transactions are allowed to potentially see. So, and the reason why you wanna run at a lower isolation level is because you're gonna allow transactions to have more parallelism because the decision is not gonna worry about, oh, you know, you shouldn't be able to read that, let me kill you, right? It says, yeah, you can go ahead and read it, that's fine. So we've already talked about dirty reads and unrepeatable reads, and then the example that it's had now with the status and the index locking, this is an example of a phantom read, right? I'm reading something that was there and now it's not there if I try to read it again or I did a scan on a range and things pop in and pop out within that range. So the, I'm saying no, but no is in green because that sort of from an academic standpoint, it means a good thing that you don't have these anomalies. And then instead of saying yes, I'm saying maybe because it depends on what other transactions are running at the same time and also slightly depends on how the data system actually implements these isolation levels. We don't have time for a demo, right now we'll do this next class, but in the case of Postgres, you can tell it I wanna read uncommitted, but it actually won't, right? Because the way Postgres implements this actually won't let you actually do this, right? The Postgres would have to do extra work to make it so it actually is truly read uncommitted and that's stupid, like who cares, right? You're sort of getting something for free. So the way you're actually gonna implement this in the context of least two-faced locking is that for serializable isolation, you have to pay all the locks first, plus the index locks to avoid the fairness we talked about before plus doing strict 2PL. If you do all those three things and you get serializable for free, or not for free, they get serializable isolation level and this is what we talked about so far. If you wanna have repeatable reads but not serializable, then you don't do the index locks because you don't care about the phantom example that I showed before. Then as you go down, you turn off more and more stuff. So the way to set this when you run transactions, you can do this in two ways. You have the set command, or if you begin a transaction, you can tell what isolation level you wanna start with. I think this is in a SQL standard. Most systems, except for my SQL, at least 5.7, will support this. I don't think SQL Lite supports this, right? And as I said, not all systems actually will support all these isolation levels. Some of them will lie to you and tell you that, yes, I have serializable isolation level, but I'm really at a lower scheme or lower isolation level. Oracle does this, or some of them will lie to you and say, yes, you're running at the lower level, but you're actually at an upper level. The only way you actually know whether what level you're really at based on the textbook definition on these anomalies is reading the manual, right? There's also some other scenarios where when you start looking at distributed systems, that maybe it'll have run to one isolation level when all your transactions touch a single box, but if you touch another machine, it'll run at a lower isolation level. These things get very complicated very quickly. So this is a table that Peter Baylis, who's a professor at Stanford, came up with it on his blog in 2013, and he basically looked at a bunch of very popular database systems and said, what is the default isolation level that they provide and what's the maximum isolation level that they provide? So the first thing to point out is that only two of them, Ingress and VoltDB, by default, give you serializable isolation level, right? So VoltDB, at least, last I checked, you actually can't run at a lower isolation level, won't do this for you. For Ingress, I actually don't know. Now I'll say both of these systems, both Ingress and VoltDB were companies founded by my PhD advisor, Mike Snowbreaker. So he loves, you know, he's an academic who built database systems and went for serializable. Everybody else is running at a lower isolation level, right? Most of them are running me committed. My SQL runs at repeatable reads, which is actually more stricter. And again, for a lot of applications, this is fine, right? You know, Facebook is running at repeatable reads. One of the largest, it is the largest MySQL installation in the world. And for them, that's fine. In terms of maximum isolation level, again, some of them support serializable, some of them don't. The one thing to point out here though is Oracle, their highest isolation level is this thing called Snapsell Isolation. We'll talk about this in the next class, but again, I don't think you can declare I want Snapsell Isolation in SQL, right? You say you want serializable isolation level, but what it actually ends up giving you is something lower called Snapsell Isolation. And the details of this don't actually matter. The last thing we can do is also provide hints to our database system for transactions, say whether our transaction is gonna actually modify the database or not, right? So when you start a transaction, you can say you want it to be in a read-only mode or a read-write mode. And in the case, if you go into a read-only mode and you try to do an update, the database system won't recognize that you're not allowed to do that and abort you and roll you back. And the reason why you want to do this is because if you can tell the database system ahead of time I'm gonna read-only, then you don't have to do a month of the checks that we've talked about so far. I don't need to maintain a write set. I don't need to worry about any transaction conflicting with me potentially because I'm never gonna modify the database. So this is like sort of a select for update that I showed the last class. This is an additional hint to the database system to allow it to improve parallelism. All right, any questions about isolation levels? Okay, I'll give demos on Postgres in my SQL Next class. You'll see this in further detail when we start talking about multi-versioning. All right, so the main takeaway of all this is that basically from the last two classes, two-phase locking, time-same ordering, every concurrency protocol that I'm aware of that is in the literature has to be one of these two types, right? You're either gonna use timestamps to figure out what the order of things ahead of time or you use two-phase locking or locks to prevent people from accessing stuff before they're allowed to touch it, right? The other thing I didn't include in this lecture is any benchmark results because I don't wanna give you the wrong idea that one protocol is actually gonna be better than another. It depends on the application. It depends on the implementation. So we'll look at some examples in this class. They'll also show you that it's not just the concurrency protocol that matters but it's all this other stuff that actually how you implement your system will affect the runtime performance for transactions, right? If you have a crappy index, if you have a crappy B plus tree, then all these transactions are gonna run really, really slow no matter whether you're doing two-phase locking or time-same ordering. So there's a lot of other factors that are involved in determining whether one protocol will be better than another for an application. Okay? All right, next class we'll talk about multi-version concurrency protocol. This is another confusing aspect of this because there's no protocol called multi-version concurrency protocol. There is, but there isn't. So multi-versioning is the idea of just maintaining different versions of objects inside our database. Sort of thing like this is the private workspace you sell for OCC but instead of just having one private workspace and then the global database, we can have multiple versions all over the place. Okay? All right, guys, see you on Wednesday. That's my favorite all-prime job. What is it? Yes, it's the SP Cricut IDES. I make a mess unless I can do it like a Gio. Ice cube with the G to the E to the T. Now here it comes, Duke. I play the game where there's no rules. Homies on the cusse, I'm a foobus, I drink brook. With the bus a cap on the eyes, bro. Bushwick on the go with a blow to the eyes. Here I come, Willie D, that's me. Rolling with fifth one, South Park and South Central G. By the 12-pack case, I'm a fart. Six pack, 40 act, gets the real bounce. I drink brook, but yo, I drink it by the 12 ounce. They say bill makes you fat. But saying eyes is straight, so it really don't matter.