 Real quick, some interesting events that are coming down the pipeline that you guys might be interested in checking out. So this Thursday we have another talk in our time series database series. We have the CEO of QuasarDB is a startup out of, I think, France and San Francisco. They are a time series database built on top of Postgres, so he's coming on Thursday to give a talk, there'll be pizza there. The other thing will be on Friday, November 10th, we're having what we're calling a hackathon for Peloton. If you don't know, Peloton is the new system we're building here at CMU. And so all the projects in the spring class in 15.7.21 will be based on Peloton. So if you want to get familiar with the system, prepare yourself for the class in the spring. You should come maybe check it out. The idea is that we're all going to get in a room where you teach you how to write a SQL function in our system and then you sort of team up and go off and implement something and the goal is that by the end of the day, we'll have a little party and then we'll try to merge everything in. So think of things like substring, string length, all the sort of standard built-in SQL functions, we're going to try to support that. The advantage I'll be able to also say to motivate you why you should come to this is that in the same way in this class where the first assignment was determined how you got off the wait list into the course, the first assignment will be the same thing in the spring class because our wait list is like another 100-something people. And so rather than just picking people random, it'll be the order that you complete the first assignment. So if you already understand how to write code in Peloton, then you'll sort of be at advantage over other people. And also, too, if you want to do research or do a captioning project and things like that, it's good to get your hands dirty on this because it's a bit steep learning curve. And then the other time series talk we'll have. The last one for this semester will be on Thursday, November 16th. We're having TimescaleDB, one of the co-founders, and he's a professor at Princeton, they're going to come and give a talk about their system. Timescale is another one that's built on top of Postgres. So where we're at in the course now is we've already sort of gone through on the stack on the far side here from the disk manager up to the query planning and now sort of get you to understand where we're looking at, what we're looking at for the next couple of weeks is that we're sort of on these auxiliary things that you have inside our database system to do concurrency control and recovery and things like that. So where we're at here now in the next couple of weeks is on concurrency control and the reason why the concurrency doesn't sort of fit directly into the hierarchy that I had before is that this sort of permeates throughout the entire architecture of the system. It's something that all parts of the system need to understand because in order to make sure your transactions run correctly and your queries run correctly, you need to make sure that it's not just a sort of standard thing, right? Like the query executor needs to know about this, the buffer pool manager and the disk manager need to know about these things, right? So that's why it's sort of on the side here. And this machine is now back up, so I'll start Penopto over here. So again, we're hanging out over here. The logging recovery is essentially the same thing too because, again, in order to make sure that any changes that transactions or queries make into the different other levels, we need to make sure that we understand exactly what's going on, okay? So the other thing I want to sort of go over again real quickly from last class is the last class when we talked about concurrency control, we spent a lot of time talking about this difference between complex serializability and view serializable schedules. And I said that complex serializable is essentially what you get when any database system says they support serializable isolation. And we'll do some demos of Postgres and MySQL and we'll see this. And what we talked about last class was really about these of protocols for allowing you to figure out whether you have a schedule of transactions. You want to interleave their operations. Can you figure out whether that schedule of interleaving is equivalent to a serial ordering schedule, right? Does it produce the state of the database that's equivalent to one where you executed the transactions in serial order one after other, even though we were interleaving the operations? And so, and I said before that in the case of view serializable, the main sort of example we show where we did blind rights and show that along the last right is the one that wins. We don't care what happens in between then. And then that's enough to know that we ended up with the correct schedule. And we said that the only way for the data system to know about whether it's actually doing the correct thing and achieve view serializability is if you understand what's going on inside the program. So that means you need to do program analysis or understand at a higher level what it means for something to be correct in the application. And as far as I know, no database system can do that. All right, so again, the last class was really about understanding the background of concurrently told, the background of what we're trying to achieve when we want to have a protocol that tries to achieve serializability. And so today's class is really trying to understand, well, what is one of these protocols look like, right? What do we do when we don't have the schedule ahead of time and transactions are showing up, coming and going, executing arbitrary queries. We don't know what they are ahead of time. So that's sort of, that's what the goal today is, right? So we looked at this sort of simple example I had before, right? We have two schedules, or sorry, two transactions in our schedule. And then we have a read-write conflict here because transaction T1 would do a write on A and then try to read it again. But in between that time, transaction T2 wrote over the modification to A. So when T1 reads it, it gets back the different value. And we said that again, this wouldn't have happened in a serial order. So this is technically not a serializable, this is not a serializable schedule. So what we need though now is a way to guarantee that all resulting schedules that the data system is going to execute for your transactions, right? And again, I mean this in terms of it's not a fixed schedule. Any transaction showing up at any time, it could do anything. How do we ensure that any interleaving the database system comes up with, on the fly, how do we guarantee that it's a serializable schedule? Or it's going to equivalent to a serial ordering. And so for today's class, the way we're going to do this is using locks, on the individual database objects. So now what we're going to do is now we're going to introduce a new component in our database system called the lock manager. And the lock manager is essentially going to be the traffic cop or the coordinator that makes decisions about whether a transaction is allowed to have a lock or not when it asks for one. And so what we're going to do in our protocol here is we're going to say that any time a transaction needs to access an object, either to do a read or a write, it has to first go get that lock from the lock manager. And the lock manager is going to have a global view of what's going on inside the system, what threads are running, what transactions are running, and knows who holds what's locks, and then can decide whether you're allowed to have that lock or not. So let's say now we start off in T1, it wants to do a read on A, so it has to go to lock manager and get the lock on A first. In this case here, assume there's no other transaction running at the same time, so the lock manager grants that lock to T1 on A. And then now T2 starts and it wants to get the lock on A. And it goes to the lock manager, the lock manager is going to deny this because that lock is already being held. So in this case, what will happen is the request will actually stall inside the lock manager, and essentially this transaction thread is just blocked. It's waiting to be notified later on that it can go ahead and get that lock. So then now in case the control goes back to T1, T1 will do the write on A and then read on A. And then when it's finished, it goes and releases that lock. And then now at this point the lock manager can say, all right, well T2, you were asking for A, now you go ahead and can have it. And does this read and write on A, and then it goes, and then it does, and releases it. All right, this is essentially what we're going to talk about today. But we're going to actually talk about how to do this for real, because what I'm describing here is actually incorrect. All right, it's not going to solve all the problems that we're going to solve. So for today's class, we're going to first start off talking about the different types of locks you can have. And then we're going to talk about the main protocol for today's class is called two-phase locking, which is actually the first concurrency to a protocol that was developed in the 1970s by, of course, IBM on system R. This is the first prudently correct concurrency to a protocol. It's actually probably the most widely used protocol as well. And then we'll talk about how to deal with deadlocks, and then we'll finish up talking about expanding two-phase locking to now do what's called hierarchal locking, or multi-granularity locking, where we can take locks on not just individual tuples, but actually tables or pages or rows. We can have different granularities. So real quick, I don't think we discussed this at the beginning of the semester, but I might have mentioned this. But it's important from a database perspective to make this distinction between locks and latches. So everything we're going to talk about today, when we say locks, we mean logical locks on database objects. So in a database parlance, if you say I'm taking a lock, you really mean sort of like a lock on an object. And so the idea of what the lock's going to do is going to protect, it says index, but it presents a logical context of some object from other transactions running the system. And we're going to hold our locks for the duration of the entire transaction. It's not exactly true, we'll see as we go along, but it's sort of this higher level thing that transaction has locks, right? And the key, though, is that if there is a problem in the locks that we're trying to acquire or any changes that we make, then it's up to the transaction's responsibility to roll back those changes and it's actually all the database system's responsibility. But I mean, what I mean by that is there isn't a low level thread that needs to go ahead and roll back these changes, right? At the moment that there's a problem, the database system can do this at a higher level. And then I mentioned latches definitely earlier in the semester, but the latches are sort of the low level mutex primitive that we're going to use to protect the critical sections of the database. So you guys had to do this in, when you put your hash table, right? You can have multiple threads coming in at the same time. Do you take latches to protect the hash table so that you can have concurrent readers and writers or do things in a safe manner, right? And so the way to sort of think about this is that our lock manager is going to have its own hash table to store the information about locks that it holds. So you have to take latches to enter the hash table, then acquire locks on other aspects of the database. In the case of when we take latches, we don't have to roll back a bunch of changes because it's sort of like we go in, do the one thing we want to do. If we're denied because the latch is being held by somebody else, then we just either spin and wait or just abort our changes, right? So there's this great table from a survey paper from Gertz-Graffy, the guy who came up with the volcano stuff we talked about earlier, where he sort of goes through all the different aspects of the difference between the locks and the latches. So in today's class, we're really focusing on this, right? The locks are going to protect database objects from other transactions. I'm going to hold them for the entire transactions in these different modes. And then we've got to use different ways to handle the deadlocks. Whereas in latches, it's sort of like a low level mutex and that you're going to put in your data structure. So for this class, we're focusing on this column here. In the advanced class, we'll talk more about how you implement different latches like that. Okay, so in my first example that I showed when transactions were taking locks, we just had a single lock for the object, right? But in actuality, you can have different types of locks, right? And this can increase the amount of parallelism in your database system, because now you can have possibly two transactions reading the same object. And as long as they're just reading it, then you can do that in parallel, right? And so in our data system, we're going to have a notion of shared locks and exclusive locks. So shared lock, again, is for reads, exclusive locks are for writes. And then there's this compatibility matrix that says, if one transaction holds one kind of lock on an object and another transaction wants to acquire another lock on that object, which ones can be acquired at the same time? And it's sort of obvious, right? The shared lock can be shared with shared locks, and the exclusive lock can't be shared with anything, all right? So now, when transactions want to execute, they're going to go to the lock manager. They're going to say, I need a lock on this particular object. It has to say what type it is. And then the transaction manager or the lock manager will make a decision about whether to allow that transaction to acquire that lock, and therefore it can go off and do whatever it is it needs to do. And then at some point, the transaction will say, all right, I'm done with what I need to do. I'll go ahead and release those locks back to the lock manager. And then the lock manager can then assign them or give them out to other transactions. And the lock manager has this internal lock table, which you end up building in the third project. That basically just says, all right, for this database object, here's the queue of transactions that either hold the lock or are waiting to acquire their lock. So if we go back to this example they had before, now we're going to introduce different types of locks. We can see that in T1, they'll take an exclusive lock on A. It goes ahead and does what it needs to do. And then when it's done, it'll go ahead and unlock it. Transaction T2 then starts. It takes the exclusive lock on A, goes ahead and does the right. But then it goes ahead and unlocks it. And now, transaction T1 wants to read on A. So it takes the shared lock on A, gets it cuz nobody else holds it, does the read, and does the unlock. What's the problem here? The same thing we saw earlier, right? It's an unrepeatable read. T1 is gonna read A here, but it's not reading the A that it wrote. It's reading the A that T2 wrote. But in our locking protocol that we showed before, where you just lock and unlock as you need, right? This is technically allowed to happen. But we know it's not serializable, right? So we have a conflict here. We should be really reading our own right or reading the other right, and what shouldn't happen if you execute this in serial order. So this is telling us that although just having a lock manager and different types of locks isn't enough, we actually need, my battery's running low. Awesome, all right, should be good, sorry. My battery's running low. Sorry, sorry, sorry. All right, so this is telling us again. This is telling us that we just can't let transactions lock and unlock as we go along. We actually need a protocol that says when should things be allowed to happen. And this is what a concurrency protocol does. And this is what two-faced locking is gonna do for us, right? It's the rules in which transactions are allowed to acquire and release locks when they're running. And we're gonna enforce this in such a way that we can guarantee that the end result of the database will be equivalent to one where we executed them in serial order, right? And the key thing that we're talking about today versus last class is that we don't have to know what the schedule is at the very beginning in order to make any decisions. We can do this on the fly as transactions come and go and execute queries, okay? So two-faced locking, as the name implies, has two phases. So the first phase is called the growing phase. And this is where the transactions are gonna request all the locks that it needs from the lock manager. And the lock manager then decides whether it's allowed to have them or not, based on what other transactions are holding those locks and what types of locks they are, right? If you're trying to acquire a shared lock and it's being held by somebody else with a shared lock, it lets you have it. And the second phase is called the shrinking phase. And this occurs immediately after the transaction releases its first lock, right? So if I acquire a bunch of locks, do whatever I need to do. And then if I go back to lock manager and say, all right, I don't need this lock anymore, you immediately enter the shrinking phase. And when you're in the shrinking phase, you're not allowed to acquire any new locks. You can only release locks, right? If you try to acquire a lock, the lock manager will basically deny you and import your transaction, right? So visually it sort of looks like this, right? So say along the y-axis, we have the number of locks that my transaction is acquiring over time. And so in the growing phase, I can keep acquiring locks and adding more and more and more. And at some point, I release a lock and then I enter the shrinking phase. So the way this is sort of actually implemented at an application level, it's not like you go tell the hints to the database system in SQL, say, I'm now in the growing phase, do this, I'm now in the shrinking phase, do that, right? This is an internal thing, the database system is going to maintain to figure out what phase you're in. And then lock manager can then decide, if you've never released a lock before, I'll let you acquire this new one. Or if as soon as you release one, then if you try to acquire a new one, then I'll stop you and abort you, all right? So the way to sort of think about this is that you can't do this. You can't release a bunch of locks and then actually try to acquire more. Because this would be a violation of the two phase locking protocol, all right? So now let's go back to our example that I showed before, all right? And now what will happen is T1 will start, it takes the exclusive lock on A, does the read and write, all right? And then there's a context switch over to T2, it tries to acquire the exclusive lock on A. But T1 hasn't given that up yet, so it's denied and therefore it stalls. And then once T1 releases that lock, T2 is then allowed to start running and then do the update and then it releases its own lock, right? Is this equivalent to a serial ordering? Yes, right, cuz there's no unrepeated reads, right? T1 reads its own write, there's no lost updates, right? So this is good, right? So on its own, two phase locking as I'm describing it is sufficient to guarantee that any schedule or any interleaving of the operations for arbitrary number of transactions, it guarantees that it'll produce schedules that are conflict serializable, all right? And the way to sort of think about this in what we talked about last class, it means that any schedule that we generate will have a dependency graph, or a precedence graph, that will be acyclic. But we know cycles between different transactions in those graphs. But the downside of two phase locking as I'm describing here is that it is susceptible to what are called cascading aborts. So let's go back here. So transaction T1 wants to do a read on A, then write on A, then read on B, and write on B. And then transaction T2 wants to do a read on A, then write on A. So in this case here, T1 starts and acquires the lock on, excuse the lock on A and B at the very beginning. And then when it does this update on A, it goes ahead and releases the lock. And then this allows transaction T2 then acquire the excuse to lock on A. But then later on, T2, after it does the write on B, it's going to go abort. But at this point here, T2 read the modification made by T1, right? Because it was allowed to do this because it was allowed to get the exclusive lock on A, read the change, even though that transaction has not committed yet. So in this case here, when T1 aborts, the data business is going to know that T2 read the value of A that was written to by T1. But then T1 aborted, and therefore that value should never have been written in the first place. And then now we have to go back and delete T2, yes? Right, so this question is how is the data system going to know that T2 read something from T1, T1 hasn't aborted yet? Right, so internally, the data system is going to know what is the last transaction that made this change on this tuple, right? It stores it in the internal metadata, right? And it'll know that this tuple was modified by this transaction. If T2 tried to commit, say right before this read on B, the data system would actually stall it and say, well, you actually truly can't commit till we find out what happens from T2, right? Because otherwise, this would have been, this is technically a dirty read, right? Because you're reading something from a transaction that hasn't committed yet. So if we want to be sterilizable, we can't have T2 commit and then tell the outside world, the external world, yeah, here's the value of A. Cuz technically, it doesn't exist yet because T1 hasn't committed, right? If we're trying to be sterilizable, then we don't want this to happen. When we talk about isolation levels, you can't allow that to happen, right? Cuz you assume these things are rare. But in this case here, if we want to be truly correct and follow sterilizable ordering or sterilizable schedules, then we can't have T2 tell the outside world about any value at red from T1. Cuz T1 technically has not finished. In this case here, we know that it aborts, right? So the issue, yeah, this question is, do you abort T2 or do you restart it? It's a good question. So it depends on how it was executed. If it's a stored procedure, then you can just go ahead and restart it. Because you know exactly all the program logic you need to re-execute. If it's done through SQL, then you can try to be smart. Maybe roll back to some, right before the read. If there's some query before that, you may roll that back. In general, the most, if you're not doing stored procedure, it'll just abort you. If you're a stored procedure, you can restart, right? So the issue again here is that this is all wasted work. Because we assume that T1 was not gonna abort. So we let T2 go ahead and get that lock and then do read on A and write on A. But now we abort, we gotta roll that back and then maybe do it over again, right? So if we always had this where T1 is aborting and T2 is reading the changes that T1 made, then we're just burning cycles doing nothing, all right? So this is what a cascading abort is. So in the case of 2PL, we're gonna have the possibility of these dirty reads and these cascading aborts. So we have to deal with something to handle that. And that'll be strict two-phase locking, which I'll show the next slide. The other thing I'll say about 2PL as well is that there are gonna be some situations where you could allow the interleaving in a certain way, that would still be serializable. But two-phase locking, the protocol itself would actually not allow you to do that. So we can easily come up with a bunch of examples on the whiteboard and say, here's how, if you interleave things in this way, that'll still be serializable. But the two-phase locking protocol won't allow you to do that, right? The last problem we have to deal with also is deadlocks and we'll talk more about that in a few slides, right? So the thing we wanna handle is that cascading aborts, these are dirty reads which are allowed under regular two-phase locking. Two-phase, you just have to make sure that no transaction actually commit until the transaction that modified data that you wrote actually commits. So you sort of hold them in a waiting area to figure out what the result is for the writing transaction. But with strict two-phase locking, what we're gonna do now instead is, we actually don't ever release any locks while transaction runs. We only release them immediately at the moment that they commit. So this is sort of always confusing for students sometimes, right? There's the growing phase still where you keep acquiring locks, right? But you never actually give them up. Only when the transaction goes ahead and commits, at that moment, do you actually release all the locks you hold. So there's not really a shrinking phase cuz you're not incrementally giving these things back as you run. It's just you go ahead and commit and then poof, everything goes back, all right? So the first thing we now need to find is like, what does it mean to be strict, right? Cuz this is actually a real term in critical theory. So we're gonna say that a schedule is strict if any value written to an object by transaction will not be read or overwritten by any other transaction running at the same time, until that first transaction actually commits. So in my case before where the transaction T2 was allowed to read the right from transaction T1, under strict two-phase locking, that's not gonna be allowed to happen because transaction T1 didn't commit yet, right? And we can enforce this because we take the exclusive lock on the object in order to do our right. Strict two-phase locking says you'll hold that to the very end, so nobody can run until you actually commit. So again, the advantages of this strict two-phase locking over regular two-phase locking is that we're not gonna have any cascading aborts cuz again, no one can read our dirty rights. And when we actually abort a transaction, it's actually really easy to do a rollback because you just take exactly whatever the before values of the object you'd wrote to and you just put them over top of the objects you wanna rollback, right? In the case of two-phase locking because you could have these different dependencies of this transaction read from this transaction and this transaction that's committed but he read from this other transaction, right? You could have to do a bunch of chaining to go back and how to roll back to the actual correct value, right? So that's a bit tricky, yes? Does it mean that like the strict two-phase locking? I think your question is could it increase the latency of transactions? Because now T2 has to wait for T1 to finish before it's even allowed to read. Yes, so another way to rephrase this is that strict two-phase locking is more restrictive in what schedules they allow, what interleaving they allow. And they're doing this cuz they're reducing the amount of parallelism, potential parallelism you can have. The advantage that what, there's no cascading of boards? So again, you can easily contrive a scenario like my example before where you always have cascading of boards and the data system keeps burning through cycles and never get anything actually done. Right, but I'm saying you could have a workload that does that, right? Okay, so let's look at the example here. So this is my favorite example. So I have a gambling problem, so we wanna take $50 out of my account and we wanna put it into my bookies account. And then for, that's the first transaction and the second transaction, we're gonna compute the total amount of accounts that the bookie has. And then we're gonna return that to the application, right? So in this case here it's, we have a more logic now, right? And we have a bit more locks and unlocks. And then there's this little bit here as I sort of showed in the last example where I'm saying echo A plus B, this is sort of a way to spit this out to the client. And again, this is not something you actually really do in SQL. Think of this as the return statement of the transaction, right? This'll happen in a transaction in a safe manner. Maybe it should come before, actually maybe come after the commit, right? All right, so our initial database state is that my account has $100 and my bookie's account has $100. So the first thing we see is that in the case of T1 it requires exclusive lock on A. So when T2 wants to credit the shared lock on A, that'll get denied because somebody else already holds the lock. So I was gonna say this is running this example without any two-phase locking. And so then when this guy tries to do the exclusive lock, that gets denied. But then now our final output is A plus B equals 150, right? Because what will happen is I'll take the money out on T1 here, then read A there, then update the total, right? At that point, there's an extra $50 missing in A's account that's not in put in B's account yet. So now when I get my final output here, I get I'm missing $50, right? So in this case here, the output is not correct. But now under with two-phase locking, then what we get is that the shared lock on A gets denied. The T1 then is able to take the money out of A and then get the suit's lock on B, then unlock A, then put the money in B, and then unlock B. So at no point here in under two-phase locking is the second transaction able to actually read the intermediate state of A and B during this operation. So then it always produces the correct output. And then strict two-phase locking essentially looks basically like a serial ordering because I'll get the exclusive lock on A and B in T1. And then T2 will has to basically sit and wait for transaction T1 to unlock both of these things. And then it's allowed to do the read on A and B and then produce our correct output, right? So again, strict two-phase locking is gonna allow a little more interleaving, but it's susceptible for cascading aborts. But it still guarantees that the interleaving will always be equivalent to a serial ordering, right? It always produces a serializable schedule. In strict two-phase locking, you wanna avoid cascading aborts because you assume that that's gonna be a very expensive thing to do, and therefore that's sort of wasting work. And so you're gonna limit concurrency, but then it essentially always ends up being something that's very similar to a straight serial ordering. Yes? So like, regardless of whatever you call action generation level or is it within the lock? So this question is, where does this actually get enforced? It's put inside the lock manager, right? So the lock manager will know, all right, you're transaction T1. I've seen you before, you have a lock on these things. And now you're trying to acquire a lock or release a lock and based on what you're trying to do and who you are, it can decide, yeah, here's the lock, you're allowed to have it, or you have to stall and wait. Let's say it again, is it feasible to do, what, sorry? What is it? What do you mean by schedule? So this question is, so how do I say this? When SQL, you're not gonna write lock and unlock, right? So the way to think about this is the database system sees these reads and writes. So the read basically gets expanded to be, get a shared lock on this, then go ahead and do the read, right? And so with the lock manager, the lock manager's not seeing, I need a shared lock on this. I know, by the way, I'm gonna do much of other stuff later on. It only sees that single request for that one lock at that moment of time, right? So his question is, can you look ahead in your transaction and say, I'm reading this now, but I know I'm gonna need to do it right later. So I'll get an exclusive lock now. Yes, so right, so what would happen is that you actually do what's called lock upgrading or lock escalation. So basically if I take a shared lock on T1 on A, then I read it and say, I actually need to write on this now. Then you can go to the lock manager and say, hey, I had a shared lock before on A, but now give me a right lock. And then it can decide at that point whether to grant you that lock, cuz nobody else holds a shared lock. Or it can decide that somebody else holds a shared lock and they have a higher priority than you, which we'll discuss what that is in a second. And then therefore you're denied and you get aborted. Or it can commit murder, kill the other transaction, steal its lock, and give it back to you, right? It can do all those three things. In some cases too as well, we'll see at the end, you actually can provide hints to the data system to say, all right, I'm doing a read now, but I'm gonna know I'm gonna update it later on in my transaction, so actually take a right lock now, right? It's called select for update, we'll show that in a second, yes. So the statement is the data system doesn't know exactly when the transaction, okay, so a couple of things. So how do I start a transaction? Is that the first question? Right, so we're gonna post-crash, so you just call begin, right? I think another, maybe asking is like, are there any cases where the data systems could be told, here's exactly what my transaction's gonna do ahead of time and therefore have you make decisions. So that's a whole other area of databases called deterministic databases. All right, this actually came out of Yale, it was sort of the early pioneers in this. And there actually is one startup called FaunaDB, which actually does do this, right? The, what I'm talking about here are sort of dynamic protocols, where you don't know anything about what transactions are gonna do later on. You just know that it wants to do a read, it wants to do a write, and then you make decisions about whether it can have those locks right now or deny them, okay? Okay, all right, so we go back to this view of all the universal schedules we had before. The no cascading of boards is sort of this middle area here, right? And then, within side of conflict serializable that we have, we have strict 2PL, right? Cuz strict 2PL, again, doesn't have cascading boards. And it's always conflict serializable, but it's not always gonna be an exact serial schedule, right? It's sort of, it's a super set of serial orderings, right? And two phase locking would be the conflict serializable reasons that are outside of the no cascading boards regions, right? Sometimes it will be no cascading boards, sometimes it can occur, okay? All right, so this is the sort of what I talked about before. Here's an observation about 2PL. And the issue you've got to deal with now are deadlocks, right? And we'll talk about how we're gonna do this. All right, so let's say we have a reliccable example. We have a two transactions, T1, T2. And they both wanna do a read on A and then a read on B. So we get exclusive lock on A and T1, we get a shared lock on B. And then now this guy wants to get a shared lock on A, but it can't because T1 has exclusive lock, so it gets denied and has to wait. And then this guy wants to get exclusive lock on B, it gets denied because T2 has a shared lock on that. So at this point, we have a deadlock, right? Because they're both waiting for the locks held by the other guy, and they're not gonna give up their locks until something happens, right, until they commit, all right? So this is what we actually need to deal with, right? Strict two-phase locking, and both strict two-phase locking and regular two-phase locking can have this problem, so we need something, right? And so, again, sort of an abstract way to think about this is, again, the deadlock is just a cycle in a sort of this graph of what transactions are waiting for other transactions. So we need to figure out how to break that. And so there's two ways we're gonna deal with this in two-phase locking. The first is what is called deadlock detection. The second is called deadlock prevention. So, deadlock detection is basically you're gonna let transactions acquire locks in any way that they want, right? I mean, you obviously denied them if someone else already holds it, but you just let them wait indefinitely, cuz you hope that maybe somebody will, the transaction that is gonna hold, the holding lock they need, they'll actually finish it up, and then you can go ahead and now acquire that lock and be able to run, right? And so what we're gonna have to do is we're gonna have to have a separate thread run in the background that will periodically check the lock table inside the lock manager and look for these cycles and this wait for a grab, look for these dependencies. And then it's gonna have to make a decision about when it detects a cycle, which transaction should it abort in order to break that cycle? Cuz that transaction gets abort, it will release all its locks. And any other transaction that was waiting to acquire those locks can go ahead and now acquire them. So, sort of visually look at an example like this, right? So we have T1 is waiting for T2, and so we build this wait for a graph to say T1 has a dependency on T2, right? T2 has a dependency on T3, T3 has a dependency on T1, so this is obviously a deadlock. So what we're gonna wanna figure out how to do is we're gonna pick one of these three transactions, kill them, and then that releases their locks, and then that frees up this cycle, and then we can start making for the progress again. But as you can imagine, this is actually very complicated to do cuz there's a bunch of different decisions that we have to figure out in our implementation in order to do this correctly. There's things like how often should we actually run the algorithm? Should we be running this every ten minutes? No, because anytime there's a deadlock, then we're waiting ten minutes, right? But maybe you don't wanna run every microsecond or every millisecond because now you're just burning cycles, looking through all the time to figure out whether you have a deadlock, right? In Postgres, I think the default is one second. In MySQL, the default is ten seconds. And this is something you can tune, right? You can tune this based on what you think your application needs. And then there's a bunch of questions like, well, how many transactions could possibly in a deadlock and how much should I be looking? And then we gotta figure out, all right, once we find a deadlock, which transaction do we actually wanna kill? All right, so the first thing is that when we figure out a victim, there's a bunch of different metrics or different characteristics about transactions that we can use when we make a decision on which transaction is gonna be our eviction. And then sort of related to Gus's question earlier, well, what happens if you abort a transaction, all right? Do you just restart it from the beginning or you just restart the last query? Or maybe there's the few queries that are holding you up? It depends on how the application actually invoked your transaction. If it's a stored procedure, you just roll back the whole thing and just start it from the beginning to end, right? To start from the very beginning, because you know exactly the entry point of the transaction. If it's done through SQL, like JDBC or ODBC, then you have no way of sort of rolling back the program code to go back to the point where they started the transaction in their PHP or Java code. So you send back an exception to say, all right, your transaction got aborted because of a deadlock. Please restart it, and then you and your application code have to write the handling of that and be able to restart it. I can't prove this, but it's my conjecture that I'm thinking most people don't actually do this, right? And sometimes you see 500 errors or random errors on websites, because they may not be handling their deadlocks correctly and their transactions. All right, so to do victim selection, there's a bunch of different things we can consider. We do things like picking the transaction that is either the oldest or the newest, right? The idea here is that if it's the newest transaction, then you can maybe restart a couple of times without really anyone noticing that the transaction is getting delayed. If you pick the oldest, it has other implications. You can try to do this based on how many queries they've already executed. So again, if I've executed a million queries in my transaction, and I'm deadlocked on another transaction that's only executed one query, well, it's probably better to abort them because rolling me back and actually rerunning everything would be a lot of work, right? But it also could depend on what you actually did. If one's just doing what you read, if one's not doing what you write, you make decisions based on that. You make a decision based on the number of locks that the transaction already holds, right? So if I have a million locks already, it's a big pain to go acquire those things because you have to go in the lock manager and run the protocol. But if another guy already has one lock, then it may be cheaper to go ahead and unlock them. But they may have a lock that nobody actually cares about, and I have a ton of locks that could break a lot of deadlocks. Maybe you want to abort the other guy, right? So there's a bunch of those things you have to consider. Then there's like, because if we're not doing strict to phase locking and we could have cascading abort, if I roll back a transaction, that may cause a bunch of other transactions to get rolled back. And therefore it'd be a lot more work to abort me versus the other person, right? And then above all else, related to the age is if I've been restarted a ton of times, then maybe I don't want to be restarted again, right? Some threshold where I say, I got to get this thing going to reduce my latency, so I'll go ahead and let it proceed, maybe kill the other guy, all right? So the main takeaway from this is that there's a whole bunch of different choices you can make for picking a victim. And there's not one that I can point to you and say, this is exactly what you want to use, that's going to be best for all cases, right, depends on the application, depends on what other transactions are running at the same time, depends on what your SLAs or SLOs are for meeting requirements about latency for transactions. It's a whole bunch of different things, and at least in the case of the commercial systems, you can basically tune all these different knobs, right? You can tell exactly what deadlock detection or victim selection protocol you want to use, all right? It's a very complicated piece of the system. Then we had this other question, all right, how do we decide how far we want to actually want to roll back? So the two choices are basically, again, abort the transaction in its entirety and go ahead and re-run it again. Or maybe you can roll back just the one query that's causing the deadlock, keep all the other queries they've already executed, let the transaction keep on running. Then now when they maybe try to acquire that lock again, in between that time, the other transaction they were deadlocked on, has already got the lock, did its update, released those locks. So now when you come back and try to get it again for that same query, now you're allowed to go ahead and acquire it. And different database systems do different things. So with that, let's do a demo. So what I'm going to do is I'm going to log into, in theory, if it loads. Have you ever seen that? Cool. All right, so I'm going to do this in Postgres first, and I'll do this in MySQL. Second, let's do MySQL first. So I have a Sibyl table that has two columns, it has an ID and a value. And I'm going to do this in two terminals because we want to run transactions at the same time, right? And in the context of this is just there's two tuples. I got to meet that, sorry. All right, let's give it to that last one to leave from there. So what we'll do first is that we're going to set in MySQL, we're going to set our isolation level to be serializable, right? So this is by default MySQL is not serializable. So we're going to say we want all transactions to run with full serializable protections. And then we're also going to do is we're going to set our time out for our deadlock detection to be 10 seconds, okay? So what we'll do is we'll start a new transaction at the top, start a new transaction at the bottom. And what we're going to do is the first transaction is going to update the first tuple in our table. And the bottom one is going to update the second tuple. All right, so now you can see here if I go do select star from the table. Actually it doesn't let me do this because I can't read the table because the other transaction holds an exclusive lock on the tuple that it modified. So therefore it denies my guy at the bottom here. And then we got timed out because we couldn't acquire the lock on the other guy, right? So we can roll that back now. I need to mute this, sorry. All right, so let's roll this guy back too. Let's try it again. So we'll do begin, begin. This guy updates the first tuple. This guy updates the second tuple. And now this guy is going to try to update the second tuple as well, right? So what should happen here? It has to wait because the bottom transaction holds an exclusive lock on record two, so it's going to try to acquire it and it gets stalled. Then we go down here and now we try to do the same thing, right? They try to acquire the lock on the other guy, but it recognized right away that there would be a dead lock. So go ahead and actually tells you that it actually got denied, right? And then my guy at the top was actually then being able to acquire that exclusive lock, so then it was allowed to run. So real quick, so I would say, in my sequel, when they do a rollback, they actually roll back on the last statement that calls the problem. So my transaction here at the bottom, this is actually still active. So I can still insert into the table a new value, right? In Postgres we'll see in a second, Postgres rolled back to the entire transaction. And now if I go back also to, and I do a select from the table, and I look at my own thing I modified, so I modified the second tuple, I can read my right, right? This is all still inside the same transaction, so my change is still there. Yes? In this case here, maybe I'm setting it wrong, it killed it right away. Yeah, they might be doing that, they might be checking real quickly to see whether something there, right? Question over there, yes? But you want to do what, sorry? How did this thing got delayed and then got released? Okay, so what happened was rollback here, rollback here. So now if I go back and do a select star from, we're back to where we were before. Like both, maybe auto commit, all right, that's fine, okay. So let's, we'll start a new transaction, start a new transaction. All right, so this guy's gonna update the first tuple, right? This guy updates the second tuple, right? Again, they, both transactions acquire exclusive locks on those two records that are distinct, so therefore there's no conflicts. All right, so now this guy's gonna try to acquire the lock on the second tuple in order to update it, right? And it stalls, right? And so down here, now if I roll back my transaction, I kill it. The, the, immediately you'll see the top terminal gets released, right? Cuz the exclusive lock that this transaction was holding at the bottom gets released, the lock manager says, well I know the top guy was waiting for this, now you can have it and then it immediately gets to do his update, right? Again, internally the lock, the lock manager has a table that says, I know what transactions are waiting for this, this lock. Soon as that lock on the bottom is released, the top guy can go and get it and then does the update, right? We'll say it again, sorry. Yeah, I think, I don't know if that was actually still in the same transaction. It might have, the fact that like I rolled back my insert and still allowed it to happen. No, so the top guy still holds the locks for both of those tuples. So now the bottom, again, this, this, if I should be running in serializable isolations, make sure I do this correctly. So I started new transaction here. Now when I do this, select from transaction demo. This actually, I think, this will get denied or this, this will stall because it can't acquire any of the locks up, up there, right? So now if I commit this transaction, the bottom guy will get released and then it'll, you'll immediately see all the changes that the top guy made. Well, it timed out now, so I missed it. But now here, like, right, I, I saw the, I see one or two there, I see two or two there, right? All right, so this is, this is my sequel. Let's go to Postgres now. And so we'll do the same thing. Yes, question? Some things aren't transactional, some things are not. Is it just no, no guarantee or whatever it is not a guarantee? So his question is, his question is, what happens when some things are redone in the context of a transaction and other things are done outside of transaction? Well, the way it actually gets implemented is any query that's by itself is actually a single, single query transaction. So all the same protections that you would expect will, will, will be there, right? And I, we can, I can show that real quickly. All right, so now what we're gonna do is we're gonna set, I forgot to roll back my transaction, sorry. So in, in Postgres, I'm gonna set my deadlock timeout to be 10 seconds, right? And so now I'm gonna do the same thing. In the Postgres, the syntax is different. Like the way you start a transaction under the serialized isolation level, you just say begin transaction, isolation, serializable, all right? So I'll do that here. And then I'm gonna do the same thing I had before. The top guy will do a, make sure my, should make sure my database looks correct, right? So that's what, the table looks like that. We're back in the original form. So let me start over, all right? All right, so the first guy is gonna do the update on 2.1. Second guy does the update on 2.2. Top guy wants to do an update on 2.2. What should happen? Has to wait, right? Bottom guy wants to do an update on 2.1, right? What should happen here? The deadlock, but in this case here, it eventually got, you know, got killed there. So it looks like it, looks like it, sorry, it killed the bottom one. The top guy then gets the exclusive lock, and there's a lot to do in the update, all right? And so in this case here, Postgres is gonna roll back the entire transaction. So we actually can't run any query against the table. It'll deny us and say, you're actually in an aborted transaction. You can't do anything till you roll back, right? Whereas MySQL sort of let me, was letting me do that. So here's another good point too. So his question was, what happens if you execute a query that's not in a transaction, right? So I do select star from transaction demo. What should happen here? All right, what's that? Yeah, so the top guy still has the right locks. It's actually a bit more complicated. So Postgres is using time stamps to figure out the order of snapshots. So it's saying at this point, the top guy hasn't. The top guy hasn't committed yet. So therefore, reading this is actually correct because it's the state of the database as it existed before the top guy actually made any changes, right? So now if I go to the top and I commit this, now when I go to the bottom, I see my updates, all right? So I literally in Postgres what it'll do is when you get a query by itself, you add begin to commit to it, right, as if it was a transaction. Essentially what you're doing, all right? Because you have snapshots, we'll talk about multi-verging later and time-stamp loading in a few of the classes. The way Postgres implements their version of this is that it allows you to read old data where it's in my SQL, they would block you. Yeah, so his question is, if you don't care about acquiring locks because you want to execute a complex query that reads a bunch of data and would block everybody else, can you turn this off? Yes, so we're not going to have time today, but I'll talk about this on Wednesday. They're called isolation levels. So you can say what anomalies, like dirty reads, unrepeatable reads, which anomalies am I allowing myself to have in order to get better concurrency, right, so what I can do, let's use my SQL because that was the one that had the, that had the, you know, would block the other guy. So we'll do this, we'll execute this transaction. We'll make sure that this one runs in serializable isolation. And then this guy at the bottom here will run what's called read uncommitted. So this is basically zero protection in the database system. So I can read dirty, I can have dirty reads. I can have unrepeatable reads. I can have phantoms, which I'll define later on, right? So now, if I start a transaction here, I'll start a transaction there and say this is my state of the database, right, through select star from, from, from the table, right, it looks like this. If I do that same thing, I'll see the same result on the bottom, right? So now, but I'll do an update, right? Update, okay, this is serializable, right? Because the bottom guy doesn't know that the, doesn't know the other guy is going to change something. So therefore, the bottom guy has a shared lock and therefore this guy's trying to acquire exclusive lock, so it's not going to let him do that, right? So the bottom one doesn't care about the protections, the top one does. So that, so that's why the top guy got denied. So, but now let's do this. Let's, let's try this this way. So roll back my change. Start this, start the top guy, start the bottom one. Actually, we might need to do this here. That might be the problem. All right, so this guy will do an update. The bottom guy, he does a select, we'll see that update, right? See, AC is 103, right? So now what happens here? If I roll back this transaction, that change of 103 gets, gets wiped away. But now if I run this query again, what will it see? It'll see 102, right? Because I said, I, I'm running at a lower isolation level. I'm allowed to have dirty reads or unrepeatable reads. So I was able to read what transaction at the top wrote to this table before it, you know, before it actually committed. And then after it rolled back, I went back to the earlier state, and now the change gets rolled back, and now my, my, my value got changed. So in some situations for OLAP queries, if you don't care about strong, you know, consistency or correctness of the data, you're allowed to have all these different anomalies, you can run these lower isolation levels and allow you to get better parallelism. Yeah, but I'll skip all the, I don't want my blocks. Yeah, so in this case here, it looks like they were grabbing blocks. It might be actually, let's see what happens, if I do this. But no, you can't, you can't read the bottom without being in a transaction without the other guy stalling, right? Again, in Postgres, in this case here, my SQL block and Postgres, it doesn't block. All right, so any, any question about this? Again, we'll, we'll, we'll play with this more when we do isolation levels next class. Nope. So that's deadlock detection. The alternative is deadlock prevention, and this is actually what you're implementing in, in project three. So with this is, instead of letting transactions acquire locks, and then figure out in the background later on whether there's a problem, what you do is basically at the moment they try to acquire a lock, you check to see whether somebody else holds that lock, and then you make a decision about, about to do something, right? So this one, you don't have to have a sort of separate thread or a wait for a graph or anything in the background. At the moment they go to a lock manager and say, can I have this lock? You decide right then and there whether they should abort, wait a little bit, and then kill them later, or, or kill somebody else that holds the lock that they want. So the way this is gonna work is that all the transactions are gonna be assigned time stamps that determine the, their, their priority. Basically time stamps, like when do they arrive in the system? And what we'll say is that a older transaction will have a higher priority if they have, if their time stamp is, it comes before the other one. And essentially two variants of deadlock perfection are wait and die or wound and wait. So the way they basically provide this in wait and die, any old transaction that wants to acquire a lock that's being held by another transaction that's younger than them is allowed to wait. And then at some point maybe time, time them out and kill them. But you, you, you, you, you're always, you always, you're, you're allowed to wait. But if you're younger than the other person that you're not allowed to wait, you get aborted. And then the other one is sort of the reverse, right? If the, if the younger guy comes along and the, this lock it needs is being held by a transaction that's older than them, then it's allowed to wait, otherwise it waits. All right, so, sort of think about it visually like this, right? So at the top we have T1, T2, T, T1 starts first, but the begin becomes for the begin, begin T1 occurs before the begin on T2. So therefore it's gonna have a higher priority because this time stamp is gonna be older. And then T2 though, requires exclusive lock on A, and then T1 requires exclusive lock on A, but that's being held by T2. So therefore we have to figure out what to do, right? So under wait and die, T1 is allowed to wait because the old is allowed to wait for the young. Otherwise in wound and wait, because T1 is older and therefore has higher priority over the younger transaction. The lock manager will kill T2 and then allow it to have to give its locks over to the older guy. This bottom one here, T1 starts, it's earlier, and then it gets the lock first. Now T2 starts later and wants to acquire that lock. Under wait and die, T2 will have to abort because the younger transaction is not allowed to wait for the older transaction. Otherwise in wound and wait, T2 is allowed to wait for T1. So why can't there be any deadlocks of this? The issue with the deadlocks before was that any transaction in any order is allowed to wait for anybody else, right? But under wound and wait and wait and die, we're enforcing that a transaction is only allowed to go in one direction. The old is allowed to wait for the young, not in the reverse. But the young is allowed to wait for the old and not in reverse. So we do that, there's guaranteed to be no deadlocks, because it's always in that one direction. Now the next thing we got a question is what do we do when we restart a transaction? In this case here, we always want to use the original time stamp that we assigned the transaction. When we restart the transaction, we'll give it that same time stamp because when we come back, maybe we were the youngest before, but now we're going to be the oldest. So now we'll have priority. So this avoids starvation if a transaction keeps getting restarted and gets a new time stamp. It's always going to be the youngest, and therefore it's always going to get aborted. So now again, it relates to Gus's question before. If the transaction is a stored procedure, this is easy to do because when you come back around the second time and evoking that stored procedure, you know it's the one you had before, you just give it the same time stamp. If it's in a SQL query, that's a little more tricky because you don't know whether a new connection, starting a new transaction, is the same one you had before or if it's completely different. So you don't really know how to reuse that. In the case of Project 3, you'll know this. We'll take care of this for you. All right, so this is basically deadlock prevention. And then I forget, I think you're doing wait and die in Project 3. So it just followed this basic protocol. All right, so we have 10 minutes left when we sort of speed through this. The one observation that we'll make is that in all these examples I shared before, there was a one-to-one mapping between an object in the database and a lock. And for the easiest way to think about this is you can think of this one lock per tuple. But now the problem is, let's say I want to update a billion tuples. That means I have to go in the lock manager a billion times and acquire locks for all those billion tuples, right? And that's actually, as we said, that's an expensive process because you're setting latches on your hash table or order the lock table and you have to then make sure that you're doing this for every single tuple you need to acquire or lock you need to acquire. So with lock granularity, what we're gonna allow us to do is that we can say now a transaction when it acquires the lock. It's not just gonna be on an exact tuple. It could actually be on some arbitrary object or element in our database, right? It could be on an individual column. It could be on a tuple, a page, a database, a table, right? And the idea here is that we want to allow a transaction to obtain the fewest number of locks that it needs but still allowing for a lot of parallelism. So there's sort of this trade off between sort of coarse-grained locks and maximizing the amount of parallelism you can have. Like so MongoDB, for example, when MongoDB was first implemented, they had a single database lock. So anytime you need to write to the database, only one thread could do it. It had to acquire the lock for the entire database. Whereas in PostgreSQL and MySQL and all the other database systems, they could do it at a tuple level. So in the case of MongoDB, that's really simple to implement but you reduce the amount of parallelism you have. Right, so the way to throw this is like a hierarchy. So at the very top, you have a database and then you have tables and then tuples have tables and tables have tuples and tuples have attributes. So we can acquire locks at any of these different levels in the tree. So if our transaction comes along T1 and it wants to make a modification, maybe it takes a lock on the table and then this implicitly acquires all the locks for all the elements below it. Right, so it's a hierarchy. If you acquire a lock to something's parent in the tree, you implicitly have a lock for everything below it. So let's look through an example here. So say that we have two transactions, T1 and T2. T1 wants to get my balance of my offshore Cayman Islands bank account and then the T2 wants to increase Joy's bank account balance by doing 1%. So the question is, what kind of locks should we acquire? Again, having this trade off by minimizing the number of locks we need to acquire but also sort of maximizing the amount of parallelism we can have in our current transactions. So for this, we're going to use the exclusive and shared locks for the leaf nodes in our lock tree. So in that example here, I have we had attributes. Typically you don't take locks on attributes. It's usually the tuples. But now we're going to introduce a new type of locks called intention locks that we're going to use at the higher levels of the tree that are going to provide hints to other transactions that come along and say, there's another transaction that's actually doing something down below and that way I can get a hint about what's going on without having to go down and scan everything. So an intention lock is basically a way to say at a higher level that down below there'll be a shared exclusive mode lock but you don't need to go down and check the individual elements to figure out what's going on. So this allows you to figure out earlier or in your lock manager, figure out, oh, is there something going on down below that would cause me to have a conflict and therefore I should abort my transaction or deny it from getting the request right now. Because what you don't want to have happen is maybe there's a billion tuples and someone needs to acquire a lock for just one tuple but then it's the last tuple on the list and some other transaction holds the lock for everything else and you don't find out to the end that when you get down there that one transaction can prevent you from actually acquiring the lock that you need. So we have basically three types of intention locks. We have intention shared, basically it says that at a higher level this intention shared lock says that there's explicit shared level locking going down on the bottom. Intention exclusive basically says that, same thing, right, at a higher level someone has intention lock that says there's exclusive locks going on explicitly down below. The one that always trips up students is the shared intention exclusive lock and it basically says that the node will be explicitly locked in shared mode at the moment where you have the shared intention lock but then below that you'll have exclusive mode locks. So I'm rushing this but hopefully an example will make this more clear but just in terms of a compatibility matrix this is sort of the same thing I showed at the beginning but now it's much bigger. Now we have all these different types of lock modes and so what we see is that the exclusive lock is always gonna be incompatible with everything else but now with these intention locks in some cases we can intermix them and in some cases we cannot. So let me just go through one example and then we'll finish up for the day and then I'll pick up on this in next class. So to say that again my first transaction wants to read my bank account record, my balance, right? So I wanna read a single record in our table R. For this now I'm only having a two level hierarchy. We're not doing database locks, we're not doing attribute locks. We just have tables and we have tuples. So all I need to do is just read this one tuple here. So what'll happen is in my lock manager when I start acquiring locks I can take an intention shared lock on the table and this will say that then down below now there's a shared lock on that single tuple that I need. So you always have to take locks on your way down to the tree, right? So now if I'm transacting T2 once update joys bank account, it will get here and it'll take a intention exclusive lock on the table which is compatible with the intention shared lock, right? There's hints to say that down below there's exclusive and shared locks going on and then it can take the exclusive lock on that tuple over there to do the right and these don't conflict and everything's fine, right? So any questions about intention locks? Again we'll cover this, I'll pick up on class next time and cover this in more detail, yes? Okay, so maybe I'll jump ahead with this, all right. So say we have three transactions, I think that's what you're saying. So I have transaction T1, it wants to read an R, read the table, then update a few tuples and then the T2 wants to read a single tuple, T3 wants to scan all the tuples, okay? So T1 comes along, it wants to scan, update a few tuples, right? So say that it wants to read T1, T2 and then the last one here will do a read and write. So for here we'll take a shared intention exclusive lock because this now says that I'm taking all the tuples below, I'm taking a shared lock on them because I want to read them but then it'll also take a single exclusive lock on that one tuple that actually needs to modify. So implicitly, T1 and T2 here are in, are on our shared locks. But I don't have to go to the lock manager and ask for that because I had the shared intention exclusive lock above it. All right, so now in this case here, I only had to go to the lock manager twice whereas if I have to take shared locks all the way in the bottom then I have to take, I have to grab all those locks. Now T2 wants to read a single tuple on R. So it wants to read this one here, it's not the one we're writing to. So it can take an intention shared lock which is compatible with the shared intention exclusive lock. So therefore it's allowed to then go down further and get the shared lock on that one here. But now my third transaction that actually wants to scan the entire thing here, I want to get a shared lock on the entire table because I'm going to read the entire table. This is not compatible with the shared intention exclusive lock. So I'll get denied at this point here and I'll wait before that one lock gets released before I can go down and acquire everything else. Right, so think about this, without the multiple granularities, if I want to scan the entire table I would start over here and take every single shared lock as I go along and then finally get to the last one and go, oh, this is an exclusive lock, I can't acquire the shared lock I need. So then my transaction would get aborted. So I would have done wasted work by acquiring all these locks when it's the last one I need is I can't acquire anyway. So I did wasted work. So this is providing hints ahead of time to say there's something going on down below, figure out whether you're going to be able to do the thing that you want to do and figure out early on to deny you from acquiring the locks that you need so that you don't do a bunch of wasted work. I sort of like, and with cascading boards, we want to avoid the case where we're doing much of stuff that would, we want to avoid the case where we're doing much of work that we're going to have to roll back later on. In this case here, the work we're trying to avoid is going to the lock manager. Okay, so I'm going to stop here. I'll, this is a bit rushed at the end. I'll pick up on this next time. And then I think homework five is going out tonight and this will be due much later. Okay, any questions? All right guys, see you on Wednesday.