 Bad, bad, in a wicked, in a hip-hop. Bad, bad, in a wicked, in a hip-hop. So we actually have quite some content to cover today. So let's see, but we just give you a heads up. We might go a little bit over, slightly over today, like five minutes or something, right? So because we have quite some content. So first quickly, administrative stuff. So there are actually some miscommunication between the TAs and me on the due date of homework. But essentially to make things simple, we are going to make the due date of homework four to be November the 10th, right? It's the same on the write up, on grid scope, on the slides, November the 10th, that's the due date for homework four. And then for project three, the due date is the same, right? It's going to be November the 14th or Sunday. And again, we are going to hold an additional office hour on November the 13th. But just please don't wait for the last day to even get started. Because again, many of you have used all your late days already, right? So just keep that in mind. All right, again, today, right after this class, there will be this database talk from Pinecone, right? I can talk about their vectorized database system for essentially used for machine learning. Again, I think it's a co-founder of Pinecone coming to give this talk. I think the person was actually the previous head of the Amazon AI lab, right? So now he's doing this database startup. Okay, and our content today. So last class, we sort of talk about this canonical concurrency control protocol called two-phase locking, as well as the variance of it, right? Strong strict two-phase locking to ensure that our schedule of transactions can be serializable, right? Always generate a quote unquote correct schedule. And beyond that, we also talk a little bit about a hierarchy called locking, which means that, I mean, it will be very costly for example, if a transaction or a query from a transaction needs to scan a table with a billion tuple but needs to acquire a billion logs for every tuple inside this table, right? This will be very, very costly and also kind of wasteful of work. So last class, we talk a little bit about this concept of hierarchical locking where we can actually choose what went at which level of this lock tree, if you will, that a transaction is going to acquire lock up, right? For example, if a transaction needs to read all the data from a table, then it can just acquire a single higher level log on the table instead of, for example, acquire many, many small logs on this tuple and tuple until tuple M, right? And of course, to do that, we need logs on different level in our system or in other words, this is the lock tree and to coordinate those logs at a different level in our system, we sort of introduced a few concepts here, right? We called intention lock and specifically, there are three types of intention logs. One type would be called intention share which means that we are not locking anything on the current level, for example, a higher level such as stable but we are going to lock a few leave nodes below this current level or for example, tuples with share lock, right? That's intention share. And similarly, intention inclusive by not locking anything at the current level but we are going to lock a few leave nodes below my current level with the exclusive lock. And lastly, shared intention inclusive means that we are going to grant or acquire a share lock on the current level, for example, a share lock on the table. But in the meantime, I may also want to acquire a few exclusive lock on the levels below, all right? And we are going to use these three intention logs to coordinate the logs among different transactions and at different levels in the debit system, right? So here, I'm going to talk about the specific lock protocol using those intention logs and it's actually not that complicated, right? Essentially, the transactions just acquire lock at the highest level of the debit first if it needs to access all the tuples, right? For example, if a transaction needs to read all the tuples in the table, it will just first try to acquire a lock on the table level, right? But there are a few circumstances that would prevent such lock to be acquired. That's how we use intention locks to coordinate different things. For example, essentially, if we want to get a shared lock or intention share lock on a specific node or specific level in our lock tree, then this transaction must hold at least one intention share lock on the parent node, right? I mean, again, to give you an example, if I want to hold a shared lock or intention share lock on a specific tuple, then I must hold an intention share lock on my parents and the grandparents, et cetera. For example, the table database, et cetera, right? Up until the node. I need hold intention share lock on all my ancestors. Makes sense, right? Because in my ancestors, I have the intention to acquire a share lock, I mean, from the tuple. And similarly, if I want to get a exclusive lock or an intention exclusive lock or shared intention exclusive lock on a node, then I must hold at least a intention exclusive lock on my parents, grandparents, et cetera, up until the node. And use that, we can coordinate from with other transactions that, hey, I'm accessing this part of the subtree of my entire hierarchy of the data, then other transactions may or may not acquire certain locks accordingly, right? And then I had a slide from a last class I didn't show today that have the compatibility matrix of different shared locks and intention share locks, et cetera, right? And when transaction want to acquire locks, it follow these rules as well as it will just check the compatibility matrix that showed from last class to see whether it can acquire a lock or not. So now to give you a specific example, right here. So you just have two transactions, right? Transaction T1, I want to get some balance from my, I want to check the balance of my bank account. And then the transaction T2 just increase Andrew's bank account by the balance by 1%, right? And then again, we'll use exclusive and shared lock for all the leaf nodes of the lock tree I showed you before. And then for everything else, for the internal nodes, right? We'll just use intention locks, try to coordinate with other transactions on what I want to do, right? Okay, here. So let's say, assuming that we only have a, for simplicity, right? We're assuming that we only have two levels here. And actually in practice, usually tuple level is just the finest level or the lowest level. I mean, usually people will not go to the attribute level to acquire super fun going locks to just add complexity to the overall management, right? Like as overhead. So, but people may have a little bit of some middle level. For example, page level, right? That's could also be a level that people use. And of course there could be database level. But here, assuming that we only have a table level as a tuple level, right? Then transaction T1 comes along. We just want to read the record in my bank account, right? So here we would just, we want to perform this read and it will first acquire a intention shared lock on the table level, right? Because it don't need to read a level. It read a record in the lower level from this table R node. And then after that, it will just acquire a shared lock on this tuple one, right? Then transaction T2 comes along. It wants to update Andrew's bank record. Just for example, Andrew's record would be deleted as a tuple N there. Then it will just first acquire a intention inclusive lock on this node table R and then acquire a exclusive lock on this tuple N, right? To get the exclusive access on the tuple N and update it. And here, intention shared lock is actually compatible with intention exclusive lock, right? So on table R, there can be both two locks exist and there's no conflict here. Make sense? Any questions? Okay, right, this is easy, right? But in the trivial example, the intention lock seems a little bit useless, right? So here, I'll give you a use case where these intention locks would actually be useful to coordinate scheduling between different transactions. So here, we assume that we have three transactions. Well, T1 will scan actually the entire relation R and update a few tuples, okay? And T2 will read a single tuple from R, right? Just one. And then T3 will just scan everything in R but without updating anything, right? Let's see how that works. So here, again, T1 come along, wants to scan everything and then update a few tuples. For example, it wants to update a tuple N here. What it will first do is that it will first acquire what? Acquire a shared intention exclusive lock on table R, right? Because it needs to scan everything in table R, it will be a waste for work if it acquire a shared lock on every tuple in R, right? So it will just acquire a, will first acquire a, at least it need to acquire a shared lock on relation R, right? But at the meantime, it also needs to update a few tuples. So the combination of the two will make this transaction T1 want to acquire a shared intention exclusive lock on table R. Makes sense? And then after that, it will have the ability to read every tuple there but in order to update tuple N, it needs to acquire a specific exclusive lock on tuple N, right? And then tuple, I mean, the transaction two come along, right? Transaction one wants to read a single tuple in R. So what does it need to do? Well, just don't deal with it here. It first will need to acquire an intention shared lock, right? Because it's not really everything, it does not need to acquire a shared lock on table R, but it needs to read something down below, right? So it needs an intention shared lock. And here, it also needs to acquire a shared lock on tuple Y. Because that's a specific tuple it needs to read. And again, the shared intention exclusive lock is compatible with intention exclusive lock. So there could be both lock exist on table R, right? And lastly, transaction D3 also comes along and it wants to scan every tuple in table R. But here, in this case, what it needs to do is that it also wants to, I mean, acquire a shared lock on table R, right? But here, what happened is that, well, it wants to acquire this shared lock, but unfortunately, this shared lock on table R is not compatible with shared intention exclusive, right? Because transaction T1 is updating something down below, you cannot grant a shared lock to T3 at the same time on this table R node. So transaction T3 has to wait. And let's, for example, after a while, transaction T2 finishes and then locks are gone, right? But at the meantime, at this time, T3 is still a block, right? It's still not compatible with the shared intention exclusive lock. And only after a while, transaction T1 finally finishes and then the lock was released. Transaction T3 now can come to grab this shared lock on table R and then finally read everything on the leaves on the tuples, right? There's actually a small lock mark that I forgot to remove, but essentially, that should also be gone. The lock should also be gone too as well while transaction T1 releases all its locks. All right? As all this coordination between intention locks and lower level locks makes sense? Okay, no question, okay? Yes, please, yeah. Yeah. Oh, which just means that the lock finally granted, right? Sorry, yeah. Oh. Yeah, yeah. But for your earlier question, you can always upgrade, right? And a shared intention exclusive lock can be upgraded to be a shared lock if it's compatible with the current locks on that particular node. You can always upgrade. Oh. Yeah, we can talk about details offline, but in general, you can upgrade, yeah, yeah. Okay, so all these hierarchical locks give us, again, to iterate, it gives us the ability to acquire as few locks as possible to make sure a transaction can protect all the records it needs to protect and avoid conflicts. But in the meantime, with those intentions, it gives different transactions a way to coordinate between each other and allow the maximum amount of parallelism or concurrency as possible, right? So, yeah, again, so intention share lock which means intend to get a shared lock at final granularity. Intentionally exclusive means that intend to get a exclusive lock at lower granularity. And then lastly, the last lock just means that you are going to get a shared lock on this node, but intend to get a exclusive lock on the lower levels. All right? So, another thing that some system do is that beyond the system just automatically figures out, hey, I want to acquire a shared lock on the entire table, but sometimes if the system realized that a transaction is reading many, many tuples at a particular level of the tree, it also has the option to come back to the parent to upgrade the lock level on the parent node to be a stronger level. For example, if a transaction did not acquire a shared lock on a table in the first place, but while it is executing, the system realized that hey, you are reading too many tuples in this table. You just better off just acquire a shared lock on the entire table so you can come back and upgrade the lock on the entire table to be a shared lock, for example. Right? Just as an opposition, some system would apply. So, another thing about these different locks in the system is that generally, a system would not allow you actually to specify these locks manually. Like I mentioned before, the whole point of this data system is mostly to guarantee this asset property and deal with all those tuples override, power failure, etc. for the programmers. So, the programmer could just focus on the core logic of the data and develop the logic of your application to generate a value. But in some cases, some systems would actually allow you to specify this acquisition of locks if you know what you are doing, essentially. If you know that, hey, I know that I need a bulk, for example, upload of a bulk change in this table. I want to, for example, calculate the interest of the entire table or update a subsequent field of the entire table. If I know what I'm doing, then, I mean, the system would actually allow you to specify the lock as well. If you want to do that to help the system optimize the performance. So, essentially, there are different ways to do that in different systems. You can either, I mean, some system would call it shared inclusive and the other system would call it read log or write log. But essentially, what you can do is that you can specify these locks with these various commands. This actually, I think this is actually not in the SQL standard. So, this is actually, as many systems would allow you to do as opposition if you know what you are doing. And the different system would have different commands to let you to do that. But what is actually more common and actually used by more people is a specific command or syntax called select for update. So, instead of directly tell the system to lock the entire table or to lock a specific tuple, many people will do is actually is that if you are writing an application or you are writing a transaction that accessing a database, that you know you first want to read a few records, and then after that you immediately want to update those records. Then you can actually use this specific clause to select a few tuples, read their values, and then ask the system to grant exclusive or write log on those tuples at the meantime. So that next time when you use the next query, the next query is going to update those tuples, you don't have to go to the lock manager again and re-acquire an exclusive log. When you are reading those tuples, I mean originally you only need a share log, but in the meantime if you use this command select something and add this for update keyword after that, the system will just trying to acquire exclusive log on those tuples altogether. So that it will prepare the transaction to be ready for the upcoming update queries, if you have that intention. And again there is just some way that programmers can do to hint the system what logs it needs to acquire to reduce the overhead and potentially improve the performance. So any questions so far with hierarchical locking and these commands to specifically tell the system to acquire certain logs. Okay, no questions. We are going to move on to the next topic of a different concurrency control approach. So so far in the class we will talk about focusing on these two-fist locking concurrency control protocol as well as a few of its variants. For example this Stronstrick 2PL or rigorous 2PL and we also talk about hierarchical locking. And the larger category that these types of concurrency control protocol belong to would be called pessimistic concurrency control. Essentially by pessimistic it just means that you assume the conflicts going to happen very often. So before you access any record you are just going to put a lock around it. Either whether it's a shared lock or exclusive lock but you are going to lock it anyway to protect this transaction from the potential conflicts. But then there's another category of concurrency control protocol will be considered optimistic. So what that means is that instead of assuming conflicts are going to happen very often you are going to assume that in most cases transactions don't really conflict with each other. They may access different 2PLs in the database, etc. or the transaction rate may just not be that high to begin with. So conflicts are rare so what you are going to do is that you are not going to grant locks. Instead you are going to in most cases you are going to pre-asign a scheduling order of those transactions and in most cases based on timestamps and you are just going to assume that these transactions can finish their execution without any conflict. In other words equivalent to a serial execution and then all commit. And if there are actually conflicts with the predefined serial execution order then you just deal with the conflicts afterwards. But in this case you don't really acquire locks before you access the data. So that's what we are going to focus on today. So again for this broader category of a timestamp order in concurrency control or optimistic concurrency control the fundamental intuition is just use the timestamp of different transactions to determine the serialization order of them. And then after you determine order you are just going to follow this order and make sure that all the transactions they have a schedule that would have an execution result equivalent to this predefined order. Whereas predefined is not entirely true. In some cases you can actually come back to modify the timestamps but in most cases the timestamp would be fixed. And in order to define it more formally we say a transaction TI has a timestamp smaller than a transaction TJ when the system would generate a scheduling of the transaction so that in the serial order of the execution of these transactions TI is always going to appear as before TJ. So a little bit about how do we assign these timestamps so again in most cases this timestamp of a transaction is just fixed although in some advanced cases you can tweak it afterwards to allow potentially a little bit more parallelism and an important property of this timestamp is that it always increase. You can really first assign a transaction with a timestamp like say one or five and a while later you come back and assign another transaction a timestamp smaller than that or a C2 then that would be a very difficult schedule. In practice it's always monotonically increasing and then different systems actually assign this timestamp at different different times when the transaction executes in most systems would actually assign the timestamp of a transaction when the transaction begins that's just very straightforward but some systems will actually assign additional timestamp when the transaction finishes and some systems actually use both of the timestamps and then coordinate this timestamp to figure out what would be the best way to allow a better scheduling of these different transactions. But for simplicity here unless otherwise specified we just assume that the system would allocate a timestamp for a transaction when the transaction begins and stick to that for simplicity we just first assume that unless we specify other words so there are also different choices or implementation strategies to allocate this timestamp most systems actually the first would be use that system clock you just use the get timestamp function of our system but the most system would actually not use this approach well mostly as you can imagine first is very dependent on the operating system and you don't really have a control over that and second of all I think a week later we will have this daylight shift from the daytime when we shift the daytime by hour for example the timestamp in the system may sometimes change and that may cause lots of problems if you just rely on the system clock so what most people do is actually to use a logical counter so just internal to have a system you just have a counter to keep track of the timestamp of different transactions and when the next transaction comes you just increment this counter by one that's what most system will do but of course if then now you have this centralized location to allocate all the timestamp and sometimes this centralized timestamp counter may actually became a contention bottleneck of the system and there are also ways to deal with that but that's what most system will do and the next what we call a hybrid timestamp would be more common in distributed systems because in distributed system well if you want to have a centralized logical counter then again it will need to be centralized on a specific machine you have network traffic but in the meantime if you want to use a system clock on different machines then you also have a synchronization problem so it's difficult to make sure that all the logs on different machines in a distributed system would always be synced so in that case in oftentimes you would use a hybrid clock you use the timestamp or the system clock of each system but also in the meantime you also have a way to use some sort of logical counter to coordinate things without relying on all the physical clocks or system clocks on all the system to be synced but for this for this class we are not going into details for the advanced class I think there might be some discussion so today's agenda we are going to first talk about a actually there's a little bit of naming issue so all these concurrency control protocol we're going to talk about in this class we belong to the category of optimistic concurrency control and they all rely on timestamps to perform the concurrency control protocol but in the meantime there are also specific algorithms or implementations of the concurrency control protocol belong to this category first one called basic timestamp ordering protocol and the second one called optimistic concurrency control protocol and these are actually the names of two specific implementations even though all of them are optimistic and all of them use timestamps so looking back they probably should come up with better names but this is just the name people use until today and lastly we are going to talk a little bit about the escalation levels any questions so far? so first let's talk about the very basic timestamp ordering concurrency control protocol so again the premises of timestamp ordering or optimistic concurrency control would be that transaction don't really need to read and don't really need to acquire logs before they read or write objects instead they are going to maintain timestamps on these records and use a timestamp to guide their scheduling or serialization and specifically every transaction when they access an object X for example it will maintain two timestamps in the basic timestamp ordering concurrency control protocol one will be the right timestamp the other very suitable will be read timestamp and then it will update the read write timestamp every time it performs for example a read or write operation and then the basic intuition of basic timestamp ordering concurrency control protocol is that when the transaction access a record if it figures out that it is trying to either read or write a record that has been read by a transaction that would logically happen in the future which means that the object already have a higher read or write timestamp then my current transaction should abort because it violates the pre-defined serial order of the schedule of these transactions by the timestamps so that's the basic intuition and we will get to details on an example so again the specific check the transaction needs to do is that if the timestamp of a transaction T1 so assuming that transaction TI needs to perform a read operation then it needs to check whether its timestamp of TI is smaller than the write timestamp of a specific object it needs to read and if that happens it means that this transaction TI is trying to read a record that has been written by in the future that violates the serial order we defined earlier so this transaction needs to abort this transaction can read that tuple and update the read timestamp of that specific object X to be the maximum of the original read timestamp as my current timestamp and then this is actually not specified in the textbook but in practice oftentimes transaction TI also needs to make a copy of this object X to a private workspace so that next time when it reads this object X again it will read back the same value as if some transactions from the future modify this value even though that modifier could be valid but then if transaction TI comes back and read that again then it will be unrepeatable read and then it has to if that happens it needs to abort so in this case if it has a local copy then from now on it can just only focus on this copy and then it doesn't need to worry about whether this value has been modified later or not yes please yes yes yes yes yes yes alright here so talk about the read then we'll talk about the writes so writes is a little bit more strict because we are modifying the record essentially if you are you want to write to object X then your timestamp cannot be smaller than either the read or write timestamp of that particular object because you cannot let you cannot write to a specific record before a future transaction needs to read it and you can't really modify a record before a future transaction needs to modify either both of them will violate the serial order of this execution of this transaction so in all cases you need to abort and here if your timestamp is smaller than both then you are allowed to write that record update the write timestamp similar to what happened to the read timestamp and then also make a local copy so that you can later on you can read that record as well give you an example so here the first example transaction T1 read on B read on A and then read on A again transaction T2 read on B, write on B and read on A write on A so here assuming that at the beginning of the universe then there are some initial transaction 0 that already are written and read all the records so all the records the read timestamp and write timestamp would be denoted at 0 in this case and here at the beginning of each transaction we assign them a timestamp and transaction T1 assign let's say we assign 1, transaction T2 for example we assign 2 first read on B and then we do need to look at the read timestamp of the record B and because we are in transaction T1 we update this timestamp to be 1 here transaction T2 comes down read on B and similarly we update the read timestamp of the record B to be 2 here we need to write on B here of course we need to check or whether there will be conflict but I actually skip that step for example before writing on this record B it needs to check that both the read and write timestamp of this record B would be smaller than my current timestamp of this transaction T2 in which case this is valid so I modify B and then write the write timestamp here here again read on A you check this is valid here another read on A I mean none of the because the write timestamp is smaller than the timestamp of transaction T2 so I am allowed to read this record and update the read timestamp and then similarly this is also valid update here even though this read timestamp is higher than me but if I am performing a read operation in transaction T1 there is no restriction on the read timestamp so I can do that and then because T2 is already greater than 1 we don't really need to modify this timestamp and lastly there is a write on A again and again it satisfy all our property so we are allowed to proceed and essentially we commit and no violation of the serial scheduling that we specified earlier make sense? so here give you another example assuming that we have again a little bit similar transactions transaction T1, read on A, write on A transaction T2 only perform a write on A so what will happen here? let's go through this so first again read on A that's super straightforward, totally valid and then here write on record A with transaction T2 also totally valid here when it comes back the transaction T1 also wants to write a value on A because transaction T1 has a timestamp I didn't denote it here but essentially timestamp 1 is smaller than the write timestamp of the record A so this violates the serial order that we defined earlier based on timestamp of these transactions so we cannot let this to commit and then we have to abort T1 make sense? I assume this makes sense no response but one issue here we can observe is actually that even though according to this basic timestamp ordering a protocol all this transaction T1 is going to be aborted cannot continue but in actuality it doesn't really matter if transaction T1 just did a write on A because what transaction T2 sorry, it doesn't matter if transaction T2 did a write on A in between because what transaction T2 does is just a blind write it write a record to A, never read it and after it written that record the record on A written by T2 will be overwritten by T1 anyway so this scheduling would be very well equivalent to that transaction T2 just didn't happen and then you didn't really so all you just let transaction T2 happen but then there's no effect and you just tell T2, hey you can commit your effect already be reflected in that database system but at the end of the day it's going to be overwritten by another transaction anyway so transaction T2 doesn't really need to know and if transaction T2 does not need to read that record then at the end of the day it will receive it doesn't receive any different result so that's what we talked about earlier that for this blind write this in the view serializability in the classes last week essentially this transaction scheduling would be valid but because we have this strict protocol from this basic timestamp ordering then we didn't allow that happen and we want to always ensure a conflict serializability with the basic timestamp ordering protocol so one optimization people have come up with that can help with this scenario a little bit would call the Thomas write-through essentially invented by some guy named Thomas so what Thomas write-through specified is that when the transaction wants to write if the writing transaction timestamp is smaller than the read timestamp of that tuple then obviously I have to abort because that read the read timestamp is specified by a transaction that logically happened in the future and if it doesn't see my write happen earlier then this is wrong so this always needs to abort but then if my writing transaction timestamp is only smaller than the correct the write timestamp of that record but it doesn't conflict with the read timestamp then I would actually just ignore this write would just allow this transaction to continue because this write if it happens it will be overwritten by the future write anyway so if you look at the result if you know the semantics of the transaction and if you look at the result of this transaction execution it doesn't make any difference of course if this transaction future it needs to read this record again then it needs to check the timestamp accordingly this would definitely wireless the timestamp order we specified earlier for this serial order based on this timestamp but then from the end result perspective or from the view series perspective the end result would actually still be correct so this is just an optimization that you can do and of course in all other cases you will just write the record and update the timestamp so again come back to our earlier example it's the same example here we have a read on A an updated timestamp and now we have write on A from T2 also updated timestamp and here we have a write operation on A from transaction T1 and from the read timestamp of A it's the same as the current timestamp because both are 1 so this is totally valid and then even though my current write timestamp would be the timestamp of my current transaction T1 would be smaller than the T2 but we would still allow it to happen in this case right we just blindly allow this this transaction the write from transaction T2 to hold and then assume that this record is already being overwritten and then we'll just ignore it allow this both T1 and T2 to continue and of course when this T1 continues it needs to read the record from its private copy because otherwise it will be wrong so if it only reads the record from its private copy then this transaction would still be valid and then it's just an equivalent to a result where you first execute transaction T1 and then read on A write on A and read on A and later on you just let transaction T2 come after and then override the value written by A and then this would still be equivalent to a serial order even though it violates the order specified the earlier time-stand we determined all right then that makes sense any questions do we need to go through this again because I don't see many responses any yes please yes well T2 logically happened T2 actually T2 physically happened first but logically T2 actually happened after so the execution of this transaction assume that you T1 read on A right up-increased by 1 and then read on A again so after the first read everything happens in the private copy so this would quickly be valid and the value written by T1 would just be overwritten by the value of T2 logically happened later even though the actual write happened earlier yeah cool sounds good all right so again this is going to the basic time-stamp ordering protocol would generally schedule that is conflict serializable but of course like I mentioned earlier it wouldn't allow blind rights and then it would make certain it would not allow certain serializable schedule that could have happened and essentially Thomas Wright rule would allow make this scheduling of this transaction a little bit more flexible and allow a little bit more concurrency with Thomas Wright rule and there's no deadlocks because I mean no transaction is waiting for anyone so that's what it means that this is a optimistic concurrency culture do not hold any lock but actually there are a few problems but one problem is that it's possible for this transaction to have a starving issue because unlike for example the deadlock provision protocol we talk about why we have this mechanism to always prioritize transactions that happened earlier here in this case the longer you stay the longer the transaction runs the more possible that it has conflict with other transactions and if you abort this transaction transaction to come back there's no sign of a starving issue with this protocol and another protocol sorry another potential issue with the basic timestamp ordering protocol is that it would also permit schedules that are not recoverable assuming that we are not putting any waiting on this transaction we just allow the transaction proceed if the timestamp do not violate the rules we specified earlier so this is a little bit similar to what we talked about last class with the basic two-phase locking protocol so essentially if you only look at the timestamp and if you do not let transactions to keep track of which transactions have read what records on which transaction then what will happen is that you will generate certain schedules that are not recoverable and a schedule is recoverable would mean that if a transaction a schedule is recoverable means that a transaction can only commit after all transactions whose changes they read commit otherwise the database system if the testament do not do that when a transaction read a record from another transaction that did not commit but the other transaction abort then this value if the system crash at the same time this value would just be lost and then after the system come back again it cannot find out what's the value of the other transaction that read from that already being aborted so this record would just be lost so again here give me an example here a transaction T1 let's say it did a write on A and transaction T2 and read on A and write on B so here transaction T2 can perform the read or write because it has a timestamp that is higher than T1 and then according to the basic timestamp ordering protocol it will be allowed to perform the read but then after a while if the transaction aborts then then write assume that T2 already committed after that then assume that after a while for example there's a power failure then transaction T1 has to abort then this record is just lost T2 would actually read at that point would have already read a value that is written by aborted transaction and then this is definitely be invalid so in this case the transaction T2 would actually not be a recoverable if for example the power failure would be resolved and you're trying to restore the database to the original state so in practice in fact as far as I know there are actually really actually as far as I know there's no system actually implement this basic timestamp ordering schedule right so many system would implement the strong strict 2PL that I mentioned in the last class but then many system actually implement another variant of the optimistic conversion protocol which I will talk about in the later in this class as far as I know nobody really implement this basic timestamp ordering schedule protocol and besides these few issues I just described there are also issues such as there are also performance issues so I describe some logical issues but then there are also performance issues for example every time this transaction needs to access a record it needs to first make a copy it also needs to update the read and write timestamp and this is especially problematic for the transactions that would only need to read a few 2PLs because the read is supposed to happen much faster than write no matter whether you write in memory or on disk the read operation would always be faster than write but for example here if I have a transaction that only needs to scan an entire table let's say a billion 2PLs originally I only need to read this but now beyond the read I need to update the timestamp, the read timestamp of every 2PL of this billion 2PL in my table so this would actually add a significant overhead and then of course like I mentioned earlier the longer transaction would also have this starving issue unlike the dialogue provision mechanism we talked about for the two phase login protocol last class so there are quite some performance issues logical issues of basic timestamp that I'm ordering and in practice people generally don't use it so one observation we did here is that if the premises of the optimistic concurrency control is that conflicts are rare in practice most transactions don't really have conflicts so you don't need to log things then why we are still updating so many timestamps while these transactions are executing so if we are already assuming conflicts are rare in practice anyway then why not to have an alternative approach where we actually at least don't update the read timestamp of these transactions and don't do so many conflicts checks during the execution of the transaction to block this operation instead we still need to update the write timestamp because we need something to ensure the crackiness but then we only check whether the transaction is conflict or not before it commits essentially at the end of the execution and then we just do all the checks in a batch together so instead of doing all the checks in between and then installing all those timestamps especially for read operation so that's essentially another type of concurrency control actually another type another implementation of the category of optimistic concurrency control using timestamps and the name of this concurrency control protocol again a little bit of naming issue would actually also be called optimistic concurrency control that's the name of a specific implementation instead of a general category so the interesting thing with this optimistic concurrency control algorithm or implementation is that it's actually invented by a person from CMU it's actually a professor called KT Khan and he actually invented this in 1981 and he's actually not even a database professor based on his specialties he's actually more doing network stuff but then he just happens to have invented one algorithm in this area and then right now I think he actually have moved to Harvard at some point Harvard wants to improve their computer science program so I think they hired this renowned professor to run their entire department so right now he's at Harvard even though he's kind of like I think at this point he's kind of like retiring age I don't know whether he retired or not so in this optimistic concurrency control protocol or specific implementation again that they have a system will create a private workspace for every transaction just a little bit similar to the basic timestamp order concurrency protocol we talk about and then all the operations would be copied to this private space and then only when this transaction commits the database system would start to look at what records have been read or written in this private space and whether it conflicts with other running transactions at the same time right if there's no conflicts then finally the system would apply all the changes to our database system and if there is conflicts I mean it's also simple the system just throw away everything it has done in that case the overhead of a board is also is also low so let's get to the details so there are three phases in this optimistic concurrency control protocol and still there's a little bit of naming issue so in this first phase called read phase it can actually both read and write in the database system but only in its private copies or private workspace the read phase just means that the transaction is not applying any changes or writes back to the database system to the global database content but it can both read and write records in its private copy of the workspace and then the second space would be called a validation phase where it just checks after it finishes all the read and write operation it just checks whether it has conflicts between other transactions and if there's no conflicts it would just enter this third phase write phase to install all the updates this transaction has to back to the database system and then commit so here give you an example here again assuming that we have a transaction T1 and T2 and I just denote a different phase of these transactions in this illustration here assuming that for simplicity we also assume that we just use this key A and B and C to identify these objects or tuples and then assuming it has a value and again in this case we don't need the read timestamp anymore I don't need to update that but we actually still need the write timestamp because at the end of the day we still need to check conflicts so that is just unfortunately unavoidable so here each transaction would have a read phase and a write phase and a validation phase so here let me go through so here the transaction T1 first has an interred read phase and it would read the value of the record A and then later on it would write the value of the record A so T1 would have this private workspace where it maintains the IDs and values and write timestamps of other objects it first read this value A and here it just directly copy everything in this private workspace and similarly transaction T2 it reads a record A and copy everything into the private workspace and here transaction T2 immediately enter the validation phase and I will talk about the details of validation later but essentially as you can imagine here T2 only reads a record so it doesn't modify anything so T2 succeeds the validation phase and it will just and only actually it will proceed to the write phase but it didn't need to write anything but after the validation phase succeeds actually at that time which means that the transaction is already ready to commit so everything is going to be successful unless there's a power failure right in between so at that point in time we're actually going to assign timestamp for that transaction so this would actually be a little bit different than the basic timestamp order conversion protocol we talked about earlier because in this case we are just going to allow each transaction to execute anyway we don't really need timestamp during the execution of these transactions and after the validation we know that we are going to commit that transaction so only at that time we assign a timestamp to this transaction and T2 got to assign timestamp 1 and then in write phase we can't do anything so transaction T2 finishes and when we come back to the transaction T1 now it needs to write a value on the record A so here what it does if you look carefully is that it changed the write timestamp of record A in its private space to a placeholder essentially or infinity because at that point it has not been assigned to a timestamp yet so it's just a placeholder again in the validation phase again I will talk about the validation details later but here it also succeeds the validation phase and it got assigned a timestamp T2 at that point and it's going to succeed unless there is a sudden power failure in between so after that it will just come to the write phase and then write this record back to A with this timestamp T2 as the write stamp and then commit so any question on this illustration except the details of the validation that I haven't talked about on the three phases any questions now we are going to talk about the specific phases so first the read phase this is sort of straightforward and we already kind of talked about this earlier essentially we read phase transaction just read or write whatever records it wants and then the only thing it needs to do is that it needs to keep a private copy a private workspace of all the tuples it accessed and then it will just read and write on this private copy that's just the read phase doesn't need to do any check here I mean the most important thing is just the validation phase all the logic happens here essentially when a transaction involves the commit command when it's ready to commit or it not ready wants to commit well it will just immediately enter this validation phase to check the conflicts and obviously the database system needs to check whether the current schedule and the current updates or modification of this transaction will still generate a serializable schedule and then what it does is that how it does that is that just by checking the read write set of this transaction comparing to the read write set of the other transaction and one important thing here a little bit similar to the dialogue of previous mechanism we talked about last class is that there are just generally two types of checks you can perform and in each type of check you are only going to allow conflict from one direction so in this checks I mean you say the validation succeeds only that when you either have when you only either have conflicts from younger to older transactions or if you only have conflicts from older to younger transactions everything else I mean there will be possible that there could be a cycle in this scheduling of the transactions and you are not going to allow them to commit and again this is similar to the dialogue of previous mechanism we talked about earlier by allowing only one direction of conflicts in the by addition phase you are just going to make sure that when every transaction commits there is no cycle in the conflict graph so the eventual scheduling would be conflict serializable and if every transaction get guaranteed that then the entire scheduling would also guarantee that so essentially that would be called either a backward validation or forward validation and I will give you some illustration so here again so when a transaction are trying to commits it will just checks whether the read and write the read and write set of these transactions will be overlap with other transactions and here in the first example the backward validation we are going to check whether it has the read and write set of my current committing transaction has overlapped with the read and write set of the transaction that has already been committed also because we are checking with the older transaction that is called backward validation and here assuming that we are looking at transaction T2 so assuming this is the physical time where transaction T1 has committed we are looking at the transaction T2 and transaction T3 has not committed assuming that we are currently at this time point and in the backward validation phase we are just going to look back we are going to look at this region called validation scope to see whether my current transaction the read and write set of this transaction has any conflict or overlap in other words with the transaction that has committed earlier so in this case I prevent any conflicts between from a younger transaction to the older transaction if there is any conflict I immediately abort I only allow conflicts from older transaction to younger transaction so again in this case there will be guaranteed there is no graph and of course the other direction will just be called forward validation which means that when a transaction again this transaction wants to commit it only checks whether it's read and write set has conflict or overlap with the transaction that are still running if there is any overlap then it needs to abort so this way it avoids any conflict from an older transaction to a younger transaction again in the conflict scheduling in the wait for authority in the conflict graph or dependency graph there will guarantee that all the edges are one direction and there will not be cycles which would be a conflict serializable any question on this higher level concepts before I get into the examples again the validation scope here would be this small region here so here for the purpose of discussion we are going to focus on the forward validation in this case backward validation is very similar just to reverse it so each transaction again would be a timestamp at the beginning of the validation phase and then we are going to check the ordering of this committing transaction we are going to check the timestamp ordering of this transaction by comparing to all the other currently running transaction and more formally defined even though for the current running transaction because they have not entered the validation phase yet obviously they are not going to assign timestamp but for the purpose of our discussion and also a formal analysis we are going to assume that we have a timestamp for every transaction and obviously for the transactions that are still running have not ready yet they will have a timestamp that is higher than the current transaction so here we are just going to look at whether our transaction would be allowed to commit or whether it satisfy the conflict serializable property when only looking at the transactions other transactions that has a timestamp higher than my current transaction so we are only looking at one direction from the conflict graph and assuming that my current transaction is transaction T1 only looking at other transactions with a timestamp higher than TTI so here in the basic in the first condition with this optimistic concurrency control it would be the scenario where transaction TI has complete all the three phases before transaction Tj begins so again this is actually not a case that you need to check in practice just for the purpose of completeness of the theoretical analysis because if transaction T1 already completes then you would already finish all the validation and write phases in the textbook definition of this optimistic accuracy control as well as for the completeness of the analysis we categorize this as one scenario where TTI already completes all the phases before Tj begins and obviously this is always valid there is no anything you need to check so here again it's like a very naive you can even say a stupid example where transaction TI just finish everything before T2 this is obviously valid okay, second in the second scenario that's where transaction TI has complete all its phases but then Tj has not start its write phase yet so again just like for our theoretical analysis this happened but in practice you would do the check in the validation phase here assuming that TI already finishes all the phases you only need to check whether TI has written to any object that is read by Tj you actually don't need to check what records Tj written if TI already finishes all the operation so here give you some example right so here we have a transaction T1 and T2 assume that T1 read on A write on A and T2 read on A so here in this case when T1 enter the validation phase it has so in this validation phase T1 because T1 has the T2 all the other two parts read by T2 is after so let me double check this actually maybe there's a mistake yes, yes exactly so because in this validation phase start T1 T2 has not start the validation phase yet so in this case T1 has written to a record that has a later modified but read by a current running transaction so in this case this is actually not allowed so in this case T1 has to abort because in the transaction T2 before the validation phase it has already read a record this slide is correct okay here but assume that in another case transaction T2 finish its validation phase earlier and then only after that T1 comes back and start its validation phase and in this case what's the rule here because T2 actually has already finished the validation phase then T2 actually already being assigned with a time stamp so in this case even though I didn't specify what time stamp exactly there are but T1 must have a time stamp that is actually already higher than T2 so even though in this case physically the write on A from T2 actually happen after the write on A but because transaction T2 is validated earlier then it has a time stamp that is earlier than the time stamp of T1 so in this case even though the physical time of this read write operation has not changed but because the validation time of T2 is moved earlier this actually allowed under our rules of the optimistic concurrency control alright because I mean at this point T2 commits logically before T1 alright does it make sense any questions okay so let me just finish the third rule I'll give you a more concrete example hopefully so the third rule is actually what you would use the most common practice when you check whether the transaction is ready to commit in a validation phase would be that when transaction T1 sorry keep seeing it when transaction TI completes its read phase right when it enters the validation phase but TJ have not complete his read phase yet so in this case TI completes the read phase and TI TJ is still reading or writing a Tupos so you don't know whether TJ would be I mean read or write any other Tupos in the future in this case you actually have to check both the Tupos read and write by TJ essentially any Tupos written by TI cannot be either read or write by transaction TJ this is very strict because you don't really know what TJ will do afterwards so you cannot read the Tupos either read or write by a transaction TJ so here come back to this example again read on A the example has changed a little bit we have transaction T1 read on A write on A transaction T2 read on B and read on A so here we look at the validation phase here we design a time stamp of transaction with one here we look at the other transaction because the other transaction only operates on the record B so I mean obviously there's no any conflict and then we can let this transaction T1 to commit because all the operation would be happen with you to record A from transaction T2 will only happen later at this point we haven't seen anything yet so this straight forward this will be allowed to commit and then after a while when a transaction T2 comes back it will read the record A and then of course in the private workspace it also need to update this value of the record A and after that when T2 enters the validation phase it will look at whether it has written any record that will be violated with the other transactions that are operating on record record record so it will check whether it has written any record that is either read or write by any other concurrent drawing transaction in this case T2 only do read read on A so when transaction T2 trying to validate it also satisfy all the rules and it can also commit as it needs alright okay so we have finished talk about all the rules of this Optimistic processing control any confusion because I don't see much response okay if you have any questions I may not have time to answer too much questions right after class because there are other meetings but if you have any questions you can either post a Piazza or go to my office hour tomorrow afternoon we are happy to discuss in more details alright so now talk a little bit about the RedFist RedFist is also a little bit straight forward it's just when the addition phase succeeds you just write whatever changes in this RedFist back to the database right whatever records that has been modified this transaction and the only thing probably to note here is that in most system you would only allow one transaction to happen at the RedFist right because even though it's called Optimistic concurrency control at RedFist we are trying to modify the content of the database you still need to take a look essentially to make sure that you apply all the modification from this transaction to the database in an atomic fashion there are methods to make this right concurrent but they are pretty complex and in practice most people don't really do that but the argument would be that because we already finished all the operation of this transaction we already know what records we read we only need to apply the values instead of trying to perform drawing algorithm etc so the right phase should be short and so that we can take this log in this case that's the argument there so again for the scenario that this Optimistic concurrency control would benefit they would mostly benefit a scenario where the conflicts is low essentially for example when all the transactions are read only even in the validation phase you don't really need to check anything or for example when the transactions just are accessing destroying the subsets of the data in this case the validation overhead is also low because there's nothing overlapping to check so if the database is pretty large and if the workload is not skewed then there will actually be an ideal case for this to happen because in this case the database has many records and different transactions will have are likely to access different parts of the system if the database system is very very skewed then it's kind of degenerated to the degenerate to the case of two phase log in because if the database system is the workload is very skewed on the database system every transaction is just accessing one single record then most of the transactions would have conflict we would have a board that most of them anyway so at the end of the day probably only one or a few transactions can commit and in two phase log in it will be similar so this will be beneficial when the conflict rate of the transactions are low of course there's no free launch here so even though it has certain benefits when the conflicts of the transaction are low conflict rates of the transaction are low if the performance implications for the most is that at the end of the day you still need to copy all the data to this private workspace to operate on otherwise you cannot just let the transaction to continue without checking anything with others and this copying overhead may also give you a performance degradation the second is lesser but sometimes could also be a bottleneck which is the validation of right phase especially during the right phase I said that most systems would only allow one transaction to write at a time so if the right phase gets longer then this may at some point in some scenarios become a contention bottleneck and lastly a board in this scenario or in this optimistic concurrency control protocol would actually be more wasteful than two phase locking because in two phase locking when you identify that I'm trying to acquire a lock but that lock acquisition is already causing conflict you're immediately about the transaction and you'll back all the changes but in the case of optimistic concurrency control no matter whether the transaction is going to be aborted at the end of the day or not you're going to finish every operation anyway because you assume most transaction don't need to abort you just do all the operations and then at the validation phase if you realize that you need to abort then you have to throw away all the work you have done you realize that you need to abort later and you potentially can waste more work under optimistic concurrency control so again there's no free launch here there's a trade-off, pros and cons and in practice there are many systems either use optimistic concurrency control or two phase locking protocol but not many as far as I know nobody really used the basic variant of timestamp ordering that I mentioned in the beginning of the class all right so actually we finished talk about the higher of the fundamental concepts of the concurrency control protocols already like I mentioned we need to go a little bit over maybe five minutes in this class talk about a little bit additional issues that people have dealt with beyond the fundamental concepts of concurrency control and that's related to the different isolation levels database systems are actually using today but before that any questions about either timestamp ordering time or optimistic concurrency control are the examples I talked about earlier no all right so we just finished up with a few additional issues and different variations of isolation levels so so far we have only are talking about concurrency control protocol where we assume that all the transactions are just read and update existing objects we assume that the objects are fixed set and then we haven't really deal with the scenarios where we can insert, update, or delete two posts so just give you an example here assuming we have this but this problem is called a phantom by the way so assume that we have this table on people with ID, name, age, status, etc so here, transaction T1 for example we are going to select the maximum age of all the table with the status equals to lead for example here and then assume that we get a value of 72 like this is a maximum age then assuming transaction T2 comes along and then inserting to the table with a record of age 96 and the status also equals to lead so what will happen here is that, well, the transaction T2 can commit and then after a transaction T2 execute the same query again then it will actually get a different value it's unrepeatable rate so even though here transaction T2 is not modifying any records that T1 has read no, it's like just insert a new record but then this transaction T1 actually have a anomaly problem where it reads a value for the second time the value has changed so this is not what we call a correct read operation so this problem will actually be called phantom and what's really going on here we spend all these two classes talk about all these different concurrency and control protocol protect this and that and the confusing optimist concurrency and control and at the end of the day we are not guaranteeing a correct result in our database here is that the fundamental theory or protocol of serializability only guarantees the schedule is correct or serializable if the set of objects is fixed but in practice of course things are not this way there are things that are not handled by the fundamental concurrency control theory or protocol and phantom problem will be one example so that will happen when you insert or delete records from the database so in general there are three ways to deal with this phantom problem I will just quickly go through them the first would actually be a re-executed scan this is actually kind of straightforward what this does is that when you first execute any query with a wear clause you just record all the read set of this query and at the end of the transaction before you try to commit you execute all the query with a wear clause again and to check that whether the queries of all these queries with the wear clause whether the read set of all of them would be the same of the read set of these queries your first time you execute them if yes you commit if no you abort straightforward this would work, this is correct but of course this is very costly but this is one way to deal with it we call it predicated locking which would this actually again this is rarely really used because it's very difficult to implement and the only system that sort of implemented would be an advanced system called hyper but even hyper is not quite implemented but the basic idea is that you actually trying to map out the high dimensional space of the wear clauses in your scan clause and you lock this hypothetical high dimensional sub space in the entire space of the table so again here for example again this is the same table, assuming that because we only have two dimensions here either the age or the lit we draw a rectangle here assuming that the entire space of the record so when the first transaction comes along trying to read this record it's just to perform a wear clause selection and it will have a a hypothetical space in the high dimensional space where all the records that contains all the records with status lit and then under this method the system will just try to lock out this entire mathematical space and then when the next transaction comes along because it's trying to insert a value that with age 64 sorry 6096 but also with the status equals to lit it will belong to this space that has already been locked right so in that case system will detect this is invalid and it will just abort but again in practice it's actually very difficult to calculate and maintain those hypothetical subset of space in the entire space of all the records right so again like rarely people use it and even for the system only system does it is not quite exactly implemented either but what is actually a very common is actually something called index locking where you actually assume that there is the assistant of the index what you will do is that you actually just look at this index data structure and then for a particular predicate for example status equals to lit if you have the existence of the index on the attribute status then you will just lock out all the pages within that index of the status lit right so essentially with the assistant of the index you just lock a subtree in this index right all a few pages if that's a leaf node then you can prevent other records or transactions from accessing this same record with that index and then of course there could be cases where you don't really have this status equals to lit in that index right in that case you have to do a little calculation and to sort of calculate that which node this record will be inserted into if there's a record with status equals to lit and you lock that higher level node and the entire subtree and then when other transaction comes along it needs to check whether that subtree has already been locked yeah and well of course what if you say hey there's no record with status lit sorry if there's no index well then you just fall back to the original method right you either do the scanning method or you just lock the entire table that's what you will also do so that essentially comes up with the well just make to use two minutes to finish that it comes up with a little bit discussion of the isolation level in practice because you see in practice even though you guarantee serializable then there are also these and that other problem that I incur beyond the fundamental theory of serializability and even for a serializable isolation I mean there are lots of checks you got to do and there are lots of performance implications like copying records doing lock checking etc right so in practice actually most systems not only specify one isolation level but it will actually they will actually specify they will actually allow multiple isolation levels in the system right they implement those features and you can sort of pick and choose what how strong you want this isolation level would be essentially what type of conflicts you would allow right for example here there could be a few conflicts we talked about earlier that are unrepeated arrays or phantom arrays the system could actually allow the users to specify that hey either I allow dot arrays or I allow unrepeated arrays or I allow phantom right in that case the system can restrict its checks and then potentially has better performance and then this would have different names on these different serialization levels right for example in the highest level serializable you just don't allow phantoms allow you don't allow repeatable arrays in other words all arrays are repeatable and then you don't allow dot arrays right in the repeatable arrays isolation level you actually allow phantom to happen then the the system would not need to do the predicated locking all the rescanning that I mentioned earlier right or in the read committed level you not only allow phantom but also you allow unrepeated arrays they will actually allow more flexible scheduling in the read committed which will allow anything to happen right and the system actually give you the option to do that if you I mean in some cases you may just don't care about the specific protection actually then I will stop here next class I will just give you I mean a few specific examples of isolation levels and then we'll start from there and talk about the last type of algorithm or implementations in the concurrency control topic which we call multi-verdain right and then we'll finish up the discussions of concurrency control alright thanks everyone with the B.M.T. and the E.Trump fit us a saint-eyed group on the jump