 Dear students, today we are in the lecture number 45 of the database management system course. As the final topic of our course, we were discussing the transaction management. And in the first lecture of the transaction management, we studied that there are basically two parts of the transaction management. One is the database recovery and second is the concurrency control. In our previous lecture, we were discussing the concurrency control and we studied what are different problems that we face if we do not control the concurrency access properly. And we studied there are three major problems due to the concurrency access. And if we do not control the concurrency access properly, those three problems may transform the database into an inconsistent state. And I told you that as a designer, as a database developer, as a database professional, this is your prime responsibility to maintain the consistency of the database. Because a consistent database is a correct database and that is of use to the organization for the users and for yourself, otherwise it is of no use. So, if we control the concurrency access problem properly, then we would not encounter those three problems of the concurrency access. We studied that the major techniques that we apply for the concurrency control are the locking mechanism and the time stamping. And in our previous lecture, we were discussing the locking mechanism. Locking is based basically on the idea that if a transaction wants to perform any operation on an object, then first it should apply a lock on that. And we studied there are basically two types of locks. One is the shared lock and other is the exclusive lock. The shared lock is applied on the object if you want to perform the read operation and if you want to perform the write operation on the object, then we apply the exclusive operation. Then we studied the compatibility matrix. That means if a lock has been applied on an object and another transaction applies the lock on the same object, then what is the possibility that we may assign both the locks at the same time. And we studied that only the shared and shared locks are compatible with each other. Otherwise all rest of the combination, rest of the three combinations, they are not compatible with each other. So that we studied in the locking lock compatibility matrix. So that was point that we were discussing at the end of previous lecture. So from that point onward, we will continue in today's lecture. What is the idea behind locking? Before performing an operation, a transaction applies for a shared or exclusive lock on an object. If object is free or lock compatible, then lock granted, otherwise transaction will wait. And I told you the transaction that applies for the lock the second time that will wait until when, until the lock is released by the first transaction. When the lock is released by the first transaction, then the second transaction it receives the lock on that object and then it may continue with its process. So having this idea that the transaction has to wait if the lock that the transaction applied is not compatible with the lock that has already been granted on an object. This phenomenon of wait, it may encounter a situation that is called the dead lock. What do we mean by the dead lock? Dead lock is a situation when two transactions are waiting for each other to release a lock. In the Surya Tehala K transaction A applied lock on an object or it was and it was granted the lock. Transaction B applied a lock on another object and it also got the lock. But after some time, transaction B applied the lock for the object that was held by the transaction A because the object was held, so transaction B starts waiting. But in its further processing, transaction A applies for the lock on the object that has been locked by the transaction B, now transaction A again waits. So now this is a situation when both the transactions are waiting for each other to release a lock. Because they have locked some objects. So this is a situation that is called the dead lock because if they remain in this situation, they will remain forever until we find some solution. The transaction involved in dead lock keep on waiting unless dead lock is over or unless you or the DBMS does not do anything about the dead lock, it cannot resolve by itself. Here is an example of the dead lock as you can see on the screen. Transaction A applies for exclusive lock on object X and because the item X was not locked before, so the transaction A gets the lock on X that is exclusive lock. Then it proceeds. After that, transaction B applies the lock on object Y or the item Y and because the object Y was free, so transaction B gets that lock. At time T3, transaction A applies for lock let us say read lock on item Y or object Y because item Y has already been locked by transaction B. So transaction A gets into a wait state and at time T4, transaction B requests for item X and because it was held by transaction A. So it also enters into a wait state. Now at time T5 and time T6, you can see that both the transaction at this time they are in the wait state. So as I told you, they will remain in wait state forever until and unless someone does something for it and who is someone, either the DBA or the DBMS and generally it is the DBM. How do we handle dead locks? One is dead lock prevention, other is dead lock detection and resolution. As you have seen, there are two approaches to deal with the dead locks. One is prevention and other is detection and resolution. Dear students, it is not always possible to prevent a dead lock. Why? Because to prevent a dead lock, we have to judge beforehand. Beforehand means before the execution of a schedule that this schedule involves a dead lock and to know that we need to know in advance what precisely are the data items and what are the types of locks that are required by a transaction. Well in some situations, you may know in advance all the data items that will be required by a transaction and all the operations that would be performed on those data items, it means you can know in some cases in advance that what sort of locks on what items will be required by a transaction. But in many cases, it cannot be known in advance. Why? Because in many situations, the execution of the transaction is based on some condition. If this then do this, otherwise this, if this perform this loop, otherwise perform that loop. So, when the execution is condition dependent, it is dependent on some input and like that. So, in that situation, you may not guess what sort of locks on what items would be required by a transaction. If you say that any data item that is found in the transaction lock them all. Well, it may require in many cases locking all the database and that would be very very inefficient. So, I told you when efficiency is required, you cannot go for only a single option blindly unless and until it is some very specific situation. Otherwise, most of the time, you have to find a balance between the things. If you are getting an advantage, but you have to sacrifice something as well. So, this means that you do not do that, you go to one side to take an advantage and sacrificing many other things. No, always try to keep a balance. So, what I am telling you is that the prevention, deadlock prevention, it is not possible in most of the cases. So, that is why the approach that is generally used for the deadlock handling that is detection and prevention. So, this is called detection and resolution. So, let us discuss how do we detect deadlock and how do we resolve? Well, prevention is not always possible. Deadlock is detected by wait for graph. So, we use the wait for graph to detect the occurrence of a deadlock. So, what is a wait for graph? If wait for graph is a graph that consists of nodes and links, the node represent the transaction and the arrow, a headed arrow represents that which transaction is waiting for which transaction. This means that if as you are seeing on the screen that if transaction A applies for an object that has already been locked by the transaction B, then this fact will be shown in the wait for graph in the form that you are seeing on the screen that there is a link between T A and T B and the arrow head is toward T B. It means that transaction A is waiting for transaction B. Waiting for what? Waiting for transaction B to release a lock on a particular data item. To discover that we have a transaction B and A, it has held some data item. It has got a lock on some data item that is required by the transaction A and now transaction A is waiting for transaction B to release the lock on some data item. Then again, let us say transaction B was processing and in the processing, transaction B reached a point where it required a data item and that data item was held by the transaction N T N. So, again no transaction B has to wait because that item is held by the transaction N. So, this fact again is shown in the wait for graph in the form that there is a link between the node representing transaction B towards the node representing the transaction N and the arrow head is from transaction B towards the transaction N. So, this graph shows that transaction A is waiting for transaction B and transaction B is waiting for transaction N. So, you can guess from the graph that until and unless transaction N releases the locks, transaction B cannot move forward. So, it means K first, according to this graph, transaction N will release the locks. So, when transaction N releases the locks, the data item that is required by transaction B, then transaction B will get the lock on its required data item. So, transaction B lock will move forward and when transaction B finishes, it releases the locks. So, transaction A receives the locks and then it can move forward. So, when transaction N finishes and transaction B gets its required locks, then transaction B can move forward. So, this is deleted. It means that transaction B is not waiting for any transaction or when transaction B ends or it releases the locks. So, transaction A gets that lock. From this graph, that is also deleted. So, there is no wait for graph now because no transaction is waiting for any transaction. This is a situation that represents a deadlock. Whenever you find a cycle in the wait for graph, it means there is a deadlock. Cycle means if you follow the links from a node and following the links, following the arrowheads, you reach back to the same node, to any node from where you passed once and you reach to that node again. This represents a cycle in the graph and it represents a deadlock situation. So, deadlock, it may be between two transactions and it may be even among more than two transactions. T C is waiting for T A, T A is waiting for T D and again T D required a lock that is held by the T C. So, this is again a deadlock. Now, the next question is how do we handle the deadlocks? As I told you that in most of the situations, deadlock prevention is not possible. So, first of all, we have to detect the deadlock. So, how the deadlock is detected? It is detected that the DBMS periodically after certain intervals, it keeps on checking the wait for graph and if and when it finds a cycle in the wait for graph, it automatically knows that there is a deadlock. Now, if the deadlock has been detected, next thing is to resolve it because as I told you, if the deadlock is not resolved explicitly by some intervention, it may not be resolved by itself. It will keep on all the transaction involved in the deadlock will keep on waiting. So, the strategy is that among all the transactions involved in the deadlock, a victim transaction is identified. How do we, how does the DBMS establish a victim transaction? It can be performed based on any logic. It can be based on the, let us say priority of the jobs. Let us say it can be based on some any other criterion or it can be decided randomly or maybe the DBMS estimates that which is the transaction that is holding maximum locks. It can, it can declare that transaction as victim transaction. But anyway, a victim transaction is identified, is nominated and that transaction is aborted. So, it will be able to proceed. And in this way, all the transactions involved in the deadlock, they will be able to proceed. Well, this is also quite possible that again, they all, all of the remaining transaction or some of them may be involved yet in another deadlock later. But the thing is, if the deadlock happens again, it will be resolved again using the same strategy. So, this is the policy for the deadlock hands. This is a very well known protocol called two-phase locking. And this is the protocol that is generally adopted in most of the DBMSs. This is a protocol to grant and to release the locks that are requested by different transactions on different data items. So, what is two-phase locking? And briefly, we also call it 2PL, means two-phase locking. 2PL ensures serializability, but it may generate deadlocks. The locks are granted and released in two phases. As the name suggests, they are two phases, the growing and the shrinking phase. There is a growing phase, growing phase, the time during the execution of a transaction, when the locks are granted, locks are given to the transaction. See, the transaction applies for the lock and the DBMS or the lock manager or the concurrency control mechanism of the DBMS, it allocates or denies or asks to wait to the transaction. So, transaction apply for the locks and they are granted or refused or they are asked to wait by the DBMS or the sub-system of the DBMS that manages concurrency control. So, what is precisely the 2PL? 2PL is any operation. This is the first requirement of the two-phase locking. And the second condition is that locks are only granted in a growing phase. Once a transaction enters a shrinking phase, no locks can acquire. The transaction can acquire no further locks. So, the transaction may be granted the lock or it will be asked to wait. Why? Because that data item that this transaction has requested, it has been held, it has been locked by some other transaction. So, in the growing phase, transaction may request for the locks, the transaction may be granted the lock or it may be asked to wait. So, there is a point. Right at that time, the transaction enters into the shrinking phase. What is this? If you look at the growing phase, shrinking phase. This is the reference of the two-phase locking mechanism. Or otherwise, if you want to describe it, these are the two conditions. And when does it enter into the shrinking phase? It enters into the shrinking phase as soon as it releases one lock. And when it enters into the shrinking phase, it cannot get any further locks. So, this is called the two-phase locking protocol. Lock can be applied on attribute, record, file, group of files or even entire data. This you can consider as a sort of hierarchy. So, you have got different tables, you have got different tables. For a table, you have got different records, many records. Or for a record, you have got many attributes. So, this is a sort of hierarchy. When we say level of locking, it means that at what level to what particular data item we have applied the lock? This is important. It is also called the granularity of locks. You have applied the lock on this. Yes. Now, understand that finer the granularity, more concurrency, but more overhead as well. Now, in area A, we have got table or file. File A1 means first file of area A, file A2 means second file of area A and like that. Again, record A12 means A1 file's second record and like that. So, this is our sort of a hierarchy, a tree sort of structure, rather a tree structure. You would be able to apply or you would be able to provide concurrency, concurrent access. And as much as you do at a higher level, we will only allow the locks at the file level. If you read one attribute, you will have to apply the lock on the entire file. So, in this case, it will be easier to manage the concurrency because you have to apply the locks. You have to grant and release fewer locks for fewer items because you have got few number of files in your database. So, you have relatively less number of items to manage. Now, all the transactions that require to perform any operation on any part of that file, they will have to wait. So, you are not providing too much concurrency. I allow the locks at the attribute level because that is the maximum level of concurrency you are providing because you have taken the level of the locks up to the individual attribute. This could be the situation. In this situation, you are providing the maximum level of concurrency but mind it, it involves too much overhead. When we kept our locking mechanism, the locking level, we maintained up to the file level. If you think at the record or attribute level, the number of items to be managed, they increase too much. Why? Because there could be thousands of records in one file. So, think of the number of records in 50 files. It could be in millions. And if you go further below, now you have even crossed the millions. So, you can understand that in that case, the overhead involved is too much. So, you always have to maintain a balance. You do not restrict the users to a very pure level of concurrency you are providing. Your overhead to maintain the concurrency is involved. So, you have to maintain a balance and generally, you can consider that the locks are applied at the record level. Dear students, you have seen the level of the locks related to the tree. Now, you have to realize that first, if you have to do a right operation on a record and a transaction means that a transaction has applied an exclusive lock on a record. Now, let us say another transaction wants to apply the right operation on many records of the same file. So, that file applies the lock on the file. So, it is possible that the record which a transaction has already locked for the right and if it does that, then you are going in that situation which we had read three situations, the problems of concurrency control. This means that when you have a lock applied at a lower level, so at a higher level, it should be an intimation that a lock has been applied at a lower level. Fine. Let us say that if a lock is applied at a lower level, then from that particular item and towards the root node, the indication will come that a lock has been applied at a lower level. For example, if you have applied a lock on a record, then the file of that record will also have an intimation and the area on it will also have an intimation and the database on it will also have an intimation that a lock has been applied somewhere below. And the item on it will have that lock. This means that if you are at any level, then look at the inverse of this. First of all, if you apply a lock on an item, then that item is not a lock. It is not a white lock. But the higher level one, first of all, the file was a lock before. Look again that you want to do an operation on a record. That record is not a lock. But it may be that the file is locked. And when the file is locked, then all the records become an area for it. So it means that why is that lock applied to this file? Or first of all, that file is not a lock. It is an area lock. And first of all, that area is not a lock. It was a database lock. This means that if we want to apply a lock on a data item, on a particular level of data item, then it is not a lock on a higher level than that. Because then we will have to check the compatibility of the locks. So this is very easy. Because you will get two or three or four nodes on a level on which you will have to check and you will realize that the lock has been applied or not. The question arises that if you apply a lock on a higher level, then you will have to take care of the lower level as well. Like I told you earlier, that you want the right lock on the file. The file is not a lock. And neither is the area lock or database lock. But it may be that the five or six records are the lower locks. So the question arises that if we are applying a lock on the higher level, then how would we know that there is no lock on the lower level? So the solution is that all the locks are checked individually, but it involves too much overhead. Because if you apply a lock on a file, then to check that if there is no lock on the lower level, then you will have to check all the records. When you have checked the records, then you will have to check all the attributes. Very much difficult. So that is called intention lock. Intention locks are two types of locks. Like we have shared and exclusive locks. Similarly, we have intention shared and intention exclusive locks. And its logic is that when you apply a lock at a lower level, you apply the intention lock of that type as well. For example, if you want to apply a shared lock, then if there is a shared lock available, then take it from there and apply intention shared to all the routes. And if you want to apply for the exclusive lock, and you can get exclusive lock on that node, then you apply intention shared to all the root nodes. And mind it, when we talk about the compatibility of locks, in that compatibility, now your compatibility matrix, now it involves more types of locks as well. First, you have to check compatibility between shared and exclusive. But now you have to check the shared, exclusive, intention shared, and intention exclusive. There will be one more. So in these four, you have to check the compatibility. And look at the sequence again. You have to apply a shared lock on a node. So first of all, you have to check whether that node is locked. If that node is not locked, then your stage is clear. After that, the parent node will go there, and you have to check whether there is a lock on it. Now there are four types of locks on it. The parent node can be shared, exclusive, intention shared, and intention exclusive. So the lock that you are applying, you will have to check the compatibility of that lock with all these four locks. If there are four locks, it is possible that you apply a lock on a file, you apply a lock on a record. That record is not locked. But it is possible that there is a lock on the same file. When there is a lock on another record, then there will be no difference on the one applied on this record. But the common file, there will be an intention lock on it. What type of lock would it be? It depends on the second record that was already locked. If the second record was shared, then the file is intention shared. If the second record was exclusive, then the file is intention exclusive. Now if you apply the lock on this record, now you check the compatibility of that lock with the lock of the file as well. That intention is shared, that intention is exclusive, and like that. So in this way, you will see the compatibility matrix between the four locks. And there is another type of lock, and it is shared intention exclusive. You want to read a file and want to lock some of the records of that file. Again, you want to read a node, or any type of node. You want to read it and you want to perform the right operation on some of the child nodes. So this type of lock is called shared intention exclusive. So that you want to write on it. It will be locked, but the rest of the locks will be shared. Now you will see the compatibility matrix between these five locks. How will a lock at higher level know about a lock at the lower level? Individual checking is not feasible. This is very much time consuming. Intention locks. When a lock is applied on a node, an intention shared or intention exclusive lock is applied on all nodes till the root. An intention lock at a node indicates a lock at the lower level. While applying for a lock on a node, compatibility with the intention lock is also checked. Shared intention lock. Yes, because shared or intention exclusive are both mixed in this, it conflicts with all the locks that conflict with shared or conflict with intention exclusive. This is our extended compatibility matrix. Now you can see that the same five locks are visible in the rows. Now you can see that we can discuss one of the two rows. Let's say we discuss the second row. The rest you can understand if you want to do more. In the second row, we have the second row with intention shared. This means that you have a lock applied on an intention shared. This means that a shared lock is applied on a child node. Now, if a shared lock is applied on a child node, if you apply for the shared lock on that node, it will be a yes. The child nodes are also being read. If you want to read this, if you don't want to read it, you can read it from all the child nodes. Fine, go ahead. If you have got the intention shared and you are applying intention shared, what does this mean? This is a node. The child node has a shared lock. If you want to apply a shared lock on someone else, you can apply intention shared on it. Fine, go ahead. You can read it. After that, intention shared is applied on a node and you want an exclusive lock. And obviously, it should be no. Why? Because intention shared means that someone below it is being read. If you apply exclusive on it, it means that you can write any of its styles. Obviously, you can also write that node which is being read. Similarly, when there is intention shared and you want to apply for the intention exclusive, what is the situation? The situation is that a child node is reading a transaction and another transaction wants to write another child node of the same node. See, the child node is reading a child node and another child node of the same node wants to write. See, you should not be confused that you want to write the child node which was already shared before. It cannot be. Why not? Because the one who is being read is already shared locked. So, before applying on a higher level, the exclusive lock you are applying will be compared with that child node shared lock. It will be no there. When you apply intention shared with intention exclusive, it means that a child node is already reading a child node and another child node is being requested for an exclusive lock. Fine, this is allowed. You are reading one child node. It is the same thing that you are giving it a maximum level of consistency. A child node is reading a transaction and another child node of the same node you can consider as a file that has record number 20 that is a transaction that is being read. But for that file, let's say record number even 21, you know, record number 100, you know, it wants to write another transaction. Fine, go ahead. No problem. In this case, your I S, it will go yes with I X. After that, if you have intention shared and you want shared intention exclusive, in that case, it will go yes. Why will it go yes? You just want to read all the files. So, the first item is being read. But the same thing is that its intention exclusive part, when you enter that part, for that, you will compare the exclusive lock on those individual nodes and you won't get exclusive lock where the item is shared first. So anyway, this is a full matrix. Basically, the idea is to make such a situation that a lock has been applied on one item and a second transaction applies on that same item. So, according to that situation, you should see what should happen. This is very simple and logical. If you don't have any problem, you can make it yourself. If you don't understand, you can make it yourself. Well, that was it about the locking. We studied that locking mechanism is used to implement the concurrency control. We studied that there are two types of locks, shared lock and exclusive lock. Or this type of operation we apply the same type of lock. Then we saw what is the two-phase locking protocol. First, you apply the lock and locks can only be granted in the growing phase and once a transaction releases a lock, it enters into shrinking phase. And after that, it cannot get any lock. This was our two-phase locking protocol. Then we studied the granularity of locks. In that, we said that it is a tree structure. The more you find locks, the lower the level, the more concurrency you have. But it will be difficult to manage it. In this way, we saw that we need to have some other types of locks as well. In that, we have intention-cheered, intention-exclusive and shared intention-exclusive type of locks. We studied their purpose. Then we saw a compatibility matrix. We saw how they are compatible with each other. That was the locking protocol. After that, what we are going to study is the second approach of handling the concurrency. That is time stamping. In time stamping, no locks are used. The time stamping guarantees consistency. And no deadlocks. Fine. However, certain transactions are cancelled. It will be like when there was a deadlock, you would detect it and cancel the victim transaction. It will not be a deadlock. But still, some transactions may be cancelled. These are the properties based on the concept of time stamp. Time stamp is basically an identifier that is allocated to a transaction. And basically, it shows a relative order. Relative order means which transaction has started first and later. And it can be anything. For example, it can be a serial number. For example, when we look at the examples, we have shown them 1, 2, 3, 4. Or it can be the time of the computer clock. It can be anything. The basic idea is that when a transaction is initiated, you have to tell the time stamp. For example, we are taking the transaction as a serial number. 1, 2, 3, 4, 5, like that. So, obviously, the transaction that will start first will have a smaller number. And the transaction that will start later will have a bigger number. This means that the larger the number, the younger the transaction. Or the smaller the number, or the older the transaction. The transaction operations are allowed in the time stamp order. So, the logic is that the operation in that order in which order your transaction is at time stamp. And especially, your conflicting operations. Time stamps are also allocated to data items. Now, the data item on which you operate, read cooperation or write cooperation. So, with that, there are two types of time stamps. One is the read time stamp and one is the write time stamp. This means that the item that has read it, what does it mean? It was an order, it was a time stamp. And what was the time stamp that was the write time stamp. Now, time stamping basically has two problems. One problem is that when a transaction wants to read an item that has been updated by a younger transaction. So, what happened is that you had a transaction, one, two, three. One was the first one and it should be executed first, then two and then three. Now, transaction three wrote an item. Now, transaction two reads that item. But the problem is that transaction two is big. It should have been executed earlier than transaction three. It has been updated by transaction three. So, what should happen? Transaction two should get the last value. So, this is the problem. And the second problem is that a transaction wants to write an item that has been read or written by a younger transaction. Now, transaction five of yours wants to write an item. But transaction eight first has already been read or written. Now, according to our serial order transaction five should be written first and then transaction eight should be read. The order is the same that the first transaction issued, it should be operated first. In this, your serial order is breaking. These are the two problems of time stamping. So, let's see how they are managed. The first problem is met by maintaining multiple versions of the data items. In this, see that you will have a store that item that will store that this will be a form of it that first the value of that item like the first one is 742 after that it is read time stamp. This means that which transaction read last time. And similarly the right time stamp which transaction update it last time. Now, see this as an example whenever you perform a right operation on a data item it creates a new version as you can see as we have version one of that item its value 742 was read last transaction six and it was also write transaction six then transaction nine came and it made it 1096 and this became the version two of the same data item and there its read time stamp and its write time stamp became 9. Now, see that whenever you write a data item its new version will be created with appropriate time stamp. Now, first do that transaction eight was to read the data item. You will do that its latest version which is version two you will see its read time stamp and write time stamp will see. Now, if its write time stamp is bigger than the requested transaction it means that the transaction requested after the transaction is write. So, you will see that the requested time stamp is smaller time stamp. So, this transaction eight will get value which transaction six had because it was a chronic transaction and instead of 1096 its current value it will get 742 and version one its read time stamp it becomes 8. So, this is how you handle this situation and the second situation write request when an item read or written by a smaller transaction it wants to if a transaction has a read or write done and if the transaction wants to write then the transaction will cancel because after that the transaction will cancel and a new time stamp will be done and today has ended in our course and I sincerely hope you will enjoy this course and I have learned myself in this course and I wish you all good luck, thank you very much and Allah Hafiz.