 So, I think I have confused you enough with a number of terms without actually giving you a very concrete recovery algorithm. I have kind of beaten around the bush about how to undo redo transactions, but that is not a full recovery algorithm. The idea was to introduce all these concepts which you need to understand which once you understood roughly what these concepts are, the specific recovery algorithm I am going to present will make a lot of sense. Now, I should point out that earlier editions actually had multiple recovery algorithms. And we realize that there is not much point in students learning so many recovery algorithms. Some textbooks actually go straight to one recovery algorithm called Ares. And one of the participants had asked me if I could cover Ares in this course. It turns out that Ares is a very, very complicated recovery algorithm. Now, some textbooks like Ramakrishnan do present Ares, but it is a lot of work to understand what all it does. You can read the text and say Ares does this, Ares does that, but to understand why this works or why it is correct is actually very non-trivial. And in fact, if you read the original paper on Ares, it is like a 50 page paper with so many details in there that just abstracting Ares to the form where it fits in a textbook is not trivial. So, what we decided is we are going to have first a simpler algorithm which has some of the ideas of Ares, but is not as complicated as Ares. That is what we are presenting here. Hopefully, you will understand all of this completely and it gives you a flavor of what Ares does. Then I will mention what Ares does after this and you can read it out. So, we removed all the other algorithms which were there before. In fact, one of the reasons for confusion, let me again point this out. In the current recovery algorithm, let me just jump ahead and. So, in the recovery algorithm which we use here, there are two phases. There is a redo phase and an undo phase. Redo happens first, then undo. Now, in the earlier edition, we had an algorithm which did an undo first and then redo. And then we had an algorithm which does redo first and then undo. And this is getting very confusing. Why should you flip it? Why does one do redo first, then undo and the other the other way? So, this was getting very confusing. So, in this algorithm, we are going to do redo first and then undo. This matches Ares by the way. Ares has one extra pass called analysis which is very useful to speed up the redo pass, but we are going to skip it to keep life simple. So, coming back, during normal execution, logging is done exactly as we described it. You write T i start, then you write T i x j v 1 v 2, every time x j is updated. T i commit at the end. Now, while you are running normally, if a transaction decides to roll back or the system decides to roll back a transaction, what do you do? Here is what you do. First, you scan the log backwards from the end and for each log record of that particular transaction, we perform an undo by writing the log record of the form T i x j v 1 v 2. How do we undo? We have to restore x j to v 1 that we have to do. We also have to write a log record T i x j v 1. We saw this earlier. This log record by the way is called a compensation log record. So, in repeating history, we are actually going to again record that we are actually going to do a redo which sets x j to v 2 and then this log record will later be executed which sets x j back to v 1. So, it compensates for the earlier update. And when T i start is found, we stop the scan and write the log record T i abort and that is it. T i is done. We do not have to look at it again. It is treated as a completed transaction in future. There is never going to be a need to undo T i in future. We will only redo T i. And what does the redo do? It does the original operations and the compensating operations which have been recorded. So, the redo of T i is going to essentially be a no op. It is going to bring the database back, but given that the database on disk could have been in a very jumbled up state with some updates present, the redo of this T i will actually update the disk database to bring it back to the state it was before T i ran. That is the idea. So, that was normal operation. Now, if the system crashes and you have to recover, there are two phases. The first is a redo phase which replace updates of all transactions whether they committed, aborted or incomplete. It is a blind redo of everything. So, what does it do? First of all, it finds the last checkpoint record. Now, why does it do that? Because it does not actually have to redo anything before the checkpoint record. All updates which were done before the checkpoint record were written to disk as part of the checkpoint. So, no redo is required earlier than the checkpoint. Now, starting from the checkpoint, it keeps that list L for a reason we will see. It is going to put it in undo list. Now, it is scanning forward. Whenever it finds a record of this form T i x j v 1 v 2, it will read us it by writing v 2 to x j. It does not do any further logging. It simply writes the value v 2 to x j. Whenever it finds a log record T i start, it adds T i to undo list. Undo list is the current information about what all transactions were incomplete and have to be undone. So, if a transaction starts for the moment we are going to assume it will be incomplete. But whenever we find a T i commit or T i abort, we remove T i from undo list. T i might have been in this initial list L with the checkpoint record. It may have got introduced later. But the moment we found T i commit or T i abort, it is out of the undo list. So, that transaction completed, there is no need to explicitly undo it anymore. So, we do this till we hit the end of the log. At this point, what do we have? We have an undo list which is a list of all transactions which had not completed. That is, they neither committed nor abort it. All these transactions have to be undone. Now, you might think that we will keep this list L and undo one transaction L, then the next transaction, then the next. That is not efficient. In fact, under certain extensions of recovery, it would not even be correct. So, what we do is we scan the log backwards from the end, scan it backwards. Every time we find a log record for a transaction which is in the undo list, we have to process it. If we find a log record for a transaction which is not in the undo list, we skip it. We do not care about it. So, if we find a log record for a transaction in undo list, what do we do? What I said earlier? We perform the undo and write this compensation log record. The only difference is now, we are undoing many transactions together. So, we are doing a single scan of the log and concurrently we are undoing all these transactions. Now, whenever we find a TI start, where TI is in the undo list, that means, at this point, we have undone all its operation. We have reached its TI start. So, at this point, we will write a log record TI abort, because we have just aborted it and remove it from the undo list. And we continue doing this. When do we stop? We stop when the undo list is empty. What does that mean? For all the transactions which were incomplete, we have now succeeded in performing the undo completely. We have written all the log records required for the undo. We have written the TI abort for all of them and we can stop. At this point, recovery is complete and you can restart processing. As a small optimization, many databases will perform a checkpoint just after this point and then allow transactions to run. So, that in short is our basic recovery algorithm. So, let me illustrate it with an example. We have a TI start. TI updated B to from 2000 to 2050. TI started, then there was a checkpoint. The checkpoint had TI and TI because they were active. Then TI updated C from 700 to 600. It committed. TI started, set A from 500 to 400. TI, which was sleeping till now woke up again. And now what happened? We decided to roll up, roll back TI. So, how do you roll back TI? We are going to set. See, the only log record for TI going back. The first log record is this one. TI B 2020 to 2050. So, to roll it back, we are going to write 2000 back to B and write this log record, TI B 2000. Going further, we find TI start. Therefore, we are going to write out the log record TI abort. So, all this is in the log. All of this was before the crash. Supposing crash happened just after this. What happens during recovery? So, at the time of recovery, it is going to go back to the checkpoint record. So, T0, T1 are the undo list at this point. Then, we are going to go forward. And C will be set to 600 and T1 commits. Therefore, we remove T1 from the undo list. The undo list is now T0. Then, T2 start. So, T2 is added to the undo list. Continue on. T2, this action is redone. So, A is set to 400. Then, this action, T0 B 2000. B is set to 2000. T0 abort. That is fine. At this point, what do we have? We have finished all the log records. And our undo list now consists of only T2. That is the only thing in our undo list. So, we will scan the log backwards and look for the records of T2. We find this one. A updated from 500 to 400. So, it is going to set A back to 500 and write this log record. T2 A 500. Then, it finds T2 start. It is going to write T2 abort. Undo list is then T2 is removed from the undo list. Undo list is empty. Recovery ends. So, not redo pass starts from the checkpoint, goes forward. And undo pass starts from the end of the log. Keeps going till it, the undo list is empty, which is at this point. And it stops. Since we are out of time, I am going to tell you what more is in here, but I am not actually going to cover all these topics. So, I am just going to introduce the titles of these slides. The first slide is on buffering log records, which means you can postpone writing it to disk immediately. You can write it later. The second slide is also on the same topic. The only extra thing in this slide is something called write ahead logging, which means if you postpone writing a log record, you cannot postpone it forever. You have to write it to disk at some point. When do you write it? First of all, you have to write it if the transaction commits. But moreover, even if the transaction is not committed, if ever you output a modified buffer block to disk, any log record relevant to that block has to be output to disk first. That means before you output any actual update, the log has to be written first. So, that is called write ahead logging. So, if you see the term write ahead logging anywhere, it is a very simple concept. It means you have an updated buffer block, you have logs corresponding to that update. Before you write this block, you have to write the log records. Why? Because if the transaction has not committed, you have to be able to undo it. If you wrote the block to disk without writing the log records, your disk does not have enough information to recover. But if you write the log records first, the undo information is there. So, you can undo this, that side. There is also a fair amount of stuff on how the database buffer works, buffer management. Again, for lack of time, I am going to skip all of this. Then, there is some stuff on fuzzy check pointing. I told you earlier that check pointing means that no update should be in progress while check point happens, which is very, very intrusive. It just affects all transaction processing. So, in reality, what systems use is a form of check pointing called fuzzy check pointing. Again, I am going to skip the details. There is yet another topic, which deals with what if the disk fails? So, then you will have to periodically dump the database. Now, first of all, with RAID systems, disk failure is rare. But supposing it may happen, we have to be prepared to recover, in which case we dump a copy of the database, dump the log records all the time, and we can recover from it. I am going to skip the details. Then, the next topic is, so far, we assumed strict two-phase locking. But I already told you that certain data structures are not actually locked in two-phase manner. So, the problem is, what about updates to those data structures? If they are not locked in two-phase manner, by data structures, I mean disk data structure like B trees and so on, B plus trees. If you are going to release locks early, the recovery mechanism we have described so far will not work, because it depends on strict two-phase locking. It turns out, there are some tricks, which we can use based on something called logical undo logging. And by using that, we can allow certain types of locks to be released early, and still guarantee that recovery will work. There are a lot of details here, and I am not going to cover it here. But the whole section of the book, which deals first with, why do we need early lock release? And how do we modify the recovery algorithm we have seen so far? How do we modify it to allow early lock release? A fundamental concept here is a notion of operation logging, as opposed to physical updates. These are operations. What is an operation? An insert into a B tree is logged. How do you undo an insert by doing a delete? So, it is what is called a logical undo, as opposed to undo by restoring the old value. So, there is a lot of stuff in the book on that. We are going to skip it. And the last, but one part of the chapter is on the Aries recovery algorithm, which is famous and widely used algorithm. The Aries algorithm was developed over many years at IBM. And one of the key people involved in developing it is a person called Dr. C. Mohan. He is very famous for this. And the Aries algorithm has a huge bunch of optimizations, which can speed up recovery tremendously. For example, in our redo algorithm, everything which was done is redone. Now, many of those things might have been written to disk already. So, Aries can track what pages were written in between checkpoints also and avoid redoing in many cases. Aries supports a form of logging called physiological redo, which can reduce the logging overheads greatly for certain operations. For example, a page can have variable length record. If you delete a record, you want to compact all the remaining records in the page. That requires a lot of logging. Aries has some tricks, which reduces the logging. Aries supports high concurrency for index updates, number of other features. So, that is the reason it is very widely used. So, whatever we described is a greatly simplified form of Aries. So, we basically omitted all those optimizations which Aries does to keep life simple. But once you understood the concept of repeating history from here, if you read the Aries algorithm, the rest of it makes sense. So, in our book, we do describe the Aries algorithm. So, if you are interested, go read it. Somebody asked me to cover Aries in this course, which brings up an interesting question. So, there was a panel discussion in some SIGMOD conference many years ago, which was titled, how many Mohans does the world need? So, what does that mean? Why should the world need fewer Mohans or more Mohans? Well, the reference was to C Mohan, who invented Aries. And the question which the panel was discussing is, how many people really need to understand recovery and detail? To the level of detail that Mohan understands recovery, how many people need to know it? There are many. Every database system needs of many people who understand recovery to that level. But outside of that, you have five, six major vendors and then three or four major open source projects. Then maybe 50 people in the world need to understand recovery and that is it. Nobody else needs to understand recovery. Well, that was what people were loosely saying, but in reality here is what I would say. It is enough if 50 to 100 people in the world, 50 is not enough. There are others who worked on extensions. Maybe a few hundred people of the world need to understand Aries in full detail. Probably another few thousand need to understand Aries to some extent because their system is doing Aries recovery and if they want to tune it or do whatever, they need to understand what Aries does. The vast majority of the people who take database courses don't really need to understand Aries. However, it is very important they understand what is recovery. They need to understand why is recovery an issue? Otherwise, they cannot appreciate what is going on in a database. They will not understand why does this database maintain a redo log? What is archive logging? What is this? What is that? They will not understand it. So, pretty much anyone who ever will use a database or manage a database needs to have some idea of what is recovery. So, what we have covered so far is some elementary stuff which every student needs to know and I believe this should be in the syllabus of any database course, but more gory details of Aries and logical undo logging and so on are probably not required for everybody else. So, if you are interested by all means read this chapters, but it should not be in any syllabus if you ask me, any basic syllabus. The last topic for today before I go back to the quiz is on remote backup systems. This is a slight digression. It is no longer Aries. What we are doing is again I am going to skip this and go over it very fast, but the idea is I have been mentioning this many times. If you have a disaster, you better have a copy of your data somewhere else far away so that this disaster will not affect those guys. So, this is shown here pictorially. There is a primary site with data and log records. There is a network and then there is a backup. So, the database is backed up to the backup system initially. Then all logs which are written here are sent over the network to the backup so that the backup can apply the log records and recover whenever required. So, any high availability system, these are very widely used. Every critical installation of a database, banks, financial firms, you name it everybody today has to have high availability with a remote backup that is mandatory. So, there are several issues in having such a high availability system. First of all, the backup site should be able to detect when the primary has failed and to take over from the primary and other users have to know that at this point I should no longer try to talk to the primary. I should now go and talk to the backup site. So, how do you distinguish failure of the site from some temporary network glitch? So, there are some tricks including what are called heartbeat messages which are periodic messages sent over which says I am alive, I am alive, I am alive. If the heartbeat stops coming then that is a problem and you have to do something. The next is transfer of control. How does the backup system take over? What actions does it need to do before it starts processing new transactions? Again, I am going to skip the details, but that is something important. The third detail is when the backup takes over, how long will it take to do recovery actions before it can process new transactions? If you do not do this carefully it may take a long time to recover which most banks and others will not allow. They want very fast recovery. So, there is something called a hot spare configuration where the backup is continually processing log records which come from the primary. If there is a failure the backup will take literally a few minutes, one or two minutes it can take over. So, that is what high availability systems use. And finally, there are some details on when you can declare a transaction committed. I am going to skip these details, but if you are interested read up this one safe, two safe. So, that is it for recovery. Let me wrap up with the quiz. So, the quiz hopefully will test something which I have been telling you multiple times. So, I will give you couple of minutes for the quiz. All of you please press the ST button and be ready. So, the first question is repeating history performs redo on all transactions, only committed transactions, only aborted transactions and only incomplete transactions. Those are the four options. So, pick the right answer, do not answer yet. Press the ST button, I hope all of you have pressed it. Wait a few seconds for your local server to receive the sync from here. Hopefully now all the local servers are active. So, please go ahead and choose the option for the first question and time is up. So, as I have been saying several times during today's lecture, repeating history means you redo everything. So, it has to redo all transactions. Whether they committed, aborted or are incomplete, every action that they did earlier starting from the checkpoint is redone. That is part of repeating history. Quite a few centers have not responded. Let us see the response rate. Only 93 responses are new low. The winner, according to the audience is option two, which is only committed transactions. Now, if you are familiar with the earlier recovery algorithms, which were there in the book before, redo is done only for committed transactions, undo is done for failed transactions. But repeating history is different. Repeating history means everything is repeated, whether it was committed, aborted or incomplete. Everything is redone in the redo phase. So, that was the first question. Now, be prepared for the second question. Press your ST button and let us see the question. Repeating history performs undo on. The four choices are the same as before. All transactions committed, aborted and incomplete. Do not answer yet. Please wait. Now, please choose the option. Time is up. Now, a very tempting answer is to say that undo has to be performed on aborted transactions. But as I explained, aborted means the abort is done. It is all over. And we have written those compensating log records. So, there is no need to perform undo anymore. The redo already has the information which is required to undo it. So, the undo processing is only applied to incomplete transactions. Of course, it does not make sense to undo committed transactions. So, undo applies only to incomplete transactions. The choice is the fourth option. Let us see the response. Again, many centers have not responded. This time, we have 118 responses. A little bit better. Let us see what people have said. This time, the audience has a narrow win in the sense the correct option four has the highest number of words. But if you add up the words for one, two and three, it is more than number of words for four. So, audience not good. So, you want to undo committed transactions. Well, quite a few want to do that. Some of you want to undo all transactions. Not a good idea. This database is going to get wiped out. So, undo is only on incomplete transactions. Okay. With that, the lecture is over. And let us take some questions. So, let us start with Amrita Kallam has a question. Amrita, over to you. Sir, you answered my questions. I gave one chat before the next session. And if there are no specific explicit commands for unlocking, how do we implement two-phase locking in a transaction? So, if you have no explicit instructions for unlocking, how do you implement two-phase locking? The answer, like I said, is commit or rollback automatically release all locks. So, that is pretty much the only control you have. In other words, there is no way for you to, most database systems to control this. It is all up to the database system. We covered the theory because two-phase locking is a very important concept. But in the industry, when people say two-phase locking, since anyway nobody releases locks before commit or rollback, in the industry most people refer, when they say two-phase locking, they are not talking of a release phase. They are really talking of a situation where you keep obtaining locks till the end and then you release locks at the end. That is what people mean when they say two-phase locking in the industry. But technically, the correct thing is what we have in the book and other books also say the same thing. So, if you are asked in an exam, what is two-phase locking? Well, what is in the book is technically correct. If you see what people in industry are saying, they probably mean rigorous two-phase locking. I hope that answered your question. So, that means the shrinking phase in the two-phase is just limited to the commit statement or the rollback. Exactly. That is precisely right. The shrinking phase is exactly when commit or rollback happens and everything is released at one go. I have no question, sir. So, what happens if there is a failure during the checkpointing before writing the data blocks to the disk but after writing the lock file to the disk? That is a good question. What happens if during checkpointing a failure occurs? So, one way to deal with it is with a basic checkpointing algorithm, you first have to stop all updates to disk blocks, then write them all to disk, to buffer blocks. I mean, you stop all updates on buffer blocks, write everything to disk and only after that you write a checkpoint record. That will deal with this. But like I said, this is true intrusive. So, real systems do not do this. They implement fuzzy checkpointing where they actually write out a checkpoint, begin lock record, then perform the checkpoint and then write out a checkpoint, end lock record in effect. Or there are some optimizations which if you read a book, the way they keep track of it is they keep track of which of the last completed checkpoint. They write out the lock record, then start writing out the disk blocks one at a time and in the end they say, now this checkpoint is completed. If that thing which says this checkpoint is completed is not there, it should be treated as an incomplete checkpoint. In fact, Aries recovery algorithm does something even more clever. What it does is, its checkpoints actually don't do write out any blocks. Aries continuously keeps writing out blocks in the background. In fact, all databases do that. They keep writing blocks in the background. Even if whatever recovery algorithm they use, they will be writing out updated buffer blocks all the time. When a checkpoint occurs, Aries only writes a list of what blocks are dirty. That is, they have not been written out yet after being done. And using this information, Aries is able to handle recovery correctly without even writing out blocks at that point. But since I didn't cover Aries, I will skip the details. I hope that answers your question. One more last question. There is a database support, multiple concurrently controlled protocols like two phase timestamp based and validation based or only implement only one of them. That's a very good question. Can a database support multiple concurrency control mechanisms all within the same database? Actually, you can't arbitrarily mix different concurrency control mechanisms. If one guy is getting a lock, the other is using timestamps, they may conflict on an item and you will never detect it. So, you cannot actually mix these up within a single database. There has been some work on one transaction using snapshot isolation while another transaction uses locking while, you know, there are some tricks involved here. If you understand how snapshot isolation is implemented, you can have some variations where some transactions can do things slightly different. But the only variant which is actually supported widely is the following. One transaction can run in serializable mode while another runs in read committed mode. Now, the transaction which runs in read committed mode may mess up serializability, but if you have five transactions which all run in serializable mode, between them they should be serializable, even if another one runs in non-serializable mode. So, that control is provided but not completely changing the concurrency control algorithm. That's not supported by anybody. Okay? Any other question? You asked a lot of questions but they are very good questions. So, if you have more, feel free to ask. Yeah, go ahead. See, in case of a client server architecture, how does the server knows if a client crashes? For example, if a client has given a submitted a query to the server and afterwards if a client crashes, how does the server manages it? So, that's a good question. In a client server mode, if a client crashes, how does the server detect it? The way it's done is usually, you know, the connection is over a network connection PCP IP and if the server detects that the connection is broken, it keeps track of what transactions were active from that connection and so if the connection breaks, then it assumes that the client is gone and it will roll back whatever transaction is active on that connection. So, that's a fairly straightforward thing to do and as far as you know that is all that databases do. If you have other situations like the remote backup, then it's slightly different because there somebody taking over is a major operation. If somebody takes over while the original one is still alive, think of this, in the old royalty days, the king is dead, long live the king. You must have heard that phrase. So, the original king better be dead. If the new king said the original king is dead and said I am king and the original king is also alive, they are going to have a fight. So, that situation should not arise. So, there it's more complicated. In the simple client server case, it's more complicated. Any other questions? No sir, thanks. Let's go to some other side. Varangal, you have a question. Please go ahead. That is about log records. We are storing the log records in a disk and why don't we right away write the original records that are updated rather than writing them into log records and again them following the right head log protocol. Thank you. That was a very good question. The question was when you write a log record, you are writing the log record and you are writing the original record at some point. Instead of doing that, why don't you write the updated record directly instead of wasting your time writing the log record? So, there is a two part answer to this question. First of all, the log record is required in case rollback happens. So, it's not enough to write the original record. But then you can say the log record only needs the undo information, no need for redo. In fact, that can be done. But the real reason for this is a transaction usually does multiple updates. It's not usually just one. It updates five or six things. So, if you have to write out all those five or six things, every transaction will have multiple IOs. Now, at least as long as our databases stay on magnetic disk, this is an issue. Maybe with flash, it won't be as important. But think of it this way. If you have to write five different blocks, that's five IOs. But the log records for all these five updates can go into one single log block. And if you write out that one log block, if you output it, you are done. The transaction is committed. So, you can commit a transaction very fast here in one-fifth of the time required to commit the transaction in the other case. In fact, it's much better than one-fifth for various reasons I won't get into. The log disk is written sequentially, its delays is much less. And so, by using logging, you can release logs very fast. The moment the transaction has finished, it updates, all these logs can be released very quickly. If you had to write out all the data items before it commits, it's much slower. So, there is one other reason for certain data items which are frequently updated by many transactions. Now, you can actually accumulate updates from 20 transactions to that data item before it is ever written to disk. So, in 20 IOs have now been reduced to one IO, which can obviously help a lot. So, these are the two major reasons why you don't want to write data blocks immediately, but instead, you just write the log record. So, back to you. Thank you, sir. Over to you. The last question hanging is from Amrita Bangalore. And then, I have a few questions sent by chat. So, let me take Amrita Bangalore. Why the multi-version time-stamp ordering protocols are not commonly used? The question is why are multi-version time-stamp protocols not commonly used? I thought I had answered that, but since you ask, basically the multi-version protocol has overheads of actually keeping many versions. So, implementation has some overheads. Some databases anyway do this. So, why don't they just use multi-version time-stamp protocol? Well, they also have to deal with recoverability, which means they have to do some kind of locking. And when you are done with all that, what you have is close to snapshot isolation. But there is one other reason. Time-stamp ordering has the problem that the commit order has to exactly match the order in which transactions start. It has been found experimentally that this is not necessarily a good idea. Validation protocols actually allow the commit order to match the order in which they complete, sorry, the serialization order matches the commit order. That has been found to be better than treating the starting point as the commit order. So, sorry, what am I saying? I am confusing the hell out of myself. Let me restart all that. What are they saying? We have a serialization order which has to be respected. That serialization order can be the same as the commit order. It can be the same as the order of the transactions. Or it can be something else. With time-stamp ordering, the serialization order has to exactly match the starting order. And this has been shown to have fairly bad performance. In contrast, if you allow the serialization order to be the same as the commit order, which is what many other protocols support, you get much better performance. That is, fewer transactions get rolled back. So, that is another reason why the basic multiversion protocols are not used that much. Now, the snapshot isolation protocol, which is very widely used, it does not actually guarantee serializability. But if you see what its approximate serialization order is, it corresponds to the validation-based optimistic concurrency control protocol. And those have been shown to do better than the original time-stamp ordering where the time-stamp is the starting time-stamp. So, that could be another reason why nobody implements the original multiversion time-stamp, although snapshot is supported. Back to you if you have any follow-up question. Sir, one more question beyond the scope of what you have taught. Multilevel databases can be used that Bell-Apudula model, right? In that, what is called covert channels? That question is on secure databases and multilevel, it is actually multilevel secure databases and there is a Bell-Apudula model for it. And I cannot give you that answer within a few minutes. It will take some time to explain what the Bell-Apudula model is. But, as a first approximation it basically says that if you have a top secret document you have a secret document and then you have public and so on. Now, all of you are familiar with the WikiLeaks thing which is going on. Now, all those documents in WikiLeaks were pretty much classified meaning it is not top secret but you should not be releasing it to the public. So, multilevel security is basically a way of enforcing this model where you have different levels of security classified and so on. Obviously, in the WikiLeaks case multilevel security failed because somebody, in fact some very junior guy, 22 year old guy got access to all this classified data, could copy it into a pen drive and send it to WikiLeaks guy. So, the security model and it is surprising that the US army was so careless with its, not army whoever it was there they were very careless with their data. Companies today are not so careless many companies have very strict policies where all their computers have USB drives, disabled etc. It is possible to work around it but it requires a lot of work. So, they can be compromised but they do not protect it. But coming back covert channels are a more complicated thing in there where you have people who have read access to some level of data, write access to some other level of data and there is a whole bunch of stuff in that model and a covert channel in all these security models is some way where you can make information visible to somebody who cannot actually communicate with you but you can do tricks like you can send out packets at a certain rate which indicates you know go ahead or some other rate which indicates do not go ahead. They cannot see the contents of the packet but they can see how fast packets are coming and you have thereby communicated a message to them. But all that is way out of the scope of this course. So, I will stop there and not say anything more about it. What I am going to do instead is take a few of the questions which came on chat. So, first question is is there any software to run relational algebra queries, relational calculus or it is used only internally for query processing. I do not know of any software which directly runs it, it is internal but if you wish to it is not hard to implement it, it is actually fairly simple to take a relational algebra query translate it to SQL and run it on a database back end, it is not at all hard. So, that could actually be a nice project for students to do write a relational algebra interpreter. Of course, the relational algebra as we write it in the book requires subscripts and so forth. So, you have to redo it in some way which does not require subscripts. A common mode for example is whatever is in the subscript is enclosed in square brackets. So, there are ways of representing it. You can do that and build your own interpreter. The next one is how compensation can be done during recovery redo phase. Well remember when we did an undo, we wrote compensation log records. When you do a redo, those log records are executed automatically during redo. So, you do not have to do anything specific the log records ensure that the compensation is done. Next question from Dhiruba Ambani. What about redo when there are nested transactions in which an inner transaction committed but outer aborted? Will it abort committed transaction also? That is a hard question. Multi-level nested transaction models are fairly complex beasts. Implementing recovery for general purpose nested transaction model is actually fairly tricky. If you look at our recovery algorithm in the book, the one which deals with early lock release, it is a special case where you have two levels. It is not arbitrarily nested but outer level and inner level transaction. The logical operations can be thought of as inner level operations and that will answer your question with respect to two levels of nesting. So, I want to answer it fully but say go read this stuff on logical undo. Think of the logical operations as the nested operations and that will partly answer your question but not fully. Goodbye.