 Very important step to guarantee the asset property we talked about earlier. So again it's the same administrative stuff, right? Product three is due on Sunday, November 14th, hopefully everyone has started on that, right? And homework four is due actually pretty soon, just this Wednesday. So hopefully everyone are on track for them. So today, right after the lecture, again there will be this talk from a database company called Vartica, which is actually a spinoff from my advisor's advisor, right? Max Dombriker from MIT, and they are going to talk about the new things that they are doing with their column store database, all right? So today we're going to talk about logging. So as I mentioned earlier, right, the important properties the database system want to ensure for the programmers, right, to deal with all the kind of headache issues with data, dirty right, power failure is that there's this set of property called asset, the system want to ensure, right? So we talk about lots of things about concurrency control algorithms and protocols in the last two weeks, and those concurrency control mechanism were mostly trying to achieve the isolation property, right? So today we're going to talk about logging and recovery that are very important to help the database system to achieve the rest of the properties in asset, namely, atomicity, durability as consistency. But for consistency, we'll talk more about them in the distributed database system setting, right? For today, it's more focusing on atomicity and consistency. So to give you a little bit of a motivating example about the importance of logging and recovery, right? Here, I give you, by the way, so this logging and recovery actually, it's a specific component but it's going to actually touch many of the other components of the database system, right? So it will actually work with many of the things, components we discussed earlier in the class, especially the buffer pool manager to achieve this atomicity and durability property. So again, giving you the motivating example about the importance of logging and recovery, here say I have this transaction just to read on it, write on it, perform a record, right? Here, I'm also showing you what will be the content in my buffer pool and what will be the content on disk, right? So here, let's say we start this transaction, write read on A and assume that there's no content in the buffer pool at the beginning of everything, right? So what it first need to do is that the database system first need to read or brought this page into the buffer pool, right? And then perform the read. And then it needs to write on A, so assuming that it just changed the value of the A from 1 to 2, right? So now, assuming that the database system wants to commit. And then let's say without any protection mechanism or login recovery mechanism, what do we do? I mean, based on what we talk about earlier in the class that we are just when the system commit, we will just get back to the client or the outside world, right? Say, hey, we already commit. But what if, after we tell the outside world that the database system already commit, but before we flush this page out to the disk, there's a power failure, right? Say, I mean, somehow there's a storm or somebody just kicked the power plug and then there's a failure. The database system lost all the things in memory, lost all the processes, etc, right? So now, in this case, if you restart the database system, back up again, then what you'll see on the disk is that it's just still this value 1 on disk, right? You're having to be able to see the latest commit from this transaction if you haven't be able to write this page on disk yet. So this will actually be incorrect, right? Because you already get back to the outside world, right? Say, this transaction is trying to save $100 for you and they already committed. You got notified the transaction has committed, but then the $100 actually is not going to be added on your bank account on disk. Then this is bad, right? That's exactly what logging and recovery are going to deal with. So just formally speaking, logging and recovery mechanism would be techniques to ensure the consistency, automaticity as the durability of the database system, especially during the feeders. And at high level, there are two parts about this logging and recovery mechanism and actually just pretty much by the naming, you can naturally tell that there are two parts, right? The first part will be called the logging part. That's when the database system is running normally, right? When it is executing all the transactions, it needs to do a certain task, record a certain metadata or different changes made by these transactions at the wrong time. When it is normally running in a certain place, right? Sometimes in memory and sometimes on disk. So that later on for the second part is that after the crash, right, when the database come back again, there's a second part called a recovery protocol or algorithm that's going to look at the earlier information, the metadata, or the changes recorded by the first logging part and then restore the state of the database system back into a correct state, right? So there's a logging part and there's a recovery part. So today, we are going to focus on the first part, which would be the additional operations that the database system needs to do when it is normally running executing transactions to ensure that there's enough information and metadata, etc. that a database system can use when it come back from a power failure to restore the system in the correct state, right? That would be what we talk about on Wednesday for the second part. So there are actually many topics to discuss about even for just for the first top. The first would be that what kind of failures that the logging and recovery can deal with, right? So the database system actually cannot deal with any type of failure. For example, if there's a fire going on, you just burn all the disks in the data center, then everything will be lost, right? Nothing can bring you back from that scenario, unless you have some redundancy, right? That would be a separate topic. And the second of all, like I mentioned, logging and recovery would also touch many parts of the system we talk about earlier, especially the buffer pool manager part. And we actually need to do some modification or enhancement of the buffer pool manager policy to collaborate with the logging and recovery part to achieve atomicity and durability. And we also talk about two specific methods for logging, shadow pitting and write a high log. And we also talk about what the content or the logging scheme, right? That's essentially what content we are writing to the log. Lastly, we will give a little bit heads up on how we do checkpoints. And we'll get more about checkpoints Wednesday when we talk about recovery. All right? So the one important concept, right? So to talk about before I get into the type of failure is what would be the different type of storage device that the database system could use, right? So essentially, based on the property of the storage device and how the database system used them, we are going to categorize the different kind of failures that database system may encounter and what kind of failure we can, the system can address or what type of failure the system cannot address, all right? The very basic three types of storage we'll talk about or classify that the database system would use would be called a volatile storage, non-volatile storage, and also stable storage. So this concept are kind of straightforward, right? Volatile storage would just mean that the type of storage that it would lose all the data if there's things like a power failure, right? Or a program exit such as DRAM or SRAM. And then non-volatile storage would just be things like HDD or SDD, would persist all its data when a power failure happens. And the third type of storage device is sort of a hypothetical or conceptual device, which just means that a type of storage device that would persist its data when encountering all kinds of failures, right? Burning data center, etc. But this type of device is mostly just for conceptual or discussion purpose, it doesn't really exist, right? So in practice, mostly we're just dealing with either volatile storage or non-volatile storage, all right? So following that, there will be three types of failures that the database system may encounter that we classify them into. The first will be transaction failure, the second will be system failure, and the third will be storage media failure. Let me explain that, all right? So sir, the first would be called a transaction failure, right? So that's what would be the type of failures associated with the execution of transactions. And again, oh actually, a little bit heads up before I go into detail, right? About the two, three type of failures. The type of failures that a login and recovery can deal with will actually be the first two, right? The third type of failure, storage media failure, that's not something that the database system would deal with by itself, right? It would be some external help or some redundancy created by humans to deal with. I just give you the heads up, okay? So for the first type of transaction failure, right? Failures related to execution of transaction. So again, we're going to classify this type of failures in two categories. The first type would be called logical errors. That would be essentially means that when you are executing this transaction, right? Certain transaction may violate the internal constraints, right? All consistency specifications required by the user, right? Again, an earlier example I used in the class is that assuming that you have a database that's handling all the bank accounts for a specific bank. And you are just only moving money for different accounts within this bank, right? Then in that case, no matter how you move money around, assuming no interest have been applied yet, at the end of the day, all the money should sum up to the same number, right? And if that somehow, one transaction come along, move money around, but then the total amount of money has changed, then that will violate a constraint, right? So that would be a logical failure. And then some transaction or this transaction needs to be aborted. And the database system can handle that, right? And restore all the changes. The second type of failure would be essentially called internal state errors. And mostly, the most straightforward example would be all kinds of deadlocks, right? The aborts, we talk about when we discuss the concurrency control protocols, right? For example, if the database system is trying to schedule a set of transactions and then there's a deadlock, then the system needs to abort certain transaction and then make sure the database system is in a correct state, right? And then that abort would just be called an internal state error. And the system will also handle that with logging and recovery, all right? The second type of failure would be called a system failure. And then also, at high level, categorize them into two different categories. The first would be called a software failure, which would mean that either typically there could be a bug either in the OS or in the database system cause the system to crash or the OS can panic, right? For example, if there's a division by zero exception, somewhere in the logic of the database system that is just uncut, then when a transaction hit that, the whole program would exit and crash, right? And then everything within the processes and then the temporary memory space of that process would just be lost, right? Can have to deal with that. The second type of failure and probably actually be the very common, I don't know whether the most common, but very common type of failure would just be power failure, right? There's a storm, there's a heavy rain, someone just kicked a power plug and then you lost everything in memory, right? And then the database system would also deal with that. So one assumption we are making here is that with the second type of hardware failure, we are assuming that those are not the type of failure that would cause the loss or corruption of the data on the disk, right? So even though you have a power failure, we're assuming that the data is uncorrupted. If the data is corrupted and somehow a value on the disk is changed by some forces alongside the hardware failure, then the database system itself is difficult to fix that. That said, there could be mechanism the database system apply to detect those kind of failures, right? It cannot deal with it by itself unless you have some human to apply some redundancy external to a system, all right? So the first type, the transaction failure failure, as well as the second type, system failures, those are the failures and that the database system can deal with with logging and recovery. And lastly would just be the storage media failure, right? So essentially your data could be corrupted on the disk with some fire that just destroyed, burned all the disks in your data center. Then you would need something. So there are ways to deal with that, right? But it's not through a login and recovery, right? You could use, for example, redundant disk, and you can install a read array of all your disks all in the next class or in next week. We also talk about distributed habits, right? You can install additional copies, but it's not a job of the login and recovery. So for our purpose today, we are not going to talk about, the message is not going to address this type of failure, all right? So any questions on different types of failures and what's the responsibility of our database algorithms, especially logging and recovery, all right, makes sense? All right, go. So why we need to talk about that? Okay, so a fundamental observation about database system is that, well, database system mostly deals with data stored on disk, right? So it will make sure, it want to make sure that all the data that users put into the database system would be durable and persistent on a non-wildhouse storage device. But through our discussion earlier in this course, we observed that, hey, if we just always write and read data on disk, it's going to be very slow, right? So ideally, we don't want to perform every single operation from and onto the disk. We want to have some staging error, which is essentially buffer pool in memory that we can just modify or read and modify pages in this temporary in-memory space, which would be a much, much faster, right? And of course, this would have ISU where you haven't written the dirty pages onto disk, then there would cause essentially the corruption ISU or when there's a power failure, you don't have the records of committed transaction on disk, right? So that's why we, because we have this buffer pool area with these temporary records in memory, that's why we need this additional mechanism to protect the database from the potential corruption from power failure, right? And just to recap how the buffer pool works, essentially every time you want to, for example, modify a data record in the content of the database, what we need to do is that you would first actually retrieve a copy of this page from the disk and your non-volatile storage device to memory, right? And then you perform all your modifications there and after you finish all the modification, you want to flush this dirty page back into the persistent storage, or in other words, non-volatile storage, right? So that's the high-level procedure and with login and recovery, we're actually going to again modify or extend these steps a little bit, right? In collaboration with our algorithms to achieve the durability and atomicity, even though we don't want to perform all our operations on disk all the time, which will be slow, right? So again, to recap a little bit about the purpose of login and recovery and especially under the concept of acid, all we are trying to do today, right? Especially atomicity and durability would be that we want to make sure that the changes of all committed transactions would be persistent on a non-volatile storage device before we tell the outside world or the clients that the transaction has committed, right? That's called durability, right? Make sure that everything is persistent before we say a DC commit. And another property, which would be atomicity, would means that we don't want the change of any uncommitted transaction, right? To be persistent on this, right? We can temporarily apply the changes of uncommitted transaction, but we don't want the effects of uncommitted transaction to be durable, right? If the transaction is about it. Make sense? All right? So the fundamental tool primitives or tool type of records that we are going to let database system to, I mean, record to help us achieve those functionality, right? Or guarantee those properties will be called the first type will be called undo records or sometimes called undo log records, right? Similar thing. The second type will be called redo records, right? So the undo records would actually store the effects of any uncommitted transaction, right? So that when this transaction aborts, then, I mean, the database system can restore any potential changes by those aborted transactions, all right? So the second type of records that the database system is going to record, again, help us to achieve this asset property is called redo records, right? Which would essentially be the records, again, to store the effects of any, I mean, changes that any transaction apply so that when there's a power failure happened, right? We can look back and see what would be the effect of modifications made by the committed transaction. And then this redo records will help the system to restore all the modifications by the committed transaction, right? And we are going to use these two records to help the system to achieve the asset property. And then how we are going to record those records and essentially achieve those property will actually depends on what kind of modification or extension we did to the buffer management policy, right? So like I said, it's going to be tightly related to how we are going to let the buffer manager as the login and recovery algorithm collaborate, all right? So any questions about the undo records and redo records? Okay. So to give you an example, right? Put everything into context about the use case of login and recovery. Say, I mean, it's a similar thing that we had before, right? We have two transactions, right? T1 and T2. And by the way, for today's discussion purpose, we're just going to ignore all the concurrency control part, right? We just assume that there will be some concurrency control mechanism, either pessimistic or optimistic that already help us to achieve the correct scheduling, right? We're just going to ignore all the annotations for them, right? We're just only talking about cases like a power failure, okay? So again, two transactions, T1 and T2, T1, read on it, write on it, T2, read on it, write on it. Assuming that at the beginning, there's nothing in the buffer pool, and then, I mean, this is a specific page on disk, right? So at the beginning, when T1 starts, just means there's nothing in the buffer pool, the ZLB system would need to bring this page into its buffer pool, right? And then perform this read on it, right? And after that, let's say T1 wants to modify the content of A, right? So what it needs to do is that it needs to, change the value of A in this buffer pool page, right? All right? So after that, I mean, T2 is trying to read on B, but this page that contains record B is already in memory, right? So it just can't directly read that page. And after that, say it just wants to change the value of B and change to A, right? So at this point, right, T2 may wants to commit. So here, we have, in order to ensure our asset property, especially the durability property we talked about earlier, we probably want to write this page back onto the disk before we say this transaction is committed, right? Because otherwise, we could hit the power failure scenario that we talked about earlier. If we didn't write this page onto the disk before we say this transaction is committed, then if there's a power failure in between, when the users come back to boot this database back up again, then it won't see the effect of this transaction that has been committed, right? But the problem here is that not only this page has a modification for the record B, it also has the modification of record A, right? So if we flush this record onto the disk right now, then it would contain this value of this uncommitted transaction A, all right? So let's say we do that, right? So we flush this page onto the disk. I mean the B, of course, the value of B is already positioned on the non-valued storage, but then there's also A there. And say what if, I mean the transaction later on either through a logical error or some internal state error it needs to abort, right? So what do we do now? So essentially we need to roll back all the changes of A, right? So of this first transaction essentially this value of A. But then we already written this record of B onto the disk, right? So depending on whether the buffer manager has evicted this page out of the buffer pool yet, right? If the page has already been evicted out of the buffer pool, then if we want to roll back the changes of A, we need to do lots of work, right? We have to first bring this page from the disk, I mean back into the buffer pool, right? And then restore this value of A from three back to this original value, I think it's 102, I forgot. And then write this page back onto the disk again, right? So this is like a loss of overhead and you potentially want something smarter to deal with this, right? So any question with this example? No. No? Cool. So to talk about different ways to deal with this buffer pool or modification to the buffer pool to help us achieve durability and atomicity, essentially we talk about different policies or decisions that the system can make when handling those pages about committed transaction, okay? So the first decision, right? All the policy we talk about is something called still policy, which is used to decide whether the system would be allowed to let the uncommitted transaction overwrite the value of committed transaction onto the non-volatile storage device, right? So essentially, it's similar to the earlier example, right? If we allow the uncommitted transaction, which would be T1, to overwrite the value of A, right? Onto the non-volatile storage, before T1 even commit, that would be called still policy, right? So essentially, we stole this page from the buffer pool manager even though we haven't even committed. And the other opposite would just be no still policy, which means that we wouldn't allow that happen, right? We wouldn't allow uncommitted transaction to write value onto the disk, right? That's one decision point that the system needs to decide. Okay? Another very important policy or like a decision point that the system need to make is something called whether it wants a false policy, right? So false policy would essentially decide whether the system would require the changes of all the committed transaction to be persistent on a non-volatile storage before you tell the outside world that the transaction has committed, right? If we allow, sorry, if we have to have the database system to install all the changes of committed transaction to their corresponding pages on disk or the non-volatile storage device, that would be called false, right? False to write before you commit. Otherwise, it just called no false, right? There's also a very important decision point that the system need to make when handling logging and recovery. Yes, please. For you to just write a log or just report to... Yeah, yeah, yeah, definitely, yeah, yeah. Very good questions. The question is whether this policy is required the system to write the changes to the specific page, I mean, in the table, right, with all the content or whether it's required the system to write this page to some location, right? So the answer is that this is actually getting ourselves a little bit ahead of time. We'll talk about logging later, but essentially the answer is the first, right? So false policy only require the system to write the values of the committed transaction to the specific page corresponding to the content of this data records in the data, right? So under a false policy or no false policy, essentially it doesn't specify whether the system would need to write that to a log or not. So this is not really to that, all right? Okay, and we all talk about people who are not familiar with log records or logging, we all talk about log, so, all right. So to better understand the tradeoff between the different policies, right, I'm just going to give you one example of the combination of these decisions called a no-stale plus false, right? Let's see what the database system would do under this combination of decisions, right? So again, it's exactly the same set of transactions, T1 and T2, right, read on A, write on A, read on B, write on B, right? So again, the similar things. So first there's nothing in the buffer pool, right? And the database system would first have to bring the page into the buffer pool, right? And install the change on read on A and then install the change on A, right? And then later on transaction B comes along, sorry, transaction two comes along and then read on B and then write the value on B, right? So now transaction, so assuming that transaction T2 is going to commit, okay? So what we are going to do here is that we assume we want to enforce a, not enforce because we already use this word false, right? We want to apply, say, right? We want to apply a no-stale and false policy combination, okay? So what does this mean? Is that the database system will actually needs to flush the changes of the technical transaction T2 before it can say commit, right? So that's the requirement of false. But the policy no-stale if we want that policy decision, right? Would also means that we cannot write the changes of the uncommitted transaction T1 to the corresponding page on the non-wildhouse story device, so essentially means that we both need to write this page out and cannot write this page out, right? So what do we need to do if we want to enforce this policy? Well, one simple way, right? Not necessarily the best way, right? We potentially will talk about better ways. But then one simple way to handle this is that we could actually make a copy of this page, right? With only the modifications from the committed transaction, which will be T2 at this point, right? And after we make this copy, we can write the values of all this copied page with committed transaction back to disk, right? And then we can still keep this original modified page in the memory buffer with the values of the uncommitted transaction, right? See, after a while, we realize that transaction T1 is to abort, then it's actually a very trivial to roll back the change of this transaction T1, right? Because right now we only need to, I mean, go back to this dirty page in the buffer pool to flip this value of A from three back to a one and then we don't need to do anything else, right? So we essentially, what we call, guarantee this atomicity or rollback is easy. And actually, if there's a power failure in between, right? It's actually also very easy to guarantee the atomicity when that heavy system comes back again, right? Because all the changes on the disk on my persistent non-biological storage device would only contain the values from a committed transaction, right? So it's very easy to rollback and very easy to recover to guarantee atomicity when you come back from a power failure. Make sense? But what would be the potential issues with this approach? Any idea? So essentially, it's kind of obvious, right? So what we are trying to do here is that, or what we did here is that every time we are trying to write something onto the disk or we try to commit a transaction, we have to make a copy of this entire page, assuming that there may be some values from uncommitted transaction, right? So this copy requires work, right? And the second of all is that even though, I mean, at this point, we don't need to write this page multiple times when later on we have to, sorry. Even though we need to read this page back again into memory on like the first simple scenario, right? But in this case, we may still need to write a single page multiple times to a disk because like we look at it earlier, right? In this time, when transaction B commits, we need to write this page to disk again, to disk for the first time, right? Assuming that later on the transaction, transaction one is not a bot, right? Assuming later on the transaction one actually commits, then we actually need to write this page back to a disk again, right? So quite some right amplification there. And the third of all, what is probably a bigger problem is that we actually, in this case, we need, in fact, keep the pages of all the transactions, sorry. We have to keep the pages of all the values modified by a committing transaction in memory before it's trying to commit, right? Because we don't allow a still here and we need to force all the pages and on disk for the committed transaction. So essentially, before every transaction commit, we have to make a copy of all the changes of this transaction, yeah, make copies on the pages with the changes of this transaction and we make copies for everything and then we have to write all these pages at once onto the disk, right? But then assuming a scenario where the database system, sorry, the transaction needs to modify lots of lots of pages in the database. Then in that case, the buffer pool may not even be big enough, right? There may not be enough room for the transaction to make a copy of all the pages that it has modified. So in that case, the transaction can't even commit and furthermore, for many systems, there may not even a mechanism for the transaction to commit or to write the content of many pages it has modified onto the disk together, right? Say a transaction may modify values from three pages in the content of your database, then we're trying to commit what if you experience a power failure after the transaction has written the first one or two pages onto the disk but has not written the third page onto the disk, right? Then you still may face the issue of when you come back from a power failure, I mean, there's a partial change from this committed transaction that you have not written to the disk yet, right? So there are actually lots of, still lots of issues with this simple policy that would either slow down the performance of a system or even limit the functionality of the system, right? Unless you have some special hardware instruction support. That make sense? Cool. So this pretty much to summarize what I talk about, right? Just in written form. So with this simple, easy algorithm for no steal and a false, you don't have to have any undo log, right? Because you just don't write the changes of uncommitted transaction onto the disk. And similarly, you don't need to have a redo log record user actually, right? Because in this case, for all the things, for all the changes of committed transaction, you will write them onto the disk, assuming you can, right? Before you say commit. But then there are lots of performance implications and as well as the functionality limitations that this mechanism can bring, right? Especially if you have a transaction that has written lots of pages that exceeds the size of a buffer pool, then you just cannot deal with that, right? So a variant of this approach, we actually kind of talk about this a little bit when we talk about the fundamental asset property. It's called shadowpaging, right? So it makes this a little bit better. But again, we'll see it also has its own limitation, right? So essentially, what shadowpaging can help address is the function limitation part, right? It can, is essentially a way, extension of this simple copying mechanism we talked about earlier, so that you can actually install changes that are bigger than the buffer pool size, right? With this shadowpaging approach, right? It's kind of like an incremental copying approach. So essentially under shadowpaging, you'll maintain two copies of the entire database, right? One is called master copy and one is called shadow copy and you will just apply all the changes from the uncommitted transaction to the shadow copy, right? And when the transaction wants to commit, you'll just flip those two copies around, switch a pointer and then let the database to pointing to the shadow page and then your master page. So, and then you clean up the original records in your original master copy, right? And this would actually help us again to achieve the no steal and a false policy. Would make the recovery pretty easy but no need to redo and no need to undo, right? Okay? So here in this, we'll walk through the specific steps, right? But at high level, that's the data structure you need to maintain with this shadowpaging mechanism. So essentially for shadowpaging, you actually have a specific page both on disk and in memory to store the pointer of the root of the database, right? And you will use that to control which version of the database, either the master version or the shadow version that you are going to use. And then with the in memory, as well as on disk, you'll have this page table to pointing to what will be the current version of each page the database system is using right now, right? Again, for each page, it could be its master version or it could be its shadowpaging version, right? Depending on what modifications the database system have done. So just at a high level, right? To install any change to into the database system, the system would actually overwrite all the shadow page, right? That the database copied for the current running transaction, right? And then switch the pointers I talked about earlier, right? For each page from the original master copy to the new copy. And lastly, right? The system will organize, typically they will organize the page table, right? For all to maintain the pointers to different pages in a data structure, usually it's maintained in a tree structure, right? So lastly, when all the changes are done, it will just overwrite the root pointer of this tree, right? To pointing to the new copy of the database or the shadow copy of the database, right? So let me give you a specific use of this. So here, again, that we're just assuming that the database system has this root page and then have these several pages on the disk as well as the current page table or called master page table that are pointing to the corresponding pages onto the disk, all the master variants pointing to them one by one, okay? Say there's a transaction T1 comes along, right? What it needs to do is that it first need to make an entire copy, right? Need to make an entire copy of this shadow page table, right? Of this master page table and it would have called the shadow page table, right? Pointing to all the shadow pages. But at the beginning, because this transaction has not made has not made any changes, all this pointer we're just pointing to the original master copy, all right? And then when the transaction needs to install any changes, it will just apply the changes to the copies of the records onto the disk and then flip the pointer in the shadow page table to point into the new page that contains the modification. And that's for the transaction that will modify these pages and for read-only transaction, right? It will just read the copies from the master page table and to read the original master copy, right? So here, assuming that this transaction T1 would update the record in the first page, right? What it needs to do is that it will first make a copy of the page that contains that record on disk, right? And then just to flip the pointer in the shadow page table for that page from the original master copy to this new shadow page on disk, right? And similarly, if it needs to modify the other records and then it will make a copy of the page that contains the record on disk and then flip the pointer in the shadow page table, right? And assuming that it needs to apply another record again. So lastly, when the transaction wants to commit, what it wants to do is that it will just essentially update this value in this database root page, right? So instead of pointing to the original master page table, now the database root page will just have a pointer pointing to the shadow page table. And then, I mean, even though, and assuming that database system crash after a while, right? When it comes back, it will just realize that, hey, the database root is already pointing to the new page table and then everything will still be correct, right? And of course, I mean, after that, you need to clean up all the honestly records in the original master copy, right? Because I mean, they are not useful anymore, right? So any questions on this? Yes, please. We have a second transaction, P2 of P1. Yeah. But to make it important, P1. Right. Then, how does that actually work? So we would copy P1's power to the table? Yeah, yeah, yeah, essentially the question is that what if there are transactions that, essentially, what about concurrency control, right? What if there are transactions modified in the same records no matter who first, who second, et cetera, right? So yeah, that's actually one type of problem with shadow paging that I'll talk about later, but since you already brought it up, there are different ways to deal with that. One simple way to deal with that is that you only allow one writing transaction, right? So this will limit the concurrency of all your scheduling to be one writing transaction at a time, and you don't have that issue, of course, that would have lots of performance simplification, right? That's one problem. Another problem, sorry, potential solution is that you could allow multiple writing transactions, then you need additional mechanism to keep track of which transaction modified which page, like make a copy of which page, and if there are some different transactions that modify the same page, one transaction already commit, right? Flip the pointer, and the second transaction may not even be able to commit, right? Have to abort that. Of course, there are performance simplification of that as well, right? Yes, please. On the second question, sorry. Yeah. If you want to delete a project, do we need to kind of represent the impact of how many categories they will be pointing to each day? Yeah, so the question is what if we want to delete a page? Do we need shadow mechanism to keep track of the counter of the page? Let's see. Again, I think it depends on the mechanism, right? So again, if you only allow one writing transaction, I don't think you need that, right? Because yeah, only one transaction modify and then other transaction can only read, right? If you allow concurrently writing, then I think yes. Yeah, yeah, you need additional mechanism, yeah. Any other question? All right. Okay, so a little bit similar to the benefit of the earlier straightforward copying approach. The benefit of the shadow paging is that undo and recall would be very, very easy, right? Essentially, it's very easy to roll back the changes of uncommitted transaction. Essentially, you just blow away all the shadow page table as well as the record on our shadow page on the disk, right? It's also very easy to come back from recovery because yeah, if there's a power failure and you have not flipped the pointer yet, then none of the changes would be in effect, right? So you would still have an original copy of the master version of the database intact, right? And you just, only thing you need to do would be clean up all the additional shadow records, okay? Again, but similar to the earlier simple mechanism example with this shadow paging and also a little bit related to what we have just discussed, there are quite some disadvantages, right? So the first disadvantage is that actually before you start any write on this without transaction, you will first need to copy the entire page table, right? Otherwise, you cannot just do the shadow paging, the pointer switching and the flip and then switch to the new shadow copy of the database with only one root page pointer switching, right? And of course, you can go a little bit smarter with that, right? You can sort of do a shadow paging on the page table in memory, right? But still, you have to copy at least quite some records in the page table in order to do the managing all the shadow pages. And of course, you have to copy the shadow pages themselves as well, right? And the other disadvantage, especially problem with commit is that one problem is that that's actually the last bullet list here, right? There are some issues that you have to deal with that have performance implication when you're trying to commit, but there are also other problems, right? Because for example, the first problem, every time when you want to modify the database, you have to do a lot of random modifications, right? Either the pages from the database, right? From either the shadow page that you need to copy, right? Or this page table, right? Because that's also lots of pages that you need to change. And then those pages could reside on many different places on disk, right? So that would involve a lot of random writes, I mean random writings of those pages that could potentially be very slow. And the second of all is that while you are doing those shadow page copying techniques, you, again, you would make copies of lots of pages in your database content to various random locations, right? So the data on your database, or storing your database would become very defragmented. For example, assuming that you have a clustered index originally, your database, the records may all sort out on disk and then laid out nicely, right? Increasing or decreasing one after the other. But after a while, after you apply the shadow paging, I mean those different copies of the pages would just all over the place, right? If you want to perform a sequential scan on the table, then especially if you want to follow in a certain order, then those pages would not be in order at all, right? So lots of defragmentation and also potentially lead a performance penalty. And furthermore, you also need to collect the garbage, right? Essentially, there could be loss of garbage either from the original master copy that is no longer useful or from the shadow pages of uncommitted transaction. You have to clean all of them up. And then we talk about the transaction committing performance imitation as well. So one system that does use, so at high level, right? People existing to have a system rarely use the shadow paging technique to achieve login and recovery. For the original simple copying method that I talked about at the beginning of the class, as far as I know, nobody do a simple thing like that, because it's just too costly and with the performance and functionality limitation. For the shadow paging, because of many disadvantages we talk about, the system rarely use that, but there are certain systems still use that in specific scenarios. So, in fact, when the very beginning of the database industry, we talk about system R, right? That's a very first database system implementation from IBM, IBM Research. Initially, they actually use shadow paging. Again, because the ongoing records and real back transactions and recovery, they are all easy. But then they try to figure out that there are lots of performance implication, implication, they switch to another login mechanism we are going to talk about now. Similarly, with another very famous system, SQLite, they actually start with using a technique, not exactly shadow paging, but something very similar to shadow paging at the beginning. Again, it's more straightforward, easier to implement. But later on, after around 2010, they also switch to a login-based mechanism we are going to talk about next. But just to quickly talk about the initial variant about shadow paging that SQLite applied in first, it's essentially a technique called journaling, right? So a little bit different from shadow paging where you make copies before you modify a page. With this journaling technique from SQLite, what they will do is that before modifying a specific page, they will actually make a copy of the original value of those page onto a separate journaling file, right? Make additional copy of the original version instead of the new version. And then they will actually modify the values of the records in those pages in place when some transaction needs to change them, right? And then when transaction was to commit, of course it will just commit, right? And then if there's a power failure, right? Assuming that the system crash, after restarting, only if there's a general file that exists, the database system would actually need to undo all the changes of the uncommitted transactions from this journaling file stored on a separate storage device, right? Otherwise it will just keep continuing. So it's kind of like a opposite or conjugate method from shadow paging, right? But the intuition is similar. Let me give you a quick example for this. Say here, right? We have three pages in memory and then there are many pages on disk and there's a separate journaling file, right? Say I have a transaction that want to make a modification on this page two, right? So SQLite would do a little bit different from shadow paging is that it will make a copy of the original value of page two, right? And then it would just go ahead and modify page two, let's say a two prime. And similarly, if it wants to modify page three, make a copy of the original version of the page three on the journaling file and then make the copy here. So modify this page three here, right? Three prime. And then say the transaction wants to commit, right? It would just, I mean, write the value of this modified page onto the disk, right? But assuming that here, there's a power figure, right? And before the transaction is able to commit, I mean, this system crash and then after a while the system comes back to memory again, right? Of course, when that happened, nothing, I mean, in memory would still exist, right? And then there would be a dirty record, right? All the page with values from uncommitted transaction on page two, right? So what SQLite would do in this case is that it would just look at the journaling file and see that, hey, there are actually journals from the uncommitted transaction in my journaling file. And it would just restore the value from this original copy of these journal page files and then restore them back onto the disk that stores the content of the actual data, right? So a little bit similar to a shadow paging, but slightly also, I mean, kind of slightly different in how they handle that. But at the end of the day, SQLite would move on to a logging approach as well, which would be more performant, all right? Any question? Cool. Yeah, it's just like, pretty sweet. Okay. So one fundamental, of course, with this shadow paging approach, right? You don't really have the limitation of the database system needs to have a big enough buffer pool to store all the modification of the changes of the transaction anymore, right? Because you can incrementally write changes to the copies of, I mean, or with these shadow pages. But one fundamental performance limitation of this shadow paging approach is that at the end of the day, you still need to either copy, right? Or look up many of these random pages on the disk, right? And modify these pages in random locations, either from the content of the database or from these page tables, right? And then write out those changes when you commit the transaction, again, in a random fashion. And then we know that random disk writes is very, very costly, right? So one central idea that we want to apply here to improve the performance of the database system or improve the efficiency of logging and recovery is that we actually want to make our rights of the database system as sequential as possible, right? Essentially, we want to record the changes of different transactions and metadata about these transactions later on if we want to recover. And we want to write those changes onto this as sequential as possible, right? So that when we reduce the number of random rights and that can significantly improve the performance of the system. And that's this one fundamental thing we want to achieve. And another thing we also want to achieve is that with those shadowpaging, or with this shadowpaging technique, one important limitation is that they are going to modify one page at a time, right? So even though you only modify one record in a page under this shadowpaging technique, at the end of the day, you still need to write the entire shadow page out to disk, right? That's also kind of costly. That's another performance limitation that we want to get rid of as well, right? To improve the performance. So the technique that we are applying here to help us to achieve this performance improvement and get rid of the earlier performance implementation will be called right-head logging, right? So essentially, the idea is that before the transaction commits, we are going to write the changes made by this transaction, including all the metadata, right? The information about redo records, under records, et cetera. We'll talk about details later. But anyway, write out all those changes of that transaction sequentially on a log file, right? Before we want to commit that transaction. And in that case, we don't actually need to write to different random locations on disk about the actual content anymore, right? And then if there's a power failure, or if we want to roll back those changes, for example, if we have a power failure, right? We are just going to look back into this log file and see what we want to restore, et cetera. And then also, what we want to achieve is that we want to achieve a still and no false policy, right? Essentially, by this approach, we actually want to be able to write out the changes of uncommitted transaction onto disk before waiting, this page, we have a clean page, or the transaction has been committed. And also, we want to have a no false policy as well, right? So essentially, we don't want to wait for the Powerful Manager to evict all the pages of a transaction, all the pages that has records modified by a committed transaction. We don't want to wait for Powerful Manager to flush all those pages onto disk before we can commit. Because we have these log records sequentially written onto the disk. And then, again, the most important thing that this right-hand log game need to guarantee is that it needs to write out all those changes, these modifications, sequentially onto the disk, before the transaction can commit, right? Or before, in other words, before the transaction can write out the actual content, the page that contains the actual content of the transaction back onto the original location on the disk, right? Or otherwise, if you already modify a transaction in this original location, or you already modify the content there, but then you haven't really have the records in the log yet, then after a crash, after a recovery from the crash, then only by examining this data on the log record, you are not going to restore the heavy system in the correct state, right? So that's what we essentially want to guarantee, right? We want to make sure that everything that we are going to persist on the disk would be reflected in the log file early, okay? That's why we call it right-hand log game, okay? Well, again, this is a little bit abstract, but we will see more examples later. So hopefully it will become more clear. So the basic protocol to achieve this right-hand log game, right-hand log game, okay? So the way we are going to do that is that when the database system is going to modify any records, what we will do is that instead of first making the modification on the original page that contains that record, it will actually write that modification on a separate log record, right? And then in this case, it doesn't need to write the content of the entire page. It only needs to write to the log record about whatever it changes, okay? And then before this transaction is trying to commit, or overwrite the changes of these modified records onto the pages about the database content, what we will do is that it will first make sure that all the log records are flushed on the disk, right? And then so that later on, if there's a crash, it can come back and see what's contained in the log record. And then when the transaction is trying to commit, it only needs to make sure that all the pages that contains the log records would actually be flushed out or written to the disk, right? It doesn't need to ensure that the modification in the pages that contains the actual record in the database content to be flushed onto the disk. It doesn't need to ensure that, essentially we would allow no false policy, all right? So trying to give you a more concrete specification, and again, after this, I will show you examples, is that every time when a transaction begin, it first need to tell that they have a system that the transaction already starts, right? So at that point, it needs to write a begin record into the log file. Again, doesn't specify when it needs to flush that begin log record onto the disk yet, right? At this point, the begin record may still in memory, okay? And then when a transaction finishes, what it will do is that it will write or append a commit log record at the end of the log file, and then it will need to ensure that all the modifications, all the log records that contains the changes made by this transaction all flushed to the disk, right? That would include the begin records, as well as all the modification, I mean, written in the log records in between, as well as this final commit record, it will ensure all the record written onto the disk, hopefully in a sequential manner, right? Before it can tell the outside world or the client the transaction has commit. And then again, note that it doesn't specify that the dirty page that contains the modification of this transaction to the actual content of the heavy system needs to be written to the disk, right? It doesn't force that, all right? Make sense? Cool? So at a high level, right? So what would be inside every log record of this right had a log in mechanism? Essentially, I mean, there are, there are actually in the implementation level, right? There are much more metadata that you need to maintain, but at a high level, at a logical level, right? What needs to contain in a log record would be first, which transaction it is, right? Kind of straightforward transaction ID here in this case. And as well as which object it is trying to modify, right? This cool, mostly, most of the time this will be an identifier for the tuple, right? For example, a page slot number. And then it will have the original value of this particular object, say a tuple, before the transaction modify it, used for the purpose of undo the changes of this transaction if it aborts, right? And lastly, it will have the new value of this object, say the tuple. I mean, after the modification of your transaction and the purpose of this redo record or after-value would be to reapply any changes of the transaction if the transaction commits, but then there's a power failure and the database system has not written the dirty pages of this transaction to the disk yet, all right? It makes sense all those information before and after-value, what are the purpose? Okay. So here, finally, we can give you a specific example, right? Say, for demonstration purpose, I just have a very simple example where we have one transaction, just write two value, okay? And then initially, we have an empty buffer pool. And then, sorry, here I'm showing you that the buffer pool already brought in this page into main memory. And then at the beginning, we have an empty log record buffer, right? Because I mean, you have to have a buffer to store those log records before you can write them onto the disk, right? So that's kind of straightforward. So at the beginning, when this transaction begin, it will first install a begin log record into the right-hand log buffer, right? To indicate the system, hey, I mean, there's a transaction that already starts. And then, I mean, there's a write on A and then it will just, the transaction will just install a log records into the right-hand log buffer about this modification. And it has the transaction ID, the object ID, as well as the before and after value of this specific object, like SIGAR 2.0. And then in the buffer pool, I mean, oh, there's a specific reason for that. We actually have to first make the modification in the right-hand log record. And then we make this modification in the buffer pool which I will talk about in details next class. But here, I mean, just for demonstration purpose, we just need to know that we first need to write this record in the right-hand log buffer. Then later on, up when the log record has been created, we are going to make the modification in the original buffer pool, right? We talked about earlier in this course and made the modification of this record A from one to eight, right? And then essentially we have a new dirty page in the buffer pool, okay? And then later on, we have this new write, I mean write on B. And similarly, we are going to install a new log record in the right-hand log buffer, right? Has the before value, after value, and the other metadata information. And then we are going to make the modification on this buffer pool, right? Okay, makes sense. And later on, transaction commits, then we don't need to make modification in the buffer pool anymore. We're just going to install a commit log record in this buffer pool object, all right? And then now what we can do is that instead of doing this random writes with all the pages that the transaction has touched, we're just going to write out all these log records that only contains the values of the specific tuples that this transaction has modified and write them sequentially onto the disk at once, right? Which would potentially be much more efficient than the earlier page copying thing we talked about with the shadow page, all right? Then now, I mean, after this modification, sorry, after this sequential write, this transaction can just safely commit. We can tell the outside world, hey, this transaction has finished. And even though we still have the dirty page down there, but even, but if a transaction, I mean, oh, sorry, if the system crash right now, we still have the log records written on the disk that we can find back the values of this committed transaction, even with a power figure, right? So this whole thing, all the state of that heavy system would still be correct, all right? Is this cool? Yeah, there's this transaction commit. Yeah, then, I mean, I'm similar, just use really what I said, right? If there's a power figure, we can still look at everything onto the disk here, right? So there are a few interesting questions or clarifications we can look into about this Red Hat Logging. The first question would be that when should the heavy system to write a log record entries onto the disk, all right? So the answer, well, when we first, obviously, that you would need to write all the log records onto the disk when transaction commits, right? Because otherwise, you cannot guarantee the durability property. But there's nothing preventing the system to write some of the log records onto disk earlier, right? If the system will figure out, hey, there are already enough records in the pages, so I'm already going to perform a big enough sequential write onto the disk, and that would just, would have worth this one run trip to the disk, even though the transaction has not committed yet. I mean, yeah, you can always write things earlier, right? I mean, and if the system realize that it's going to worth it, right? And then, second of all, even though, I mean, we talk about that we can write the changes of the entire transaction onto the disk one at a time, there could still be cases that you may have many short transactions, right? So each transaction may not modify many, many pages, or maybe just modify one record, right? In that case, even though we could commit or write the changes of all the transactions every time when the transaction commits, if each transaction only modified a small number of records, there may still be lots of random writes, right? So another thing what we could do is that we could actually batch, or in other words, group the commits of multiple transactions together, right? So that, again, we would have a big enough, a chunk of big enough log records in memory so that we can write the changes of many transactions together onto the disk in a sequential manner, right? To amortize our cost. But again, or similar to the example I gave earlier with the sequential writes, this potential disadvantage of this approach is that each transaction needs to wait a little bit before it can commit, right? Because before you write out the commit records and all the changes of each transaction onto the disk, you cannot tell the outside world you have commit, right? So if you want to use this group commit approach, you had to let a set of transaction to wait until the log records is big enough or a certain time has passed, write the changes of all the transactions onto the disk, and then you can tell the clients that issued all these transactions together that everything has committed, right? So if some transaction executed earlier, you have to wait. But then again, it could amortize the cost of many potential random writes, all right? So actually with this approach, one thing to note here is that even though the sum of certain transaction did not commit, right? Like I mentioned earlier, you could actually still write a change if the time window has passed, if you realized all, if you realized that you have written a big enough log record so that a wrong trip to the disk would worth it. So let's give you an example here, right? To combine these two concepts. Say you have two transactions, transaction in a T1 and a T2, right? Do you need some modification? And then just quickly go through transaction one begin, have a begin record, and then transaction one can have a write operation, create a record for that, and then transaction one can have another write record, right? So say here that something happened and transaction one is stored because of some other operation it's performing, we do a context switch to transaction T2, right? Then on the group commit, we can actually start to write a T2 already, right? Let me T2 say here have a begin timestamp directly append to the same log buffer. And then for example, it can write the value of C, sorry, the changes of the value of the C as the changes, yeah, it can already write the changes of the C here, right? Say now, for example, the Sestabi system already realized that this log record is already big enough, right? Then it can actually already start to flush this record to disk, right? It doesn't really need to wait for either T1 or T2 commit, or it doesn't need to wait for a specific time either, right? It can just directly write this big enough record sequentially onto the disk already. And later on, right, say the second D2 comes back and then write the log record of this record D or 2.D onto the second buffer pool, right ahead log buffer. And then say after that, both T1 and T2 has stored, right? Again, because I mean some other reason maybe T1 or T2 are performing some read operation that is waiting for some other records to be brought into the buffer pool, right? So it's just some other expensive performance, expensive operation they are performing. And then if certain time has passed, right? If there's like a periodic time that the IV system can keep track of to write the log records in batch, if that time has passed, it can still write this new batch of log records onto the disk, even though there may not be many records already, right? So that's totally fine. And I mean, even though this T1 and T2 has not commit, but in the log records, we will know that T1 and T2 has not commit anyway, right? We are not writing commit records for those T1 and T2 transactions in advance. So that would be totally fine as well, right? So you can totally write this record as well. Any questions with this timing of writing log records as well as the concept of growth commit? Okay. All right. So the next question I need to clarify a little bit is that when should the database system to write the dirty records, right? To actually contain the modification of the transactions in the pages of the database content right to the disk? So when do the database system need to do that? Well, actually the answer is that it actually do that anytime actually, right? As long as the database system ensure that it first write the values, sorry, write the log records that contains those modifications that the transaction made onto the disk, right? As long as the database system ensure the flash of those log records first, then for the dirty pages, the system can write at any time, right? When the transaction is executing, when the transaction commits, or after the transaction commits, doesn't matter, right? Whenever the database system has the cycle as well as the hardware resource to do that, I mean they are all fine, right? So to summarize a little bit of the trade-off between the different combinations of these still and false policies, right? So in the earlier shadow-paging example, right? We have this combination of these false and no-still, right? So essentially we are going to let or to ensure the changes of all the committed transactions to onto the disk in the pages of the content of the database before we can commit the transaction. And so as, again, shadow-paging, right? We are going to, we are not allow the any changes of uncommitted transactions to be persistent on the number-outel storage device. And with that, it will actually have the farthest, sorry, slowest runtime performance, right? Because you have to make copies, random writes, et cetera. But then it's very easy to unload things and it's very fast to do recovery because I mean once you do recovery, you only need to read all the pages from a disk back into memory, right? Everything there would be committed and then everything would be correct. On the other hand, with right-hand logging, what we are achieving is this no-false and still combination, right? With that, we have faster runtime, especially because we only need to store the changes of the modification of the transaction on the tuples, not the other pages, as well as we can sequentially write things onto the disk. But then, at the recovery time, when things crash, we actually have to look at log records onto the disk, because we have not propagate the modification of the transaction on their original location that database content get, right? We have to look through all those logs and then apply the corresponding operations to restore the changes of uncommitted transaction, et cetera to make sure things are correct. And we'll talk about the recovery protocol in the next class, right? Essentially, at the recovery time, it will be relatively slower. But that said, in practice, right? This is actually a very important comment I want to make here. In practice, 99% of the time, people would prefer faster runtime performance comparing to the faster recovery performance. And the reason is simply that recovery in most applications, sorry, crash in most applications would be rare, right? So there will be very little chances you would need to actually apply this expensive recovery procedure, even though you use approach like redhead logging, right? So most of the time, you just want to optimize for the common case when the database is running to make sure that things are faster there, and then you deal with the expensive recovery when it happens. And in most cases, it doesn't happen that often. But again, there are actually scenarios where the recovery or the crash of the database system may happen often, and in that case, it may be more beneficial to use the shadowpaging approach. And in fact, there's one example I heard from my advisor, Andy, that many years ago, right, there was actually some sort of electricity company down to Costa Rica where the power there, right, or actually it's like a phone company, right? But the power there is actually not stable at all. Essentially, the power that provided the database system will actually be stopped once every few hours, something like that, right? So the database system may be in a constant state where it will crash, and then you have to bring it back up again, right, for example, every few hours. And in that case, it might actually better to use the approach like a shadowpaging, right, with a force and no stale, so that you optimize for the recovery time instead of a run time. But again, as you can imagine, that's actually kind of rare. Most system would use right ahead login to optimize for the other, okay? So last few minutes, so quickly go through what will be the content to retain in those log records. We talk about the at high level, like the before value, after value, transaction ID, object ID, et cetera, right? But then for the specific implementation, there are a few different choices as well, so the first choice would be called a physical login, which means that you are going to record the exact change that a transaction made onto the disk, right? So one way to think about this is to think about the diff command in git, right? So that's exactly what change on which line in this page of the content, and then you're just going to store that. And the other type of content you can record on your login, we call logical login, right? We see actually, instead of to record the exact change of the transaction on every pages or on all the tuples, assuming that your transaction may modify a billion tuples, right? Then there will be a billion records you need to record in the first place with the first approach. The second approach, you only actually record at logically what this transaction did, and in many cases, they will just be the secret command of the transaction, right? Essentially, you just record the original secret command, let's say update, I mean, this table with all the values a equals to a plus one, right? Something like that. And then you will just replay that command when you want to, I mean, recover, as they redo this login. So a little bit about the trade-off. So it kind of already talked about that. With logical login, right? For example, you only record a command that modify the content of a billion tuple. So then obviously you are going to write much less data compared to the physical login, write all the contents. But in actuality, people almost nobody use the second logical login approach, right? Again, there are extreme exceptions, but in actuality, nobody almost, almost nobody use the logical login approach. Simply because with logical login approach, it's very difficult to determine which part of the content of the LVD system is modified by which query and in which order, right? Remember with the transactions, concurrency control, we can interleaving queries, queries have interleaving operations. It's very difficult to keep track of all those things and then restore the LVD system in a correct state, page by page, right? So, and this one did a challenge. And the second challenge is that probably more importantly, is that with logical login, you have to re-execute all the operations of that transaction as well, right? So you have a transaction that has expensive join, join 10 tables. You have to re-execute the entire, rehire join query again with this logical login, right? That would also be very expensive. So in practice, what people do is essentially a modification or slightly tweak of a physical login called a physiologic login. So I would say this is more close to physical login than logical login, right? So essentially what it does is that it will actually combine, essentially it will modify or record changes of a transaction at the page level, but instead of record the exact change, right? At which location this value is changed from what to what, it will only record at a high level, let's say what records have been modified in this page, and what's the before and what's after. And after, say there's a recovery, you need to re-apply this change. The database actually would have the freedom to re-install or re-apply the values of this record back in the page, but it doesn't need to write the value in the specific location, right? It can freely reorganize the location of all the pages as it wants, cleaning up the empty pages, sorry, cleaning up the empty slots in the page, et cetera, and then to restore all the changes of the tuple, right? So give you an example here, right? With the logical, and say we have this transaction, it's update this value, right? It's like a set values equals to x, y, z, then with the physical login, what you do is that it will just record the table and the page, and then specific offset of that value before and after. And by the way, we didn't have time to get too much into this class, but then it will also record the change in the indexes as well, right? Because you don't want to, every time you come back from a crash, you don't want to repopulate or recreate the entire expensive index either, right? But we don't have time to get into too much detail. That's physical login, logical login, I mean very straightforward, it's just the specific query, right? And then when you come back, you're just going to re-execute this query. And with this physiological login, it's very similar to physical login, just instead of just record the specific offset of this tuple, right? It's just going to record a slot of this tuple or identifier of this tuple, right? Tell the system that, hey, I modified a tuple, right? With this ID something, something, and before and after a value, right? When the database come back, it has the freedom to insert this tuple back into this page at a location that it seems most optimized, right? Again, just a little bit of modification or tweak of the physical login, right? But at a high level, it's pretty similar, all right? And then in practice, most database system would actually just use the third type of physiological login, all right? Cool. So I think we probably don't have time to get to the checkpoint today. We'll just directly show that content into the netcast, right? So, yeah, next. But I want to get to the conclusion yet, for a conclusion here, right? So essentially, right? So in most database system, we would just favor the right-hand logging by favoring the runtime, comparing to the recovering time, because we assume that recovery or crash would rather be rare in practice. And then we'll, I mean, before the checkpoint, we didn't really have time to talk about it, so we'll go over that to our next class. And then on the recovery time, right? What we need to do with those log records is that essentially we'll use the undo records from those log files to restore the changes of the uncommitted transactions. And then we are going to use the redo records from this log file to reapply all the changes from the committed transactions, right? So essentially, after that, we are going to restore the Dabby system back into a correct state. Okay? So next class, we're just going to talk about checkpoint protocols, and then we're going to talk about recovery procedure. Yeah! ACK, talkin' about the Sennah's rule, run through a can of two, share with my crew is magnificent, bust is mellow, and for the rest of the commercial, I pass the mic on to my fellow. No need for a mic, check, plus it, the fees are set to grab a 40, to put them in your snap snacks. Sennah, take the sip, then wipe your lips, cue my 40s gettin' more, I'm out, he got the flip. Drink it, drink it, drink it, then I burp. After I slurp, I skew, I put in much work, with the BMT and the e-trump. Get us a Sennah's rule on the dump.