 So today, we're going to talk about one of my favorite topics, and I'm going to get super excited and I'm going to start talking very fast, tell me to slow down, okay, transactions and concurrency control. I spent six years of my life worrying about this in grad school, I f***ing love it. Before we get to that though, the things that are coming up in your schedule is homework four is coming out today and then that'll be due on November 12th, and then project three should also go out today, what you end up implementing will be discussed a little bit today but mostly on Wednesday, so don't feel like an urgent need that we didn't put it out last week that you're going to fall behind, right. I shouldn't say this, but it should be easier than project two, okay, but still plan accordingly, right, there won't be a check point this time, it'd be a final deadline, okay, and then also you saw my post last night of extra credit, a lot of you already signed up for your favorite system, some of the you selected ones that I didn't realize we already had articles for because we had some people in industry I gave accounts to and then they fit out the articles, so they weren't on the list of the unavailable systems that I went back and corrected you, for the last time I checked this morning everyone looked like they picked something that was available, okay, all right, so where we're at in the semester now in terms of understanding the architecture of a database system is that we've gone up this entire stack, right, we started off talking about how to do a disk manager to organize our files and data on disk and then we talked about how to do a buffer pool manager, how to retrieve those records or pages from disk, bring them to memory, pin them and modify them, write them back out as needed, then above that we had our access methods about how we can read and write the data that are stored inside our pages in our buffer pool manager and then we said how we can implement the operators in our query plan, actually do further computation on these things and then above that last class we talked about how to do, two classes ago we talked about query optimization and query planning, so now where we're at in the semester is that we're going to discuss these two additional topics, concurrently control and recovery and I'm drawn then the boxes on the side, this is actually a bit of a misnomer because the idea of concurrently control and logging and recovery and checkpoints they're actually going to permeate all throughout the entire system but we just wanted to go through this stack first and not worry ourselves about these things, those sort of complex topics but then now we're going to go back and discuss how do we actually incorporate these concepts into our database system and again the idea here is that at pretty much every single level in the system you need to be aware of what concurrently control scheme you're using or what kind of recovery scheme you're using because you don't want the system to make, to do certain things with being unaware about how these components are actually operating. We don't want to write data out the disk in our pages before we write the log record that represents that change because otherwise you could crash and lose data so today's class we're focusing on actually this class and probably the next I think two weeks or so in the schedule we're focused on concurrently control and then we'll have a week discussing recovery protocols okay. All right so the the motivation for what we're talking about today can sort of be summed up or actually going forward these two high level concepts concurrently control and recovery can be sort of best by these these these two different scenarios. To say that we have an application and we're gonna have multiple clients connect to our database system and we're gonna have two clients access the or modify the same record on our data it's exactly the same time so what we have a race condition here right and so we want to know what the correct behavior should be then we have another scenario say that we want to take a hundred dollars out of my bank account and put it in your bank account but then there's a power failure why we're doing this operation or doing this transaction right take a hundred hundred dollars out of my account and put it in your account take the hundred dollars out of my account and then we lose power what should happen when we come back what should we see. So the first example is what we call a loss update and this is what we're going to use concurrently control to protect and then the bottom one is a durability issue and this is what we're going to use the recovery mechanism of the database system to prevent. So concurrently recovery are sort of the two one of the two most important concepts or features that a database management system will provide you right because it's going to allow multiple clients to operate on the same database at the same time and make sure that any changes you make to that database are persistent and saved after any possible type of failure right and when you think about it this is a this is an excellent example of why if you're building an application you don't want to fucking write your own database from scratch you don't want to do it yourself right because doing this is hard right and so think of it this way say you're a startup and you're trying to you know get your product out the door and have people use it you don't want your engineers spend the time worrying about how to do concurrently control or recover any crash in you know in your data after okay recover any data after crash for your application because these aren't differentiating factors for your application right these are things you need to have other companies have these things and you don't want to waste your time implementing these things because you're probably going to do it incorrectly anyway so this is why you always want to use a database management system right whether it's an embedded application you want to use SQLite or a really large-scale you know distributed application and you want to use you know a distributed database system which we'll talk about later in the semester you don't want to be doing this because as we'll see as we go along it's really hard to do these things and so the core concept that we're going to rely on and leverage all throughout the the next couple weeks is this idea of a transaction right and we want our database management system to execute transactions and provide what to call it acid guarantees so quick show of hands who has ever heard of acronym acid right mm a little less than 50% that's fine okay so again I'll explain this this lecture today will be about acid but we need to understand what a transaction actually is first and then we can understand how we we achieve acid with them so a transaction as described in the context of a database system is going to be a sequence of one or more operations that are going to get invoked on a shared database and we want to have them and they're meant to perform some higher level function in our application so the first thing to point out is that we have one or more operations and these operations could be sequel queries if you're using like a no sequel system they could be gets or puts or it doesn't matter for the database system point of view right there's just something that's actually reading or writing data in our database and a shared database means that we can allow multiple transactions thinking like multiple threads and multiple clients connecting to this the single database system and modifying you know the shared database and this high level function thing is a is a concept that has really no meaning in the database application database system itself it really has to do in your application so my example of sending money from my account to your account that would be this high level function in the system yep the data system has no no function called send money right that's written by you in the application developer in your application code but the way you're going to achieve that high level function is through these low level read and write operations that the database system sees okay so transactions will be the basic unit to change of a database management system that supports concurrency goes to port transactions right I can have a single if I have a single query single update statement that'll still be in the context of a transaction I can have a single query transaction I suppose you could have a you know a zero query transaction doesn't mean anything doesn't do anything but technically that's still correct so really care about one or more things are going to read and write to the database so again using that example that I talked about before sending money out of my account right say I have a gambling problem and 80 cent a hundred dollars in my account to my bookies account right so that the transfer of money from my account to my bookies account that's the higher level function of the application but the way we're going to actually implement it is through three steps when first check to see whether I even have a hundred dollars and may or may not right then we'll take a hundred dollars out of my account if I do and then we'll put the hundred dollars in my bookies account right so again those that those three steps are what gonna be comprised of a transaction and we're going to want all these steps to occur safely atomically and not worry about seeing the effects of other transactions running at the same time right these are all the guarantees that a database management system is gonna gonna ensure for us so a really simple way to actually implement a database systems that can support this would have the following architecture right so that I'm proposing a system here and we discuss what what's actually doing and see if we can do better so what we're going to do is that we're gonna have a single thread in our database system and that single thread can only execute one transaction at a time right in serial order so if I have multiple clients all connected the same system and they always you know transactional quest at the same time I'm just gonna have a queue the threads gonna pick whatever is in front of the queue take it off run it complete it then go back in the next one right we're gonna execute in a single thread so only one transaction be running at a time and then when a transaction starts running what it's gonna do is gonna make a copy of the entire database right soon our database is a single file on disk I'm gonna make the copy of that that tired database file put into another location then modify that file the copy I just made and then when I'm done and I'm complete and I know I've made all the changes I want to make then I just have a pointer inside the system to say set a point into the old file now I point to the new one and that's the current state of the database and then the next transaction starts and does the exact same thing makes another copy modify it and then flip the pointer is this not a good idea or a bad idea for both of these two parts she's taking her head yes you think it's a good idea why why so the first one single thread why is this a good idea she says it's simple right you're not wrong right yeah you don't do any latching to be plus tree that everyone you know freaked out about because this is a single thread you don't do pretty much anything we're talking about here today but what's the downside of that was that slow exactly right so say if I if if I got a recent thing from disk and it's not in memory not my buffer manager I got a stall why go fetch that and now the system essentially is because I only had one thread is stolen looks like it's on unavailable on responsive right what about the second part right before I start my transaction make a copy the database file modify it and then when I'm done I just flip the pointer to it so what's the best the benefit of this she already said it it's easy it's simple I don't have to do I don't the word about anything I know I know the transaction running at the same time I know that I'm not going to see any partial rights or intermediate state of the database file I'm the only person that's doing it I copy it and I'm done by my changes and then now everyone can view it of course now what's the downside yes right if you're if your files 10 gigs or one petabyte you know this is not this is not realistic so can anybody actually think of a system that would they might know a system that actually does this we've already talked about it in this class hmm postgres no think smaller huh my sequel no there's one other major one sequel like exactly yes this is sequel light or this is the old version of sequel light and again when you think what sequel light was designed for embedded devices like cell phones or even smaller things like you know little iot devices this is fine because they're maybe they'll have multiple threads right actually the bottom part is so the top part is still true for sequel light the bottom part is this is what the older version of sequel light to do now they're going to use right-hand logging the way we're talking about here today and later on next week but you still can I think toggle a flag or something to still get this bottom thing right and again think of like sequel light running on a cell phone application your database of me what maybe like you know a couple hundred kilobytes and you really only have one thread you know one app modifying at the same time so this is fine but now we start scaling up more cores more machines more complex applications then this is not going to work so a potentially better approach would be allowed for concurrent execution of independent transactions and multiple clients can issue requests and make modifications and read and write data to the database at the same time right and I'm saying potentially here for the exact reasons she said before she said that oh it's simple to implement my previous approach actually turns out in some cases it can be faster and we'll see this later on in the semester when we have volt TV come give us guest lecture they actually implement a variant of the sequel light approach I'm discussing here because now you don't need you don't need to do any latching in your B plus tree because only one thread could be operating on the database at the same time but the way they get scalability is they have multiple sort of on a single box they could have multiple course multiple threads running on their own separate partition of the database or super shard right so it's but the high levels essentially the same thing so we've already said why we sort of would want this right right because we're gonna get better utilization and better throughput in our system because if one thread has you know one thread has to get something from disk it'll stall and other threads can come along and still make forward progress right it's in the system will look more responsive now the downside is going to be though we we need to make sure that we even now that we're interleaving operations that we can still have the database end of being correct after these interleaved changes and I'll define what correct correctness is in a few slides we also obviously want fairness right we don't want a transaction maybe takes an hour to run you don't you don't you don't that to block everyone else we're gonna sort of have a balanced approach of allowing transactions to interleaders operations and everyone try to make best use of the resources so again both of these concepts are gonna be really really hard right it's hard to ensure the correctness of transactions that are interleaving operations in an arbitrary way right so let's say that I'm trying to pay off two bookies at the same time I only have a hundred dollars I owe them both a hundred dollars they're gonna break my thumbs right we don't want the application to try to issue a request to make changes to the database at the exact same time and not verify that I only have a hundred dollars to give you don't want it to give out you know hundred dollars twice and and you know make money at a thin air and then you actually want to execute this correctly right so the easiest thing to do is have game one half one thread only operate in the system at a time but this is going to be slow so I want to leave their operations I want to do it in a way that doesn't the interleaved version is not slower than the single-threaded version so there was a famous this is a good example why a lot of the note this is a good explanation of why a lot of the no SQL systems maybe ten years ago when they came out said oh look how much faster we are we don't do SQL we don't do transactions right because it's really hard to do to make this all work really efficiently right and only now are the new SQL systems that are actually survived you know the the sort of initial flood of interest in this only now they're going back and some of them are adding transactions right MongoDB just added transactions about a year ago because Sandra has sort of lightweight transactions now there's a lot of applications in areas where you don't need transactions the way we're talking about here but I would certainly say anytime you're dealing with data you do you can't lose and you do you can't up like you don't want to have incorrect information in your database then you you want to use transactions right there was a famous because the Bitcoin exchange was running MongoDB if I remember correctly somebody figured out that MongoDB they weren't using transactions because they were using MongoDB and they were able to basically bleed out all the money from the exchange right alright so again the what's going to happen is now we're going to start interleaving operations in our database we're gonna have a bunch of problems come up so the first of that we're going to have temporary inconsistency right if I take $100 out of my account put it in your account there's a be a brief moment before I put the money in your account where their money isn't there right because I can't magically have the money appear from one location to the other inside of our database system from the outside world it'll appear as if it moved atomically but internally it won't always be the case so this is okay because this is unavoidable but what we don't want to happen is again permanent inconsistency meaning I transfer the $100 then something bad happens and I don't put the money into your account but I took it out of my account and now the $100 disappears so that second one is the bad one and we gonna avoid this at all costs so now sort of obviously sort of say alright well I get it if I'm going to transfer $100 I don't want to lose that $100 but that's sort of like a you know a sort of one particular example we need a way to actually define what it means for interleaving these transactions to actually be or operations to be correct so we're going to find what it means for the data to be in a correct state so for now again we're gonna say here's a transaction is going to allow to execute one or more operations on the database right doing reason rights and we want to make sure that the interleaving this operations end up being correct right if we have multiple transactions of the running same time but that by correctness I really only mean the things that the database system can see like so again it can only see that read a write a read be right be it only sees those things it doesn't know anything about what you have in your application code so let's say I have a transaction again that transferring money I take the money out of my account then I put it in your account then I send an email to you and say hey you just got $100 but then I go try to commit that transaction something bad happens like I lose power right and I need to roll back that change right I can do the databases can roll back that change on the in on the data inside the database but it can't retract that email right because that's something in the external world of the system that has no control over so these transactions only have the scope these transactions only based on what the operations are that run inside the system itself right anything else is just you know you wrote your application incorrectly I can't undo that change so now we're going to find that our database is going to be a fixed set of named objects like ABCD whatever and the two key points about this is that I'm saying that the database is fixed right for today's lecture which assumed that if I have a hundred tuples I only have a hundred tuples we're not worried about inserts or deletes yet we're just talking about reason rights and then the other thing is that I'm saying that their database objects right I'm not saying whether their tables or databases or single attributes or single tuples all the protocols that we'll talk about for the next couple classes work the same no matter what the what the sort of granularity the object is we're going to assume their most you know you can assume that their tuples but the same concepts work for for you know pages for for tables and everything else and then we're going to find our transactions as a sequence of these read and write operations right so read a write a read be right be right and this is the only thing that the database system actually sees it just sees these read and write operations right sequel queries you know at the end of the day does it translate it into these things because these are the read and write operations you do on the access methods to either indexes or tables now in sequel the way you would start a transaction is with the begins begin command right you can issue this from the terminal say I'm going to begin a transaction and the day system will set up a bunch of metadata to say all right you're about to do a transaction let me keep track of what you're actually going to do and the transaction will finish with either a commit or abort command I think also the sequel standard says instead of abort you can use rollback some systems actually support both I figured what Postgres does so if you issue commit and that commit is successful this is actually an important point I told you this isn't commit my transaction the data system can then decide I can't let you commit you're going to fail and then throw you back an error message so your transaction actually didn't commit so you so even though you totally have a system commit it isn't actually truly commit so you get the acknowledgement that you committed okay so then at that point that transaction is is fully persisted fully saved and you know if you crash it and you come back all your changes are still be there but for abort all the changes you made that transaction since that begin statement they'll all get undone and then the state of the Davis will be put back into the form of to the way it was before the transaction ran so it'll be as if the transaction never ran at all and as I said this abort command can either be issued by you and your application or the data system can come back and say hey whatever you're doing I can't let you proceed because that's going to put me in incorrect state you actually have to abort and I'm rolling back all your changes so an abort can either be self-inflicted like you shoot yourself in the head or you'll see this in project three where you can force other transactions to abort because you don't like what they're doing so now the correctness criteria we're going to have for our transactions is defined by the acronym acid and that admissivity consistency isolation and durability so the story goes there was a German guy that invented this in the night early 1980s and the lore is that for whatever reason he was in an argument with his wife or something like that and his wife didn't like candy didn't like sweets he said she was a bitter woman or whatever so he named acid after her right whatever there's also base which will cover later in distributed systems which is the opposite of this but the so Adamicity is going to mean that all the transactions all the actions of a transaction have to happen to have to complete or none of them happen right this matches up with what we said before about no partial transactions consistency is sort of a weird one it's not really going to make sense for what we're talking about today because we're focusing on single node databases when we talk about distributed databases if you ever heard of like eventually consistent systems this is where this concept applies but it basically means that if every transaction is consistent and the database is consistent then if I actually transaction the end result should be consistent right again I'll explain what consistent means in a second isolation is one more to spend most of our time on right this is the hardest one to get right this is where where as we execute transaction we wanted to have the illusion that it's executing on a machine on the database by itself but there's no other transactions running at the same time even though there really will be and that means essentially you don't want to see any of the changes from from any other transactions and they shouldn't see yours the last one is durability it basically says that if a transaction commits you get the acknowledgement that it commits actually let me reverse that if a transaction commits inside the data system we acknowledge that internally that it committed then no matter what happens afterwards whether we the machine crashes the data system crashes when we come back we should still see all those changes made by the transaction the reason why I sort of had to undo what I said just now you can actually tell the transaction the data system to commit it will commit it for you save everything to disk but then before you get the message that it's committed it could crash and you never get the message right that's okay right right we can't can't prevent that you just have to come back and check to see whether your thing actually truly committed or not we'll talk about this in a second the shorthand way to think about all these is that a admissivity means all or nothing consistency says it looks correct to me right isolation says I'm running as if I'm by myself and durability says I can I can survive all failures so we're gonna go through each of these one by one we're gonna focus mostly on animosity isolation for now durability will cover when we talk about log logging methods and checkpoints and then consistency will come up more in when we talk about distributed databases okay all right animosity so there's two possible outcomes of exiting transaction as we said before so either the transaction commits and all its actions all its operations are applied successfully to the database system in the order that you requested them or it'll be aborted either by yourself or by the database system and then all your actions all your changes then get reversed right and so the day system will guarantee that these transactions are atomic meaning that from the applications point of view all right the transaction either excuse all the operations that it wanted or none of them right again it makes sense if I want to send money from me to you I you know I don't want the you know I don't want the money sort of disappear halfway right because it because of a crash I said we go again we take a hundred dollars out of my account put it in my bookies account right if we take the money out of my account but then there's a power favorite before I put the money in the bookies account when we come back we should have you know should it should it the database should look like as if none of the none of the operations ever changed so the hundred dollars should go back into my account because I wasn't able to complete the transfer into to the bookies account right that's the core concept of what a transaction looks like in a database system so how do you actually achieve this well again we're gonna cover this more when we talk about recovery but the at a high level one way to do this is through logging so every single time I'm gonna make a change to the database I'll write a little message in a separate log file that says I'm taking a hundred dollars out of any account and then when I when I put the hundred dollars into my bookies account I put another log messages I put a hundred dollars in the bookies account and then when I go to commit I make sure that log is written safely out the disk and persisted once I know durable on disk then it's safe for me to tell the outside world if I commit because now if I crash I can come back and look in the log say what was going on before my crash and make sure that the state of the database reflects what was in the log so this is a bit more of it but it looks a lot this is kind of like how the black box works in the airplane except you don't put anything back together right if an airplane crashes they go find the black box and say what was occurring in the airplane right before the crash right sort of piece piece the you know piece the disaster back together so logging is pretty much used by every modern modern system right and not just in databases and file systems in other distributed systems at a high level everyone's doing some type of logging right the benefit you get for this is it gives you an audit trail maybe you can always go back and say what was going on in my system at different times like if you're a financial firm you need to keep track of every financial transaction of the last seven years you can use logging to be able to do this and it's also it's gonna make from an implementation standpoint it makes writing data at the disk actually really more efficient because now I can do sequential writes out the disco from this log record and then I can have my purple manager flush out dirty pages you know asynchronously in the background which would be random IO right another approach to do this is a lot like the seagull light example that I showed in beginning this actually has a name it's called shadow paging and so typically the way it works is not you don't copy the entire file you actually copy just the pages of the file that are being modified these are called the shadow copies then you have your transactions apply those changes in the shadow copies when your transaction commits then you just flip some pointers inside of a directory that says alright the latest version of this page is now the one I just modified right so this is actually the original idea that IBM came up with in 1970s with system are right they did they did shadow paging they actually turned out to be a really bad idea or difficult to implement and actually performance was not that good compared to write-ahead logging you wanted to share before so they abandoned this in the 1970s when they switched over to to build db2 in the early 1980s so now as far as they know the only two systems actually still do something that looks like shadow paging the way that IBM did it is cows DB which is a no-sequel document store and then LMDB which is the embedded in-memory engine for open LDAP all right so you actually tried using shadow paging in a newer project to implement database engines on the new non-volatile memory stuff or devices storage devices from Intel and I had this awesome idea thinking like oh shadow paging from the 1970s will work really well on you know Intel's latest devices in the 2010s turned out to be not not true and logging so always fastest and we'll explain these two methods in more detail later on right but this the main idea here is to show you would if you you need this something like this in order to ensure animosity because think about it if you had your buffer manager now if I wrote to some pages those got written out the disk because they got evicted but the transaction hasn't committed yet and then you crash come back and fetch in those pages now you have changes from the transaction that it actually commit and you have part you have torn turn updates of partial transactions and that we can allow so these methods what we'll handle that for us all right the next one was consistency right and there's the basic way to think about this is that the database is meant to represent you know this something in the real world right think of like Amazon Amazon store front that's most of that's just a model what a real you know real brick and mortar store would actually look like you have items you have customers that make purchases things like that and so the idea is that if our database is modeling the world in a correct manner logically correct correct then any changes that we make to that database should always end up putting us in a logic correct state and any questions we ask about that data should always be logically correct right now we're getting into the logic correctness which is different when we talked about latching we talked about physical correctness of the data structures now we're talking about high-level concepts that you would define with you know referential constraints or integrity constraints so we two types of consistency we have to worry about database consistency and transaction consistency so database consistency is sort of what I was saying if we're trying to model something in the real world we have these integrity constraints that that we're gonna use to enforce to make sure the data inside that the database is actually always correct then any changes that we make will always make sure that the in our database will always be correct right if I if I if I make a purchase and I sort of say I buy item I have a transaction that that makes a purchase in the database if I come back tomorrow I should be able to see my my purchase information right because I the future transactions can see the effects of the previous ones so you may think that already this is kind of stupid this is kind of true of course I should be able to see what I can see from the next day but think about now on a smaller scale if I make a change and my transaction commits and I come back the next millisecond I should be able to see that change and this is where the stupid database stuff starts coming to play because now if I make a change on one node and I come back to another node in a millisecond I should be able to see that change if I'm trying to say I'm consistent on a single node it doesn't actually matter on a distributed base it matters a lot again we'll cover that later in the semester transaction consistency is this sort of foolfully thing that doesn't kind of really mean anything right it means well it does mean something but like we don't care it just means that if my turn if my database is consistent my transaction is consistent then if I run my transaction then the end result my database should be consistent and so we have no control over this right because it's up to the application programmer they write code that we don't have a control over and that puts the database in an inconsistent state we can't stop them from doing that right so let's say that I have a list of customers list of people and I have their email addresses and again in the real world you can't have an email address without an at sign but I can write crappy code that goes and puts a record in there with an email address without an at sign now our database is inconsistent right but we we can't stop that because we you know all that's in the application code the data system doesn't have you know didn't days and we'll do whatever you told to do so if you told it to insert bad data it'll insert bad data we can't stop that that's what I'm saying we don't care about this because we can't from it from a database system point of view we can't control this piece because it requires us to understand what the application really wants to do and that's impossible right because that's a human a human has to make a value judgment about this so there's nothing really have to do about you know do about this again we care about database consistency more about this one okay isolation all right this is what we'll spend most of our time so the idea of the isolation guarantee for transaction is that our application is going to submit multiple transactions simultaneously because I multiple threads in the application multiple user requests and we want to be able to execute these things at the same time so we want the users to write their code in such a way where they assume the transactions are the you know their transaction invocation is the only transaction that's running at the system at that current time right that it has exclusive access to a shared database even though it really doesn't because we already said we're going to interleave things right so the question is how are we actually going to achieve this we've already talked with this before early in the semester when we talked about modifying the index at the same time with different threads right we're going to have a current to show a protocol we said the culture protocol is going to allow us to interleave operations at the same time on a shared object in this case a shared database and we want the data system the database to end up with being end up in a correct state and now from our point of view in concurrency toll for transactions here it's logical correctness that we care about we assume that underneath the covers we're already using latching and protecting our data structures and our protocol manager we assume that that's all we physically correct now we care about making sure that the data we're putting into those data structures is logically correct right that again I don't lose money from transferring money from one account to another so the after this class will have two lectures to talk about different protocols country protocols you can use extra transactions but at high level there's essentially two categories to wait there's two ways to do this there's a pessimistic approaches where you assume transactions are going to interfere with each other and therefore you have to make them ask for permission to do something before they're allowed to do it what does that sound like we talked about indexes and crowding latching right before I was able to reverse to the next node I had to acquire the lock for the latch for it same thing here before I'm allowed to update a tuple I have to acquire a lock for it optimistic approaches of where you assume the transactions are not going to conflict you just let them do whatever they want to do again underneath the covers we make sure that the data structures are physically sound and only when we think there's a there might be a conflict to be go actually go back and rectify things and fix things and again this is a lot like the optimistic lock latch coupling where I assume I can make it all the way at the bottom to the leaf node without taking you know taking read latches without any problems if I get it wrong you know if I end up having a need you know an exclusive like a right latch some point I just undo what I did and come back and do it again but now I take a more I take right latches as needed same kind of thing we're gonna assume transactions are not going to interfere and then later on we go check to see whether that was true or not so again we're gonna talk about two-faced locking on on Wednesday that's just your implementing in in project three that's a pessimistic current Joe protocol and then after that we'll talk about time stamp ordering approaches and those are optimistic all right so let's take our example we're taking money out of my account putting somebody else's account right and we're gonna mix it now with another transaction running at the same time so t1 is going to do that that money transfer take $100 out of A put $100 in B but then t2 is going to get you one percent or six percent interest on all the accounts it's going to compute the interest and apply that that interest change to the account right so assuming that both of these two accounts a and B have a thousand dollars we want to say well what are the different possible outcomes we could have for any arbitrary interleaving of these transactions well the number of possible outcomes is a lot right we can have you know maybe the first query in t1 star then the second query in t2 start right we can interleave these things anywhere you want but the thing that we're going to care about to know that whether we have a correct interleaving is whether is that the final outcome is that the total amount of money in the in the bank has to be 21 20 because we're going to start with $2,000 right thousand a thousand and B and transfer money to between the two of them is still going to be $2,000 but then we're going to compute one percent or six percent interest on them on that $2,000 so we should end up with 21 20 so now this is a really important concept to understand about database transactions right in terms of correctness here so the database is not going to guarantee that if you issue t1 first and then issue t2 that it's actually going to execute t1 first before t2 this is a lot different than maybe think about sort of parallel programming or you know bulk synchronous parallelism in some machine learning programs right the data systems allowed to interleave them anyway at once and it could have t2 go first even though t1 was issued first and that's still considered correct right and the only thing we care about is that the end result of the database the state of the database is equivalent to one where the two transactions were executed in serial order either t1 first followed by t2 or t2 followed by t1 because the thing we only care about is that the final sum of the two accounts is 21 20 so I could either have if I execute t1 first I would have this this guy here thanks to t2 first I would execute that right but again you add these two up together for always 21 20 if you cared about t2 executing for t1 in our model we're talking about here then you would execute t1 first then when it's done then you can execute t2 now there are some systems that are called this is called external consistency or strict serialized ability which we're not going to cover here but there's some systems where it will guarantee that if you issue t1 first thought and then if you should t1 first then t2 it'll execute them in that order but that's way stricter than what we care about here the only system that I know that actually guarantees that is Google spanner ref one and they need it for some ad reasons I don't think I don't know I don't think cockroach TV or tidy he does that right so again we can visualize this in terms of these schedules right so way to understand this is that we have two columns t for t1 t2 and this is going forward in time so t1 starts us begin then we do our operations then we commit then t2 starts does does this operation then it commits right that's sort of going from from the top to the bottom is going forward in time so again the end result of these two schedules even though a and b have different values if you add up the two amounts together you get 21 20 so from a database database system perspective these are both considered correct even though a and b are different so why do we do this right again we already said this before we want to mass the slowness of physical resources we have to deal with like reading data from disk going over the network that's certainly what they were cared about in the 1970s but now in modern systems we have a lot of cores and most of our data is going to fit in memory we want to have these different cores running at the same time right and interleave their operations to take advantage of the you know all the Harvard that the additional benefits that that Intel's giving us right the clock speeds aren't going faster we're getting more and more of course so we want to take advantage of that that one if one thread stalls one transaction stalls another one can keep going so now let's start interleaving our operations right so here we're gonna have here we're gonna have here is that T1's gonna start first it's gonna take the hundred dollars at a then there'll be a context which so assume we have on a system that only has one thread right one core only one transaction can all multiple transactions can be running simultaneously but there's only one program counter and we can only do one operation at a time so T1's gonna start does the deduction on a then there's a context which T2 starts then we can put the interest on on a then we switch back here and do be switch back here and do be and then they commit right so again this is equivalent to a schedule where the transactions execute in serial order even though they were interleaved because again all I care about at the end of the day is that the the value of the sum of the two values at the bottom are the same and as we'll see as it goes along sort of the intuitive thing that should be sort of be obvious about why this actually works is because I always made sure I did the whatever operation I wanted to do on on in transaction T1 on an object that had to occur before the corresponding operation on the same object in T2 so I made it make sure I always took the money out put the money in on a or B before I could put the interest on it right because if you don't do that you end up with something like this right I take the money out of a then I compute interest on it on a then I could be interested on B and then now I take add the money to B right and now end up with a sum that's not equivalent to 2120 and the bank is missing missing $106 right so you may think all right this is not that big of a deal right the bank lost some money and you know it's okay right but you would you'd be pissed off you know you weren't given the interest you you thought you were owed right now $100 think of a billion dollars so the reason why so the reason why it makes this tricky is because the database game the database system doesn't see these sort of you know a equals a minus 100 or B equals B plus 100 it doesn't see any of those sort of ether arithmetic operations it just sees reading reason rights so can't infer any meaning about what your application is trying to do and to make any decisions about how to interleave things right so essentially these these you know this a equals a minus 100 this is actually a read on a followed by a write on a doesn't again doesn't know anything so again we sort of see at a high level that this sort of makes sense that we want these things to be you know whether one is correct as one schedule is correct versus another but we need a more formal way to actually judge this right and so what we're gonna say is that we're gonna say a schedule can be correct if it's equivalent to a serial execution schedule so it's sort of obvious a serial schedule is one where the transactions will don't do any interleaving you execute one followed by another another and then we're gonna say something is equivalent to it one schedule is equivalent to another schedule if for any possible database state we could ever have right for any possible values of a and b in our in our example the effects of executing the first schedule the state of the database after executing that first schedule is identical to the state of executing another schedule and it doesn't matter what what operations we're doing inside of the transactions we just see reads and writes as long as this the state of this is absolutely identical all the values have to be exactly the same then we can say they're equivalent to now building upon this we can say that a schedule is considered serializable if no matter how the operations are interleaved that if it's equivalent now to a serial execution execution of these transactions in that schedule right again going back to the circumsistency stuff we talked about before if the state of base database is correct in and consistent using one executing the transactions in one schedule if that's equivalent to or identical to the state of the database that's produced by a serial schedule then we can say that that the interleave schedule is considered serializable so this is clear a lot of heads shaking yes good okay so this is what I sort of say before that the the terms of correctness in a data system it's slightly different than how you may think of this in your program you know into regular programming languages and the reason why we we want to do this reason why we're not going to say just because you issue a transaction first doesn't mean we're going to execute it first is that this flexibility is going to allow the data system to choose have more options available for deciding how to interleave your operations in your transactions in a way that can produce more more parallelism right because again if I execute things in serial order then that's essentially the same thing as a serial as a serial thread exiting transaction one after another but if I can interleave them and not care about having to match exactly how you submitted your transactions then that opens up more opportunities for to mix things up and get better parallels all right yes so he says wouldn't external consistency be easier on a single node no give me a second I'll show you example where even though you issue transactions first the final outcome will be will be different so now we need a bit more formal definition of what it means for somebody to be equivalent right we still we so we started at a high level what it means for a schedule to be serializable we don't understand you know what exactly does that mean to be a equivalent so the way we're gonna find equivalence is me based on conflicting operations so we're gonna say that two operations conflict in a schedule if intuitively if one if the two operations come from different transactions and they're both operating on the same object and one of it at least one of these operations is a right so put in other words we can have different type of anomalies that can occur if we have incorrect interleavings of these of our transactions so we can have a read write conflict right right read right read and a right right conflict why no read read conflicts in the back yes right there's no issue who cares about read the same object as you and it doesn't matter right like there's no conflict so we only care about if we use one of them is a right let's go to each of these one by one so a read write conflict is also sometimes called an unrepeatable read and the idea here is that if I try to read the same object twice in my transaction and I get back different values both times then that's an anomaly that should not happen if I was actually truly executing in isolation of all other transactions so T1 can do a read on a can read on a pause then read on a again T2 is going to read on a then write on a so when T1 first starts does the read on a and it gets back ten dollars then there's a context which T2 starts running and then it's going to read ten dollars from a but then right back nineteen dollars on into a right it's going to add nine dollars to it but now the transaction commits I have a context which back over to T1 now if I read it a again I get nineteen and that's not the same thing that I read before right so this is an unrepeatable read I'm not able to read the same object multiple times right this is easy to understand when there's a single object it gets really hard when this ranges but we can ignore that for now next type of conflict is called a right need conflict right is basically you're allowing transactions to read uncommitted data and do things that they you know exposing information about the onset to the outside world about an inconsistent state of the database is sometimes called a dirty read so T1 is going to a read on a gets back ten dollars writes back twelve dollars to it then there's a context which T2 starts it does a read on AC's 12 man was the twelve dollars written by T1 but then it writes back fourteen dollars right ignoring the fact that we're over writing what a transaction T1 has written right at this point we've committed and now we've told the outside world hey we read 12 and we wrote 14 but now when we go back to T1 T1's ends up a boarding so we need to roll back that right on on a right but the problem is T2 already read that value and was a lot of commit and tell the outside world hey a equals equals ten dollars twelve dollars right so this is a is a read of dirty data data that was not committed yet the last one is a right right conflict right so T1 but T1 T2 we're going to read on a and write on B so T1 is going to start does a read on a then T2 starts writes nineteen dollars on it sorry T1 starts writes ten dollars T2 starts writes nineteen dollars then they write Andy into object B then I commit but now T1 starts up again and he writes Justin Bieber into B and then I commit so what's the issue here yes so he said you have a pair of values right I'll put it in other terms you have in the state of the base contains updates from T1 and teens updates from T2 and our transaction should never let that happen because these things are supposed to happen atomically or isolated from each other so should either be ten dollars Bieber or nineteen dollars Andy but now at this point here we have nineteen dollars and Justin Bieber and that should never happen right so given these conflicts now we understand what it actually means for a schedule to be serializable right so what we're going to go through now is how to take for a given arbitrary schedule to how to actually determine whether it's serialized or not like true or false so the thing that always sort of trips up students in this part of the lecture we're assuming that our schedules are static and given to us ahead of time so this is not about how to say our five transactions coming in how do I generate a schedule that's serializable I'm going to give you the schedules ahead of time just we're just trying to determine whether they're correct or not or serializable not what we'll discuss on Wednesday is how to work in a dynamic environment where transactions are showing up at arbitrary times and you may not know exactly what they're going to do ahead of time to determine a way to generate a serializable schedule for here we're just worried about correctness for a schedule when we're given everything ahead of time which again is usually not realistic there's some systems that actually work this way most most systems don't so now where things get tricky is that we're gonna have different notions of serializability we're gonna have a notion of conflict serializability and a weaker broader notion called view serializability right so most systems are gonna try that do transactions that support serializable execution will do conflict serializability even they don't they don't call it that there's a worst serializable as far as I know no system actually supports what is called view serializability because as we'll see this requires you to actually understand what the application is doing in order to infer whether something's actually correct or not and again no no system can do this because you require program analysis okay so we're gonna say that two schedules will be conflict equivalent if they're gonna involve the same actions on the same transactions and all the conflicting pairs of operations in these two transactions are being ordered in the same way and so we say something is conflict serializable if it is conflict equivalent to a serial schedule so again that sort of seems like a tautology seems I'm saying if something is conflict serializable if it's conflict equivalent to a serial schedule right what does it actually mean well it means means that if we're able to transform a schedule by moving the operations between transactions up and down in the order if we're able to transform it without having conflicts in a way that's that is a serial schedule then we can say that the schedule is conflict serializable again more blank faces okay so let's look at this visually okay so say we have two transactions t1 t2 both of them are doing a read on a right on a read on be right on be okay so what we're gonna do is gonna take conflicting transactions conflicting operations between the transactions and we want to swap their order so that we're gonna take all the operations in t1 and try to push that to the top and all the operations in t2 and push that to the bottom so that we end up with a serial schedule right so in this case here sorry it's sorry it's any operation and as long as they don't conflict then we then we can swap their order so read on a right on a and t2 read on be in t1 they're both operating on different objects in the database so we can swap their order right read on a goes up sorry read on be goes up right on a goes down same thing read on a right read on be these don't conflict because they're both read operations we swap their order one more time right on a read on be right on be different objects we swap the order one more time and we're done right we look at all the anomaly conflicts we talked about before if two operations don't conflict then we can swap their order so now we have something that's equivalent to a serial schedule so we can say from this our original schedule back here this is conflict equivalent or sorry this is conflict serializable because it's conflict equivalent to the serial ordering here right again we're just mainly moving swapping the order of the operations let's look at another one again here we have read on read on a right on a and both of these transactions so here the right on a and both these transactions is a conflict so we can't swap their order so therefore this is not conflict equivalent to a serial ordering right so this is a not is it this this schedule is not conflict serializable so swapping operations is easy when there's only two up and two you know two transactions in our schedule but when there's more transactions kind of you know it's paying the so you want to see if there's a better way to actually do this the answer is to use what is called a dependency graph so the idea of a dependency graph is that you're going to have for each transaction you're going to have a node in the graph and then you'll have an edge between transaction ti and tj if and only if there's an there's an operation in ti that conflicts with an operation in tj and that first operation appears earlier in the schedule than than the second one right because again you can't that means you couldn't be able couldn't be able to swap them so if you have one of these operations or two operations you can't swap the order you just add an edge in the graph from the the earlier transaction of the earlier operation to the other one so I think the textbook this is called a precedence graph this is going to look a lot like the wait for a graph we'll talk about the next class and what you have to implement in project three but the wait for a graph is about waiting for locks this is just about the dependencies between between transactions so what will happen is we can take our schedule we can generate one of these graphs and once there's no cycle then we know that the schedule is conflicts are liable let's go through example so this is that read on a right on a read on b right on b so in the first case here we have the the right on a and t1 and then read on a in in t2 so this is a conflict right we can't swap the order these guys so we're going to have an edge in our graph in t1 to t2 right because the the operational on right on a appears before the read on a in t2 so the edge goes from t1 to t2 and then we'll just annotate the edge with the object that was that calls the the the conflict then later on we have the right on b and t2 and the read on b and t1 so we add another edge going in the direction right and then lo and behold we have a cycle so we know that this is we you know we're not going to swap the order whenever this is not equivalent to a serial ordering right pretty easy so whether this is helpful again when you have more than two transactions right you can do the same thing and it's it's easier to identify what's going on so we have three transactions uh t1 is doing read on a right on a read on b right on b and then t3 is doing read on a right on a t2 is doing read on b right on b so the first one we have the the right on b in t2 and the read on b in t1 so we have an edge from t2 to t1 annotated with b the right on a and t1 the read on a and t3 and that's from t1 to t3 right and then the question is is this is this equivalent to a serial execution and there's yes right because there's no other conflicting operations that we care about right because we've already covered uh from here to here right i guess we would have another edge from there to there and actually yeah we like we'd have a read on a and write on a here but that's the same thing as uh t1 to t1 to t3 there right so with just this this is equivalent to a serial order where t2 x2 is first followed by t1 followed by t3 so this answers his question he asked me before right wouldn't always be better just execute things in the order that they apply that they arrive in this case turns out no right t1 to ride first but ends up sort of finishing last and when we actually look at the state of the database uh it's equivalent to being executed in the middle if we execute these in serial order so is this clear what's going on right it's pretty straightforward just have the edges from one node to the next if there's an operation that that conflicts in a transaction that appears earlier than than the other one all right so let's look at a more complex example so here now i'm actually introducing some program logic in our transaction again i said before the database doesn't actually doesn't see any of this right it only sees and reads it right so even though i'm adding a equals a minus minus 10 or i have the i'm declaring variables to compute the sum of something right the database doesn't see any of this right this is all done in the program logic of the application the other thing i'm introducing also too is this like echo this is not a real thing this is just to say i'm going to print out the sum right i've returned the sum to the application so when we start looking at the conflicts again we start off looking at from the top to the bottom here we have a write on a and then a read on a so we have an edge from t1 to t3 or so t1 to t2 with a and then later on we have a read on b and a write on b so we have an edge going now in the other direction because this is a read write conflict right and we said that again if there's a cycle in our in our dependency graph then this can't be conflict serializable my question to you guys though is that is there a way to actually modify the application code and make you do something different with the exact same read and write operations in the same order just changing what sort of the pseudocode actually does is a way to generate some different answer some different computation that will always still be correct even though this is not conflict serializable so what is this doing right so this guy over here is just computing the sum it's reading a putting in that to a variable the reading b and then adding the value b into the sum and then printing it out so this is sort of like the the the interest computation example i showed before where i kind of need to have exact values of a and b in order to make this the sum actually work out to be correct but maybe i don't care about exact values right maybe i just care about counting the number of accounts that are greater than zero greater than equal to zero right so assume i can never go negative right so now if i say take the value of a if that value is greater than equal to zero then i add one to my counter same thing for b so now it doesn't matter if i'm if i'm counting these things in between the transfer going on in here i'm still always going to get the correct answer so this is an example what i was sort of saying before that even though uh there may be some examples there may be some schedules that won't be conflict serializable as we're defining it in sort of a rigid dependency graph there may be actually times where if we actually knew what the application was doing we would be okay with this and would allow us to actually execute things in parallel this way so this is another notion of serialized body called view serializability right this is an this is a weaker notion than complex serializability and the way to think about this is that if we have one transaction in one schedule reason object um and gets an issue value for it then in another schedule i'll get the same initial value and then i would do whatever competition writes on it i end up with the exact same high level state of the database then that's then that's okay so being very hand-waving here on purpose but i think if i showed the next example it'll make more sense so say i execute three transactions t1 is going to read on a write on a and then t2 and t3 are going to do what i call blind writes on a where it doesn't actually read the value beforehand it just writes just overwrites whatever's in it right i just i just you know not even update it just doesn't matter what scare i just put a new value in them so now if i do my dependency graph uh evaluation to check to see what's complex serializable we would see that we have a bunch of of conflicting operations and we'll have a lot of edges so the read on a and write on a the the read on a and write on a here so forth and so forth going forward right for all these different operations sorry right so again is this conflict serializable she's taking her note right because we have a cycle between t1 and t2 but when you actually squint at what these transactions are doing it doesn't matter that actually it's conflict serializable because what do i care about there's there's only there's only one object a and the only thing i care about at the end of this schedule is that whatever value t3 put into it that's the final value of the of that object so i actually i you know i end up overwriting that the right by t2 and right by t1 and just putting whatever whatever's in there with a new value so under conflicts their lives ability this would actually you can't have this schedule because of these this this cycle and dependency graph but it's actually a equivalent to this schedule all right right it's view cuddling so i do t1 followed by t2 and then followed by t3 right because again the only thing i sorry the only thing i care about is just what's the the t3 do the final right so view serializability is going to allow you to it's encompasses all schedules that are conflict serializable plus these additional ones that support what are called blind rights and the other corner cases that i show before where your application may not actually care what the final output actually is as long as it's as long as it's correct and i'm defining correct correctness in terms of what the application cares about not what the database system sees so as i said view serializability allows for slightly more uh scheduled than conflicts or lives ability but no system action can do this because you can't enforce this efficiently because it requires you to understand exactly what the application wants to do with the database wants to do with the data right furthermore i'll say that there'll be also a excuse me there's other schedules that are not captured by view serialized but in complex serializability that are technically still serializable but because because we don't understand what's going on in the database we we we can't uh we can't support them so the spoiler to be from we talked about two phase locking next class that's going to be complex serializable when we talked about time stamp ordering protocols those are going to be all complex serializable right so there'll be more strict there'll be some cases where you could actually be more parallel and you end up avoiding transactions when maybe you didn't really need to right again everyone does complex serialize because you actually can enforce this efficiently anything that you want to get better more concurrency you have the special cases in your application which is hard to do so the way to think about this again visually is that you can think of this this sort of region here is all possible schedules you could have for a set of transactions and their operations and a small portion of this will be just the serial schedules for them and that encompasses now all the complex serializable schedules and then a larger reason to include all the view serializable schedules and we'll see this see this this this diagram in future slot in future graphs because there's going to be things that can sort of span in different directions and maybe include things that are are not few as i serializable or you know some but not the others but we'll cover that next class all right so any questions about serializability complex serializability view serializability yes through the beginning all right all right so here we go don't know what you're going to be committed or afford a point to get to talk about this one yeah so the statement is for this example here when i'm when i'm back here and i'm reading on a his statement is i don't actually know whether this guy's going to commit or not because at this point in time because i'm going forward in time i don't know anything about what this guy's going to do so your question is should you be allowed to do this right so his statement is if t1 aborts should t2 also be aborted yes and so what would happen here we'll cover this next class and that what you're talking about this is called cascading aborts so when i actually go to commit here this is what i was saying before is like i can tell the system i want to commit but i don't really commit until it comes back and says yes you committed so this guy will call commit but will recognize hey you read data from this transaction t1 here you read object a and this guy wrote to it but i don't know whether it's committed yet so this thing is actually going to pause install and wait until it finds out whether this guy successfully commits or not and in this case here when we get when we get this abort this will this will call a cascading abort it calls this guy to get aborted as well his question is this an optimistic approach this has nothing to do with whether pessimistic or optimistic we're not talking about current control at all this is just saying if you had interleavings of transactions what would be what are the problems that can occur okay any other questions you guys understand it all so clearly right all right um so now finish up real quickly right so for durability we've already discussed discussed some of these key ideas before right the thing we're going to care about again is if our database system crashes i have a power guess loss there's a software bug the hard drive crashes something bad happens then we want to make sure that any transaction that we told the outside world that it committed you've got back the acknowledgement that you committed that's sort of like a promise from the database system that no matter what happens i will make sure that your changes are always persistent someone may come along and overwrite them with new values that's okay right because that's that's all controlled by the application but no matter what i will make sure your changes are persistent okay so that's the durability mechanism we're going to care about we'll gain logging and shadow paging are one way to do this because we can write things durable at the disk right because your buffer hole manager is ephemeral if i pull the plug you lose everything so we have to persist things to disk and make sure we can always come back and see our changes so we will cover this again in two weeks when we talk about the logging protocols but at a high level again these are all sort of intertwined you know you don't want to write data out before transact you don't want to write data out from transactions that have not committed yet and come back and maybe only see part some of their changes right everything has to be all or nothing no matter what happens to you know to the system and to summarize what we talked about so far today right animosity consistency isolation durability as you can see the isolation is sort of the really important one because we really want to have transactions you know execute as if they're they're running by themselves and not worry about it being interfered with by other transactions now it's a quick aside the noticeable guys basically said we don't care about transactions right and we're gonna get better performance because we don't have to do all that sort of extra checks we'll have to do on on wednesday to prevent transactions from interfering with each other and it turns out if you do that then it makes it harder to write your application because all the hard that you know the hard logic get the right to say all right can i read this versus kind of write that like if you have to write that in your application then you have these like you know these one-off javascript programmers trying to figure out what the hell they're actually doing whereas if it's inside the database system by you know written by highly paid people that in theory know what they're doing they can do a much better job than what your average programmer do can make sure these things run correctly and provide these asset guarantees and you know google was one of the first guys her first first company is really pushing the no-sequel movement in the early 2000s 2004-2005 they came out with big table and said hey we don't do transactions we'll get better performance but now if you go read the spanner paper it's like in the first page when that came out said oh yeah it turns out it's better off to have some really smart people like jeff dean figure out how to make transactions work efficiently in in the database system and then provide that transaction abstraction to all the the grunt programmers rather than everybody else trying to figure out how to deal with you know incorrect data or the funky corner cases that we talked about today so it's better off to have really smart people that know what they're doing make your data system run transactions efficiently and provide that abstraction to everyone else and then nobody else has to worry about how to do transactions you can write your transactions your application code assuming you have exclusive access to the database and then that makes your life easier makes you more productive and you sort of think of like you know in java you don't have to worry about managing memory it does it for you may not be as efficient but it makes people more productive pythons another example does that make sense all right so um as I said current control protocols and recovery mechanisms are the most important functions that a database will provide you you don't want to write this yourself you're probably going to get it wrong and the thing that we'll see next class is that if it's not clear from this is that the current control mechanism will be completely automatic so you're going to call begin then you write your sequel queries you're not going to do like hey lock this lock that you can but you're in some cases they can provide hints for things but most programs don't have to do this it's all going to be done for you underneath the covers and again that makes you more productive because you don't have to worry about these things you just say here's the queries I want to run run them in any way you want but make sure that they generate a schedule or generate a modify the database in a way that ends up being equivalent to a serial ordering okay so next class we'll talk about two phase locking homework four will go out today project three hopefully will be posted today with an update for the source code uh and then we'll also talk about isolation levels on on on wednesday because this is actually this is actually kind of important because I'm gonna I spent the entire day talking about how great serializability is in practice most people don't actually use it and we'll talk about what the isolation levels are that how to get weaker some things weaker than serializability okay all right guys um hope everyone's doing okay and I'll see you on wednesday that's my favorite here comes duke I play the game where there's no rules homies on the cuss of yama food because I drink proof put the bus a cap on the eyes bro bush we're gonna go with a blow to the eyes here I come will he eat that's me rolling with fifth one South Park and South Central G and St. I's when I party by the 12 pack case of a bar six pack for the act gets the real promise I drink proof but yo I drink it by the 12 they say bill makes you fat but saying eyes are straight so it really don't matter