 So today's class, we're now going down the stack of the system and we'll start talking about how to actually store things. So the last class was all about indexes and concurrency control for the most part of the semester. Now we're going down to the system and say, how are we actually going to store data? So before we jumped into this, there were two major announcements in the world databases last week. I'm really curious if they saw that and knew what they were. Yes? Well, let me read down this. Yeah, so that was the first one. All right, so I got it out of order. Right, so the first one was AltaBase was one of the first in-mary databases out of Korea in the 1990s. They announced that they're now open source. Actually, everything's on GitHub. I spent some time looking at the source code. It's not very easy to read because all the directories are just three characters with the naming. So that's reminiscent of the old days, I suppose. And then the second thing was, as he said, my clicker's not working, is that MongoDB announced that they now support transactions. So I've known about this for a while. They were talking about this in the summer when I went to go visit them. The reason why this is a big deal is because MongoDB was the vanguard, the stalwart of the new SQL movement, but basically says, we don't care about SQL. We don't care about the relational model. We don't care about transactions. And pretty much every new SQL system has added some support of more or less, or any new SQL system that people are using widely, has added support for SQL and something that looks like the relational model. And for MongoDB to add transactions is to me sort of the last nail on the coffin of this trend of, hey, we don't need to do anything that traditional databases have been doing for years. Now some things that the new SQL guys do, I think they got correctly, they did right, and we can borrow those ideas in our own system. But not doing transactions, what you want to support updates, in my opinion, was a mistake. And you can see that MongoDB has added this. And so they are aware of their, how to say this, the sort of public opinion that MongoDB puts out features and they fix bugs later on. So they aware of these issues that they've had in the past and the reputation. And so they assured me that when they're adding transactions now, that they're gonna spend time to make sure it's being done correctly before they put it out in the virtual world. Cassandra put out transactions or what they call lightweight transactions a few years ago. It's basically just compare and swap. And that turned out to have a bug that people could hit a corner case and come up incorrect results. So all the bases of the source, MongoDB's adding transactions, let's keep going. Okay, so for today's lecture, we're gonna talk about again, at different levels how we're actually gonna store data. So we're gonna start off talking about the way we wanna represent the types in our system, the types in our database. Then we'll talk about how we're gonna lay things out in memory. And then we'll get into the stuff that you guys read about was how do we represent the tables in memory, what different storage models we wanna use. And then we'll finish off talking about talking about the catalog stuff. All right, so I'm gonna keep clicking this unless I put it in the real. Let's see if that works now. No, awesome. Awesome, okay. So I think we showed this picture at the beginning of the semester. This is the basic overview of how an memory database is architected. We have our memory index. We do a lookup on some key. We end up with a block ID and offset. And this is gonna point to some fixed line data block and the offset tells us where to jump down and find our tuple. And then any attribute in our tuple that is larger than 64 bits, we're gonna have a pointer to some location in a variable length pool where we wanna store that data. I said at a high level this is what we're doing. So today we're talking about how we actually wanna represent these two things. So the way to think about the in-memory database is that it's essentially just a large byte buffer, a large byte array. And we're gonna write code in our system and our database system that allows us to interpret those bytes to represent the values of our tuples, the values in our database. So essentially what happens is you're gonna jump to some offset and then you're gonna look in your schema for your table which is stored in the catalog and it's gonna tell you how to convert those bytes or interpret those bytes into the type you expect there to be, expect to be. So if I jump to some offset and I know I went to fourth tuple and that fourth tuple is a third tuple integer, I know how to interpret those bytes as a third tuple integer and then do whatever it is that I need to do. So we talked about this before when we talked about NVCC where every tuple is gonna be prefixed with a header that's gonna contain some metadata, things like timestamps that you can use for visibility and flags to say whether a tuple's been deleted or not. But for all the information we need to in order to interpret the bytes and say this offset is a third tuple integer, that's gonna be stored separately in the schema. It's stored in a centralized location we can do a look up in the catalog and say we know it's this type. Contrast this with Mongo where they use the document model where they don't have a fixed schema so they have to embed in every single document what the actual type is. So the reason why we wanna store all our tuples in this fifth length manner in the fixed length data blocks is that it's gonna be really easy for us to be able to compute what the starting location and memory is for when a tuple starts. We do simple arithmetic, we know the memory address where the block is, we know that we want the fifth tuple and therefore we know how to use tuple math and say jump to this offset within that block and that's the starting location of my tuple. And so because of this we don't have to have a extra indirection layer that you would have to have in a disk based system, right? Disk based system the index will give you back the page number and then the slot number. And then when you go inside the slot there's another indirection layer that says or go inside the page you look up the slot array and that tells you what offset now in the page has the tuple that you want. So it's sort of an extra step if you do a look up. Whereas in our case here we don't have to do that directly within the block to the offset that we need. So one thing to be mindful about all of this is that essentially what we're doing or trying to do is map the database pages to the physical hardware pages. We want to try to do this in an efficient manner and have it be almost a one-to-one mapping so that we don't have to have the operating system or the hardware itself do extra work anytime you need to access data. So this shouldn't be any news to any of you here because everyone should have taken an OS course as part of the undergrad or in the background for this course. But essentially the operating system is going to basically map the physical pages that are in memory to virtual memory pages that it maintains. And the idea here is that it allows the having this indirection allows the operating system to move the location of virtual memory pages around in physical memory without having to notify or modify the application code. So if we had say a memory page, a virtual memory page that was mapped to one dim, we can move it to another CPU and the application wouldn't know, wouldn't care. We wouldn't have to change any code. So essentially how this works is that the CPU's merry-man's reunion is going to maintain this translation look-aside buffer, the TLB that's going to map these virtual memory addresses to physical addresses. So if I want to look at virtual memory address, I know where physical address has it. So the TLB is sort of a special case, a special location in the CPU caches that the system's going to maintain but it can't obviously store everything in the TLB for all possible memory pages because this would be really, really big. So your CPU cache would be taken up by just the TLB. So the hardware is going to be able to manage how things are moved in and out of the TLB but in general, we want to have as few TLB misses as possible. So when you allocate a block of memory in our database system, the memory allocator itself is going to be in charge of making sure that our memory block, it will fit into the page boundaries. So typically you have four kilobyte pages, so even if I allocate eight kilobytes, it's still going to get broken up internally as two four kilobyte pages. And it does this because you want to avoid fragmentation and having large holes that we can't reuse officially. So the question that often comes up now in memory databases when you start to talk about memory pages is that can we use larger page sizes to reduce the amount of metadata that the system has to maintain for all the pages we've allocated in our system? So in Linux, there's a feature optimization called transparent huge pages where the idea is that we can instruct the operating system to allocate pages on sizes larger than the default four kilobytes, right? So you can say I want my pages to be two megabytes or one gigabyte. And so these pages are always going to be stored to continue to say it's large blocks of memory. And the idea here is that by having larger blocks of memory that we're using as the underlying physical storage for our in-memory database is that we're going to have fewer entries in our TLV. That means we're going to have fewer TLV misses and fewer cache misses because of this. So there's a way to do this manually in Linux or in an operating system. You can say I want to use huge pages for my process, right? But there's another feature called transparent huge pages where underneath the covers, the operating system is going to organize your pages in the background for you automatically. So if you turn this feature on, then you start allocating things as one gigabyte pages. What'll happen is it'll start splitting up smaller, combining smaller pages into larger pages or taking larger pages and splitting up into smaller pages. And it starts moving around where your memory is actually being located. Because this is all virtual memory, it knows where the physical address is. And so you think you're accessing things just the way, you know, when they were before the first time you allocated that memory. But underneath the covers, the operating system could have moved things around unbeknownst to you. So this sounds like this is what we would want in an NMME database because it'd be nice to have your TLB messages, it'd be nice to have the operating system, we organize our memory for us. But the problem is when it starts moving things around, anytime you access that memory that's in the process of moving, it's gonna have to stall your process or like block your thread until that operation finishes. And so for really, really large NMME databases, this can become problematic because now people report that the stall from the operating system could take seconds. And that's just as bad as going to disk, right? So that's bad. So I mean, obviously the advantage of this is quite huge because again, instead of having four kilobyte pages, if you can have two megabyte pages, you dramatically reduce the number of TLB entries you have. But in general, this doesn't work out. So in the, I think Linux 4.6, they add a new option where if you try to allocate memory with a huge page and it can't find a continuous block of memory, instead of just blocking why reorganizes everything, it can defer the operation and go to a smaller page and sometime later on, we organize things. But then you still have to pay the stall if you hit it during this period here. So almost every single NMME database, sorry, every database in general, not just NMME databases, tell you in the documentation to disable transparent huge pages and mostly huge pages in general. And again, this is because just the overhead of having 18 all this in the operating system is bad. We want the data system to do this everything for us, because we know the best. We know what's best for our application and our workload. The other problem with huge pages is that even if you only touch one byte, you still allocate the entire space, right? And so if you have to do this for really huge pages, if I only touch one byte and a one gigabyte page, that's a large chunk of memory that's gonna get tied up. So again, these are just links to some documentation pages where they say all is basically the same thing. Do not run your data system with transparent huge pages or huge pages. Some of these systems will actually throw an error if you try to turn them on and it looks in the operating system and says, oh, I'm running these huge pages and it'll crap out, so I won't do that. The only system that I know that lets you run with huge pages and actually some cases recommends it is Postgres and Vertica. Vertica is based on Postgres, the original code, so it doesn't surprise me that it lets you do that or it calls what Postgres does. In Postgres case, as far as I can tell, they only use it for the buffer pool, which runs in shared memory. So you can have huge pages for shared memory, but then have regular page sizes for your regular process. It's not like you can say, I want my hash table to be a regular page size and my fixed length data arrays be huge pages. You have to take, it's all or nothing. So in Postgres case, they can shove it off to shared memory and have huge pages there where you do get a big benefit for your buffer pool, but then all the internal data structures that may have smaller access patterns, they can run in regular page sizes. Yes? Can we imagine that it only has to keep space large enough to avoid storing some memory access? Your question is, can we have the operating system maintain huge pages for us so that what, sorry? For, to do a wide array that are not really large enough to keep the huge pages, sometimes like split large pages to small ones, U.S. So your question is, can we just tell, can we use like a hint to the operating system to allocate this large page and have it definitely be a large page, but then other things be smaller pages? These data is like splitting into smaller pages. Again, so if you turn on huge pages and you allocate one gigabyte, it's gonna try to do that. It always tries. And then if you have the defer option, if it can't do it, then it'll let you split it up, but then in the background, it's gonna try to put it back together. So if we use like huge page wisely, make sure it is like space up. The statement is, if we use huge pages wisely, and that's debatable wisely, this right, like we use huge pages wisely. Yeah, so in a perfect world, this would be, if we could do this for our data blocks and then have everything else not be huge pages, that would be perfect. But as far as I know, the Linux doesn't allow you to do that. Now, you can maybe push it off to shared memory. And I don't fully understand, I mean, I understand how shared memory works at a high level, but I don't understand how it's actually written in the kernel, and I don't understand the performance implications of having shared memory. So post-processing is shared memory because it's a multi-process architecture, whereas most other systems are multi-threaded and they don't. So I don't fully understand what are the implications of shared memory? I will say shared memory will come up later on when we talk about in-memory checkpoints that if a Facebook scuba, they write everything on the shared memory and restart the process. That, in that case, shared memory actually provides a benefit that regular memory doesn't have. But in general, as far as I know, most of you know a memory system wrong with shared memory, but I don't fully understand why. Okay. So let's now talk about, so now we know how to organize our, the operating system, the hardware is going to organize our memory pages. Now we want to say how we're actually going to store values in those pages. So what I'm showing here is basically an overview of how the in-memory database is going to represent different data types in the database. So for the top, we have integers, like big and small and tiny and medium and integer. For these, we're just going to store them with the same representation that you get in C or C++. Like if I declare an integer variable on C++, the hardware is going to store that out as either little Indian or big Indian, depending on what kind of CPU I'm running on. So in our database system, we're basically going to get piggyback off of that and use that same representation because the hardware can do arithmetic on that and other operations on that very, very efficient. For decimal numbers, like floats and reels in American decimal, these are a differentiate between whether we want to support floating point numbers or fixed point precision numbers. And I'll go through more examples of these two in a second. But these are what you would get directly on the hardware, the same way you got with integers and how you represent them is defined by this IEEE 754 standard. And then numeric and decimal depend on the implementation of the database system because the hardware is going to support floating point numbers. If you need fixed point numbers and you have to do something extra. For bar chart, bar binary, text field, and variable length fields, as I said, typically what happens is if the value is less than 64 bits, you store it in line in the fixed point data tuple. If it's greater than that, then you should have a pointer to something else. I'll show you an example of that next. For timestamps, date time, and other fields, the newer systems typically represent this as the number of seconds, microseconds, milliseconds since the Unix epoch. If you want a date field, some systems will store that as 32 bits versus 16 bits for timestamps. For the older systems that came out in the 1980s or things like that run on Windows, they obviously don't use the Unix epoch and they have their own database type. The one I want to spend time on is the floats and reels. So again, this should not be news to anyone here. If you're in this class and you should have taken an operating system course or a systems course, you should understand there's a difference between floating point numbers and fixed point numbers. So the variable precision numbers, the floating point numbers, and this is what you get as the sort of native data type in C and C plus plus. And these are represented in the same way that the hardware is actually going to represent them. So the hardware is going to have instructions to take two floating point numbers, add them together or multiply them or whatever you want. And the way it's going to expect those values to be stored in memory is the way we're going to represent it. So that makes it really easy for us to do arithmetic. So as we'll see in a second, these are going to be much faster than the arbitrary precision numbers because again, this is just letting the hardware do all the math for us. We don't do anything extra to check for rounding errors and stuff like that. The hardware does everything. So floats are 32 bits, real and doubles are 64 bits. And the reason why these can suck is because the specification says that the IEEE standard says that you can't actually store these decimals as truly exact numbers. So here's a really simple program in C where I take two floats, 0.1 and 0.2. And in the first case, I'm going to add them together and just get a number. And the second one, I'm just going to take 0.3, print that out and see what that looks like. And so for this, in this printf flag, I'm basically saying you need all 20 digits of the floating point number past the decimal point. So what you see is that when you take x plus y, you get some long number here with a bunch of extra stuff at the end. And then for 0.3, just printing it out, you get 2.9 all the way down followed by 8.8 and some other stuff, right? And again, this occurs because the hardware can't store these numbers exactly. So you're gonna hit rounding errors. If I just did printf without the 20 decimal points, I would get 0.3, 0.3. And it would look like to me as a human being the same, the text representation of them are the same, but internally in the hardware, they're gonna be totally stored differently. So if I try to do comparison of these two numbers, they'll come up as unequal. So the way the systems handle this is to do, if you need to have exactness in your decimals, is to do fixed point percentage numbers. So the idea here is that the database system is gonna maintain some extra metadata for the value that you're trying to store. And this metadata is gonna be things like, where's the decimal point, what's my sign, what's the scale of my number. So you get this when you get to American decimal. In some systems, these are just synonyms for each other. In other systems like Postgres, these are actually stored slightly differently. So the way to think of that is what's gonna happen is we're gonna store this as a, almost like a text representation of the actual number you want to store. And then the metadata allows us to then interpret that text and say, well, where's the decimal point and whether it's positive or negative, things like that. Right? And as we'll see in a quick demo, the performance difference of these things is actually quite significant. All right, so let's see if I can do this without crashing. Because it depends what screen it shows up on, all right? All right, so, of course, now I can't see that's that tight. All right, so this what I did was I have a Postgres database and I generated 10 million random numbers. And I loaded that in Postgres as a decimal and a real. I'm sure I didn't want to do that. Sorry, team box, right? So here's the, here's the decimals and then here's with reels. And so there's two fields, A and B. And I generated a CSV file with 10 million entries, so 10 million numbers. And then what I'm gonna do is I'm just gonna do a scan across all of them and add them two together. So I'll do select sum A plus B. So I'll do the reels first and it's gonna crunch for a bit, spits out the answer and it took 1.2 seconds. If I run that same query now under with decimals, you'll see that it took almost twice as long, more than twice as long, and actually the value is actually different, right? And this has some precision beyond the decimal point whereas in the first case here, it actually gave me back an integer, right? If I go back here and cast it as an integer, that's the same value I got. If I try to cast it as a number, I don't think it'll let me do that. Yeah, so it lost all the precision when it did the calculation, right? So again, if you want to not lose the precision of your values, you wanna store them as numeric or decimal type. If you want performance and okay for the rounding errors, then you store it as reels and decimals. And again, internally, what's gonna happen is the database system is gonna represent these numeric and decimal types a lot differently. There's a lot more metadata we have to store. So in the case of Postgres, this is how they actually store internally the numeric type. So Postgres is written in C, so there's not classes, there are structs, but it's basically the same thing. So you see for one single numeric value, you're gonna have to store the number of digits, the weight of the first digit, the scale factor, the sign, and then you have this numeric digit, which is just a type def to the var chart up here, unsigned var chart, so or the chart right. So this is where you actually store the actual digit you wanna store and then you have all this extra stuff you have to maintain to figure out where to actually interpret it. And when you look at the Postgres code, this is just the function to do addition in between two numerics. And you see there's all of these clauses in here that say like, if it's this sign, you this, if it's an numeric, you that, right. So whereas before, if it was just a wheel or a float, there's a single instruction to do addition between these two things, and that's super fast. We're in the case of a numeric, I have to do all of this just to add the two numbers together. And that's gonna be much slower. So again, this is just trying to say that the way you're gonna store internally a numeric type is a lot, there's a lot more work you have to do than just using a wheel. So currently in Peloton, we just follow wheel and double. We don't actually implement anything numeric type. If someone's interested in doing this and for the project three, I'm definitely interested in helping out with that. Because the kind of whole thing we can think about is not just how to store it the way Postgres does it. It's a way we can use query compilation and possibly send it to speed all of this up. All right, so now that we know how to represent the bytes for a type, so let's talk about how we're actually gonna lay things out. Right, and then be able to access it. So here I have a really simple table, I have two fields, I have a 30-bit integer and a 64-bit integer. So again, I jumped to some location in memory in my fixed length block and that's where my tuple is gonna start. And again, we just think of this as a char array or byte array that we then can interpret. So we're always gonna have some header, in this case here we assume it's 64 bits, in practice it's much larger. I didn't have an ID field and a value field. And again, so if I wanna access the ID field for this tuple, for this particular tuple, I know how to jump to the offset in the fixed length block and then I know how to look in the schema and the catalog and say, well if I want the first attribute to 32-bit integer, I jumped to this offset at the starting location for the tuple. So you bake in the code, I know what the size of my header is, but if I know how to jump to the right offset. Now the way you would access this in C++ is use reinterpret a cast to convert it to the type that you expect there to be. So this is actually done at compile time. This is not something you could actually run. Basically what this does is says it takes an arbitrary memory address and converts that or reinterprets the type that you're gonna examine that memory address to be in 32. Because again, this is just a byte array. The code doesn't know what the hell you're actually can access, right? So it's up for us to write our code and know that we access things in a type-safe manner. And reinterpret a cast will do that for us. For vary length fields, as I said, if it's less than 64 bits, then we wanna store it directly in line in our chart array for the tuple, but if it's larger than we wanna store it in the vary length data block. So let's say that I have this table here and I only have one value field, it's a bar chart 1024. When I do my lookup, again, I jump to the offset where I need, then I look in there and I find my pointer and that tells me to go down here in the vary length block to find the data that I want. And again, in this vary length block, typically what happens is we first field in the block or for a certain location, just go back to the data reading, followed by a pointer to maybe then another block if it's broken up between multiple blocks. Because otherwise, again, if I could store a regular file or regular data, I don't wanna have to be all containers in memory by having to set your net corner, I can break it up. So although you can in theory store like a 10 gigabyte file in your database, the practice, you don't wanna do that for performance reasons like logging and other things. In general, like there's some rule of thumb, changes every year in terms of the hardware, but in general, you don't wanna store really large files because at some point again, the overhead of managing that becomes problematic, but there is some files that are small and you actually wanna store them in. So in the case of SQL Lite, the guy says they've done experiments where they find that it's actually on your phone, it's better to store thumbnail images inside SQL Lite in your database because you can read them faster from disk than you can if they were separate files. Because in the case of SQL Lite, the file handle's already open, you just jump to the location that has the thing you need and you read it in. Whereas if it's stored as files, then you have to open up the file node, the I node and go fetch the data and that's a sys call every single time. So one optimization you can do with this is the same way we saw storing the prefix in the B plus tree from last class, you can actually put a usable extra space into our prefix for the value you're storing so that you don't have to follow the pointer and go check to see whether you have to match. So if I was trying to find all the fields where the first view character is Andy, I do have the prefix and avoid having to go check the actual data if I don't have to match. So as far as I know, Hyper does this and I don't know if anybody else does this but this is one thing that they do. All right, so now I'm gonna talk about having to store nulls. So as far as I know, there's three ways to do this. The first is that you can designate a special value in the domain of an attribute type to represent the null value. So in the case of say like a 32 bit integer, you could have the smallest integer you could possibly store be designated as the special value for our null. So if you look at your limits.h in libc, it's gonna have these pound of fines for the int 32 min, int 32 max. So you basically say int 32 min is gonna represent your null, right? So now if you start accessing the data and you're scanning the table and looking at a tuple and you see this value then you know you should be treating as a null and not as the true value. So I like this approach because it doesn't require to store any extra memory, keep track of whether something's null or not because it's always gonna be 32 bits the same way a regular value is. You do have to do some extra work in the upper levels of the database system to make sure that nobody comes along and tries to insert int 32 min, thinking they actually stored the value but then when they read it back they end up getting null. So this is actually what we do in Paletton although we don't check the int 32 min, we don't check when you're trying to insert null at the top levels of the system. This is what we used in h store and volt DB and this was also used in MonaDB. The probably the most common approach is to use a separate bitmap header, a bitmap in the header of a tuple to represent which columns in that tuple are null. So in that header you're gonna store some bitmap and you have one bit for every single possible column in your attribute in your tuple and then if any of those attributes are null you just flip the bit in the header. And so now when you start accessing the tuple you always have to check that header to see am I accessing a real value or whether it's null or not. So this is pretty much using everything like Oracle, Postgres, MySQL, DB2, this is pretty much everyone does, times 10. And the downside obviously is that you're storing extra memory. You have to store extra things in the header for every single tuple just because something could be null. But the advantage of it is that you don't have to worry about the value domain like doing the first one here. This is also why sometimes you see there's arbitrary limits in the number of columns you can have in a table because they have to store it in a bit for every single attribute you have. So in like Postgres and SQL Server you can have two to the 16 columns per table and that's a limit that's largely imposed because you don't want this to be too big. In Oracle they're famously only allow a thousand columns per table. This is not done for any real high level meaning reason it's just done for software engineering. Some guys at some point hard coded like a hundred columns in the 1980s and in the 1990s they went back and changed to a thousand and then they decided after that it was too much work. The third option is to store actually a separate flag per attribute that could be null as a prefix for that value. So again, instead of having it in the header you're gonna have this little flag be put right in front of the tuple or sorry the single attribute. So the tricky thing though with this is that although the null flag only needs to be a single bit you can't actually store it just as a single bit. So you actually have to pad it out you have to pad it out much more larger. So the only system that I know that actually does this is MemSQL. And so if you go look at the MemSQL documentation say this is for integers they had things like what's the minimax value but then they show you the sizes for the attributes when it could be null and when it's not null. So when it's not null it's what you expect there to be. 32 bit integer is four bytes 64 bit integer is eight bytes and so forth. But when it could be null you see that the size of some cases is actually double. So a 32 bit integer which could be was four bytes when it's not null if it could be null has to be represented as eight bytes. And they have to do this because they have to put the flag in front of the attribute to say whether it's null or not and it can't be a single bit. It has to be word aligned. Does everyone know what I mean when I say word aligned? Who here doesn't know what word aligned it means? Who here is afraid to raise their hand? Okay, all right. I'll go through an example with this one, okay. But again, so like, so, you know Boolean is another good example. A Boolean technically should be one bit it's like, you know true or false. They store it as a single byte. When it's not null but then they have to pad it out the two bytes to have that single extra bit to say whether it's null or not. Yes. What can they create there? Like what would be the range? Your correct, your statement is would this allow them to increase the range of what can be stored in a single value because now they have extra space? No, because the hardware, I'll see this when we talk word alignment the hardware is not going to want like a five byte integer, right? There's no instruction in the x86. Yeah, let's say we have like three bytes. We usually send the bytes to, like, or the, yeah, I understand what you're saying. But I'm saying like if your integer is now seven bytes there's no instruction that's gonna allow you to take and do additional two seven byte integers, right? And it's gonna mess up with your word alignment which we'll get to now. So I'm gonna explain to you what word alignment is in the context of it for an MMORP database and what we need you to do to care about this. But I'll say the example I'm gonna about to tell you is not actually how it's done. I'm gonna describe word alignment in terms of 64 bits, 64 bit words, but in reality, in the hardware it's actually 64 byte words. So a single cache line is 64 bytes and that's the alignment we're gonna care about. But to make it easy, we'll just do 64 bits. Okay, so say this is my table and I have four attributes. And as we said before, we're gonna store this all continuously memory in our byte array. So again, for each of these little boundaries I'm gonna represent a single 64 bit word. So now when I'm gonna start going across and I'm gonna insert a new tuple the size of the attribute I'm trying to store is gonna tell me how many I'm gonna take up as I go across. So the first thing I wanna store is the ID field, and that's 32 bits. So that's me half of our 64 bit word. Then I had this date field, the creation date, and that's 64 bits. And then now I'm gonna start where the ID ends and this is spanned over the word boundary and takes up the second half of the first word and the first half of the second word. Then we have a chart two, right? And that's eight bytes. So that takes up a little more space. And then we have our zip codes. There's a 32 bit integer. And again, the same problem before, now that I span to the word boundary. So now if I wanna do a lookup, say on creation date, I'm gonna jump to some location memory and I wanna read this back here. The problem is it spans two words. So the harbor has to use an extra word to go get the data that I actually need. And what it's gonna do depends on actually the CPU that I'm running on. So the first choice is that the hardware actually would do two reads across those two words, get the data that you need and then reassemble them in your CPU register, right? To put it back into its single form. And this is what you get in x86. And this is what you get in newer ARM processors. I don't know about power. All the alternative is that you just sort of random what you get. Maybe you get the first half and not the second half or maybe you get the second half and not the first half, right? So it's undefined behavior, right? They say, you know, essentially they're saying don't do this. The last approach is essentially you throw an exception back. The hardware says I can't do, you know, trying to do unaligned memory access. I can't do this and I'm not gonna try. So you throw back an exception and you have to handle that in your process. So ideally this is what you want, right? Because we're not ideally like, this is the nicest thing from a programmer standpoint that's most friendly to us because we don't have to change anything in our code. It just, you know, just magically happens that things are put in the correct place. But of course this is gonna be slow because now what was one fetch should have one fetch to memory to go read the thing I wanted to read. Now it's two separate fetches. And so you can see this when you run perf on your process it'll spit out the number unaligned memory accesses. And you can use that to figure out where your bottlenecks are in your system. So the way we can fix this in our really simple example is that for anything that is less than a single word we'll just pad it out with some extra bunch of zeros. And so that the next thing we end up writing is nicely word aligned. So in this case here the ID field is 32 bits. So I'll just put 32 bits afterwards so that when I start writing the creation date it fits nicely in the next word. So now when I do look up on creation date I know that in my catalog that I have a 32 bit integer plus 32 bits of padding. So I know how to do the right math and jump to the location that has the data that I want. Say it again? Zubsi is not worth a point. So that's word aligned, right? So I need the, again think of my cache lines. I go fetch this, it comes in as a single fetch from memory that I can put my cache on. And internally I'll have an, actually it's a instruction to say that, all right if I don't do any arithmetic or anything operation on this last zip code field I know how to jump to that location but still within a single word so that's fine, that'll be that. So I can take this out of my cache line and put it to my register and do whatever I need to do. Whereas in this case, in the earlier example to do anything with seeding I had to go fetch the two things two cache line patches, go reassemble them and then I put them in my CPU register and do my operation. Because I can't do any instructions unless things are in the registers. Yeah, but shouldn't Zubsi be like the first app that looks like 64 bit, like this, like this? So I think your statement is, this is 32 bits. So couldn't I restore that here? Yeah. Wasting space with padding. Well, I'm not trying to say that. What are you trying to say that? I'm trying to say that it's like a half word. Should it be like a half, it's just half a word with a second app? Oh, I see, so this is what eight bits, this is eight bytes, right, is that right? Yeah, six bytes, six bytes and then this is four bytes. So instead of this starting off set two, shouldn't it start off set eight? I think probably with respect to the cases, it's not weird as we're like, as it should be, but it's just half of this. Like, the office app doesn't know it doesn't care, right? It doesn't care, the beginning doesn't care. It only cares about the night fetch everything I need in a single way, in a single work. So as long as this is in a single word, all I care about, right, because again, like I'm bringing this word into my CPU cache, the single cache line, it's gonna send L1, now take it out of L1 and put it into my registers in order to execute instructions on it. That's super fast. Going out to memory, if nothing was on a line, that's terrible, I'm gonna avoid that. So the fact that like I gotta jump, you know, four bytes versus two bytes and get the thing I need to put in the register, it doesn't matter, that's trivial. Yes. Is this the same thing that happens when you declare a struct to receive the name and then both are aligned to these types of the design data type? So the statement is, this is what happens when you, if you, it's not a struct and it's packed correctly that you get the alignment, yes. But we're not declaring structs in our data system. We're mallicking large blocks of memory and we need to know how to interpret the bytes. So again, this is, I'm showing this in four bit words in a real system, this would be 64 bytes. So I don't care so much about aligning individual attributes, it's more aligning the tuples. So if this was, say the word for my tuple ended here, but ended here and I had a little extra space, instead of having the next tuple sort of needed here, I wanna pad it out and start the next one. And again, I started with my catalog so I know how to do the math and need to, to jump to the right offset. So we put this in C, like this one. All right, so here's a quick statement, this is what I thought it was, yes or no. In this case here, can I be smart and recognize that, oh, the zip code is 32 bits. I have 32 bits of space right here, instead of storing it in the end, it leaves some space and I could pop it down on there. Absolutely yes, there's no reason you couldn't do this. As far as I know, no memory data is actually does this. There's no reason you couldn't, right? You have to do it as a bookkeeping to say, I expect my tuple, logically when I call create table, was in this order, but physically it's stored here. Now for a column store, this doesn't matter because a column store is gonna store all the values in a single column together. It's only for a row store and so in that case, you may argue the extra stuff, the extra work you have to do to pack things in intelligently in a row store may not be worth it. Yeah, in this example, yes, there's still three words, yes. Okay, so now we know how we're gonna organize memory, you know how we're gonna store attributes and now we know how to interpret those attributes. Now we wanna sort of think at a higher level, how do we wanna organize our tables themselves? So there's three approaches to do this. There's the NRE storage model, the decomposition storage model, and then the hybrid storage model was in the paper you guys read. So when you took an introduction to database class, they described, the way they described a database is typically always gonna be in the NSM model except we don't tell you that it's NSM, right? So NSM is essentially a row store. It's all the examples I've been showing you so far where you have your starting location of a tuple and then you write out all the attributes one after another in the order that they're fine in the create tables data. So this is called the NSM, called NSM. So the NSM approach is ideal for transaction or OLTP workloads because the queries in these workloads or applications typically are gonna access all the attributes of a single tuple. So if I have a select star query and do a lookup to grow grab one tuple, the select star means I want all the attributes. And so in the NSM approach, I just jump in my fixed length data blocks, jump to my offset and just read all the attributes immediately afterwards, right? And that's gonna be really fast because I'm gonna have great cash locality because I'm just gonna read everything in sequential order. So this is really good for OLTP because a lot of these workloads are going grabbing the entire tuple and it's on a small number of tuples at that. It's also great for insert heavy workloads because when I insert a tuple, I wanna write all the memory out all at once. And I just jump to an offset where I have a free slot and just do a mem copy and write everything all together. And so it is typically used in the tuple time iterator model because again, we just pass around single tuples so it's really easy. So the way we can store this physically now in the MRE database is one or two approaches. The first is you use what is called heap organized tables, which is what I've been showing here. And this is essentially again the fixed length data blocks are called heaps. And when at any time we need to insert something, we just find a free slot in the heap and we just go ahead and put it in. The heap doesn't necessarily define any order. Again, just wherever you can put it, wherever you can find a free slot, you put it in there. So most data systems are organized in this manner. The other approaches used were called index organized tables. And this is where the tuples are gonna be stored inside of the primary key itself. So the way you start to think about this, say I have a B plus tree, the leaf nodes of the B plus tree are gonna be the actual tuples themselves. So this is what my SQL entity B does. This is what mem SQL does with skip loss. And the advantage of this is that whenever I have to do a lookup in my index to find an attribute or tuple based on its primary key, instead of having to do an actual lookup and say, well, from this location or this record ID, go find me the data in actual heap, I get to the leaf of the tree and my data is just right there. All right? So again, this one is probably most common. This is used in some systems. Some of the commercial systems like Oracle, for example, allows you to define whether you want either one or the other, right? But most of the times, it's either or you don't get both. Yes? Do you know an actual lookup for variable length between the two? Is it again? Sorry. Do you know an actual lookup for variable length between the two? Yes, so this question is a statement. In index organized tables, if I have variable length fields, do I still have to do a lookup? Yes. For the variable length fields in the data pool. I'll also say too, in some systems like, I'm saying primary key next here, if you don't declare a primary key, they'll make one for you, right? So my SQL does this, if you don't declare a primary key, they create an internal record ID, right? So there's not extra stuff you can do to make this work. So the advantages of NSM is that it supports fast insert, updates, and fleets because I'm assuming I'm gonna go grab a single tuple and I can jump to one location in memory and do whatever it is that I need to do to that one spot. It's really good for queries that need the entire tuple and we're able to use an index-oriented physical storage. Whereas when we see the DSM, you can't do this. The disadvantages is that this is not good for scanning large portions of the table as we're to see in analytical queries because most of the times those analytical queries only need a subset of the attributes. And so if I had to scan the entire, if I had to scan through the NSM approach, I'm gonna jump to different offsets in my table and it's gonna pollute my cache with a bunch of attributes that I don't care about. Right, so the NSM approach is faster transactions but slow for analytics. And conversely, the flip side of this is that the decomposition storage model, the DSM, this is really good for analytical queries that only have to access a small number of attributes because you're gonna end up only accessing just the data that you need. So the idea here is basically we're gonna take all of our attributes for our table and a set of storing for a single tuple, all of its attributes together, we store it as a column where all the attributes for all tuples, sorry, a single attribute for all tuples are stored continuously in memory. This is gonna have a bunch of different other bandages we'll see on Wednesday when we start talking about compression. But in general again, this is what you wanna use if you wanna do analytics. And so for this we can use the vector at a time iterator model when we process queries whereas the tuple at a time would be harder to do that on this because you have to pass along extra metadata and say, here's the column I'm actually pushing up to you in my query plan. Or you have to materialize the entire tuple and pass it off which is bad because it defeats the whole purpose of having this layout here. So the DSM approach is definitely in vogue in the last 10 years or so if not a little longer but it's been around for a long time. The sort of first known database system that was using more or less a column store approach was this thing called Canter that came out of the Swedish defense ministry and there's only like two papers about Canter that were published and they're not technically database papers because they talk about processing files instead of databases but the basic idea is there. And then in the 1980s they formed so many sort of paper that formed to define what the DSM approach looks like but it wasn't until the 1990s that Sybase came out with a thing called Sybase IQ that was using the pure sort of DSM model. And so this was an in-memory query accelerator where there was sort of an add-on to the main Sybase database system, Sybase ASC. The idea was that, and we'll see this in a second, like you basically had a cash copy or database in memory in Sybase IQ stored as a fractured mirror column store and then at runtime when you ran a query the system could decide, well, should I run it on IQ or should I run on a regular row store database? In the 2000s is definitely when these things really, the column stores really took off. There was a bunch of other systems that came around this time. They filled that data warehouse. But the three most notable ones are Vertica, Vectoralized, and No.80D. So Vertica was started by Mike Stonebreaker. It was originally a C-Store project out of MIT in Brown and they made a company for Vertica, Vectoralized by HP. So he came up with the idea because he was at Walmart Labs in the early 2000s and saw the struggles they were having with their Teradata installation and then he realized what you actually wanted columns for to do to support their kind of queries. So he came back to the East Coast and he started Vertica. Vectoralized is an optimized version of No.80D. So we saw this Vectoralized before when we talked about query compilation how they pre-compile everything versus compiling things on the fly. So this is the guy Peter Bonza started this, worked on No.80D and thought a way that it was better. So Vectoralized was commercialized and then I got bought by Actian which is the holding company that now owns the original Ingress code. Then they killed the product. If you Googled Vectoralized or Googled Vector when they renamed it, it would take you to a product page that would then redirect the homepage. You could still download it, it was hard to find and then they fired everyone that was working on this. But then they brought it back apparently. But Vectoralized is really good. It's a shame that they did do it. So then there's a bunch of other systems like Astrodata, but that was a piece of crap, Datalegro, whatever. In the 2010s, basically everyone realized that a column store is what you want one. So the big three database companies, IBM, Oracle and Microsoft now all sell extensions for their native type of column stores. But now you have things like Redshift and Impala and HANA and M-SQL. These are all systems that are now used column store approach. Because again, for OLAP queries the performance difference is quite significant. So the thing that I deal with in a column store is how we're actually able to identify tuples. So there's essentially two approaches to do this. The first is to use this fixed like offset that we were talking about so far where within each column we have a way to do the arithmetic to find the offset that corresponds to a tuple at a position, right? So if I want the second tuple, I want tuple number two in each column. I know how to look at my catalog and say, well this column is this type, it has this length. So I know how, if I want the tuple in the third position, I just do the math and jump to that offset, and now I have the value that I want. In the case of the same thing with them, they have different types, you just do different jumps in the memory to find the thing you're looking for. The other approach is actually to embed the internal tuple ID with every single attribute, every single value, every single tuple and every single column, right? So if I want tuple zero, I have to embed the ID zero and every single column, every single location and I have to do some bookup table to figure out how to get there to find the thing that I'm looking for. So you would do this if you had verbat length columns whereas if you had fixed length columns, then you can do this approach because it's much faster. So as far as I know, nobody actually does this, everyone does this, because this is the better approach. And we'll see later on, actually on Wednesday, next class, when we start talking about compression, there's gonna be techniques where if I could have verbat length fields, I could get really great compression ratios, but because I need to be fixed length, then I'm willing to pay a penalty to not get as great compression. All right, so again, the advantages of DSM are that we're reducing amount of work we have to do because when we scan things, we only scan the columns, I have all the data that we need. We'll get better compression because now all the attributes within, all the values for a single column or attribute are stored continuously and they'll be of the same data type and therefore we can run some compression on that to reduce the total size we have to store. For the disadvantages, obviously, are that for point clears, any single time we need to get the entire, all the attributes, it's gonna be really expensive because we have to stitch things together or break them apart, destroy them in different columns. Now what I'll say though is that Hyper actually claims to support fast transactions and they store everything as columns. Whereas in all the systems we'll talk about, they're gonna have a row store for all the latest updates and then a column store for the analytical stuff where in Hyper they store everything as a column store and they claim that they're good enough that they are fast enough that it's not a big bottom that you know you have to split the stitch. But we haven't really benchmarked them for transactions yet. So one of the things you can take advantage of now when we start talking about can we have a single system that can support both analytics and transactions all together is that we can rely on this fact of in data's applications is that there's this notion of age or hotness of data that it's gonna allow us to identify whether we wanna store things as a column store or store things as a row store. So think about this like when you access a website like you go on Reddit or Hacker News, you read the latest posts and then maybe you upload things or add comments to things that people are talking about today. You very rarely go back four months ago and leave a comment because no one's gonna see it. So what's the point, right? So in that same idea of being identified what data is gonna be updated most likely to be updated in the future then we can store that as a row store because they're gonna do transactions on it and we want that to be fast and then as data gets colder or it's not gonna be accessed as transactions anymore we wanna convert it to be stored as a column store because that's gonna be faster when we wanna do analytics, right? And this is really important because we wanna be able to make decisions on how to affect the behavior of newer people or newer transactions or new updates based on things that we saw in the past. So the most common setup you often see in sort of large database applications is that you have this sort of bifurcated design where you have separate front-ends that run on your transactions and the database is running in these separate systems here and then you have some extract, transform and load or ETL process that moves data from the front-end to the back-end then you do all your analytics there. And the idea here is that this is sort of the row store can be really fast to do updates and then this is sort of the column store allows you new analytics very efficiently. But a lot of times people wanna be able to do is they wanna be able to take things you learn in the back-end and push it to the front-end so that you can show people different ads or show people different products they think they wanna buy, they help them give you more money. So this is a very common pattern that you see. And the problem is that this process here in the middle is ETL thing can be really slow because you don't wanna start moving data out of your front-end and slow down the transactions that are ingesting new information. So a lot of times people do this maybe once every hour, once every day, but ideally people wanna do this in less time. So this is where a hybrid storage model is gonna come into play and this is the hybrid transaction processing workload or HTAP workloads that you guys read about in the paper. So the idea here is that we're gonna have a single logical database. And then what I mean by logical is that it could run across multiple machines but it appears to you as the user as a single logical database. For every table you create, you only see one table instance of that. And then what's gonna happen is we're gonna use different storage models that we talked about before, the NSM and DSM, that we can use for the hot and cold data. So the hot data we wanna store is a row store because that's gonna be faster updates and then for our cold data, where we wanna do analytics, we'll store that as a DSM. The idea is we're gonna do this automatically behind the covers to move data out from the NSM to the DSM without the user having to tell us anything there. So there are essentially two ways to do this. The first is that we can maintain separate execution engines in our system that are each designed to operate directly on NSM or DSM data. Or we can have a single flexible architecture, a single execution engine that knows how to process and operate on either database. So the separate execution engine approach is pretty common and then the idea is that basically you run two internal database management systems that are designed for either DSM or NSM. And then when a query comes in, you have to figure out what data do I actually need to access and then direct that query to go either to one or the other or both of them. And then you steps the answer back through the other to boost a single result to the end user. And to do this, if you update, you need to use two-phase commit to make sure that both sides are synchronized correctly. So the two approaches to do this are fraction mirrors and then the delta store. So I'll go through both of these. So the fraction mirror approach is again, essentially we're gonna have a complete second copy of the database stored as a column store. And then what happens is all our updates are always gonna enter the NSM side and make all our changes here. And then in the background, we're gonna send these updates to our mirror and have that be sorted as a column store. So now whenever every time I have an older query comes in, I can have it directly run on the DSM and not interfere with the execution on the NSM side. So in Oracle, they call this the Oracle in memory column store, this essentially is not adorable. It's just an emery catch for copy. So if you blow away the system, this thing goes away and then you come turn the system back online, then it has to rebuild this. But this is always gonna be persistent and always durable because this is the primary copy of the database. So you may think this is kind of wasteful, right, that you have to make these copies over this, but when you think about it from a business side point of view, Oracle has an amazing customer base, right? It's the most widely deployed database, except for SQLite, and they make a lot of money. Larry Ellison has a big job. And so you don't wanna interrupt people's applications by having them have to rewrite the code to figure out whether they should use the DSM or NSM, right? You can get all the ecosystem of Oracle for free and only the covers they manage this copy of the data. The alternative is use a Delta Store approach. And this is where we have a sort of front end Delta Store stored as a DSM, sorry, NSM. And then we have our, any data that gets moved out of the Delta Store stored as a DSM in this second part here. So all your updates going to here, the Delta Store is going to be really fast, that's transactional, and it ends up in the Road Store. And then in the background, you'll migrate the data over to the historical record, the archive record. So this approach is used in HANA, it's used in Vertica, Splice Machine, Snappy Data, this is a very common approach. And typically what happens is they sort of make a Frankenstein system. I don't mean that in a juridic way, I mean they basically stick together some existing databases to make them work like this. So in case of like Snappy Data, they sort of gemfire and they spark, put those two together, Splice Machine is HBase plus spark. So they don't have to maintain, you don't have to write two really brand new systems. They just have a little bridge code in the middle that knows how to move things over. And then he has a layer of updates to say, if I have to do an analytical query, should I go to, I do my CM on this, and then I have to check this, to make sure that there's any updates that I missed are included in the output. So I mean, hand waiting on how you actually decide how to move data, but there's three approaches to do this. So the first is that you have a manual approach where basically the human tells you, human DBA tells the data system what data should move and when should it move it. In the case of MemSQL, they have the Delta Store and the archive store approach that I showed in the last slide, but they require a human to tell you what data should be a Rows Store or what data should be a column store. You have an offline approach where the data system is gonna monitor a log offline and they make decisions about what to move when it's appropriate. Or you can have an online approach where essentially the system watches how queries act as data and then they can make decisions based on what they think is gonna happen in the future or what data is not being accessed anymore to move them to one side of the yellow. All right, so just to finish up, I wanna talk about what we do in Peloton. So we decided to build a single execution engine architecture that is able to operate directly on NSM and DSM data all transparently to the user. So the nice advantage of this is that we don't have to store two copies of the data as you would in Fractured Mirrors. And then we don't have to sync the two different data engines because it's all inside a single code base. So the system can still use the Delta Store approach that we saw before. But the idea with this is that we don't have to have against these standalone systems synchronized with each other. Everything's always in sync. So the basic idea looks like this. So say this is our original data, it's originally stored as a NSM. And then we look at the access patterns of the queries that we're executing. And we see in this case here, we're doing an update and we access all columns. And this, we do an analytical query and we're only accessing two of them. So the idea is that we wanna identify what data is hot and data is cold based on how it's actually being used. And then we can split this up into a different layout so that the hot data is stored as an NSM and the cold data stored as a DSM. So if you look in the code, you'll see this, these notion of tiles. And this is essentially how we're able to represent this and do this. So the execution engine processes tiles, but before it processes tile, our tile group, it looks and says, what's the layout for this tile group? Doesn't look up in the catalog and say, well, what attributes do I have? And now knows how to jump to the right offset to find the data that it needs. And then in case of query compilation, we can compile a query that can operate multiple tile groups at the same time. And it knows that different tile groups are stored in different ways. So we generate different code to access them separately. So here we have tile group A and tile group B. Tile group A is stored as a column store, sorry, row store, and tile group B is stored as a column store. And then internally it's broken up with different tiles. So then we also have this thing called a tile group header that basically ends up being a, just extra metadata to say what's going on in each tile. And then when we have a query that needs to access different tile groups, we know how to route them to the right offset that has the data that we need. Again, this is all done for you underneath the covers. So the last slide I wanna show you is the performance benefit you can get from having an adaptive layout. So this is a workload where we took, we derived from an online web application and we contrived an example where we represented what the workload would look like with a dynoral pattern of doing a bunch of inserts during the day when people were active and they do analytics at night when we wanna try to figure out what happened during the day. So the workload is broken up to discrete segments where we do a bunch of scans, follow-up to inserts, follow-up to scans and so forth. So if you store everything as a row store, what you see is that the scans take long and then the inserts are really fast, right? Because it's really fast to do inserts in a row store. For a comm store, you'll see that the scans will take less time than the row store because everything's a comm store. We can just read the answers we need. But then the inserts take slightly longer because we have to take a tuple, break it up into different attributes and store them into separate memory locations. But if you have an adaptive layout, you end up seeing you get the best of both worlds. So at the very beginning here, we're doing this as well as the row store for a scan query because we haven't seen any queries before, so we're just gonna, everything's gonna be stored as a row store. We haven't adapted anything yet. So in this environment, anytime you insert a new tuple, it always ends up as a row store by default. But then what we see is that we observed accessing data as, in fact, we better suited in a comm store layout. So over time, we start migrating it and so that speeds of our scan queries. And now when we do our inserts, we match the speed as the row store because again, we're always inserting as a row store. But then what you see is that over time, now when we start doing scans again, we're able to get the same better performance than the comm store because we'd be smarter about how we combine to get tuples together or columns together so that they're fit nicely in a single cache line. So it's one fetch to go get us all the actions we need, all the data we need for a single tuple. So this is showing that when we adapt the data store, it's a single database engine, a single user interface, a single copy of the database, and you get the best of both worlds of a comm store and a row store. Yes. This is for the second scan. This one here. Yeah, it has like those row layout. Yeah, so it has, so yeah, the reason why the question why it's a little little weird. So I do all my scans and I start reorganizing the data to be a column store. So I start doing better than, I start performing like a column store. Then I do a bunch of inserts, but now when I start scanning that data again, I'm scanning the row store data I just inserted plus the column store data. These are essentially doing full table scans. And so my performance is slightly worse because I'm reading much insert data. Same thing where I'm in here, but then over time, the majority of the data that I'm reading is a row store, sorry, it's a column store. So I'm getting the benefits of being the DSM model. Yes. How does your insert better than the column layout? Have you inserted for a different table? Why do I insert better than the column layout? Because I'm inserting as a row store. Yeah, so the scan is scanned and it's for a different table. Same table, but I'm inserting new tuples. It always ends up in the row store portion of the same table. Right, so again, it's not like in Manseco, either everything's a row store, everything's a column store. Within a single logical table, I can definitely store them in different ways. Mm, I need something refreshing when I can finish manifesting to cool a whole bowl like Smith & Wesson. One cork and my thoughts hip hop related. Ride a rhyme and my pants intoxicated. The lyrics are quicker with a simple moan liquor. Since I'm a city slicker, play waves or pickups. Rhymes I create rotate at a rate too quick. To duplicate fill a breeze as I skate. Mic's a Fahrenheit when I hold it real tight. When I'm in flight, then we ignite. Blood starts to boil. I heat up the party for you. Let the girl rub me and my mic down with oil. Wrecking still turns with third degree burn for one man. I heat up your brain, give it a suntan. To just cool, let the temperature rise. Cool it off with same eyes.