 Thank you. You didn't buy your girlfriend a birthday present, I heard. Oh, yeah, I forgot. Yeah, that's why I left the phone for a week. Okay, let's get her something for Christmas for the holiday, okay? She's a good woman. Okay, so what's up for you guys? So this is the last week of classes. The Project 4 still do. 155 will be due. And then on Wednesday's class, we'll have the Assistant Poverie and the final review. We'll also be announcing the result of the paternity test about my kid and, well, whether or not it's mine, we'll see. So I'll announce that at the end of class on Wednesday as well. All right, before we switch over to the speaker, any questions about what's remaining for the rest of the semester for you guys? Yes? The Extra Credits do the same day as Project 4. It should be on the website. We'll send out the feedback for the Extra Credits probably tomorrow or Wednesday. Okay? Any other questions? All right, guys. So we're super excited today to have Shashank from Oracle to come give a guest lecture. And again, like I said, the reason why I like having him here is because it's going to make me not seem crazy that I didn't tell you all these things about databases and just didn't make things up. He's going to say, oh yeah, here's a real system. They're making lots of money. It does all the things that we talked about this semester. But he's going to talk about it in the context of an in-memory database is not what we talked about, but it's going to apply the same concepts in a different environment. And if you're interested in the kind of things he's talking about today, this is what the Advanced Class 15721 will be entirely about next semester. Okay? So Shashank is a vice president of in-memory database technologies. Is that correct? That's correct. So he has an undergraduate and a master's degree from UC San Diego and he's been at Oracle for six years, nine years. Nine years? Nine years. And again, this is a real dude working on the system. This is some marketing ****. This is real. So by all means stop and ask him hard technical questions. Push him and see whether he actually knows what he's talking about. Okay? Damn. Let me fix the light. Awesome. All right. Hey guys, thank you for coming. So again, my name is Shashank Chavan. I come here every now and then. I guess there's a PDL lab that they come here for the retreats. And I think it's like winter and spring, I think. And Andy's kind enough to invite me. So thank you again for inviting me here. So today's talk is going to be on this top five innovations of Oracle's database and memory. So this talk was initially given to folks that actually know about the database and memory product. I kind of changed it a little bit to kind of give you guys a background on what database and memory is. Like Andy said, please feel free to stop me at any point in time anytime you have a question of no problems. As I was telling Andy, I usually have more slides and I have time for the talk. So sometimes I talk too fast. So just interrupt me anytime you want to, OK? OK. So let me see. I just want to make sure I'm following the slides appropriately. OK, great. So that's my next slide. All right. So I like to start off with this motivation slide. This is talking about real-time enterprises need in-memory innovations now. OK. So this was the slide that I have currently. What I really wanted to show was this slide with our presidents tweeting in the morning, called the arms, real-time enterprises need in-memory innovations now. But I was told it would be too political. So I opted for this instead. But I actually like this one better because it's real. It actually is very true. Did he really say called arms? He literally said that. Today? No. No, he didn't say. This is actually when the impeachment trial started, I think, is when we put the slide together. But the reality is the following. Real-time enterprises are enterprises that need access to data now. They're data-driven. They are agile. They're efficient. They want to be able to react instantly to data. So you can imagine there are a bunch of enterprises that fit this category, right? Like insurance companies, retailers, manufacturing processes, financial services. People need to do fraud detection in real-time. Actually, this just happened yesterday. I was buying some speakers at Best Buy because they're on sale. And literally, after I made that purchase, I got a text message that says, are you certain that you want to make this purchase? Did you make this purchase? Blah, blah, blah. Because they immediately know that I'm not the kind of person that makes a $500 purchase on the spur of the moment. That's just not me. So anyways, that's real-time. And so in order to achieve that, we need a lot of things to come in place. So we see the demand that's coming from enterprise companies, right? So what's actually making this possible? So if you look at the hardware trends, friends, OK, you're starting to have larger, cheaper memory. So DRAM, everybody's familiar with memory. PMEM, I don't know how many of you are familiar with persistent memory. I talked about that in the last bullet there, so I'll get to that then. You have larger CPU caches. So now we're talking about 32 megabytes of shared L3 cache on Intel's latest processors. You have larger multi-core processors, so 24 cores with Intel Cascade Lake. Larger SIMD vector processing years. Show of hands of folks who know what SIMD is. Have you heard of that before? Single instruction multi-data. Excellent. OK, so now you have 512-bit SIMD registers that can paralyze your operations essentially in a single cycle. Faster networks. So you have 100 gigabit per second rocky versus the 40 gigabit per second that you have with Infiniband. Numa architecture, so now you have to concentrate on local memory versus remote memory. So a bunch of factors. Persistent memory is really the biggest, in my mind, at Oracle and of our group, we think is a big time game changer in the memory technology space. So persistent memory is basically like DRAM. It's just like memory, except it's a lot larger, about 3x times the size of DRAM. It's got availability because it's persistent. You pull the plug, you put the plug back in, it's still there sitting in these persistent memory dims. And it's fast. It's not faster than DRAM, but it's way faster than Flash. So it's got a lot of really cool qualities and all of these things combined help us move towards meeting the requirements of real-time enterprises. So their last project is on logging and recovery, doing areas. So think about this. You're writing your ability to write out dirty pages in your workflow. So in persistent memory, you don't want to do project 4 if it existed. Exactly. Okay. Where do we have in memory today? Okay, so we have different tiers. So the top tier here is application tier. This is where you care about an immediate response time. So imagine you have an application and you're going to link this database directly into your application. Shares the same process space, the memory space, everything is shared. This is extremely important if you're like OLTP sensitive and you want to do an insert very quickly or read very quickly, and you're just getting a value for this particular key. So that's times 10. Times 10 came out years ago. It was one of the first in-memory, I think it was the first in-memory database. I don't know, maybe 20 years ago. So one was for 96. Okay, much longer than that. So that's times 10, that's in the application tier. The database tier is what I'm going to talk about. It's a local database in-memory. And this is where we really care about analytics, but we're also mixed workload performance. And it's embedded directly into a real working enterprise database. And then you're processing things at billions of rows per second when you're looking at analytics. Then there's also the storage tier. So I'll talk about that as well. And by storage tier, I mean where the data is actually residing, whether it be on disk in some persistent store or in flash or something close to basically a persistent store. So you have the application tier, database tier, storage tier, and I'll talk mostly about the database tier and the storage tier. Okay, so I'm going to start off with a background on what Oracle database in-memory is. Okay, let's start with some basics. So before we had columnar databases, which is the next slide, we basically have row databases. Row-wise databases are fantastic if you care about transactions. Okay, so as an example, you're an ATM machine, you have your ATM card, you want to deduct $25 or $20 from your account. That's a transaction that you're applying. You're going to quickly search through this gigantic database of millions or billions of users, not billions, but certainly many millions, and search for your particular account, find your balance and deduct $20 from it. So that's like looking for a specific row that's potentially accessing multiple columns within that row. That's transaction processing. So row-base is very good for accessing a particular row and then touching multiple columns within that row. So as an example, here, if you're running this query like select column 4 from my table, what you have to do is you basically have to process each row and hop to column 4 to get to it. So the problem is when it comes to analytics is you're basically visiting every single row but on top of that you're going column by column touching multiple cache lines to get to the column that you care about. So it's again fantastic if you want to get to a specific row and access a specific column. That's row format. Now column format is fast for analytics. Analytics is where you say I actually care about visiting all the rows but I'm only interested in certain columns. So in this case now if I just want to say select column 4 from my table, all I need to do is visit column 4. The data is stored in columns in contiguous memory. Each column here is in a contiguous piece of memory. Any questions so far? The difference between row format and column format? Let's do that. Okay, great. Awesome. Okay, so the innovation on R and what Oracle developed for R and memory product is we decided that you can't just have one or the other. It's just not feasible really for enterprise companies. Enterprises are basically more interested in mixed workloads. Sometimes they're running analytics, sometimes they're running transactions. Sometimes it could be very mixed. Sometimes they run ad hoc queries. These gigantic queries, sometimes very simple queries. So it's a mix. And so we basically decided that you can't just choose one format. We basically want to have both formats. And so that's why we have this thing called dual format architecture. So with dual format architecture we basically maintain both the traditional row store that's sitting in your buffer cache as well as a columnar representation that's sitting in memory. Okay, we maintain both. And both of them are simultaneously active and consistent with each other. All the brains goes into the optimizer. The optimizer decides when it sees the query which path it should take. So for example, if it sees a query that says I really want to get to this particular row, this key, and extract that value and I have this OLTP index on it, the optimizer will say go to the buffer cache and fetch the index block for it and read that row and that's how fast it'll get that row. If the query is an analytic query and you're doing some kind of crazy aggregation or group by some or some joins or whatever it is, it'll use the column store. Okay? So it's, for us when we developed this database in memory, we built it natively into the database so it's not a separate storage engine. It's part of the existing storage engine. It's basically just a, think of the columnar representation as if it was an index an index that resides in memory. Okay? Any question? So like you just said that you're using the best of both when you're doing read, right? So if you have an OLTP thing, you do the row thing and you have to scan the whole column but say you're doing right, then it's the worst of both things. So it depends on what the percentage of writes are to your workload. So I'll cover that in the second terms of how we handle writes or updates or whatever in general. If you're doing, let's say, 1% of your workload is writes, okay? That's not a problem. That's basically a mixed workload. Even 5 to 10% is fine. When you start getting to a higher percentage, you're like 20, 30, 40%, maybe let's say, this is not the right solution for you because now as you'll see, you're basically, oops, I'm sorry, I don't know how that advanced. You know, I'm probably running with the timer. There you go. Okay. What you're doing is you're basically maintaining both, right? You're maintaining this column store and the row store and that could become problematic. So I'll describe how we handle that very efficiently. Any other questions? Okay. Next one. Alright, cool. Awesome. Alright, let's go into some of the details in terms of how we store this data. So, what you see here is a table. We store the data in a very impure memory column or representation. So sales table continues to sit on disk. Doesn't change, right? It's exactly as you guys know it to be. It can be pulled into the buffer cache if it's accessed. That doesn't change. But what you do is if you say bring this table into memory, we will basically bring it into memory. So take the rows, transpose them into columns, and then store the columns into blocks of contiguous pieces of memory. Okay. There's no changes to the disk format. We're Oracle, so we support all platforms. You can enable in memory at any level. At the table space level, at the tables at a column level even, you can specify at any level. And the only thing you do need to do is you need to tell us how much memory you want to reserve for your column store. That's, I'll talk about some of the future stuff we're working on, but that's the only thing you really have to do is tell us how much memory. Now, if you dive in deeper into this, we basically block out these rows in something called IMCUs, in memory compression units. IMCU, in memory compression units. And IMCU basically has about a half a million rows. Half a million to a million rows. And within each IMCU you have all the columns for that table. So here in this sales table here you have the employee ID name department salary. We also have this row ID column. Now, this is really important. This is kind of the little trick that we have. That row ID column maps to the actual locations of those rows on disk which is this bottom part here. So on disk we basically store data in extents which are basically contiguous pieces of blocks. And this extents says, extent number 13 has blocks 20 to 120. Extent number 14 has blocks 82 to 182. And in each block you have some hundreds, thousands of rows. And so this IMCU is mapped directly to the physical locations on disk. And why this is important is because when you have a modification to a row, it's very easy to say which IMCU does that map to. And I'll go through another slide that talks about how we utilize this when we talk about DML. So the only other point to make here is you can specify how you want to compress this column. So, you know, I don't think you guys talked about compression yet or different data formats. And you guys did some engineering coding before or prefix encoding. Okay, so I think I have a few slides on this. But you can specify what compression levels you want. You're limited on memory. So you can bring it into memory and compress it however you which way you want to. Okay? So that's how we store things as IMCU. Here's the slide on compression. So how do we actually store this data? Okay, so imagine this was your column. Okay, this is your uncompressed data. And I have cat, cat, fish, fish, horse, horse, dog, dog, cat, et cetera. Okay? It's not actually sorted. I have dogs here and cats here, but this is just my example. So the first thing we do is we dictionary encode this. So what dictionary encoding means is it identifies the distinct symbols in that column, pulls out those distinct symbols and then sorts those distinct symbols and then assigns a code to them. So you have cat, dog, fish, horse. That's all you have in this column. I sort them and I assign them codes of zero, one, two, and three. And then you just have to replace the values of cat, cat, fish, fish, et cetera with the codes themselves. And we take it one step further. We actually bit-pack those codes. So because there are only four distinct symbols there, I only need two bits to represent each one of those symbols. So zero, zero, zero, zero, one, zero, et cetera. Okay? So far so good. So that's dictionary encoding. The next thing we do is we apply RLE or run length encoding. Run length encoding is basically saying let me see if I can identify a run of the same symbol and number of times and replace all n copies with a single copy along with the run. Right? Account. So here you notice how we have cat, cat, and fish, fish, or rather zero, zero, zero, and one, zero, one, zero. Well, I can replace that with just zero, zero, and one, zero, and then I have a run over there. Is this little button the- It should be, yeah. There you go. Perfect. So then you can see that you just basically maintain some runs that just identifies how many runs there are for those symbols. So far so good. Okay. Then we take it one step further. We apply something called OZIP or OracleZIP. OracleZIP is pretty much a fancy dictionary encoding algorithm. But it's fantastic because it's very simple and it's hardware friendly to decompress. I won't go into too many details, but try describing what's happening here. So what it's doing is it's now finding patterns within the encoded values now. So here you have zero, zero, one, zero, one, one, zero, one, and you see the same pattern of zero, zero, one, zero, one, one, zero, one. So you'll replace these set of eight bits now with a single code of zero or a single bit of zero and you'll replace zero, one with one and now you've compressed it even further. So you're building in yet another dictionary on top of the encoded stream. So far so good. Okay. So that's what we do. We take it to kind of an extreme and then you can actually take this and you put Zlib on top of it or BZIP or whatever, a higher level compressor on top of it. Okay. All right. One more compression form I want to talk about is something called prefix encoding. So we talked about that dictionary, right, of cat, dog, horse, et cetera. Notice how they were all sorted. Well, so once it's sorted, you can actually remove a common prefixes from adjacent symbols and store them separately. So the example I have here in this dictionary is used, used, useful, usefulness, et cetera, right. And they all have use, but some of the symbols can actually benefit from something more like useful. So what we'll do is we'll basically take out maybe like a block of eight symbols and then from those eight symbols we'll find the common prefix across those eight symbols. In this example I think I'm using two symbols. So use for use and then D is the suffix for used. So I just maintain the prefix along with an empty suffix here for the first symbol and D for the second one. And then the next one we'll use useful because the third symbol here is useful and the same thing and it grabs the suffixes. So this just gives you some more compression. So far so good. Problem with this stuff is the following, right. You have to decompress it and that could take some time. You can't just point to use and D because you have to stitch them up at some point and give it back to whatever operator wants the actual symbol. So there's a cost associated with compressing and then decompressing as a result. So like the execution engine for the row store in Oracle is not compression aware. So you have to decompress it when you hand it off to it. Or you have to specialize onto the column store. For the column store we have specialized formats where we know what the format is. We have to decompress it. We have some tricks to like for example for dictionary encoding for prefix encoding, we have tricks that use some D that allow you to not have to ever stitch things back together when you're doing a scan. When you're doing a projection you do. Row store has a compressor very much like dictionary encoding but it operates across multiple rows and it's far more complex. So you're right when you actually have to project back. It's a complicated stitching algorithm to bring it back together again. Alright, so far so good. Okay. Alright, so we talked about how we bring the data into memory. We talked about how we format it and compress it. Now we're going to talk about how do we scan how do we actually use that data and actually scan fast. How do we get to this billions of rows per second versus the millions of rows per second when you're looking at a buffer cache in a row store. So this is where SIMD comes into play. So SIMD again for those of you who don't know, stands for single instruction multiple data. And what that does is a lot of most modern processors now, for last, I don't know 15, 20 years maybe, I don't know since 96 I think, have a, in the processor have a vectorization unit where they have very fat registers and not only do they have those fat registers, they have a sequence of, they have an instruction set that can be applied on those registers that allow you to paralyze data operations. So let's just go through an example to make things clear. Here you have again your column store and you have a column called state right? And your query is find sales in state of California. So what you're going to do is bring the state column into a register. Now you guys, we just talked about dictionary encoding right? So these state columns can all be packed into how many bits do you guys think? Quick thinkers? Exactly. That's a good answer. That's right. So you got 5 bits or 6 bits. Thank you. 6 bits. So 6 bits coming in here but you have a register, let's say it's 512 bits. So I can bring in 64 of these assuming all of them are 8 bits, let's just say. Right? I can bring 64 in a 512 bit register 64 states can be loaded at once. That's one instruction. Some number of cycles depending on whether it's in your CPU cache or not. But let's suppose it is in your CPU cache. So it's one cycle I've loaded them in here. The next step is to bring in California whatever bit representation California is and splat that across another register. California, California, California, California, etc. Then I've and that's one cycle. Then this is where I use the instruction set for a vector for vector registers here where I can do a vector compare all values in one cycle. So this state will be compared to California. This state will be compared to California, etc, etc. And in one cycle it will apply 64 comparisons resulting in a bit mask of 64 bits. And that tells me which of these 64 states were equal to California. Project 3, you had to do a volcano style iterator. You're calling next on one people at a time. You cannot do that in a single cycle. To go do 64 tuples would be a next call 64 times which is super sensitive. This is why if you use a vectorized processing model which is what they're doing here, take a bunch of crap all at once and then do the filter or do the predicate evaluation very quickly. That's why they're getting 100x faster over what you guys wrote in Project 3. Make you feel really good about your work. That's why you guys are 100x slower. Learn. Okay. But Andy is right. We actually apply vectorization techniques to all of our operators. Because there's magic in this. It's complicated and I'll have a slide on this I think. So let's move on here. That brings me to this slide here. So we're really looking to improve all aspects of analytics. So we talked about scans just now with vectorization joins. We look at vectorizing joins as well. There's all sorts of nifty techniques when you're dealing with a column store and you're dealing with dictionary encoding. You can leverage the format, leverage the hardware to again process at the billions of rows per second. Here we make joins faster by simply making something called bloom filters faster. Show of hands of folks who know about bloom filters. Excellent. Awesome. Excellent. Bloom filters are magic. I love them. So that's how we make joins faster. Initially with our first release from five years ago. And then reporting aggregations grouped by some things like that. Okay. Yep. I mentioned that bloom filter optimization. That's the 10x right there. Yep. I mean this is using SIMD on top of it. Okay. Let's move on here. Okay. So now we finally get to the top five which is what I wanted to get to. So top five, I just want to list five of the things that we have done in the in the last five years, I guess that I think stand out in terms of what makes database and memory kind of cool for us at least. Okay. And by the way, a lot of these, most of these resulted in academic papers that we submit to very different conferences. So I'll point them out to you as we move along. So the first thing is the dual format architecture. Fast, mixed workloads and faster analytics. Why we think this is pretty innovative. So let's explain how it works. Okay. So the dual format architecture enables fast, mixed workloads and faster analytics. So if you look at your right, that was our drawing that I showed to you last time. Now you can get very fast in memory DML because the invalid row is logically removed from the column star. So I'm going to walk through this example. Let's suppose that a row was modified. Okay. So that row was modified for whatever reason, maybe a column in there or some number columns, but it was updated or is deleted, whatever it is that happens on the normal path. Okay. Like it normally does like Oracle has been doing for the last 30, 40 years. Now when that happens, that tells us immediately using that row ID you remember how I told you we map from disk to the column store, we can immediately just set a bit that says that row is invalid. That doesn't take that long. You guys have all set bits. If you've done, if you create a bloom filter, it doesn't take long. You do have to find that I am see you. That's a little bit of a look up, but not that long and then you just set a bit. So that's how DMLs work. Super simple. You just say this row and that column store is invalid. So now what happens when you want to do a scan you just ignore the invalid rows. Okay. So here's my example. Let's say I'm doing a full table scan. I just go whoop and I just sidestep that invalid row and I keep going. Now what happens? I sidestep that row, but that's not giving me a consistent result now. I have to actually process that row. So just for that row I will go to disk or the buffer cache to fetch that row. Okay. Now in all likelihood it's probably in the buffer cache because I've updated it or I've done something. Updating it brings that block into the buffer cache and so I'm still technically processing at memory speeds because the buffer cache is in memory but I'm going against the row store. Right. Now I am see you's not covering invalid rows are unaffected. So that's when you break up your data into blocks. It's got a lot of value to it. The blocks that aren't affected they'll still go through a nice proper vector scan, the SIMD scans that we just talked about. Okay. This is an important point. Mixed workload performance can suffer if the number of invalid rows accumulates in your IAMS use, which goes to your point over there earlier. If the rows start to get really dirty then I'm just going to the buffer cache all the time. I'm not leveraging the columnar representation. So this is where fast repopulation techniques save the day. So let's go through let's explain this for a second. So we do something called continuous intelligence. Okay. This is actually an analyst term. Continuous intelligence just means that we will track how dirty your IMCU is how and how frequently it is scanned. We do a combination of it and I'll explain why. We care about both how dirty it is and how frequently it is scanned. If your IMCU is super dirty lots of updates are happening but you're never scanning it I don't necessarily care to refresh it or repopulate it. No one's actually accessing it. So it's some combination of it that we have an algorithm for that decides when we should refresh it. So the first technique we have is something called double buffering. So with double buffering the idea here is when you have to refresh it. So we keep the dirty one around and in the background we'll create a new one that brings in those dirty rows repopulates it and now has a nice fresh copy of the IMCU and once it's ready we do that switcheroo and the old one goes out and the new one comes in. Right. The main reason why we want to do this is because we don't want your queries to suffer by taking it offline. Every operation if you learn a few things I guess in database classes you really want to try to keep everything online. You never want to bring something down and then suffer slower performance while you're refreshing an index or in this case refreshing the IMCU. Anybody understand that double buffering? Okay. The second thing is incremental repopulation what we call incremental repopulation. The idea here is you can construct a new column leveraging the information from the old column. So for example when you do a trend, when you're actually going from row store to column store it's very expensive. We call that population to bring it into memory. It's very expensive to identify the distinct symbols. You basically are going to use a hash table or an art, you guys heard of art adaptive rate X tree and an index or something. Yeah, rate X tree. You need something to identify, give me my distinct symbols. It's expensive to do that. And so what you want to do is leverage the fact that hey I've already created a dictionary before. I just have some dirty rows here. So you leverage that to build the new CU, the new column CU. Okay, so that's incremental repopulation. And the last thing here that we do as a little trick here is oftentimes when you are invalidating a row invalid means you're updating or deleting or whatever, you're really just touching a column or some number of columns. You're not actually touching all the columns. And so if you run a query that's unaffected by the columns because you're only accessing those columns that were not affected, then you can still go columnar, right? You don't have to worry about the accessing the invalid rows. Okay, so these three kind of techniques here allow us to really run mixed workloads faster. And it gives us this best of both worlds. Okay? Is the like when you say like I want a good IMCU on a column or a table, it's all or nothing. There's no paging, it's like either table fits or doesn't fit. No, so from the very beginning oracles we've always said that you can't assume that everything's going to fit in memory. It's just not realistic. So what happens is if you have a limited, whatever memory you have, you'll populate that bringing it into memory. Anything that doesn't fit will stay on disk. Now you, like I said, you have the ultimate control you can say which columns you want, you can say which partitions you want, whatever it is. It's not like a buffer cache where like you have a no, no. Exactly, it's a store, it's a column store. So it's not a cache. Now, I'll talk to you about how we're transitioning over to actually making it a cache when we're doing storage tearing and things like that. All right, so far so good. Doing all right on time. This is a very quick slide, I'm just going to skip through the animation here. And this is just telling you how, oh question, I'm sorry, go ahead. Why do you need the low base and the column base? Because ultimately you are using the column storage and processing your columns. So what you're doing is like bringing in to memory the low base and the column base would give it an efficient memory. Yes. Yeah, so let me. Okay, so I think the question was is that if you're bringing, it sounds like you're bringing both the row format and the column data into memory and it sounds like that's inefficient to basically have two versions of your data in memory. And you need to keep them in sync and up to date, right? Okay, so let me see if I can answer this appropriately. Depending on your workload, you might be an analytic heavy workload. You're never going to bring the row formatted data into the buffer cache. Makes sense? You're only going to leverage what's in the column store and you're done. If you're an OLTP heavy all you're doing is transactions inserts, updates, sleeves. You're not doing joins or aggregations or anything. You're never going to even bring it into the column store. That makes sense? So that's the two extremes. Now the middle ground is when you have some percentage of DML, some percentage of scans. Now where in memory is going to be beneficial is of course if you're more heavily towards the scans than you are to the DML. Now the DML is going to bring blocks into the buffer cache but it's not going to bring the whole table into the buffer cache. Does that make sense? And so you are going to occasionally go down the row store because you're just looking for a row and you're going to use an index to get to that. Or you're going to do analytics and go through all those. So that's the difference. Got it? Okay question. So you said you only bring in that portion of the table that fits in the memory. Yes. So the question is that since we don't bring we don't have to bring everything into memory what happens when you're doing a full table stand and you're actually the axis something that's not in memory. That's your question. So it's no different than what happens when you're doing a full table so it's no different than what I was mentioning about the invalid rows. If you have an invalid row you have to go to disk to go get it. It's the same concept. Imagine you have, so you filled up 50% of your data in memory and you have those in IMSUs. Imagine you have the other 50% as empty IMSUs or dirty IMSUs just if you can envision that then you're always going to disk to fetch those 50% of the rows. That's the behavior that you'll have. So you'll get everything that you can from memory in the column store and you'll get the rest from disk or the buffer cache. Got it? And you can compress the hell out of these things so you can fit as much as you can. So I didn't mean to say that. But you can say what I want. I have to bleep it because the part makes you bleep it. Okay. There it is. But you can compress it bring everything you can into memory if you can. Okay. Moving on. I'll just blaze through this, but this is a somewhat obvious point for me because I explain it all the time but may not be so obvious to others. How do we get mixed workloads to run so much faster without doing much but just having a column store? So normally, when you have a mixed workload, you'll have here's a table, you'll have 1-3 OLTP indexes. Again, indexes on columns like key columns, primary key columns that you really quickly trying to get to the values of. And then you have 10 to 20 analytic indexes because for all the columns that you want to run analytics on, you're going to have a separate index for that. Now, you can basically fill away those analytic indexes all together and just maintain the column store. And anytime there's an update, as I mentioned you can very quickly mark a bit that says this guy's been updated, he's been deleted. So because in memory the updates are very quickly to indicate a DML has happened. Okay, and that's how, and so now you don't pay the cost of updating all of these analytic indexes for every update that you have to your table. Alright, number two vectorized analytics. So we touched upon this briefly with the SIMD scans. I'm going to walk through that a little bit more in detail in terms of how that works. So paralyzed predicate evaluation load, eval, store, consume, result. These are the steps that you take when you want to evaluate a predicate. So let's just go through a quick example. So let's say you have this select count star from table T where A is greater than 10 and B is less than 20, right? So the first thing you're going to do is you're going to load A and you're going to bring four values into a SIMD register, right? Then as I mentioned you're going to splat 10 across that SIMD register. Then you're going to do a comparison greater than, greater than, greater than, greater than all in one cycle and you got a bit vector that says 1, 1, 0, 1. Then you do the same thing for B. You load B, you compare it to 20 and that's the bit vector that you have for it. And now you just need to add those two bit vectors together and now you have the final set 0, 1, 0, 1. That is actually not completely done because that's stored in a SIMD register. Four bits are stored in a 128 bit register. I don't want to waste 128 bits to store four bits. So you basically pack them, that's another instruction that the SIMD instruction set supports that packs them into just four bits, literally four bits. And then you store that wherever or utilize that for the next predicate. So that's a little more details of how SIMD operations work. Okay, so not just scans but we're also making joints faster. So bloom filters, you guys are familiar with bloom filters, the way we make bloom filters even faster, it's a combination of things. We apply the bloom filter on the dictionary first that's the first thing because you only need to run it on the distinct symbols, you don't need to run it on all the values and once you run it on the distinct symbols you need to map them back to those symbols that map to those codes. Second thing that we make bloom filters faster is we vectorize the operations. We use SIMD to basically do a set membership lookup that says within a bit vector determine if this value is set in this bit vector. This slide doesn't talk about bloom filters. This takes it to the next level. This is saying, if you can tell me which two columns across two tables you plan on joining against. That's what we call this join group. If you say hey, you're going to join between vehicles and sales on this name column. Once you tell me that or tell us that, we will encode both of these columns using the same dictionary. Without this we don't know that these two are related and so we'll basically have a separate dictionary for this guy a separate dictionary for this guy and they can have different codes depending on how the IMC's are broken up and so once they have different codes they don't relate to each other anymore like code 5 here doesn't match code 8 here even though 5 and 8 are both out BMW or whatever this is vehicles name it BMW or whatever. So once you tell us this we will use the same dictionary to map the codes and now you're just doing a code to code match. You're not doing a normal join you guys implemented a join? Yeah. Hash join involves building a hash table you're going to have that hash to key you've got to insert into a hash table when you do a probe you've got to again hash the key you've got to do a key comparison you've got to follow chains all of that, right? Here it's just a code 10 index into an array not null there's a match and that whole thing can be vectorized. It's a very simple operation to do a lookup. Does that make sense? If sales name was a foreign key to vehicle name would you do this automatically? Yes. Yeah good question. So there are cases where we do things automatically and then when I talk about the future we're going to go to an extreme on this. Okay. So that joins aggregations we do the same thing now this is very detailed so just follow my words and not necessarily look too much at this but we have two forms of aggregation push down this is single table aggregation push down this is something it's called vector transformation is doing aggregations above a join and it's kind of this is complicated, right? You can't make much meaning by just looking at this graph but let's talk about this first. What we're doing here is imagine you have a select sum from this table with a predicate, right? Normally again with that volcano model, right? You're sending back every passing row to another operator and that operator is then adding and adding and adding and adding one row at a time maybe there's a batch of rows at a time, right? But it's taking, it's doing a couple of things one is it's doing it, it's copying the symbol into a buffer the other operator is reading that buffer then it's looking up an aggregate and it's adding it, it's quite expensive. What we're doing here is we're saying you know what let's leverage the format this was dictionary encoded and if it's dictionary encoded we can use the dictionary codes and aggregate against the dictionary codes and the distinct symbols and then we can do that all at the scan layer and then only project back a partial aggregate result to the higher level operator. Okay so that's the techniques that we're doing for aggregation. I have an example this vector transformation I'm not going to go into this all I'm going to say is again this is a paper that we have in ICDE 2018 I think it's a very novel technique that basically pushes aggregations down and joins down into the scan layer. So basically instead of actually sending all these rows back up up up across the operator's the joint operator and aggregation operator we do everything in the scan operator and that's that again I'm not going to talk about that but I'm going to talk about this this is kind of cool something that you probably don't think about but we have to think about it in industry very large numbers anybody know how very large numbers are encoded a variety of different formats and by very large I'm talking about like let's say 30 plus digits you know it's something that doesn't fit in 64 bit in a 64 bit register Oracle supports a software implemented type for number you might think of a number as just a register an int or whatever right or long we treat it as a sequence of bytes when each a set of each byte represents a digit up to 100 it's base 100 so you have the first byte indicates like an exponent as well as the sign plus or minus numeric okay so this is how it's implemented you guys implemented okay so we basically have in each operation on a number type is like hundreds of cycles literally hundreds of cycles as opposed to one cycle for doing a register operation so let's walk through an example of how we support very large numbers so imagine you have a select sum of a from table T where group by j comma k so you're doing a sum on a grouping by j comma k and let's suppose this is your table these are your grouping columns j and k and that's your measure column a okay what we do is we create this frequency table and this frequency table on the x-axis are the dictionary codes imagine these are dictionary codes they map two actual symbols just to illustrate this example here so you're going to have 0, 1, 2, 3 and this is 10,000 and 1 all the way to 50, 1,000 here and 19, et cetera okay that's the x-axis on the y-axis you have the groups the distinct groups so the possible combinations of groups are 0, 0, 0, 1, 1, 0 and 1, 1 if you look at all these values over here right now what we do is we walk through each row and we add a count so if I go to 0, 1 that maps to this group I index into 51,819 and I bump up a count here that says I've seen it one time and I do the same thing for each one of these values 51,819 again for the other group I bump it up I do this I bump it up and in the end I basically have this table that says have I seen these dictionary codes for these different groups anybody with me so far okay now what we do is we will aggregate across each group by multiplying the frequency times that dictionary symbol so it will say code 0 shows up 0 times so I don't need to multiply it but this guy shows up one time so I need to grab its code look it up in the dictionary and multiply it and so this is basically the operation 0 times the symbol 0 times the next symbol and so on why do we do this question if you're mapping are you saying if the number of groups is too large or if the number of dictionaries there could be a lot of distinct numbers then you have dictionary codes for each of those numbers got it it may not be helping as much absolutely right so you've identified some of the weaknesses of this so the question was I think if I understand your question there's one everywhere here for example because it's so diverse I guess at the right word and so every one of these codes only shows up in one particular group or whatever it is you basically end up multiplying everything and that is a weakness of this it depends it depends on the data and how the data is actually organized and also it depends on how many groups you have if you have many groups and this table starts to become very large because you have many distinct symbols and many groups it's going to be a very expensive operation to go through this for every single group but just to finish off what the idea here is is we're basically replacing addition with multiplication every addition of if I had to add each one of these values up and sum up the same value and if I can replace that I'd rather do 10 times 5 as opposed to adding up 5 10 times and this is like an obvious thing you would think about in terms of factoring but it's very important for software implemented types because each one of these operations is so many cycles 80 hundred some cycles so as you were saying it does depend on the workload in terms of when this actually kicks in or not does that make sense if I understand that one question again the groups so the groups can be analyzed by walking through the columns of J and K so as you walk through J you'll identify how many distinct symbols are in J you walk through K how many distinct symbols are K you multiply the two like kind of like a multi-dimensional array to get the max possible code and that's the table that you create and then when you want to index into it you take this and this multiply them together and that's how you index them into the table good, okay alright, I'm going to skip through several slides here in the interest of time so numbers joins are made faster and memory expressions I'll skip over here dynamic scans I'll skip and I'll move over straight into this thing called the memory plus exadata so this is number three where we're bringing in memory into the storage tier so with that example we have something called exadata exadata is a database machine that we build from scratch utilizing basically the best of read hardware so the fastest networking the best SSD drives a lot of memory, a lot of flash right we build that from scratch and we build it to run Oracle as fast as possible how much is that box? that's a good question it's definitely in the millions we sell them in various different configurations but it's definitely in the millions but it's amazing how many people buy them it's like amazing it's a billion dollar business for Oracle okay ah, okay, so on exadata you have the compute nodes and you have storage nodes okay, compute nodes is where all the higher level SQL operator processing happens joins and aggregations and sorting and all that jazz happens there on the storage nodes is where you are closest to the data where it sits in the SSD drives so that's where you can actually do some fast filtering here and then only send back the rows that pass through the network to the compute nodes what are the contacts we discussed last week? this is a shared disk distributed data correct so what you had without in memory is we have also some flash that's sitting on these storage nodes and how we use that flash is we use it as a cache and this goes to what Andy was talking about earlier there we actually have it as a cache in the sense that the hottest data that you're accessing will be moved from the SSD drives into the flash cache and now it will also be stored in a columnar representation now the beauty of this is that on the compute nodes you're limited to how much DRAM you have like one and a half terabytes of memory which you might think is a lot but in real enterprises that's nothing it's like a drop in the bucket you really need the hundreds of terabytes that flash gives you and once you have all of that memory you can basically do some really nifty storage tearing now you can have most of your hottest data sitting on your compute nodes and if you run out of memory no problem it will automatically get populated into the flash cache sitting on the storage nodes now how does a query work? so imagine you're doing a scan a full table scan it says select from my phone book all people in my address that live in California whatever it's going to go through here it's going to say okay I've exhausted everything that's over here the rest of my table is sitting on my storage and it's going to say everything is cached into my flash cache it will utilize the same vectorization techniques that we've talked about the same improvements that we make for joins and aggregations and so on it leverages all of that it's just doing it from flash and so you have to bring it from flash into the DRAM that's local to that storage node but then after that you do the same techniques and then you send it back up does that make sense? and this is just the form of storage tearing gives you a much larger column store and it allows you to place data where it belongs hot data in the most expensive memory that's the fastest warm data in the flash cache and cold data sitting on your persistent store on your disk this is just some performance numbers I think we can skip that and let's skip this anybody interested in fault tolerance? okay good let me tell you about how we achieve fault tolerance we basically maintain a cluster of nodes we call it rack on rack you can have many different nodes let's say four nodes in this example and you have a column store on each one of these nodes each one of these nodes has memory you can put a column store in there what you can do is you can bring your data into this column store and you can duplicate it on any one of these other nodes so for example this red IMCU is duplicated on this node and this blue is duplicated on that black is duplicated there and you can have at least two nodes that you can duplicate again so that if anything happens to this node queries just need to be redirected to the node that has that IMCU right? it's as simple as that you can do even better by having full duplication where this red IMCU is stored everywhere and that gives you both availability as well as performance because every access now when you go to it will be local you're not doing a remote access to another node to fetch its contents and run that query everything is local so that's how we achieve fault tolerance it's fairly simple and straightforward at the IMCU level every IMCU which is at half a million rows gets duplicated I'm going to skip data guard Andy, how are we doing on time? okay alright number four is intelligent automation this is where we're trying to be a lot smarter so far everything I've told you for the most part is the DBA or you know the user has to direct us in some way says hey bring this table into memory or hey create this join group right? we want to be a lot smarter you guys all work with Andy, Andy's the creator of self-driving databases Larry Ellison stole the name we made money out of it we made money out of it so anyways so that's where we're moving towards self-driving databases I have a separate talk later on today that's talking more about how we achieve this but the idea here is a picture of this DBA poor DBA and the DBA is saying I have to manually manage what to put in this column store and what to keep out I don't know what tables are hot I don't know what's cold, I don't know what's warm I have no idea what queries you're going to run but as I mentioned earlier it's a column store so once this store becomes full of tables that's it the DBA has to decide what to bring in and the desired outcome is you want to keep hot objects in memory and remove the colder objects so that's what we had what we have now is something smarter here we will observe the access patterns we have something what we call a heat map heat map basically says at the very smallest granularity access is a block and the block has some number of rows at the block level we will say how often has it been accessed how hot is it and how it's been accessed with the access because of scans and DML what, how so we observe the access patterns once we observe the patterns we'll classify the data we'll say hot, intermediate, cool data and once we classify the data we'll remove the cold data very simple, very straight forward we take it to an even more complicated level where we'll do it not just at the table level or the partition level but even at the column level so we'll say something like this column is being used to aggregate on it's a measure column and this column is being used to predicate on ok now I know that this column is for predication or for predicate evaluation and this column is for aggregation I could format them in a different way I might use dictionary encoding for one and maybe another compressed format for the other because dictionary encoding in order to involves getting the code looking up in the dictionary, getting the symbol it's a couple of levels of indirection and that's expensive but it's really fast for scans because I just need to look at the bitcodes I never have to decompress it whereas aggregation I need to actually go get that symbol from the dictionary and then I need to add it so why should I dictionary encode that maybe I should just keep it uncompressed so I have the symbols right there so this is where you can start becoming a lot smarter about knowing how your data is accessed how the columns are accessed how they should be formatted maybe these four columns are never accessed get rid of them compress them or if you want to be safe compress them or evict them out of the columns are all together okay so anyways so that's how we're trying to remove the guesswork out of the picture and just start being a lot, lot smarter and this gets complicated because you guys all worked on with the buffer cache imagine you have a mixed workload where sometimes you're going to run OLTP sometimes you're going to run analytics how much memory do you dedicate for the column store versus how much do you dedicate for the buffer cache right how do you determine that how do you know what the behavior is going to be today versus tomorrow versus the next day and so on so all of these factors have to come into play and you have to be very elastic and being able to switch from one to the other and you have to be very very fast and accurate no customer wants to move over to Oracle's Autonomous Database and all of a sudden A experience worse performance and B experience like inconsistent performance those are two terrible things if at the very least you want to be consistently bad that would be my suggestion if you're going to be bad be consistently bad so that the user knows at least I know what to expect today and tomorrow and so on alright so I'm going to skip through some more things we'll talk briefly about, oh question there was a parameter to control like the what value is that the finer grain this is saying for example when the column store is full the action we could take is just evict out cold columns that's it we never will decide what should go in to replace it that's just based on your query whatever you touch you bring in that's one mode another mode is now we'll be smarter we'll evict and tell you what to bring in and yet another mode is an extreme mode that's where like hands off we'll do everything we'll decide from the get go what should be in the column store from the get go what columns should be in there how to compress it and so on do you make any decision about what's actually in the data like look at the values this says order dates that are beyond you know 3 months your question we just look at we just look at frequency NDV right that's all we really look at you could look at if it's a date column for example let's give a date column if it's a date column we could be smarter in terms of how we encode it we can encode the months separate from the year from the dates and get better compression if it's a number column we can compress them with a binary representation JSON we can do so we do look at data types to compress them in a much more efficient representation but we don't look at it to derive meaning out of it now that's kind of another level once we have the queries together we have a repository and we have the data and we have access patterns we can do things like that I don't know if we're I think we could do it so we haven't actually done anything like that yet because we'll see here's the thing is the data likely will be encrypted as it is and so you know we can't do anything NDV even is difficult you can do it from the dictionary but other than that you can't any other questions? question are there any kind of game theory considerations like you're like how much would it save for me to move this over here and once it's like once you've hit that amount of time and then pull it out great question so that is that's kind of our expert system it's not machine learning but it's an expert system it's taking into account again let's just take simplest metrics for how often has it been scanned how much space does it consume and how has it been scanned those are kind of the three very simple metrics and from that you can divide design an equation that meets the threshold that you optimize for the other part of it that's missing is time when has it been accessed you take all those into account and you've got a pretty smart system without anything more complicated than that you can go further you can look at correlation between multiple columns correlation across multiple tables etc question in the back good question so the question is how do you determine when to bring it into the column store automatically right and when these objects good question so I'll answer the first when to bring it into memory is again it's similar to what I was saying in those attributes if I know let's say there's a table that's not in memory and the column store is full how often has this table been scanned not that often how often have these tables been scanned out of memory very often doesn't make sense to replace them once the number of scans for this table has increased then it makes sense to evict something out from the column store bring something in but it's tricky because you have to know is that just for a short period of time and then it's not going to be accessed again and I paid all this I spent a lot of CPU cycles to transpose it bring it into memory and now I'm done like it's midnight and I don't need to run these reports again so we try to be smart about identifying the access patterns before we take action we're actually a fairly conservative system and we will slowly make modifications it's like any other system, an expert system that you design you're going to have feedback mechanisms in here right, you're going to say did I make the right decision that I just did and if I didn't like I learned from that, I remember that and the next time the situation comes again I'll be smarter about what I bring in I'll be honest with you this is not a perfect system there's a lot of work that needs to be done here and that's where we're trying to figure out how do we get to the 80% of the benefit with 20% of the work so we can quickly get to it what's that, question over there so if this is a 0-1 extra pattern and both 0, table 0 and table 1 are completely filling the column so you have to evict the first table if you want to bring the next table then they are both like then I'll do a query on 0 then I'll do a query on and what do you do then yes so the first thing we do let's say 0 and 1 are both hot you don't have space for both 0 and 1 what do you do first thing we do is we say which of these columns are not been accessed normally you have a table that's a very fat table anybody familiar with TPCH TPCH is a benchmark like the data warehouse benchmark TPCH has a table called line item line item has a column called L comment L comment is never used in the queries that are in that particular benchmark L comment for Oracle takes up about 20% of the space and memory it's a gigantic varchar but it's a large varchar string and Oracle is very aggressive we will dictionary encode it which doesn't really make sense because they're all distinct symbols why the hell am I dictionary encoding I'm not going to leverage it but we do that because we assume you might scan against it and if you're going to scan against it dictionary encoding is the smartest thing so going back to your example here is L comment is never used line item is hot that's a hot table but that column L comment is never used so we'll evict it out or compress it just to be safe that active compressing it or evicting out might allow you to bring this other table in to make sense so we will try to first be very conservative and try to get both of those tables into the column store by compressing columns if that doesn't work we take the next bigger action if that doesn't work we take the next bigger action so for example tables are usually partitioned you can partition by date maybe the old dates partitions aren't needed you only care about the hottest date so you keep that in mind when you evict the old dates out that makes sense now if you're in a situation where both of these are hot then we stay conservative we're not going to constantly populate evict populate evict that's just going to add more cycles to do the population so we'll choose one and live with the consequences okay alright I know I'm like running out of time here so I'm just going to oops sorry sorry alright persistent memory okay let's just talk very quickly about persistent memory new silicon technology capacity, performance and price between DRAM and flash this is a nice little picture here that shows as you go from disk to flash to persistent memory to DRAM you're going faster faster faster faster faster at the same time you're going higher cost higher cost higher cost per gigabyte okay eventually PMEM is really expensive right now is really expensive eventually this will reduce in price but flash is going to be even cheaper also so it's very interesting how we put systems together with all this technology to get the best performance Intel calls this Optane DC persistent memory DC's data center I believe they're pretty much the only game in town right now there's others but the only game in town right now reads at memory speed much faster than flash right survive power failure unlike DRAM okay you can imagine building as Andy was saying building a brand new storage engine built on top of persistent memory right it's tricky as Hideaki here he's going to give a talk tomorrow on how you can leverage persistent memory it can be tricky because it involves leveraging some new instructions to make sure that data is properly flushed all the way back to memory I won't go into the details of that I will just bring up this slide here on how we are planning on using persistent memory for a memory column store so today as a baseline you cannot necessarily fit all of your data in memory that's the example of some of us we've been talking about so assume he doesn't fit in memory so you're going to have disk sorry disk plus memory the queries then have to go against the column store in DRAM as well as the row store on disk and DRAM DIMS can go up to 128 gigabytes no one buys it at that it's very expensive so that's today's baseline with PMAM you can conceivably fit everything inside of a PMAM DIM it's 3x more denser something called memory mode this is really cool memory mode Intel supports it allows you to have your persistent memory DIM and a DRAM DIM sitting on top so when you access data the DRAM DIM serves as a cache so your hottest data sits in memory in DRAM and if it's not in that cache then we'll bring it from the persistent memory DIM into the DRAM DIM so the DRAM DIM serves as a cache space in the sense that it's a cache it's not like extra memory that you have but the benefit is you are accessing PMAM almost as fast as DRAM so that's called memory mode the hottest tables are cached in DRAM for the fastest access and these DIMS can be quite large the 512 gigabytes and then this just talks about using our latest Oracle 20c and some of the techniques and the reason why I mention this is we ran this benchmark SSB is another benchmark called Star Schema Benchmark it's kind of based off of TPCH and there we showed DRAM about 384 gigabytes the table didn't fit entirely in memory this was a progress bar that completed we processed 18 billion rows and took 130 seconds when it didn't fit entirely in memory and then we used memory mode with persistent memory and it like dropped by 10x 10x faster and it took 12 seconds to go to 18 billion rows and then we just added Oracle 20c which has some cool techniques for making joints faster and we just showed that they can go an additional, you know, 4x faster than that and the main point I want to mention is this really is a game changer for us we're looking at persistent memory more as expanded memory, larger capacity so now you could fit everything into memory gone are the days where you need to put everything on disk okay you got parabytes and terabytes of memory okay I'm going to skip the converged analytics and I'm just going to open it up to questions but the main I'll just talk about this slide here the main thing about Oracle, one of the things that we're trying to push for Oracle is you don't need to have a separate storage engine for a document store or a spatial store, Neo4j or graph or AI or IoT or treat the database of the file system you're just trying to position itself as a one stop shop right you can do everything on a single database and it's got a ton of benefits one being security you don't want to have to migrate your relational data out to bring it into yet a graph database just so you can run graph queries on it right there's a customer god where is it the next data customer in Europe that they have a they build a graph database and they use a graph to represent transactions between different people on their paypal sort of system and they every edge in their graph node is a transaction but they have to take their data out of a relational store bring it into the graph store and then they run their query and that takes one some time and second there's always a security violation now because you're taking this unencrypted data out of the database same thing for a document store with MongoDB MongoDB but you can also store JSON nicely inside of your relational database okay so I'm not going to talk so much about this I just sort of skip through this and leave it at that okay so last thing is just this is this innovation summary I want to make it very clear that you know this is 2014 when we came out all of these little bullets here are like kind of massive projects and Oracle is kind of committed to in memory and you know our roadmap going whoops I'm kind of drawing it up higher from 2014 and so on we're super committed because we feel like now is a time and memory is becoming readily available it's becoming cheaper it's becoming larger the hardware technology is all there the requirements again from the real-time enterprises is there and so as a result our features need to be there and the main drivers for us are the self-managing in memory database everything needs to be autonomous really looking at everything not just relational but spatial text graph etc and also vectorizing all of our operators not just our simple scans and so on okay I think that's it I'll skip that I'll skip that I'll skip that and I'll open it to any questions you may have any any questions about being able to support multiple types of database I was wondering are you actually having the information for the graphs for the relational or do you have it stored in one way and then you're to be able to work on it great question so the question is how do you how are we representing this as a single representation of your data when you want to use it for a graph query versus a relational query or is it like you know multiple representations I had that right okay so I went through those slides very quickly but let me explain so what we do for graph let's say I'll say JSON initially right now for text okay so the techniques that we're doing for text spatial JSON whatever it is is representing the data in a much more efficient manner in a in memory in a column store so it's kind of like think of dictionary encoding doing some tricks like that to represent your data more efficiently all right I have one quick question yeah so in addition to having a database systems background of knowledge we're taking a course like this how important of course if you're looking to hire somebody would be he would have a background maybe testing database systems or query optimization oh yeah great question okay so when I when we look to hire people okay so first thing is this I always look for people that are enthusiastic smart and excited about the technology that's that's first and foremost because we have we have mathematicians we have a mechanical engineer we have a chemist we have all sorts of people that work in a database systems the folks that actually have a solid background in database systems however they that's like what's the word the holy grail in some ways because we interface with we're the we're the core storage engine team but I interface we interface with the optimizer team we interface with the exited team we interface with the hardware team I have a background in compilers and computer architecture so I didn't have a background on database systems at all to make it make it more simple okay you have two candidates equal background equal one is a javascript programmer okay one is a query optimizer person who do you hire I hire the query optimizer person sorry oh oh yeah coming through with my Shalin crew two cent for the case give me St. I's crew in the mix of broken bottles and crushed up can met the cows in the gym or a drug it's with St. I's in my system crack another I'm blessed let's go get the next one then get over the object is to stay so I lay on the sofa better hit down my shoulders I'll be Tim's stress out can never be son Rick and say jelly hit the deli for a part one naturally blessed yes my life is like a laser beam the force in the bushes say nice and I can't crack the bottle of the same I said but you don't you don't realize I'm drinking it only to be drunk you can't drive keep my people still alive and if the same don't know you for a can of pain pain