 Hey everybody, happy Tuesday again. Here we are. So today what we're going to cover is we're going to finally get down to the details of how we implement indexing on disk drives for database systems, right? So we talked sort of in the abstract about indexes last time. And of course, you may remember when we discussed joins, we had this black box called an index in the middle of index nested loops join. And all we knew about an index at that time was that you could do a lookup in it and get things back fast. So today we will, you know, delve in and demystify this. A quick review from last time, okay? Files, pages, and records, right? I think we actually did a pretty good job reviewing this last time, but I still want to just touch one more time because it'll ground our indexing discussion. So the abstraction of stored data is going to be files, which are collections of records stored on pages. So there's files with pages or blocks of records. Records live on pages. The physical record ID, the way that you go find a record is page number, slot number, right? That is the address of a record. And we know physically how to get it. It's at that particular page and then at that particular slot on the page, okay? If you have a file that spans multiple disks, there's probably embedded in that page number. There's a disk ID as well. Variable length data you should remember because this is the general case. To store fields in records with variable length fields, we'll have that list of offsets at the front of the record with the little pointers into the middle of the record to tell us where to go find field one, field two, field three, and so on. And then in a somewhat analogous manner, records on pages recall the slotted page organization where at the end of the page we have the slot directory pointing into the storage area of the page where we have the records. If you're not familiar, you should go review them before you study up on this lecture on indexes. Now generally speaking, we can be a little bit lazy about a bunch of the issues with space management in the database. Kind of reorganizing tuples on a page, reorganizing pages in a file. We're going to be a little bit lazy about this and do it in batches, all right? You don't want to always be real, like obsessive compulsive about making sure you compact everything because there's a certain overhead to doing the compaction, particularly for things on disks, like reorganizing the pages of a file. So instead what will happen is we'll let things get a little sloppy and then in the background periodically we'll go through and we'll clean things up. So don't imagine in some of these organizations that they're always kept pristine. We talked about sort of three file organizations, unordered files, also called heaps. Remember when I talk about a heap file in this class, I don't mean a heap data structure with find min. I just mean a collection of records in a file. We talked about sorted files that we really keep sorted and basically we're not going to use those. But we looked at them in our analysis, so I do want to remind you that we talked about them. They're expensive and indexes are a more flexible solution. So today's lecture on indexes will replace our discussion of sorted files. And then finally last time we talked about kind of sorted files which are clustered files. That's the word that's used in the database community. Where they're mostly in the order of some search key but over time in the face of insertions some rows may be out of order and that'll be okay for a while and then periodically we might want to go through and clean it up either incrementally or in batch. And we won't worry too much about that background cleaning but just be aware that a clustered file is not guaranteed to be sorted. It's just likely to be sorted. So it's more of a performance hint than a semantic guarantee. And then we talked about in the abstract that there might be some file organization called an index that could speed up different kinds of accesses. And we talked last time about different classes of little queries that we could ask of these indexes that they might be able to answer. Sometimes in database systems, particularly in the earlier research literature, indexes are referred to as access paths to the data. Just a phrase you may want to be aware of if you become a database insider. Okay, so today we're going to talk about tree-structured indexes which are sort of the versatile, widely used workhorses of most database systems. So the selections in any index are going to be of the form field operator, Boolean operator, constant, right? So examples of this might be the operator's equality like us. Department equals computer science. Either tree or hash indexes will tend to work there. We won't study hash indexes in the class as I mentioned, but there's a nice chapter in your book on linear hashing which I encourage you to read. Range selection operations could be one of less than greater than or between less than or equal to greater than or equal to linear ordering queries. And hash indexes do not work for these because they're based on hashing. And of course the whole name of hashing is that it makes a hash of an order. So hash indexes won't work for range predicates, range queries, but these tree indexes we're going to look at will work for linear range queries. And then as we said last time, you can have fancier selections like for instance spatial containment or near-neighbor queries. And there are indexes, typically tree-shaped indexes that work for these. We won't study them this year in this class. I've done it in previous years, but it's a bit esoteric and I think there are other topics in advanced material that we want to cover this year, probably like NoSQL and things like that that may be more contemporary. So if you're interested in this topic, I gave you links last time. Tree-structured indexing techniques we're going to support range and equality selections and we're going to look at two techniques today. The first, ISAM is of historical interest and the only reason really to teach it is that it's a nice warm-up to really what we want to learn, which is B plus trees. ISAM is an old IBM-based mainframe data structure. It stands for the indexed sequential access method and you will see why it's called that when we look at the structure which is both indexed and sequential. And we'll talk about it in just a second. Okay, a note of caution. ISAM is very old-fashioned. We're talking about 1960s technology. B plus trees are almost always better, maybe not always always and we'll see exactly to ask the question when are they not better. ISAM is a good place to start basically the kernel of the idea of the B plus tree which was invented in the early 70s right around the same time that the relational model was invented not probably coincidentally, they came kind of together. ISAM is easier to understand. It's a way to get us towards B plus trees. We're actually going to go through history and learn the lesson in the same order that the community learned it. It's just helpful, but you don't want to put on your resume that you don't want about ISAM. That's an impressive very small number of people. I will say that old 1970s and 1960s IBM mainframe technology still powers some of the systems out there in the world and if you're looking for a niche expertise you can probably still make good money knowing about things like ISAM and VSAM and old IMS databases from IBM that predate relational databases. But it is a niche and it is on its way out over decades. It's very hard to kill a database system in deployment. They're still out there, but you don't want to brag about that. You do want to understand them though and you want to understand how they trade off with B plus trees because if you really understand B plus trees you'll be able to answer these questions about how they're different from ISAM and you'll know that you know the material. Let's think about range searches for a minute because it's sort of the extra power of a tree-shaped index over a hash index. We want to find all students with GPA greater than 3.0. If we had a sorted file, we talked about this two lectures ago the way we would do this is we would do binary search in the file to find the first record and then we would scan to the right. Find the GPAs that start with 3.0 and you scan to the right and you get higher and higher GPAs. The cost of binary search in the database though is kind of a bummer. Why is it a bummer to do binary search in a database as opposed to say in memory? Yeah, random disk access is very expensive as we talked about, right? Binary search, while it's very nice, is binary. It's only carving things into two each time. Maybe we can carve it into a larger constant than two. Wouldn't that be nice? Because the constant factor of going to the disk drive is so high that we want to amortize that cost somehow. So the simple idea if we have this data file instead of doing binary search in this sorted file we'll create an index file on top of it. It'll kind of look like this and it's going to allow us to do a much more efficient search because we're going to do binary search in our little index file and the index file is going to contain pointers to the pages of the sorted data file and it's going to contain keys that tell us the lowest value on the page whose pointer is to the right. So K1 is the lowest value on page 2 which means anything less than K1 is on page 1. KN is the lowest value of anything on page N. We can do binary search in that index file but the index file which still may take many disk blocks so each one of these inner rectangles is intended to represent a disk page the outer rectangle to represent the file. The index file doesn't store entire records it only stores the key values and the pointers to the pages. So you may have a record with 10 columns and this is only storing say the GPA field which is just a float. So the index file is going to be quite a bit smaller to search in than the data file so that's idea number one let's at least do binary search on something smaller and you have that idea and you say hey wait a second I could do that recursively why don't I build an index file for my index file and that's roughly what an isam is. So what an isam is it's going to be a hierarchy of these index files forming a tree and then in the leaves we're going to have a sorted file sort of that's going to be where it gets a little bit fancy not at all fancy it's actually stupid but you'll see where it's different than a plain old sort of file. So the leaf pages are going to contain the data entries and the non-leaf pages are going to be this hierarchy of index pages that let us navigate down to the leaf pages so let's see how this works and again 1960s technology each one of these internal nodes these non-leaf pages in this index is a disk page keep that in mind these are all these little rectangles on this are disk pages which contain 64k of data and an index page a non-leaf page is going to look like this it's going to have a set of index entries key comma pointer so k1p1 that pair is an index entry and it says that everything if you follow pointer 1 will be k1 or greater and everything if you follow pointer 2 will be k2 or greater so the things between p1 and p2 are down sorry the things between key 1 and key 2 are down p1 and so on so those are the index entries and that's the structure of a non-leaf page and so here's an example of a dorky little isam that fits on screen the whole idea here is that these pages are very big 64 bytes worth of these little pairs is a lot maybe a couple hundred of these index entries on a single page but just to make this visible on the screen we've got three index entries three pointers per page internally and two values in the leaf level so it's very small so here's an example where each node can hold two entries and you can see things that are less than 40 are to the left things that are greater than 40 are to the right and then recursively less than 20 is over here 20 or greater is here 33 or greater is here 40 or greater is here 51 or greater is here 63 or greater is here and you can see how you would navigate this tree pretty directly from the root to the leaves to find a value now these are the index entries search key value, page ID so I've highlighted an index entry there and there are many of these obviously on each page and here's a path of index entries to get to a leaf but I want to point out one thing which is that implicitly there's an index entry at the left of each one of these as well which is negative infinity, pointer so anything greater than negative infinity is at less than 40 is down to the left we don't have to store negative infinity it's implicit, right? so it's as if we've compressed it away but sometimes people get a little confused when we talk about the index entry being a search key value and a page ID we're getting that leftmost search key value for free, right? so there's one more pointer than there really are search keys on the page make sense? and obviously every page you can think of as having negative infinity after the left, right? okay, now isam is a static structure which is why it kind of feels like 1960s technology it's that whole tree shape is created when you create the file and it never changes again seems kind of crazy, right? so here's how the way this works we create an isam file like in batch from a batch of data so you have a big table which hopefully won't change very much in the future and you take that table and you sort it by the search key that you like and you store that at the head of your file so the first however many pages of your file are the data pages they are the leaf level in sequential order on the disk so this picture off to the right if you want to think of it as the blocks of that isam file the first number of blocks is the data pages it is that leaf level stored in sorted order okay, up on the right top right of the screen and then what we're going to do is we're going to construct the upper levels of the tree and append them to the end of the file so the file will contain first the data and then the index level okay and at the end of that structure we've built the data in sorted order and then appended the index pages after it anything that happens later on that needs to be added to this index will be appended at the very end of the file on these overflow pages and I'll show you how this works so that's the physical structure of the file and you can see why it's called indexed sequential now it is an index but the data is actually stored sequentially in the head of the file if you need to scan the data you don't pay for the index at all you just scan the data pages and they come in order so it's very dense actually representation up at the front it's just the data in sorted order if you want to use the index it's off in the middle of the file somewhere and you can go find it probably have a pointer to the start of the index alright so search obviously starts at the root so you need to know where the index pages begin because that's where the root node is going to be and then you use these key comparisons to walk down to the leaf remember that every pointer in this picture is a disk block address it's a disk based data structure so where you see a pointer here that's a disk block address in that file on the upper right here so all you need to know is where the index page root is and then everything else you'll navigate by following these pointers which are disk block addresses and you'll go accessing blocks in the index pages as you navigate this file that's exciting we'll do that in a second so search is going to start at the root and use the key comparisons to go to the leaf obviously search is cost log n and it's log base f the base of the logarithm is f f being the fan out of the page the number of entries per page the number of pointers per page which is much better than log 2 f is going to be like 100 or 200 which is a big deal actually it turns out in practice it's not a big deal in big O notation it's a big deal in practice though constant factors matter to us quite a lot here because IOs are so expensive so f is the number of entries per page really the number of pointers you can think about n is the number of leaf pages now there's no need for the leaf pages to be linked together why is that? we don't need to store pointers at the leaf pages why don't we? if we want to scan across the leaf of this index how do we do it? okay one idea breath for search we could not scan a leaf of the index we could wander around and visit the leaf of the index using any search algorithm you like you could do that but that'd be very expensive a lot of random IOs yeah? exactly we know where the leaf pages are they're at the beginning of the file and they're in sorted order so we just scan them till we hit the header page of the index they're physically sequential so they don't need to have pointers block one, block two, block three, block four that's the leaf level it's sequential index sequential access method so unlike what we'll see in B plus trees there's no need to pay for link pointers at the leaf level the leaf level is physically contiguous on the disk left to right okay to do an insert in isam you find the leaf where the data entry belongs and you put your item in that leaf and if that leaf is full you allocate an overflow page at the end of the file and you stick a pointer to it in the leaf page and I'll show you a picture of that in a second in fact maybe I'll show you a picture to it now so insert works like this let's insert 23 so we started through oh we walked all the way down clearly it's less than 40 it's between 20 and 33 it needs to go between 20 and 27 at the leaf level right but the leaf is full so what we'll do is we'll allocate an overflow page at the end of the file and put a pointer to it on that leaf page know that leaf page is not sorted or that sequence of leaf pages and overflow is not sorted too bad we'll have to sort it by reading all that linked list of leaf pages in and sorting it every time we look at that leaf page you can see why isam are kind of 60s technology right nice hierarchical log-based access as long as you never insert anything and when you start inserting things we start getting these chains of overflow pages that are in insertion order they're not in logical order that kind of stinks 48 well it's to the right of 40 so we navigate down to the right it's to the left of 51 so we navigate down to that leaf node between 40 and 46 it should go on that page but again it doesn't fit so that's got an overflow page 41 should go between 40, 46 and 48 so we'll navigate to the leaf page where there's room for it on that overflow page and then 42 is going to create another overflow page you can see that this data structure starts to degrade upon insertion pretty badly what's the worst case behavior of this thing big O notation search in this thing in the worst case becomes what after many insertions O of N right in the worst case you built your tiny little index under your tiny little data and then you inserted a whole bunch of stuff creating giant link list at the bottom of say one leaf page so almost all your data is in this link list and it's not even sorted you've got yourself a pretty stupid data structure so at that point you rebuild your isam now deletion is just to go back to the instructions at the bottom here deletion go find the item and delete it so you can destroy okay and if you're deleting a tuple that empties an overflow page well that's nice we can get rid of that overflow page and you know make the one that used to point to it point to the next one we can handle the link list of overflow pages intelligently so at least in the overflow pages we'll you know reclaim a little data so let's do our insertions again great and then let's delete 42 and when we do that we can also delete the associated overflow page and kind of give it back to the pool delete 51 actually where'd it go we just delete it it's just gone okay because it was in that primary leaf page and that structure is never going to change and delete 97 oop oh here's a point after we delete 51 interestingly there's still a 51 up above in the index level we didn't delete it from up above that's okay why is that okay? it still acts as a pivot nicely said 51 in this case is just directing traffic it's pivoting our search to the left or to the right sure it could be any value now between 47 and 55 would be fine there but 51's a perfectly reasonable number to have there let's just leave it moreover we're not going to change it in the index pages because it's isam okay but it's fine nothing's going to break we don't assume 51's in the data the real data you have to go to the leaf level to look for it alright so unlike the B tree that you might learn about in an algorithms class the data structures we're going to look at here both the isam and the B plus tree all the data the true data is in the leaves and the superstructure is just navigation okay so keep that in mind we can have keys up there in the index pages that are not in the data that's okay yeah good point so because we may need to chain these overflow pages every leaf page and overflow page has to have room for one pointer which is the next pointer in a linked list alright so we have to reserve just a little bit of room on each one of these pages to hold a pointer to the next disk block absolutely right okay and when we delete 97 we just delete it once again okay so what are some good yes question before I deleted the 42 there yeah so the question was and this is in the direction of like could we make this silly data structure slightly less silly suppose we deleted 41 here wouldn't it be nice to kind of compact that linked list of data so I don't actually know what IBM's implementation of isam does in that case this is mostly just for our educational staging but certainly it would be nice to compact that list at least to look one level down to one page ahead and say could we put two pages together it's a good idea I don't actually know if they do it or not but it will not be tested on the details of managing overflow lists in isam so implement it as wisely as you like it's a bad data structure so don't implement it at all it's just to give you an idea okay so here we go there are some good particularly good things about isam there's one uniquely good thing about isam that we will lose when we do the B plus tree can anybody guess what that is it's in the name sequentiality yeah the S right so the ability to do a sequential scan on the disk with no random IOs and read off the data in order is a nice thing about the isam right and that's what they were going for back in those days I guess they wanted to maintain that sequential scan and have some index that's a nice thing we're not going to see that when we do the B plus tree in a minute we're going to lose that sequentiality if for example this analogy becomes increasingly dated but supposing you were building an index and you were going to burn it to like a DVD isam would actually be a reasonable thing to do right you get it all balanced you get the data sequentially you get your index it's never going to change so that's not bad I mean God forbid you should ever do index searches on a DVD because it would be incredibly slow but you get the idea right if you had right never data then isam is actually really quite nice because of that sequentiality and the cons are obvious right we said it degrades to linear search in the worst case and in general it's just going to have overflow chains in bad places it doesn't handle dynamics gracefully right it doesn't handle insertion and deletion ok so let's fix that the B plus tree humbly named the B plus tree by it's inventor Rudolf Bayer who claims it was not named after him it's the workhorse of basically every database system alright is a B plus tree it's going to be a balanced tree much like the isam of blocks insertion and deletion are going to be log base F and cost so the tree is height balanced fan out is F the number of leaf pages is N in that expression the clever thing particularly when you think back to your like 61B or 1970 whichever you prefer to remember is that this is a balanced tree that is not actually width balanced it's height balanced but individual nodes will have fewer or more pointers on them oh if you do that interesting how would you do that so it's the way we're going to guarantee our log is we're going to make sure that every node has at least 50% full of pointers so if the pointers could range between say D and 2D we'll make sure there's at least D pointers on every page and then because that's just a constant factor off of 2D we'll be able to guarantee our log based F cost but it's going to give us some slop between D and 2D to play with things so each node will be 50% or more occupied except for the root which is kind of a degenerate case so each node is going to contain M entries between D and 2D if you want to have a sort of talk B tree talk I don't know who you would talk this talk to D is sometimes called the order of the tree I don't really know anyone who uses that term but it's in the book and it's in the slide so there you go we will talk about the capacity of these pages being between D and 2D though and the performance of the structure it's going to support equality and range searches as in ISAM all the searches will go all the way to the leaves so again those internal keys are not data they're just kind of direct traffic the classic B tree which Byer also invented the internal nodes contain the data so you kind of have data hanging in the internal nodes and in the leaves in the B plus tree the data is all at the leaves which is going to allow us to do scans more efficiently so the data is always at the leaves you walk all the way down to a leaf to find items you can't trust the data in the internal nodes it's just for routing you down to the leaves so it's a lot like ISAM but it's a dynamic structure it's going to grow and shrink elegantly and maintain its log guarantees so here's how it works this looks awfully like an ISAM except now we do have pointers across the leaf level why do we need those pointers across the leaf level to support what a range of values if I say I want all values greater than 22 I guess in this case greater than or equal to 24 I need to be able to find 24 and continue to the right preferably I don't want to go bouncing up and down in the tree because those are IOs that I shouldn't have to pay for so we're going to keep a linked list a doubly linked list actually so we can do less than queries too across the leaves so search begins at the root and key comparisons directed to a leaf for example you walk down left of 13 and you find 5 if you want to search for 15 is it greater than 13 yes is it greater than 17 no so you walk down here and now in the leaf level there's no 15 so we know the answer to that is no matches if you're searching for all data entries greater than 24 you can do a little binary search on the root page to say aha it's greater than 24 so we go down here and we see them are in RAM our buffer pool is where we see these pages so we access the root root page gets loaded into the buffer manager and now we're looking at a page size block of RAM we can do binary search in there within the page so to find what what pointer to use for 24 we can do binary search within the page it'll point us down here we look for 24 we find it and then we return 27 29 we follow the pointer we return 33, 34, 38 and 39 so it's down into the right for greater than and down into the left for less ok just to give you a sense of the numbers in practice these are old numbers from the book I think so this is maybe hmmm multiply this by like 16 or 20 or some round number like that to get to typical numbers today but let's say there's 100 items on a typical page 100 to 200 100 to 2 items out of page typical fill factor the same argument applies that we talked about last time usually about 2 thirds full after a sequence of insertions and deletions randomly so average fan out let's say is 133 in one of these B trees circa 2000 so typical capacity just do your arithmetic a height 2 B tree 133 cubed ok because we're going to get the root the level below it and the level below that that's 2 IO is height 2 is going to be 2 million entries alright that's a big number let's just make sure we understand this this is a height 0 B tree by the terminology of the book 133 children alright this is a height 1 B tree 133 children per 133 children this is a height 2 B tree alright and then down here are the pointers to the file potentially alright well this contains over a million items 2 million items that's a lot and you only do 1 2 3 IO's to get to the leaf level alright so that's pretty good that's why when I said last time imagine when I talked about index that's the loop strand I said imagine that getting a look up in that index is you know 2 to 3 IO's that's pretty realistic actually right millions of tuples will get you can get them in 2 to 3 IO's also let's think about and I think I might be jumping ahead but let's think about our buffer manager replacement policy from last time suppose we're using LRU caching ok so we're going to replace pages that are least frequently used and we're doing a lot of B tree lookups what's the likelihood that the root is in memory we're doing a lot of B tree lookups it's really high because every single look up in the B tree accesses the root right so every time we touch the B tree this thing's going to be accessed for every B tree lookup what about this guy what are its odds of being accessed on a given lookup 1 over 133 if they're random lookups right so maybe not so often alright and then down here it's 1 over 133 squared it's getting very unlikely but you can assume that the root and actually often the second level of the B tree are just sitting in the LRU cache because they're hot which means that really to do it IO you're like that was free that was free maybe one IO in the index and if it's not an alternative one index if the tuples actually aren't here but you have pointers to them in some other file then it's another IO to go get them so B trees you know it's kind of like one IO to go get a lookup in a B tree it's pretty cheap which is great it's still a random IO though and remember random IO is 10 to 20 times more expensive than a sequential IO so you don't want to do lots and lots of lookups in a B tree you'd rather scan at some point keep that in mind too and you can see the numbers they get really big really fast thanks to exponentiation I have religious relatives who have like 6 to 8 children at a time and when you go to weddings you get like 3 generations of those families and they walk down the aisle and it just keeps going exponentiation is amazing okay inserting a data entry into a B plus tree so how do you do insertion so this is where it gets fussy and you want to start paying close attention first you find the place to insert it then you put the tree to find the correct leaf L and then you put the item on L and if there's room for it you're done and that's exactly the same as isam where it differs from isam is what happens if L is full you don't create an overflow page well you kind of do create an overflow page temporarily so what we're going to do is we're going to split that leaf node by allocating a new page and taking half the stuff that used to be on the old page and putting it on the new page and then moving the new item in the right place in between and rather than read this I'm going to show you a picture because it's annoying to read the pseudocode and it's really easy to see in a picture we'll bounce back and forth but let's insert 8 into this B tree so you can see where it's going to go it's going to go in the far left node so we do binary search on the root which tells us to go to the far left node we get to that node it's full so let's pretend we're in isam for a minute well I allocate first we'll traverse to the leaf then we'll allocate a new page and then a overflow page but what we're going to do instead of just putting 8 there and a linked list we're going to take the sorted list of values that's on that page and we're going to split them up evenly between the old page and the new pages if we're splitting it in half so 5 and 7 go there and now we figure out where 8 goes so now these two pages are both in sorted order and then rather than making this a overflow page we want to shove it back into the tree we want the tree to remain height balanced which means that every path from root to leaf should be the same length but we're going to have ourselves a problem in order to insert a pointer to this page in the root we're going to need space in the root and there is no space in the root so we're going to split recursively so watch what happens first of all we need to know how to point to this new page what would be a good pointer value what would be a good key value for this new page one idea would be 4 which actually would work but is kind of subversive much easier to take a value we have here which would be 5 so 5 would be a nice value for this page 4 would have worked actually but what we're going to do is because it's greater than or equal to logic here you can say 5 will be the lowest value on that page which was the middle of all the values we had between the two pages that's going to be the key value for this new page so we need a pointer and a 5 but there's no room for that new index pair of 5, a pointer so I want to put a 5 and a pointer there but I don't got no room so we're going to recursively split the root and when we split the root we're going to play the same trick we're going to allocate another page we're going to shuffle the values over alright we're going to make room for our 5 and move the pointers with them so let's do that again the thing to the left of 24 pointed to the middle page the thing between 24 and 30 pointed to the second to the right the thing to the right of 30 pointed to the right so those are all just going to move over so we really just moved all that stuff I just couldn't animate moving pointers it was too hard and then we want to stick the 5 up above but now we've got a tree with no root which is not going to be good so this is where the tree actually gets taller is when you split the root so we're going to allocate a new root now we need to split up all these keys down here we've got these two children where does everything go well the root is going to split the tree into halves so we want the middle most key between 5, 13, 17, 24 and 30 to kind of be pushed up into the root alright and it's going to be our split key so it's going to be in this case 17 if I'm not mistaken is the middle value that's going to be the differentiator between these two pages and now we have ourselves a nice balanced B tree okay so that illustrated like the worst case scenario in insertion which is you go down to the leaf and you recurse all the way up to the root and the root has to split and that's when a B tree gets taller it usually just gets wider and then occasionally you split the root and it gets taller by one level and because it splits upwards rather than downwards you always know every path is the same length make sense and then every path is a life of K plus 1 alright so all paths get longer together by splitting the root and that's sort of one of those oh I never thought of that all my trees I used to split down were never occurred to me to split them upwards but that's how you get the balancing argument in the B tree okay so no rotations, no crazy stuff like that just split upward there is an important distinction that happened that got buried here in this which is the difference between the leaf level nodes and the internal nodes and then we'll talk it through in the text of the slide okay so watch here when we split this leaf page it goes there now we're going to need to put a key up above what key are we going to pick, we're going to pick 5 we're copying that 5 the 5 stays in the leaf node because 5 is real data and all the real data is in the leaves so you never never never take anything out of the leaves unless the user asks you to delete something so the 5 stays in the leaves so if you're having a leaf node you copy its key upwards into the next level and then we did this and the 5 ended up copied one level up but watch what happens with the 17 which is an internal key that we're going to move the 17 actually gets pushed up it doesn't get copied up because it's just directing traffic and the place it needs to direct traffic is there things less than 17 are down that branch but the 17 itself doesn't have it shouldn't stay at the level it was at before so the internal nodes where they're just you know, pivots was the word used, is a good word the keys are pushed up but at the leaf level the key is copied up because it's data and all the data has to stay at the leaf level so remember that distinction when you're learning the algorithms for this so now if we go to the previous page with the actual pseudocode if, so starting from the top put the data entry on L space you're done, else you must split L into L and a new node L2 you redistribute the entries evenly and in order copying up the middle key to the next level up and you insert an index entry pointing to L2 into the parent of L with that middle key this is a little sloppy the pseudocode to be honest and then this can happen recursively but when you split internal index nodes you redistribute the entries evenly but you push up that middle key and the proper pseudocode is in the textbook I encourage you to read it splits grow the tree and a root split increases height so splits kind of go like this and eventually like that, tree growth either gets wider or it gets one level taller okay so we went through all this that's the basic idea of B tree insertion cool alright so that's what happens after we insert the 8 we got a root split leading to increase in height this is the same slide as before just without the animation I believe now there's a weird thing which is that if we really wanted to there's another way we could have accommodated 8 and it's not unlike the suggestion that was made before about compacting the ISEM pages if we wanted to make room for 8 we could have sort of shoved it in the second level the second node from the left and changed the keys above it or something like that we could have did that but it's kind of ugly and in general you might have to slosh from way far to the right to make space on the left or something like that so we don't do that typically in practice we don't redistribute entries on insertion okay this is just to hammer home once again the difference between copying up and pushing up and I'll just do it one more time okay so observe how the minimum occupancy is guaranteed regardless we always have at least 50% full at the leaf level or the index levels because we take full pages and we split them in half so they always are at least 50% full right there's something sort of very elegant about the constraints always being guaranteed here there's no rotation there's no like restructuring of the tree it's just by dealing with things when they're full and then splitting them we keep our 50% full and things grow upwards and all paths remain the same length alright but notice when we split a data page it splits into two and we the entry that goes above is a copy of that value 5 that's the entry to be inserted in the parent node right that's the index entry key comma pointer and with the index page split we split it into two right but we take that middle most value 17 and we actually don't include it at the level it used to be at it's been pushed up into the parent node so the insertion of the index entry 17 comma pointer takes the 17 away from the node below alright deletion starts out just like isam again you go to the leaf level you find the item you remove it now unlike isam we have to maintain our constraint of at least 50% full for it to be a proper B tree so if the leaf is still at least half full we're done but if the leaf has fewer than D entries which in this case will be exactly D-1 entries this is what's going to happen at that point we will try actually to do what I said we don't do at insertion time we can try to borrow some entries from our neighbors at the leaf level so you can borrow from a sibling not a cousin because a cousin you have to traverse up to the grandparents to find a common ancestor but we can borrow from a sibling if we share the same parent node and then we adjust the keys in the parent okay and if that doesn't do it and we still have less than 50% full then what we're going to have to do is take our sibling merge with our sibling and form a new key it's the opposite of split essentially and if a merge occurs then the entry that used to point to one of the two items that's merged has to be deleted from the parent which might make the parent under full and so this might recurse up the tree till in fact if the root is emptied then the tree shrinks if you merge two nodes below the root then you get the root being redundant and it shrinks and being a little vague because I don't want you to learn this okay you could learn it that'd be fine but we don't implement this in most typical database systems alright they actually cheat and they allow pages to go less than half full if the page is really really empty sometimes you can reclaim the page and just leave the tree a little unbalanced for a while why would people do this this seems horrible it's just as bad as ISAM only in reverse case scenario we got a giant tree and no data which is oh the size of all the well it's oh log the size of the data we used to have to look up nothing just seems weird why is this okay in general people don't have delete heavy workloads in their database and even if they do they're followed by insert heavy workloads usually if you've grown a table to a million tuples or more it's not gonna like become a 10 tuple table in the future for a long period of time okay so rather than write this extra code and debug it and rather than do all the IOs to maintain it let's just keep the capacity in the tree there for future inserts this table clearly has periods where it's big so let's just keep the index big so actually it's not worth implementing this stuff as an academic exercise you might want to know how it works but I will not quiz you on it alright in this class it's not really actually that important okay there are four hidden slides in the slide deck that you can go look at if you want or you can read the book for fun I googled for B plus tree visualization last night and in fact there's a pretty nice one on the web at USF actually which is kind of the University of San Francisco right across the bay it kind of looks like this so here's a little B tree it's got 20, 30, 40 and 50 it's got a degree of 5 so it can hold 5 things and we can insert something in it let's insert which we inserted 10 great choice hey look at that alright let's insert a few more things 200 11 100 look at that you can play with this it's quite handy actually there's only one minor distinction that I found last night goofing around with this with our book in slides which is upon deletion it actually when you delete a node from the leaf it'll update the split pointer above it remember in isam we deleted something and there was like a 15 left over above and I said that was cool so these guys will actually run around and fix things above to match let's delete 50 watch the key value in the root if we delete 50 it updates the root to have 100 there's no need to do that it just adds extra right traffic to the database at some point we have to flush that page with the 100 on it now because it's dirty and it really hasn't helped us any so this is a little obsessive compulsive about deletion I think that's the only difference though between this animation and the algorithm in the book so it's actually pretty good really helpful and in fact I think we might make you play with this on your vitamin just to kind of get you to visualize these splits and insertions alright typically if you want to implement a bee tree you want to get that fan out as big as you can what's the goal of making the fan out bigger yeah it makes the tree shorter sometimes right yeah so it postpones the point at which the tree grows would be another way to say you can absorb more insertions before you grow and yeah if you have a data set that kind of rides the boundary between say 2 and 3 levels if you get more fan out you can keep it at 2 levels and that can be kind of a big win cash it cash it 1 IO it might just be cash it cash it no IO's it can make a huge difference in your performance so we try to make these things have as big a fan out as possible what can we do to increase the fan out of our bee tree well we could have bigger pages alright there's always some sort of trade off between having big pages that you don't really look at most of but the page size is going to be the same for lots of data structures in the database it's really big but then you'll often fetch pages and only look at a subset of them or you can pick a page size that's really small but then you get lots of IO's right and so it's kind of a toss up so page size usually gets tuned up and kind of fixed and let's say it's 64k we don't get to adjust it so we can't change the page size I just ruled that out after thinking about it but there is more that we can do to increase the fan out we can't change the physical storage which means what we've got to do is shrink the stuff we're putting in it we're not going to change the page size we're going to compress the stuff inside the page we're going to try to use compression to jam more bee tree entries onto a page so I'm going to teach you two little tricks for compression the goal is to get more entries per page therefore to get shorter trees alright the first is what's called these are annoyingly called prefix and suffix compression they could have easily been named the other way around I'll try to be on exams we'll try to make sure we indicate which one we mean remember that these key values just direct traffic so we really don't need all the details so here's some key values from the book they're kind of maybe a little overly verbose like this would be just as good don't you think they're distinguished from each other they're in alphabetical order why not why not just compress is it right did I mess anything up let me ask you a different question is it the same is this thing the same as that would it direct traffic exactly the same way well let's take an example let's take David Jones David Jones where do I go to find David Jones in this tree the one two third pointer between Davey Jones and David Smith right alright well on this tree where do I go to find David Jones the one two three fourth pointer so it's not the same but is it bad not really not as long as all the data is in the right place right as long as all the pointers point to data where the data is I don't mind if it's balanced slightly differently we're still got 50% fan out okay and yeah you know so now things are a little bit further to the right than they used to be or something that's fine it's all balanced anyway okay so it's not the same exact thing but it is just as good and it takes a less space and so here's the thing is we just need to make sure the data is in the right place it turns out comes for free because the way that compression happens is you do it when you split the leaf so when I split a leaf node and it's got lots of long strings in it the key that I'm going to copy up is going to be the shortest thing I can copy up that's differentiating so think about it this way and I don't think I have a picture of this in the slides let's say I have a node that I'm splitting and it's got I don't know halfway through this node there's a value um George Gershwin and the next value is George Washington alright when I split this page if I split it here I'm not going to copy up George Washington I'm going to copy up George W that's enough to distinguish and from now on this is compressed up the tree this is as long as this needs to be ever again we'll never see George Washington higher up so the compression happens upon the copying up right and the pointers all point the right way the invariant that the pointers are all correct happens the minute you do this insertion so we're all good you don't compress the existing tree you compress as you copy up and everything works out okay the other thing you can do which is just like straight up compression it's called suffix compression if you have a whole bunch of things on the same page that have a common prefix well you know shove the prefix in the corner and then just put the suffix and then we're just increasing suffixes there so instead of writing down mcdonald mcdougall mccleary and you just write down hey mac and then donald mcdougall feleler alright the reason that this isn't just sort of a dorky hack is that we often have composite keys right so remember we might have an index that's defined on say last name comma first name maybe this doesn't even merit a picture but if we have a page you know where there's those are two different columns but we're going to have composite keys in our index so there's going to be you know jones comma davie that's one key and jones comma david that's another key etc because there's going to be lots of jones you pull that out as a separate prefix and then you just store the suffixes it happens all the time when you have these composite keys because the first field that you index on has lots of duplicates last name comma first name there's lots of duplicates on last name so there's a pretty typical optimization you want to do make sense alright one more topic I think I'd be trees actually maybe before I do this let's take a little stretch and I have some announcements so this is sort of a different thing I even took notes alright before you have too much fun I don't want you to start like a long conversation homework 2 so I would like to make a brief announcement about homework 2 homework 2 is due Thursday night at 11.59 p.m we've been known to have certain tolerances for lateness in the order of a minute here and there but the official deadline is 11.59 p.m to avoid confusion over what really is 12 o'clock and what day is it on so Thursday night at 11.59 the homework is due after that you start accruing slip days in homework 1 a number of people struggled to get the correct incantations with Git to submit the homework properly we were lenient in homework 1 in fact Michelle is in the process of painstakingly and by hand being extra lenient we're not going to do that in homework 2 so if in homework 1 you got a 0 because you'd missubmitted with Git please learn your lesson because in homework 2 you will simply get a 0 and no amount of complaining will save you you have been warned learn from any mistakes you made in homework 1 auto grader by tomorrow morning Derek promises with a smile on his face and a late night ahead of him by tomorrow morning there will be an auto grader installed in the cloud if you will although it'll probably be in Evans Evans soda I'm really old I went to grad school in Evans when you do a Git push you'll be able to see if your tests ran properly tomorrow when you do your Git push the auto grader should be installed so it's going to actually run the same tests you run now no additional tests but it'll run the same tests you have in the infrastructure and get back to you with email with email saying that your tests ran and here's the outputs you'll be able to tell if your Git submit is working okay but we will have no patients for people who don't get that process correct when you submit so really important to understand how you submit with Git if you do not have a question yeah the blue shirt and then so behind you and then you yeah there will be additional tests we're holding back because in the real world when you ship software you don't get all the tests it's your test suite group repository is what's going to get graded won't fire up the auto grader yeah right so you must submit to your group repository to get the auto grader going if you submit to your personal repository sorry your class individual repository the auto grader will not go alright partners for homework too if you do not have a partner it's awfully late and you're required to have a partner and so if you do not have a partner you'd better be in touch with the TAs like Prantissimo there will be perhaps some way to do some speed dating but boy you know it's really late but it's required to have a partner to get a grade on this homework and if you feel that you are in an exceptional situation for some reason please come contact me or email me immediately and it better be a pretty good reason because it's not that hard to get a partner okay and just for a minute having a partner is good right pair programming is good you get fewer bugs it's good for your humility because actually people point out your mistakes so all that stuff is good I think as engineers we all need a little training and socialization and so being forced to actually work with another person who maybe you don't even know that's good experience too I know you do this in your other classes anyway it's a good thing to do we don't require it for all the homeworks but I do require it for this one because you're an awesome Berkeley EECS major okay I get it I still want you to partner up Piazza in the interest of helping the staff prioritize answering questions on Piazza if somebody asks a question please don't respond with yeah I don't understand that either either respond with an answer or don't respond because what happens is we tend to jump to the questions that have no responses first oh man nobody's ever answered that person so we'll jump in and we'll start responding but if you know that happens through like an automatic was there a response or not and then we'll start reading through all the posts and seeing what's going on but as a matter of prioritization please don't fill in the responses unless it's actually an answer to the question okay it just helps us prioritize responding on Piazza so with that I actually want to take a pause before we do bulk loading and we return to lecture any administrative questions before we go on the administrative question but I will answer it but just give me a minute did anybody have any organizational questions then we'll get back to Beatrice yes the eternal slip day question so the question was suppose that you have no slip days left and your partner has many slip days left how does that work it's a good question when you have a mixture of solo and group projects it works the way the rules say so you hypothetically if you're late you'll start losing points and your partner won't which means that you and your partner should have a long talk so the incentive structure around slip days and points and everything is individual but it may incentivize you to work differently with your partner than otherwise other administrative questions okay and then feel free to post to Piazza if you have one that occurs to you so let me answer the question about Beatrice insertion which has a pretty short answer and then I want to talk a little bit about joint algorithms so the question was well let me give the theme and then maybe the principle behind the answer then I'll ask the question and answer the question so remember in the Beatrice an index entry which will be something like 5 and a pointer what that's saying is that and then let's say there's 10 the things on this page are greater than or equal to 5 and less than 10 the things on this page are greater than or equal to 10 etc I probably shouldn't put these in the page because this is a description of what's in the page it's not actually the contents of the page there's stuff in here that stuff is greater than or equal to 5 and less than 10 this stuff is greater than or equal to 10 and less than something so this 5 is a greater than or equal to pointer which means that when we split and so the question was on split how do we make sure to find the item that we copied up will always be just to the right it'll be the first entry of the split key so let's go through the example in detail we'll take a leaf page here's our leaf page it's got 3, 4, 5, 10 alright we'll just do it like that and then we want to insert say 11 okay so we want to put 11 here so we'll split it we move 5 and 10 to the right along with 11 and by the way when we break ties typically we'll have sometimes an odd number of things to split up the odd numbered thing the higher number thing will go to the right as a convention just so we get the same answer when we do these examples alright so if it's like 5 we'll do 3 to the right and 2 to the left it's 121 divided by 2 anyway we round up over here alright so the 5 is here and the thing that's copied up is 5 as well so the new key that gets inserted in the root is 5 pointer to this thing okay so it's always because it's greater than or equal to it means that this thing is going to be over there so I want to spend a moment revisiting index nested loops join which one of you came to office hours today and pointed out to me the pseudocode in this slide is kind of ridiculously dorky and unclear and now that we know how indexes work anyway let me review quickly so wow, blast for the past I guess I used the board more than other people alright so index nested loops join we are scanning one table let's do a concrete example here's our query select student.lastname comma course.name from and I know we haven't learned join syntax yet you'll figure it out as I go here students courses C where the student ID in the students table is the same as the student ID oh you know what this is not courses excuse me this is the enrollment records table E dot course rolled E equals E dot so I want to find all pairs of students and enrollment records where the student the enrollment record is about that student right and I want to know for each one the last name of the student and the name of the course they're enrolled in okay so I'm going to do this with an index nested loops join and I'm going to have an index on the enrolled table on that SID field alright and it's going to be an alternative to index from last time which means that here's the enrolled heap file it's a plain old heap file that's where the enrolled tuples are and here's going to be the students heap file which is smaller because each student has many enrollment records okay and then I'm going to build a B tree which is alternative 2 which means that in the leaf levels what we'll have are it's going to be a B tree on E dot enrolled dot SID and so in the leaf level what I'll have is I'll have SID comma record ID pairs that point into the enrolled heap file alright so you'll be able to look up SIDs in here and when you find the SID you like there'll be a pointer to the tuple in the enrolled table where you can go get it okay that's an alternative to index and this thing is a B tree so it's going to be I don't know let's say 3 levels deep okay so it's going to have a root it's going to have a level here it's going to have a level here so index nested loops join well let me animate it and then we'll write it down scan the student table for every record of the student table do a look up of the SID in that record in this index to find matching enrolled records alright so you get a student you walk down the tree one two three four requests to the buffer pool maybe only two of them are IOs and then you follow a pointer to get a record ID out of here each student maybe enrolled in many courses so the next time we grab one tuple out of here there may be three or four matches here right which we're going to fetch with get next one at a time and so the algorithm for index nested loops join actually does have a nested loop in it despite the index but it's a little nested loop over the matches so the algorithm should look like this for S in student matches equals B plus tree look up student dot SID semicolon for sorry enroll enroll dot SID right for E in matches of which there may be more than one right return so this little call here is not in the pseudocode in the slides which seems like a pity and what is this call B tree look up it's the walk down the tree right and it's in this pseudocode we're finding all the matches and putting them in a set in practice we implement this with iterators and maybe that's an exercise we'll leave for a section or another day but you should think about if you had to do this with iterators what would you do it it's the same question for plain old nested loops join how do you implement it with iterators but basically we're going to say here's a join iterator here's an index look up iterator here's a scan iterator scan students this thing is going to say next to get a student it's going to say pass that student ID down here to get a next look up it's going to keep doing that getting next lookups until it runs out of matches and then it's going to get another student and do the same thing again so I'll let you guys think about how you do this with iterators but hopefully the algorithm for indexed nested loops now is clear so in the time remaining let's finish up our B plus trees the last detail which is actually an important one remember how I said we built an isam by taking a full table and then sorting the data and creating the isam above it right so this is actually a fairly standard thing you're like you know what I have this big table and I'd like to add an index maybe an alternative to index and one of the columns so I already have a lot of data and I want to quickly build an index it's very inefficient to repeatedly run the insertion algorithm for B trees on each row of that million tuple table why is that why would it be a bad idea to just run insert over and over and over well it's slow and it also has poor leaf space utilization alright but why is it slow why is it slow to do multiple inserts well think about the IO pattern particularly if the data is not sorted yet right you're walking down a B tree which in steady state is half full let's say half way through it's half full so it's a B tree of some size and you're doing random lookups in it to do the insertions well these random lookups will hit cash on the root but as you get down to the bottom it's doing random IOs and inserting a single tuple on each one of those pages so you're getting a random IO per tuple of insertion it's expensive right you want to get a whole page full of goodness out of that random IO you're getting one tuple of goodness so what we want to do is come up with a scheme where when we do an IO we load that page full of data right so if we're going to look at a leaf level of this index let's put all the tuples that go on that page on that page at once so it's called bulk loading so this is a it's going to be really slow it's order the number of tuples number of IOs to do insertion it's going to be order the number of pages of the file properly with bulk loading so factor of 100, 200 faster also and this is something you may want to play with with that web animation if you do insertions you can't actually it's very hard to get the leaf level to stay full it tends to want to get half full or really two thirds full is what it gets sort of stochastically if you really want to pack up a B tree nice and tight because like you know the data set's not going to change much well then you're going to have to control it some other way than insertion because insertions are going to tend to kind of split the pages right when you didn't want them to split alright so and you can play with you'll see this if you try doing insertions into this actually I challenge you to do this go you go to the web animation make up values to insert so that you can keep your B tree nice and compact it's actually like you got to start making up values like in between things and of course that's not how the real world works you don't get to make up your data as you go you have to actually deal with the data alright so bulk loading is going to be better so this is the deal take your file and run sort on it external sort algorithm that we all know and love right so that's going to be read the file write out batches that are sorted read the batches back in right when you're reading the batches back in so that's the output of the merge algorithm we're going to bring it into memory and pages it's going to come in an order okay and what we're going to do is we're going to start imagine this is we're reading the sorted pages back in streaming in from the sort algorithm but you can imagine them being on disk in sorted order we're going to take them a page at a time and start building a superstructure of a B tree above them so we'll take the first page of sorted things and we'll put a root above it and that's how we get started and we haven't read any of this stuff in this box yet because it's coming streaming into us from a sort algorithm okay and then what's going to happen is as more pages come in we're going to be basically in a steady state we're going to have a partially built tree on the prefix of the sorted order like on the upper right here so at some point we've read all the way up through 36 and we've built this thing above it and then by induction what we're going to do is we're going to take the next page load of stuff and we're going to start inserting it at the end onto new pages so 38, 41 will go together and we'll insert those two into the parent right there and then we'll just recurse to the right of this tree and the leaf level is always exactly as full as we want it to be we can make it not totally full if we don't want it totally full or we can make it totally full if we want it totally full so we can control the compactness on the leaf level and we're always inserting things to the right and the tree IOs are going to only grow leaf pages when we can fill them up completely so each leaf page gets visited exactly once in this thing what about this parent pages though is it a problem that we're doing these insertions toward the right let's think about LRU what pages do we tend to touch when we do an insertion to the right well we'll touch the root and all the children to the far right yeah which means that the next time we do that it's still going to be the root and the things to the far right and we'll have a lot of locality on the right fringe of this tree and we'll realize what's hot and the left part of the tree sort of gets colder and colder and gets paged out which is great so LRU is going to do the right thing here as we insert these things to the right this right fringe of the tree is already in cash and life is good so this is going to have the behavior we want which is that we're going to visit each page on the disc pretty much just once to fill it up and then it'll get written down when it gets evicted from the buffer pool this one's pretty good and it happens to be in the book alright so this exercise I kind of walked you through but you may want to convince yourself that it's true alright multiple inserts is slow bulk loading is good the leaves are stored actually so here's a detail I didn't point out you can actually get the effect of an ISAM because when you start streaming the data and to the leaf level you can store it on the disc in a new file you can make that leaf level be pages 0, 1, 2, 3, 4, 5, 6 of a file and then you can write the index pages later somewhere else in that file and so you get actually a file where coincidentally after doing this the B tree organization looks a lot like an ISAM the front of the file is a bunch of pages that are sequentially in order which is great keep in mind that it's not a guarantee so we still need our pointers our linked list at the leaves is like on a range query say you're doing all things greater than 6 as you follow these pointers you'll say hey guess what this is page 2 and I have a pointer 2 page 3 which is right under the disc head oh look page 4 that's right under the disc head 2 and so you are following pointers and if things are messed up and not sequential that's fine but if they are sequential you'll get nice locality on the disc so you can actually create a leaf level that is sequential on the disc and the only cost we pay then are these pointers but the sequentiality does is under our control when we load the bulk loading so that's kind of cool and then as I said you can control the fill factor on those leaf pages if you want to leave room for future inserts and make them two thirds full great if you want to make them one third full you can do that too it's entirely up to you how you fill out that leaf level if you want to make it really packed you can make them 100% full but it's under your control alright a little detail we said that the order of a tree D that every node would have between D and 2D entries meaning pointers but in practice when we use key compression or variable sized keys like strings the entries the entries on those keys sorry the keys on those entries will be a variable size which means you have more or less of them on a page depending on the data so you might have a long key here so the index pages may hold lots more entries than the leaf pages because they're compressed and by the way remember alternative three from last time so that's where in the leaf page you have a key like suppose you're indexing popular words on the internet and you know you have the key Perry and these days that has lots and lots of matches we'll store them all in a set right of our IDs this set's going to be super long right so this key is in essence super wide so even at the leaf level in alternative three you get variable length popular keys are wider than less popular keys so we actually can't guarantee this order D thing exactly some pages will just have lots of items and some pages will have few items so instead as an approximation we'll just talk about the page being physically half full of bytes it may be that it's half full of bytes because you have a few big things or maybe half full of bytes because you have lots of small things yes that matters a little but let's assume it's in the noise and as I said many real systems don't even promise you to be half full anyhow they'll let you get less than half full but the nice thing about the order stuff is it gives you that big O guarantee that you wanted and it gives you the sense that you're doing something robust lots of times in practical computer science after you prove something you then cheat a little bit and it all works out in the wash experimentally to summarize and hopefully this will be useful these summaries and so please don't get up and leave and disrupt your peers tree structured indexes they're very nice for unidimensional one-dimensional range search they're good for quality searches as well and because of that combination they're very widely used ISAM strictly of historical slash pedagogical interest it's a static structure the only thing that changes are the leaf level pages have insertions and deletions and overflow pages remember the organization of the ISAM file where the data is stored sequentially at the beginning of the file and I drew it vertically on the slide but here's a whole bunch of data rows and then when you're done with the data which is ordered then you start storing the internal nodes or the index entries they're called and then at the end of that and this all is fixed this never changes well the data pages can have insertions but that's deletions they don't actually change the number of pages the overflow pages are here and this is where the action is in the file right? going from left to right the overflow chains obviously can cause performance to degrade to as bad as linear the B-plus tree is a dynamic adaptation where inserts and deletes leave the tree height balanced at with n items log base f of n cost where f is the fan out of a note of the tree a high fan out means the depth is relatively more in practice which is great this is almost always better than maintaining a sorted file for any number of reasons including binary search is expensive and including that maintaining the sorted file is more expensive than maintaining a B-plus tree on average it works out things are kind of two-thirds full in a B-plus tree so you want to account for that when you think about what f really is in practice it's about two-thirds of the possible occupancy almost always you want to be plus tree instead of isam it's a dynamic structure so almost nobody uses isams now here's a detail buried in the last bullet of the summary we didn't talk much about alternatives one, two, and three today so actually let's take a minute and do that since we do have a minute oh we don't have a minute we'll take a negative minute and next time I will come back and elucidate the last bullet of the slide oh and there's more but I think we're good see ya good luck with your homework get cracking