 All righty, welcome back to operating systems. So probably another fairly short one today. So we get to talk about memory allocation briefly, which should be covered more in 454, but jury's out on that one, at least for this semester. So we've done memory allocation before, you've all done it before, hopefully static allocation or global variables are the simplest strategy. We've all done this, especially with your lab. So you create a fixed size allocation in your program. It's a global variable. You can put static in front of it, so you can't use it in other C files, but other than that, doesn't really matter. And how this works is, whatever your program loads, the kernel, while would read that elf file and set aside some memory for those global variables. And that memory is valid for as long as your process exists. As soon as the process doesn't exist anymore, that memory is gone. So there's no need for you to free it because whenever your program terminates or your process terminates, then it's gone. So you don't need to free it, the kernel does it for you. So we like this strategy, it kind of works. So problems with this strategy is, well, you have to declare your memory upfront and it has to be a fixed size. So you have to know the maximum size of it and also doesn't cover the cases where you conditionally require it, because while the global variable's there, whether you use it or not, it might actually be wasteful depending on the kernel. Sometimes the kernel can be smart and just not try and give it a page or anything like that, but it could be wasteful. So you might only conditionally require memory. So if you have to statically allocate the complete maximum size, well, the rest of it is essentially just wasted. Also, just for program reasons, you might not even know the size of the allocation whenever you compile the program. So you need to account for the maximum size and have a limit or you have to do something else or you just have to do dynamic allocation. And then if I have to do dynamic allocation, there's the question of, well, do I allocate it on the stack or do I allocate it on the heap? So we pretty much just use C because it does stack allocation for us. So like if we have a normal variable in X, where is X stored if it's within a function on the stack, right? So this is primarily the reason we use C, but we've used malloc before, right? Hopefully, so if we use malloc before, you can actually request stack or we can request memory from the stack. So there's a function called called allocA that I'm sure none of you have ever used before, but that is how you allocate. It's essentially like malloc, so you can just ask for a number of bytes, but instead of getting it from the heap, you get it from the stack. And well, your C compiler will essentially do this for you. So if I declare in X, really what's going on under the hood and the reason you can take the address of a local variable is because under the hood, it would do an allocA of four, so the size of an int. That would get on the stack and it returns a pointer to you just like malloc. But the only difference is with malloc, we have to free. With allocA, you do not have to free because how long does a local variable last that's within a function? Anyone? Yeah. Yeah, until it's out of scope, until the function's done, then that memory's gonna probably be used for something extra. So we have the function foo. It returns an int pointer and all we do is declare int X equals one and then we will print off the value of the pointer or the address of the pointer. So it should give us an address, some address that's on the stack and then we return the address of X as the pointer we return. So this will point to somewhere on the stack, right? All right, so here I create int star p and I call foo. So it should have the same value that gets returned from the function which is just the address of X. So if I just print off whatever the address of, these two things should be exactly the same, right? All right, so those two values should be exactly the same. I'm just returning the address of X and printing it. All right, and it would be insane if they were different, right? Like actually insane. All right, yeah, percent C will say null if it's a null pointer. Any guesses? Yeah, the compiler essentially knows you're an idiot. So if we looked at this compiler warning here, it says address of stack memory associated with local variable returned and that is technically undefined behavior because you're not allowed to return the address of the local variable because well, the address is only valid as long as that is in scope. So if I try and return it, that is technically undefined behavior and well C can do whatever it wants when there's undefined behavior including just change the return value on you. So the compiler, this is actually new in the last like two years. So the compiler is actually taking advantage of undefined behavior by returning a null pointer instead and its hope is instead of just working in randomly changing stack memory, if you try and change it, it will just cause a seg fault and you get to fix it. All right, so I've worked on compilers. I'm smart, I can probably outsmart the compiler. How do we outsmart the compiler? So is there a way to return the address of X without directly returning the address of X? Save it in a string or just save it in another variable? Probably not a string. So I can essentially just save, yeah, I could save it in a string if I want but the compiler's not that clever. So I can just create a pointer to an int called Y, give it the address of X and then return Y. So if I do that, my compiler's happy. So it can't detect it. So compilers are actually really, really, really smart tools that can really look dumb. In this case, I made it look dumb because, well, I skirted around it quite easily. And if you want to know why the compiler is dumb in this scenario, you can take the compiler course because they're actually very, very, very complicated pieces of software. All right, so now if I execute it, I get what I think. So both of the addresses are exactly the same because, well, it should be exactly the same. So any questions about that? So, yeah, this is technically undefined behavior but in this undefined behavior, well, if it detected it, it just wrote a null. In this case, it didn't detect it so it's undefined behavior was to do nothing. Yeah, volatile for what? Where do you want me to put volatile? Next to X. No, volatile wouldn't do anything here. So volatile means it essentially forces the compiler not to cache any values. It forces it to reread it every single time is what volatile does. So in this case, volatile would not do anything for us. All right, well, if I did something silly like this and I just read whatever value P was pointing to, segfold or garbage, any other contenders? All right, well, let's try it. One, why is it one? Yeah, are you good? Oh, so clean up here doesn't take time. All it does is reset the stack pointer back. So it cleaned it up, it's just not used anymore so it's just the value is still there. So in this case, yeah, I'm pointing to int X, it was declared on the stack and then I returned so it reset the stack pointer back but the address is still pointing up here as long as I don't overwrite it, it's the same value. It's not gonna have a zero or anything like that even though technically it could because, you know, undefined behavior but if I do this now and I run it, oh, guess what, now it's three. It's consistent because while that stack value didn't happen to get changed by the time I called printf. If I was to print it a second time, yeah, yeah. So in this case, if I have two printf calls while printf will have local variables in it, that will be assigned to some values and in this case, well, because I have the address of the stack still and I called printf, printf would use that memory because it's allowed to because, well, I shouldn't be using it for other purposes and it goes ahead here, clobbers the value and now I read it again and it is some gigantic number and imagine if you tried to debug this. So it probably just read a bunch of ones in some variable. All right, so yeah, now you can see which one it works because yeah, this one's consistent. All right, any questions about this? Aside from don't do this. All right, so taking address of the stack really bad because, well, technically that memory doesn't exist anymore. It's the same idea as using memory after you free it. It's the exact same idea because any stack variable, if you want, you can say, hey, there's an alloc a here or I'm getting memory from the stack and it's essentially freed right here. So if I'm using it outside of foo, technically that's a use after free same as with malloc and bad things happen if you do a use after free. All right, any questions about that brief detour? So sets you up to a little bit for the compiler course. All right, but we've used dynamic allocation before but we're not exactly sure how it's implemented. So there's the malloc family of functions. It is the most flexible way to use memory and also the most difficult to get right. I assume everyone has used malloc and then it has seg faulted at some point in their life. So you are not alone with that. It is actually very difficult to get right because, well, you have to handle your memory lifetimes. What does that mean? It means that you have to allocate memory and you have to deallocate memory and the rules with deallocating memory is, well, you should deallocate it as soon as you are absolutely sure you are done with it and we'll never read it again. Otherwise you have to keep it around. So when you deallocate it, you also have to make sure that you free exactly once. If you free two times, then that's also undefined behavior. You can't do a double free. That's also bad. And, well, if you were to implement malloc yourself, there's also fragmentation because, well, you don't control any of the requests and you don't control the freeze. You just kind of deal with requests and if you actually implement a memory allocator, fragmentation is one of your top concerns, probably after performance or right with it. So more of a unique issue for dynamic allocation because, well, at least in C when you malloc something, you can pick any size you want and that memory has to be contiguous and the decision is essentially permanent. So after malloc returns to you an address, you have to guarantee that that address is always valid. You can't move around memory on the user. Otherwise, well, that would also make debugging even more impossible than it already is. So remember what a fragment is. It's just a small contiguous block of memory that cannot handle any allocation. So think of it as a whole wasted space and in order for fragmentation to happen because it's not an issue with the other memory allocation strategies we had, you need three requirements and they all must be true. So first requirement is different allocation lifetimes. So that means, well, if I have two different variables, they might last a different amount of time. So one I might have to free before the other or vice versa. So in the case of stack variables, they do not suffer from allocation because all of the local variables have the same lifetime. They last as long as the function lasts. So you don't have any fragmentation, you just allocate things contiguously on the stack and then since they all have the same lifetime, they all die together, so you just get rid of it altogether. And also this is true for global variables because their lifetime is the lifetime of a process so they don't have to deal with fragmentation as well. Other one is, well, we also have to have different allocation sizes for fragmentation to be a certain concern, at least external fragmentation because, well, if they were all allocations of the same size, so like with your file system, with blocks, one block is as good as any other block, so it doesn't really matter, at least for external fragmentation, you won't have it as long as they're all the same size. And number three is the inability to relocate previous allocations. So in C, if you get an address back from malloc, like I said before, it was permanent. If you could magically move the user's memory while you could defragment it, if you wanted to, if you could move all the memory, so if there were a bunch of holes, if you had control over all the memory, you could just move it all together and then update all the pointers. And this is essentially what Java is allowed to do because in Java, you're not allowed to, everything's a reference and you're not allowed to use the pointers directly, so the Java virtual machine will actually move memory around to try and reduce fragmentation, and its job is just to update all the pointers so that everything is consistent and you can't even tell it's going on. So like we discussed earlier, there's internal and external fragmentation. So external fragmentation is when you allocate different size blocks and there's no room for an allocation between the blocks. So in this case, if my allocations were in red and the blank shows any of the free space, well, then if this little free space was too small to handle any allocations, I would consider it as having external fragmentation. Internal fragmentation is whenever you allocate fixed size blocks and then that's wasted space was in a block. So if these blocks were like your file system blocks and they're all the same size, well, your file would consume this red space and then any other space on the block, which would be in this dark color would just be wasted space because it's not using it. It has to fit within a block but it doesn't need the entire block. So any questions about those two? Kinda saw it before. All right, so our goal of a memory allocator if we want to implement malloc, which we don't in this course, but you will in three or four or five, four. So fragmentation wasted space, we want to prevent this. Ideas reduce the number of holes between blocks of memory and maybe one of the ideas was, well, if we have holes and external fragmentation is just wasted space between, maybe I want to keep that wasted space as big as possible because so it's actually useful and I can actually use it for other allocations. So we want to keep allocating memory without wasting space. So the memory allocator, typically it's implementation will be a free list. So keeps track of free blocks of memory of different sizes and kind of chains them all together. So it would just be a link lists of free space keeping track of how many bytes are free in each one. And because we need to handle a request of any size, if we allocate something, we just choose a block big enough for their quest, remove it from the free list. If there's some other wasted, if there's, you know, if it was too big, then any of the new free space, well, okay, I'll rewind. So if the free block was like 20 bytes and we only requested 10, well, then we would allocate 10 and make a new free element that's smaller, that's 10 now instead of 20 because that's what we have leftover. And for deallocation, you just add that block back to the free list. And if it's contiguous with another block that's free, you want to coalesce them together, you want to merge them to create a bigger block because if this is free and this is free, well, then I can just assume that the, well, then the bigger chunk of memory is also free. And while that's what you get to implement in a later course, but we'll just go over the general strategies and talk about them. So one strategy is called best fit. So I choose the smallest block that can satisfy the request. And yes, I have to search through the entire free list in the worst case. Unless there's an exact match in which case that is the free block I use for the allocation. But for best fit, I want to choose the smallest free block that can fit this allocation. Worst fit does the opposite. So it chooses the largest free block that fits this allocation. It also has to search through the whole list. So both these algorithms are fairly slow. With first fit, I just search through the free list and I just take the first allocation that fits. I don't care how big it is. So in the worst case, it still has to search through the full list, but generally you hope that it just terminates early and gives you some memory. So if I allocate some memory using best fit, so I just draw a big block here and I assume this is all the memory that's managed by your memory allocator. So any colored block means that is being used for an allocation. So there is a red allocation. And then in the blank background, with the number on it, that is some free space. So there would be a 100-byte size element of our free list. Then this blue allocation or teal or whatever that is, cyan, I guess if we're fancy. And then a 60-byte free entry. So if I was using best fit, which is to find the smallest free block that will fit this allocation. If I have this green allocation for 40 bytes, which free block do I use? Do I use the 101 or the 61 if I do best fit? 60, right? So want to find the smallest one that still fits this allocation. They both fit it. So 60 is smaller. So what would you do? You would allocate it as a 60 and then, well, it was only 40. So I have 20 bytes left over. So now my free list would have 100 bytes and then 20. So now if I allocate this allocation, that's 60 bytes in this purple color, well, in this scenario, I have no choice, right? I have 100 and a 20, it won't fit in the 20 because it's too big, so I have to use 100, right? So if I have to use 100, what's left over? 40. So I put it there. 40 is left over. You could put it at the beginning or the end. Doesn't really matter. Now I have a pinky allocation magenta, maybe, for 60 bytes and where do I put this one? I can't put it anywhere, right? I get a request for 60 bytes. I have 60 bytes free, but it's not contiguous. So I can't make the allocation here. So because of fragmentation, so this would be due to fragmentation, I can't fulfill this and I wasted memory, so you would get like an out of memory message here and it would look really weird because there's 60 bytes free. You ask for 60 bytes and you don't get it because you're out of memory. All right, any questions about this one? All right, so let's show that, well, what happens with the worst fit? So if I do the same allocations with worst fit, where do I put the green allocation in the 100 or the 60 in the 100, right? 100 worst fit is I pick the biggest one. So 100 is bigger. So if I put the 40 allocation there, well, it creates a 60, 60 byte free entry. So now if I have the purple allocation, well, where do I put this one? Well, in this case, it doesn't matter. They're both 60 bytes, so one is as good as the other. So I'll just put it on the left first. And now if I have the magenta allocation, well, I can actually fulfill that in this case. So I can actually fill all of memory. So in this case, the name is a bit counterintuitive because it says worst fit, but it actually gave me the best outcome. So any questions about that? All right, we are definitely gonna end early today, sweet. So problem with both of them is that they are, yeah, both slow and if you simulate this, so if you just run a program and see all of its calls to malloc and free and then simulate what would happen with these algorithms, well, what actually tends to happen for real programs is for best fit tends to either leave very large holes or very small holes. There's nothing really in between. Those small holes are generally just pure fragmentation and they're wasted space. Worst fit, if you simulate it, well, the name actually makes sense because it's the worst in terms of storage utilization. It has the worst fragmentation problems. And with first fit, the one that is much faster to implement and much less smart, well, it actually tends to leave average size holes, which depending on your program might actually be good because you can actually use those holes for other allocations and then it's not really external fragmentation. All right, any other quick questions? We are storming through this one. Well, all right. If there's nothing else, yeah, we can wrap up then in record time. So what we'll see next time is while the kernel has to implement its own memory allocation. So if you remember all the way back to lab zero when you wrote some kernel code, it didn't really do anything, but if it did do something and you required memory and you tried to malloc in it, guess what malloc does not exist in kernel code because it's only in the C standard library. The kernel doesn't have the C standard library. The kernel needs to load the C standard library and actually let your programs use it. So kernel has to implement its own memory allocation and it will do that in some other ways that we will discuss in the next lecture, but the concepts are the same. Or user space can use it for that memory allocation. Once you get into caring about the performance and memory utilization of your programs, you may or may not at some point in your life actually implement your own memory allocator. And well, guess what? That's what Chrome comes with its own memory allocator. You can just use your own memory allocator if you really want. Yeah, Python has its own memory allocator. A lot of programs have their own memory allocator because well, malloc has to be as general as possible. So if you can tailor it for your specific application, well, you can make it obviously somewhat better. So what we saw today, there's static and dynamic allocations. Dynamic allocations is where all our problems arise. So fragmentation is one of the biggest concerns and dynamic allocation, well, it returns you a block of memory. Those are called fragments and fragmentation. Between blocks is called external and fragmentation within a block is called internal. And we saw that when we were talking about file systems as well. Then for our three general malloc strategies for memory allocation, well, we just have to maintain a free list and then we can either do best fit, worst fit or first fit. All right, any other questions, concerns, comments? It differs a bit. So kernel space, a lot of the optimization as well. If everything's of a fixed block size, then you can just write your memory allocator to deal with those specific sizes. So for instance, in the kernel, there's a special memory allocator just for inodes because they're all the same size. And in fact, in lab six, if you've done lab six or started in lab six, that's also kind of memory allocation. So that bitmap is keeping track of inode allocations and it's essentially a memory allocator, but it's super efficient. It doesn't have a link list or anything. It's just one bit allocated or not because everything's the same size. So generally, if you can do that, then that's the fastest thing, but we'll see another one in the next lecture that is also better and more general. All right, so with that, just remember, phone for ya, we're on this together.