 All right, welcome back to 353. So today, probably another short one, we get to play with memory allocation. So of course, definitely wrapping up. So when we want to allocate memory in our program, easiest way to do it is what, I mean, we can just create a global variable that's called static allocation. So if I know the size of memory I need ahead of time and it's gonna last the entire duration of my program, well, I could create a global variable, let's say buffer that is a char array of 4,096 bytes, size of a page. So how this works is your compiler's just gonna pick a random virtual address to put that buffer in and then whenever you load your program through the exec ve system call, the kernel as part of it, like it's gonna describe in that executable file that you need 4,096 bytes and what virtual address it wants it at. And then the kernel's job is to go ahead, set aside that memory for the program, set aside 4,096 bytes and then make sure that that's actually valid. And if the kernel runs out of memory loading your program, then well, too bad, your program probably is not gonna load. If your program loads successfully, then you have access to that memory, you don't have to worry about allocating it, you don't have to worry about running out of memory, you don't have to worry about anything. That memory is going to exist as long as your process does and then like you don't need to free it, you don't need to manage it, you don't need to do anything. Whenever your process exits or gets terminated or gets killed or whatever, that memory goes away, the kernel's gonna free it for you. So our programs do not use static allocation. Most of them have some aspect at least that needs dynamic allocation. And we probably use this, we probably remember it, but we haven't talked about how it could be implemented. So often we need dynamic allocation. Sometimes we might only require memory under some conditions. Static allocations, we have to know the size of it ahead of time and while it would have to, in order to make sense, it would need to be active as long as our program does. At some point, probably you don't need that memory anymore, so it's just gonna be wasting space because it's around until our process dies. Also, static allocations, you have to know the size of the allocation ahead of time or at least your maximum size and then you have some type of limit. So if I don't know the size of it ahead of time, then I also have to do dynamic memory allocation through malloc or something like that. And then the other question is, well, where do I allocate memory? So there's actually like, we know malloc that can grab us memory on the heap. Well, there's also a call to actually grab memory from the stack. If we want it, the compiler inserts it for us and that's mostly done for us in C. So like if you think of a normal variable, if you just have like int x in a function, instead of malloc, the function to actually allocate space on the stack is actually called alloc a. You can use it if you want. You've probably never had to because the compiler does it for you. So whatever you write in x, this is the reason why most people you see compilers is, well, that has to live in memory somewhere and then the compiler will go ahead and kind of internally insert a call to alloc a that wants the size equal to the size of the data type. So for an int, it'd be four bytes. And then it would have a pointer to that memory, which is on the stack, which it can go ahead and use. And then the rule behind that is if you want to use alloc a yourself, you're free to, you don't have to call free or anything with that because the rule is whenever the function that called alloc a returns, just freeze all the memory that you allocated with the stack and doesn't matter how much memory you allocate to it. I'm going to be really, really fast because all it does is just restore the stack pointer to the previous state from when you called the function and then it's just going to reuse that memory for other functions and go on and go on. So this isn't going to obviously work if you actually try to use that memory after returning. So yeah, so let's go to our fun example. So let's look at this code and I'll give you a minute to go after it or read it. And then you tell me what happens when I run this code. So I'm going to like for these two printf statements. So I call foo and it returns an int pointer inside of foo, my declare int x and then I print the address of x and I return the address of x as the return value of foo. So now it's captured in P and now if I print off the value of P, it's a pointer. What is the value of that pointer or should like the address of that pointer be the same in both print statement should be different. What should actually go on here slash is this a great idea? All right, we got a seg fault or they're different. Yeah, yeah, I'm not accessing the memory. It's the same address each time, right? I print off the address of x, it'll be some memory address wherever x is be some address, I don't know, let's call it a zero zero and then I return a zero zero. That should, I should get a copy of a zero zero back store it in P and then when I print off P it should be like a zero zero or whatever that random address is, right? So they're both the same. Okay, so would be surprising if they were different. Anyone care to explain? Yeah, yes, well, so after I returned so like that address is to some variable on the stack that should only exist as long as foo exists, right? As soon as foo is gone then that memory is no longer valid. It'd be like with malloc if you try and use something after freeing it. It is what's everyone's favorite C word. It is undefined behavior, right? So use after free undefined behavior trying to access an invalid address, also undefined behavior and see the compiler is smart enough to know that, hey, the address of x isn't valid outside of the function foo. So if you return it, it's undefined behavior so I can do whatever I want and they decided whatever they want is just to set it to null. But it could be anything doesn't have to be the same thing. You would think it would be the same thing. Let me see if, okay, let me just get rid of it. Actually, let's see if I can recompile it. Let's put a new line, recompile. So yeah, it even tried to warn me. It said warning function returns address of local variable so I shouldn't be able to do that. What is this with today? Okay, whatever, maybe it's this room. So it says warning address of local variable. It's just a warning but it's technically undefined behavior so it does whatever it wants. If you want to get around it and fool the compiler so I worked in compilers so I know that they're not terribly smart. So if I want to fool the compiler I could do something like this. So if I do that, suddenly that warning goes away. I fixed it, right? So now if I do this, it's actually the same address both times because well, I outsmarted the compiler so it's just what you guys said before. It's just an address, it doesn't really matter. It just stays the same value. So I use it, it's fine. It's the same address. So now that it's the same address, what happens when I uncomment this line? What value am I gonna see? One? Yeah? Not one? Any guesses as to what not one or just not one? Just not one? All right, well let's see what we got. One. So that's one. Why? We got lucky, right? So it was one here before. We returned the address of it so the stack would get smaller whenever we returned from it. So see it's not gonna, I mean it could but it's not going to because it wants to be lazy and fast. It's not gonna like zero out. It's just gonna move the stack pointer down so the value doesn't change. So it's still one. And then we print it out. Still one, we haven't modified anything. We're all good and nothing changes. But we only got one there because, well, we got lucky. And in fact, you would think that if I just print the same thing twice, it should be the same value but any guesses will it be the same value? Probably should be the same value, right? One one? Yeah, makes sense, right? One zero. Why? Yeah. Yeah, printf uses the stack for whatever it did. It just didn't happen to use it before it printed the value the first time but the second time, you know, it played with the stack memory. Somehow it set it to just be zero for some reason. So just ended up being zero because we're pointing to invalid memory. It doesn't really matter. And I could just make it like, if you want it to be really weird, it's like, if I do something like this and compile it, like it just gets like really weird values because, well, I mean, I'm just getting lucky or unlucky however you think about it. So any fun questions with that? Yeah, that doesn't print an integer. So it will, because like, it's a printf with a format specifier d, it'll just take whatever that's pointing to, use the four bytes. Doesn't really matter what they represent and try and make it an integer. So it'll always print some type of integer whether or not it makes sense. It's up to you. Like, it's just C, it's just memory, right? So I tell it's an int, but if I told it was a character, it would print, try and interpret the memory as a character or do whatever. All right, anything else we want to do with this fun thing? I think that's the most fun we can have. Yeah. Yeah, each thread has its own stack. They should. Yeah. Well, yeah. So they should have their own independent stack. If you could get addresses to other stacks, I mean, it can access it because it's all the same virtual memory. But in order to function correctly, they should have their own independent stacks because that would get weird and probably be like impossible to debug. Yeah, anything else with this? This is good fun, right? So just don't do that. Oh yeah, a question. Why doesn't C prevent you from doing such? Because C is all about going fast and checking would be slow. So C lets you write whatever the hell you want, which is the great thing about C and the not so great thing about C. All right, so that was a fun adventure in Stackland. So back to dynamic memory. So like there's the malloc family functions everyone likes. That's the most flexible way to use memory, but also the most difficult to get right. We have to properly manage our memory lifetime. So whenever we're done using that memory, we have to remember to call free on it. We have to remember to call free on it exactly once. After we call free on it, we're not allowed to use it. We also have to make sure to not call free on it twice. Lots of those things. And there's also a new concern if you were to write the memory allocator itself, fragmentation. And that is a unique issue for dynamic memory allocation. So fragmentation, just essentially wasted memory, like when we talked about file systems, but it also applies to memory. So it's more of a unique issue for dynamic allocation because I can allocate memory in just different sized contiguous blocks, right? You just say, hey malloc, I want 100 bytes of contiguous memory. I want four bytes. I want 4,000 bytes. Doesn't really matter. You can request whatever you want. And you also can't move memory away. So ideally, you might be able to move memory and compact it together if there's holes in the memory. So if you use, I don't know, 100 bytes, and then you allocate 10 bytes, and then another 100 bytes, and then you free 10 bytes. Well, that 10 bytes might be useless because, well, nothing fits in there anymore. But if you could move things, so if I could like move that 100 byte allocation over into the gap and make them all contiguous, maybe I wouldn't have any fragmentation issues, but you can't do that in C because every allocation is essentially permanent. So if you call malloc, you get a pointer back. They can't move it because you're going to access memory through that pointer. And it would be especially rude if they just changed the value of the pointer and suddenly you just seg faulted through no fault of your own. You use the pointer you got back from malloc and then suddenly seg fault. So Java does not have this issue because in Java, it has like a garbage collector and references, but references are just fancy pointers that can change. So what Java will do, which is why it's slow, is say you have a reference to an object, which is just a pointer. Well, whatever Java decides to run its garbage collector and clean up memory, it's allowed to move objects, but it has to be careful, which is why it's slow because if it moves an object in memory, it has to update every single pointer in it so they all agree with each other. So it has to track all the pointers, has to make sure it changes all the pointers like consistently. So usually, or at least one dumb thing it can do is just pause the entire program, wait, update all the pointers and then let your program resume. So that's something you can do with some programs, but C does not have this luxury. So fragment, just a small contiguous block of memory that cannot handle allocation. It's just like a whole memory wasting space. And there are three requirements for fragmentation and they all have to be true. So your allocations have to have different lifetimes. So whenever they're valid till when they're invalid is a different period of time from one another. They also have to have different sizes. So different sized allocations. And then the third one is you can't relocate them after you've already allocated it. So for stack variables, they all have the same allocation lifetime. So they all exist as long as the function exists. So there's no problem with fragmentation for stack variables. If I need four bytes, well, I get four bytes on my stack. If I need another 20, I get another 20. They're all contiguous all beside each other. And then I get rid of them all at the same time when that function exits. So I don't have any problems with fragmentation. If I have all my allocations are the same size. So think of like, I don't know, page allocators or your like I node array in lab six. They all have different sizes or sorry, they all have the same sizes. So there's no fragmentation because while one's as good as any other or like if I have a page is as good as any other page, I'm not wasting a space because they're all the same size. And then the third one is you can't reallocate or relocate previous allocations. Might hear this as being termed defragmentation. So I'm just kind of moving memory around to make everything contiguous, make everything close together. If anyone's like, I guess this might date me, but anyone have like their parents like start running the defragmentation on their computer and tell you not to touch it because it's doing stuff. Or am I too old? No one's done that. All right, nevermind then. That's never happened to me either. It was just like something weird. Yikes, all right. So last year they did, but okay, new generation. So there is two types of fragmentation. There is internal fragmentation and external fragmentation. So external fragmentation occurs when you allocate different size blocks. So there's no room between the blocks. That's basically the fragmentation that the memory allocator actually cares about and can control. So for example, if I have like a red allocation here and a red allocation here, maybe any other allocation I have is bigger than this blank space between the two red allocations. And suddenly, well, I can't use that memory because nothing will fit between these. I have to use this space. I have no other choice. Nothing will ever fit in here. So it's wasted memory. It's external fragmentation because it is between blocks. Internal fragmentation is when you allocate and waste memory within a block. So the amount of memory, the memory manage ur sets aside is bigger than the amount of memory you actually need. So there's wasted space within a block. So if this is our big allocation, we could split it off into two parts. The red is the part that the user actually cares about and is using. So it would be here. And then the wasted space would just be in the dark box there. That's just the rest of the block that it is not using. So this generalizes to any type of memory. So your file system, still an example of this. Or for malloc, maybe it only allocates some powers of two and you ask for five bytes or something like that. It would do the same thing. So for internal fragmentation, even you could still have that when you have fixed allocation sizes because it's just wasted space within a block. So maybe I get rid of external fragmentation because everything is the same size, but I couldn't get rid of internal fragmentation as long as they use less than the memory that the block is. So memory allocators, yeah, in terms of what you saw before, those were conditions for external fragmentation for the memory allocator. Internal fragmentation, well, if you don't use as much as the memory you get, then well, you can't really, it's like a trade-off. You kind of trade off external fragmentation for internal in that case. So our goal, minimize fragmentation. So just wasted space, we want to prevent it. Whoops. So, sorry, back up. In practice, usually create a very huge array, not use all of them since C has no array list, then it would be very common to have internal fragmentation. So for that, like a big array of everything the same size, you generally don't consider elements you're not currently using as fragmentation because you could use them for useful things. Just because you're not using them does not mean they're considered fragmentation. So as long as I could use them for useful things, then they're not considered fragmentation. So just having a too large array, yeah, you might be wasting space because you don't need it yet, but in terms of like the memory allocator, you're actually, it wouldn't be considered fragmentation. So we want to minimize that fragmentation just wasted space, which we should prevent. So we're kind of putting ourselves in the shoe of people writing malloc. So we want to reduce the number of holes between blocks of memory. If you have holes, you want to keep them, maybe you want to keep them as large as possible so we don't have fragmentation. We can use that memory for more useful things. So our goal is to keep allocating without wasting space. So whenever you're implementing a memory allocator, that's very, very general. So think of like how you would implement something like malloc. Usually it uses a link list called a free list and that keeps track of free blocks of memory by like chaining them together. So initially the program just asked for a bunch of heap space and then initially the entire heap would be free. So just be like one giant block of memory that can use for whatever it wants. And then whenever a request of any size comes in, well, you just choose something from your free list that's large enough to hold the request. You would remove it from the free list. Maybe you split it aside. So if you had, I don't know, if you had 4,000 bytes and then a request came in for 20, well then maybe you split that 4,000 into 20 and then 3,980, did I do my math right? Yeah, boom. All right, I shouldn't do math live. That's really a bad call. So split it, I use the 20 bytes, the other side's free. I put it back on the free list and then I can do another allocation with that. Overtime might get pretty messy. So for allocation, whenever someone requests more memory, well then you just choose a block in your free list, large enough for the request, remove it from the free list, maybe split it off into parts. And then for deallocation, you would just add the block back to the free list. If it's contiguous to something else that is free, well, you can merge them together because it's now a larger chunk of contiguous memory. So, in general, there are three general heap allocation strategies. One is called best fit. So, and this is how to choose something in the free list to use for the allocation. So best fit is I choose the smallest block that can satisfy the request. So I need to search through the entire free list unless there's an exact match, in which case that is the best fit. So I shouldn't bother looking any further. The other one is called worst fit. So it is I choose the largest block which has the most leftover space. Also, I have to search through the list, check the biggest block instead of the one that barely fits or exactly fits. And then the other is just kind of like the FIFO of this algorithm. So instead of searching the whole list, I just search the free list until it's big enough to satisfy the request. And then I just use that one. I don't really care what I get. So let's have this situation. So here, if I want to use best fit in this situation, anything that has a color on it is an allocation that's in memory that I can't change. So this is the amount of memory that we are managing. So this would be like malloc is managing this giant block of memory. Someone has made an allocation that is in the red. Someone, and then they also made a bigger allocation which is in the blue or teal or whatever that color is. And then the blank boxes here are free space. So there is a hundred bytes free here. And the other entry would say there are 60 contiguous bytes free here. So now if a green allocation comes in that is 40 bytes, my question is well, which of these ones should I use? Should I use the hundred byte free space or the 60 byte free space in order to allocate this 40 bytes if someone requested 40 bytes and I'm using best fit. So which one do I use? We got 50, 50, the hundred or the 60 if we're using best fit. 60, right, 60. So best fit is I want the smallest one that can still satisfy the request. So between the hundred and the 60, well it would be the 60. So I would put the green one there at the beginning at the end doesn't really matter. I would split it off and then of that 60, well I use 40 of it for the allocation and then there's another entry in my free list that's 20. So now I have 120 bytes left. So now I have this purple allocation comes in that's 60 bytes that's a bit hard to read. And now while where do I allocate this? Well, I have no choice, right? Doesn't fit in the 20 block, I have to put in the hundred. So I could put it at the beginning, put it at the end, I'll put it at the beginning. So you would split it off into a 60 for the allocation and then a 40 for the free list. And now our memory looks like this. And now while we have 60 bytes free but a new allocation comes in and they're requesting 60 bytes but we have nowhere to put it, right? So we have 60 bytes in total but it's not contiguous. So we actually can't do the memory allocation here. So best fit didn't work out for us in this scenario, right? So these would both be considered loss based due to fragmentation, they are now useless. All right, so now if I do worst fit, let's see if things are better. So if I'm doing worst fit, then I have a 40 byte allocation between the hundred and the 60, which do I use the hundred, right? The hundred, that's the biggest one. Worst fit is to pick the biggest free block there is. So I would allocate it in the hundred block. So I use 40 of it for the allocation and then the rest of the remaining 60 bytes go back on the free list. And now when I have the purple allocation come in that's 60, where do I allocate it? That doesn't matter. There's two 60 blocks, one's as good as any other. So maybe I use the first one. And then now I have the pinkish allocation come in that's another 60 bytes and boom, it fits exactly in the remaining space. So it looked like worst fit, despite its name, worked better than best fit. So turns out names can be deceiving, right? Well, turns out no one actually uses either of those because they're both slow, right? That you like one, they're slow because you have to search the entire free list every time for every allocation. And then if you go ahead and simulate these things, best fit tends to leave like very large holes and very small holes and the small holes would probably be considered fragmentation. If you do worst fit, if you simulate it, simulation actually does say it's the worst in terms of storage utilization. So it's name is actually accurate despite that silly example. And then first fit actually tends to leave like average size holes that probably would not be considered fragmentation. And also in terms of implementation, it's a lot faster because it doesn't have to scan the whole list every time, just finds the first entry that fits and uses that. So if you wanted to implement malloc for fun or for profit or whatever, you could implement it. It just takes a link list and you can just do first fit and use that SBRK system call to just get heap space if you want. So you could do that. They'll come up in different courses if you're more interested in system programming, but for now we just kind of just leave it as is. All right, so any questions for today? Sweet, we're definitely getting near the end of the course because the lectures are getting very short, which I guess is a good thing, right? Because imagine if I went on about threads and things that difficult until the very end of the course. Yeah, so I intentionally push stuff back that are easier so we have time to breathe. So, oh yeah. Yeah, so the question is how does realloc work? So realloc would try and extend it if it can because it doesn't have to really do anything. It just shifts it a little bit and makes life easier. And also realloc would copy the memory too. So if it can just grow it, it doesn't have to copy memory or anything. So it would prefer to do that. So yeah, realloc would just see, hey, is like in the free block after you, is that free? Can I just use that space and just extend it instead of actually grabbing another one and copying memory and all that stuff? So yeah, it would prefer to just extend it. All right, any other, whoops, any other questions? All right, so we'll do some, oh yeah, sorry, what do you mean by linking? So for this, so let's, oh okay. Yeah, so the rules of C for like realloc and stuff, it has to be contiguous. So you can't divide into two chunks. Otherwise, like your raise suddenly wouldn't work because it just stupidly just does pointer arithmetic and it assumes everything's contiguous. So if you wanted to split something into two like this, you pretty much have to use like a link list or something like that or something else to keep track of the breaks in it. But in terms of like this, like maintaining the free list, there's some hand wavy stuff there because if you freed blue, then you wouldn't have like a free 60, a free 100 and a free 60. If you freed blue, it would try and, they're all contiguous, so it would probably try and make like a free 220 all together. So it would try and merge it. We'll go over how we would merge in the next lecture because for this, it's just kind of hand wavy, you have to scan everything, see if they're contiguous. Yeah, so how do you get a split in the first place? Yeah, so basically the question is, how would I end up in this situation? Yeah, so okay, so the question is, why is blue in the middle? So this could be in the middle because this is like halfway through running a program. So yeah, initially we're just managing this big block of memory and then maybe we just have a 40 allocation. It goes at the start, makes sense, right? Then we have 100 allocation, it would go after red. Then we have another 100 allocation and then we put in blue and then we free that 101 and then suddenly we're like this, right? Yeah, so contiguous in terms of virtual memory but we know the kernel can just, I mean, just map whatever pages it wants. So it'll be contiguous in terms of virtual addresses but maybe not physical but it doesn't really matter because it's random access memory. The only thing holding us back is the TLB speed really. But yeah, so this would be contiguous in virtual memory which is an easier illusion to do, right? Then just doing it with physical memory because for physical memory, like that's technically memory allocation too but the kernel only cares about pages, right? And each page, they're all the same size so one page is as good as any other page, it doesn't really matter. And yeah, for malloc and stuff, your allocation can span multiple pages, do whatever, it doesn't matter. But in fact, for performance, I guess that's kind of a detour but like some people have done performance analysis on their programs and like the same program got way slower for what seemingly looked like the exact same execution, same parameters, same everything. Turns out it got slower because someone's username was slightly longer and because their username was longer and was stored on the stack at some point, it crossed a page boundary and because it crossed a page boundary, it got a lot slower because that's another thing and it just had enough. So yeah, weird things can happen because of that. So weird things like yeah, so if you want your programs to go fast, I guess too long to read of that one is shorter user names are better. So mine's only three characters, so just J-O-N, boom. My programs go fast. If you use any more than that, yeah, you're screwed. All right, any other fun questions? So yeah, if you wanna be better than me, just, okay, I wasn't gonna, yeah, just have your username as X even though that's like a terrible name too but could do better or just J would probably be better. All right, any other fun questions? Otherwise, I will be around to ask things. So here, let's wrap up. Kernel has to implement its own memory allocation so we haven't gone to that because while the kernel wouldn't have malloc or anything like that because while the kernel exists before malloc exists, right? So the concepts are the same for the kernel as for user space memory allocation. Kernel just like the kernel just doesn't really care about virtual memory. Well, I mean it can, but that can be contiguous. In terms of allocations, there's static and dynamic allocations. So just global variables versus using the stack or the heap. If I'm doing general dynamic allocations like malloc, fragmentation is our big concern. In fact, web browsers and things like that will ship their own version of malloc instead of the standard malloc because they really, really, really care about memory. So like Chrome, Firefox, all those web browsers, they don't use libc malloc. They have their own malloc because they really, really care about that. And for dynamic memory allocation, just returns blocks of memory that have to be contiguous in virtual memory at least. And fragmentation between blocks is external. That's what your memory allocator mostly has control over. Fragmentation within a block that the memory allocator manages is internal. And in general, there are three general allocation strategies. If you just have general different sized allocations, there's best fit, worst fit, first fit, and we'll see better ones tomorrow. So just remember, pulling for you, we're on this together.