 Good Monday, everyone. Let's get ready to continue. Any questions before I start? Project 4 due today? You're already for a rude day and night. Next 10 hours give me a problem. Yes. So is it hard? Well, I guess you're not done yet. Maybe we'll talk about that afterwards. Harder or easier than Project 3? Harder. Yes. Cool. What's that? 2 hours a day for 4 weeks? Cool. I don't want to do any extensions because then it'll cut into your project 5 time. Project 5 is also hard, so just having a less time there. Cool. Your midterm 2s are being graded. They should be back quicker than midterm 1. We'll see. No promises. I don't want to spoil anything before I give them down. Nope. I'll tell you about it later after it all graded. I can't promise, but I'm highly confident it will be faster grading this time around. And I think it'll be cooler for you guys, too. But we'll see. It could also blow to my face and you'll get them back until the end of the semester. That's, you know, a large range of possibilities. You have to prepare for anything. Cool. Okay, so let's recap. We were looking at, on Friday, we were looking at how can we actually simulate call by name when all we have is, sorry, pass by name, when all we have is pass by value. So here we have our code where we have an int y. Wait, is this what it would be? Yes. Okay. So this is our original code where we have a function p that takes in an int y. We set a local variable j equal to y. We increment global i and then we return j plus y. And we have this q function that has a local variable j and i equals zero to global i. And then we call p and we pass in i plus j. And remember, the cool and weird thing and counterintuitive thing about pass by main is that everywhere that y is used, the expression i plus j is going to be substituted there. So in our main function, we just call q. And so we can actually simulate this. So on the left is our pass by main program. But we can get these exact same semantics with pass by value if, everywhere, instead of having y, instead of y being an integer, because in pass by value this will be a separate copy of that integer, we evaluate the expression that was passed in here every time y is used. In essence, what that means is we turn this i plus j expression into a function. And here I've done it as a global function. And so the idea is every time this function is called we compute brand new what's i plus j. So this is what we pass into p and so p takes in a y parameter which is a function pointer. And so instead of an int, every time we were going to use y, we said call y as a function. So in this way we can actually compile this and we can get pass by main semantics to get the same output 5 which is what we got on pass by main. Any questions on this part? How many people know Java? I know it pretty well. Only a third of you raised your hands. I feel like that's very low percentage. Don't you learn Java in your intro classes? So you've forgotten it since then? Haven't used it since then? Cool. I don't know how I feel about that, but I guess good. So now that you've learned parameter passing semantics, what is the parameter passing semantics of Java? Pass by reference? Pass by copy of the reference variables in Java. But do you see any pointers in pointer operations? It's complicated. Yes, like a Facebook relationship status. Is it pass by value, pass by reference, or pass by name? So let's look at an example. So we have a class testing with a field foo. We have a class parameter passing. We have a main method. We create a new object bar. We create a new object snap. We set bar.foo equal to zero. We set snap.foo equal to 10. We call a function pass by question mark, bar snap, and then we print out bar.foo and snap.foo. So what is pass by question mark? We're taking two testing classes, A and B. It's going to set B equal to a new testing. It's going to set B foo equal to 100. And it's going to set A foo equal to 42. So what is this going to print out? Zero 42? Ten. 40 to 10? Yeah. Why? It gets changed with B does it? Effectively. What do you mean A? So A and B are local parameters there. So foo gets changed with snap does it? Or bar gets changed with snap does it? Foo is the field. Okay. So bars, what gets changed? Are these? Yeah. So the field in its class, right? It's member variable. This foo gets changed. So we set A dot foo equals 42. Right. And then B. So what happens to snap though? There's a new object created. That changes the reference to error. So should it print out 100? If we're saying it's pass by reference? Because we have got new testing. It should print it because there's a new object defined inside the function. So it's shadowed to the earlier function. Why? But if it's pass by reference then this B, the new testing would change snap to be equal to this new object, right? Can you just make up new things? You kind of can actually. So you can see how this is changing our kind of notions about how we talked about pass by value, pass by reference. Is it pass by name? No, we can emphatically say no. It's definitely not. So what's the problem with this combination here? So why is it not purely pass by value? Well, everything's an object, right? We're passing in an object. We're passing in bar and yet its field gets changed. So what it is not doing is it is not passing in a new copy, a new value of that object bar, right? So we say, okay, it's not pass by value but is it pass by reference? Why is it not pass by reference? Because snap doesn't change this new object that's created, right? Snap doesn't output 100. Snap.foo is not 100. Yeah, so it's kind of a weird combination of both, right? And that is part of the problem, right? How do you explain this to somebody if all they know is Java? So we talked about, yes, it's pointers underneath but Java doesn't have pointers so you don't have to learn pointers to learn Java. So how do you explain the semantics of this as, oh, they're just pointers underneath and it's pass by value but all the objects are really pointers to objects and so then it makes sense because you're passing in a copy of that pointer to that object and so setting it equal to a new thing doesn't change the original pointer but altering a field of that pointer so dereferencing that pointer does change that object. So the other way to think about this remember we talked about assignment semantics? What are the assignment semantics of Java, right? And so it's kind of a pass by value and assignment sharing semantics. So here we're passing it in by value but we're, and this is not standard in terminology and it's really, when you kind of break it down this is the problem I feel is that when you break it down it's very clear of what it's doing, right? You're saying these are pointers we're copying by value pointers in here except you have to remember that primitive types are not pointers they are actual things so that also throws a wrinkle in this understanding Java and it's so many things. But to go back to the very first answer it's complicated, right? There's probably not something you'd expect to talk about when you talk about Java it's much simpler than C and C++ but I think in this instance it definitely is a bit more complicated because you have to do more mental thinking about what does this actually mean. Cool. Let's talk about something crazy. Let's talk about local functions. What do I mean by local functions? Yeah, so, so what was, what did we learn when we talked about type systems about function? The type system that we used to discuss types and type equivalents and also hinderly mill their type inference. What was the important thing about that type system with respect to functions? Function is a type. Function is a type just like any other type, right? There's no difference between functions and classes and structs and ints, right? It's a type just like any other and so it can be an argument to a function, it can be returned from a function, it can be a field in a structure. So similarly, if we kind of say, okay, we can have global variables in C, C++, we have global variables, we have local variables, and we have global functions. We have local variables which are only applicable to that function. So why can't we have local functions as well? If you try and think, what makes, hey, what makes functions so different that they're so limiting and it's composed to all these other techniques that we're talking about. And so, what if we wanted, how would we actually implement a language that allows defining local functions and what are the challenges here? So the key thing is, these are functions that are only valid in that scope. So scoping is pretty easy to deal with. Would you agree with that? We do it just like with variables, it's valid only in that scope. We've all done this, you know how to do this. So let's look at an example. So let's say we have this C language where we want to allow local variables. So we have a function foo, we have an integer x, so what's the scope of x? Yeah, just inside the body of function foo. Then we have a local function bar and that adds a local function baz and bar increments x. So that's x equal to x plus one and then it says if x is less than 10, call bar. So can I do this? Is the function bar, what's the scope of the function bar? Yeah, bar is valid, well, essentially from here to the end of foo, what about inside its own function body? Yeah, that's how we get recursive functions, right? Functions can call themselves, actually, well, we'll see foreshadowing a little bit when we get to lambda calculus, that's not, you don't actually need to have a name in order to do recursion, but here this is an easy way for us to do recursion. The function main bar is applicable inside of its body and so it can call itself, therefore any functions that are defined inside baz can call themselves. So what x is this changing? There's only one x and we know, so we're going to go static scoping, so we're talking static scoping here, so this x is going to map to this declaration of x here. So bar, the very first thing it does is call baz. That's all it does. And foo sets x to be zero, calls bar and then prints out the value of x. So what would be the value of x? I think there's no break for the current. Baz gets called every time? Does it get called every time? So we call bar, what does the first thing bar do? It calls baz. So what does baz do? Increment x to x is one, it's x less than ten. Yes, call bar. We call bar, what's the next thing that happens? We call baz. The first thing it does increment x equal to two, so x is now two. Is x less than ten? Yes, call bar, what's the first thing bar does? Call baz. So it increments x is three, so it's going to keep going until x is less than ten. So when this is false, so that will be x is equal to ten. And so when it returns, when all these baz return, what happens after them? In this bar, this bar returns, which will return here, this will return for this baz here, this will return for this bar and so on and so forth. All the way down to here, we'll finally get out of this one and print out the value ten. Cool. So if we were to compile this with a magic version of DCC that supported local functions, it should print out ten. So how do we deal with accessing local variables inside of a function? Local variables, but they could be the smallest one. Kind of. You take the one that's closest to the scope, the one you're trying to dereference it. So how did normally, before we had local variables, access this variable x when we set x equal to zero. So for local variables, we gave them a fixed memory location. For local variables, what do we give them? A fixed offset from the base pointer, into our function frame. Our function frame must contain this variable x at a fixed offset. But now what happens when we call this local function bar? We're calling a function. So we have to add another function frame. But then that calls another function and then it needs to access variable x. But whose function frame is variable x in? Foo! So how can you do that? Can it calculate the fixed offset and says, okay, I know there'll be this many function frames before me, so from my base pointer, calculate. Let's look at the problem. So, can what we use with C-deckle can this actually, can that support local functions? So the problem is here, right, we have foo, then we call bar and then that calls baz which then we go inside baz and now we set x. The problem is this x is inside foo. Right, in the C-deckle convention all we have is our local variables, right, and the parameters to our function. We can access local variables and our local variables. But the problem is that here we are in baz and we essentially need to walk up two calls on the function chain to get to x because x is in foo. The question is, is this a fixed could we walk up just twice? Because we do have, remember we saved our previous function frames base pointer. So we have bar's base pointer we could follow that and then from there we can find the base pointer to foo and then from there we can use the offset. So would that work? We know that we have bar's base pointer, right? Save on our stack is bar's base pointer. We can just access it, right, it's on the stack. We can read it. So now we know where it's base pointer. We don't put it as our base pointer, we treat it as a variable. So we say this is that base pointer and then we know 8 bytes or 4 bytes above that is going to be bar's function caller's save base pointer. So we can get that and then we say aha, that offset was going to be plus 4 is going to be x and we can use that to access x. What's the problem with that? Will that work here? We could do it, we have all the information. So what happens now, now we say x is less than 10, we call bar we put a bar on the stack the one we call baz again and now we need to access x. Now how many function frames is x of the stack? One, two, three, four, yeah four or five depending on how you count. Think five, right? So now, before we only had to go up two but now we have to go up four and the problem is that how do we know which one to go up? So really what we'd like and the way we get around this is the base pointer here store the function that called use base pointer. So this means that function bar called function baz. That's the way to read this stack of function frames and function bar was called by baz and baz was called by bar and bar was finally called by foo. So all of these links only give us the calling structure of these base pointers. What we really want but what is this scope decided by? Is the scope of what is x? Is that decided by the call structure? What's it decided by? The scope, yeah the brackets, right? The scope rules. So instead of this, what if we passed in when we called the function, it's parent's base frames. It's static parent's base frame. It's parent's scope base frame. So for baz that would be bar and bar would get passed in foo's base frame. In this way, we can add what we call an access link. So this would give us a way to access our static parent's scope rather than the person who called us scope. So we'd have to change our calling convention. So this is why this is tricky, right? This is a language feature that we're like yeah, this should be, you know, this is something well I guess I didn't start with that but do you think that local functions would be useful? Or do they just make for complicated questions that I can ask you? It's one of those things I don't know if you ever heard this term. It's called, what do they call it? Blurb. Basically the idea is it's very hard to think about how useful a programming language feature is until you've actually coded in that language and then once you start thinking in that feature then you go back and go, oh man how could I have a code in Java that doesn't have these local functions or whatever. So there's this idea that's actually hard to actually understand. I can try to tell you how cool macro programming is in lists and how they give you all these awesome metaprogramming benefits but you'll just look at me like a crazy and so you actually took the time to do something in it and saw the benefits for yourself and then you look at other languages and go, man this is garbage, why can't I do this cool stuff? So what we can see here, even if this was a feature that we wanted it's actually difficult to implement, right? We need to change, fundamentally change the calling convention because each function does not have enough information to support this. So we can do this, we can add an access link and a function can follow the access link to find the correct lexical scope for all the variables and so if we were here we would see that Baz would have so Baz's parent lexical scope is what? Bar and Bar's parent lexical scope is what? Foo so we know, statically we know X is in whose function scope Foo so how many access links do we have to travel up to get to X's function frame? Two. And there it doesn't matter, as long as Baz has its parent function frame to the access link and Bar has its parent then it doesn't matter how many times and how far down the call stack we are, we can always get back up with two jumps of the access link so I'm going to put this on the left so Baz would have Bar and Bar would then have an access link up to Foo so now we can hard code this we can say follow this link twice and then X will be in a fixed offset from that second value no matter how deep our call stack is, so this Baz will have a pointer to this Bar and that Bar will have a pointer to Foo and it's going to be all the way down we have all this crazy call stack we have each of the saved frame pointers points up the call stack and we will also have our access links everyone points to their lexical parent so it looks something like this if there was an X declared inside Bar then we would only, so every reference to X would just travel up one access link when we reach here the scoping rule has already been applied yes, because the scoping is happening statically this is at runtime, how do you reach that variable that you know you want so statically we can tell if there is an int X here for Bar we would say, I know it's one access link up so follow the access link up one and this is the offset here I know this X is two so I can follow two yes, so JavaScript well I guess I should say this is one way to implement it but yes, so this is part of JavaScript if you're not aware or familiar, its scoping rules are not braces its scoping rules is function levels and in JavaScript you can have anonymous functions which are super awesome so what people do is to create your own scope you want your own variables and you don't want them to leak out to other JavaScript I was going to say other JavaScript scripts what you do is you create an anonymous function and that way all your variables declared there are local to that function set up whatever things, handlers you mean and then you immediately call that function so you declare the function and call it basically all on one line and that helps with the lexical scoping and on JavaScript you can also do like event handlers, this is another way to do event handlers, you can say when this button is clicked increment this variable by one and that variable will be outside the scope of this function so it does in essence something similar to this so this gets into the whole thing of closures which we're not going to go into here but you can think it lexically looks, it sees what variables it's accessing and it packages up that information into the function so that when it's called it can access that variable questions, more questions here okay, time for another here's your own education do you want to learn about keep memory management or start on lambda calculus I'm going to, because we left and right oh, uh okay, well I don't know what I'm going to do okay yeah, one for the first it's option he okay now put your hand down two for the same option I don't know why you have to do two I actually don't know, it's probably about the same so how do you decide which one you want I kind of want to do lambda calculus because it's cool, this will give us a lot of time to talk about lambda calculus just my justification would be there's a lot of people that aren't here because they're finishing project four that sucks for them but I think as a general consensus we probably know less about lambda calculus than we do about keep memory management and so because we're more familiar with that we go faster can you guarantee those students will be here on Wednesday for lambda calculus? not at all, I'm just trying to be altruistic and assert that you care about you kids I do care, but I care about you because you are all here are we comforting both in their way? yeah we'll just have like half a lecture less on lambda calculus I mean, the lambda calculus can kind of go for as much time as we allow it so alright, we'll do, I'm going to do the we're going to do the executive decision because this is cool low level stuff we'll get to as much as we'll get to here and then we'll do lambda calculus because I think we have a lot of time for most cool, okay, so we saw different types of memory allocation we've seen global allocation how are variables declared and allocated globally? from what? static, yes, I thought you said static yes, so the compiler just says I'm using this much global memory when you want to load this program make sure at this specific memory address that you reserve these many bytes because I will be writing to those bytes stack allocation we already saw right, when we call a new function we allocate space on the stack for those local variables and then when that function leaves and returns we clean up that stack space so we automatically deallocate it so now we're going to look at how does the CPU heap allocation actually work and so for this someone didn't write me again what are the semantics of heap allocation semantics, you are the programmer it's technically to be allocated yes, so you are the programmer what do you have to do you have to explicitly ask for it by using a malloc to run your family functions and when you're done what do you have to do free, you have to call free right, so that's the big difference here global you don't have to do anything stack you don't have to do anything but heap allocation you're saying hey I need some memory and I will be responsible for freeing it when I'm done so okay so the C heap allocation it's defined in libc so this is actually a library function it's not part of the C language itself and it's defined in standardlib.h the malloc functions so we have malloc what's calloc do you use this I think in the project 4 code does anybody know what it does yeah I don't know is that right, yeah yes, I think it's the zeroes everything out part I think you can also it's a different style, you don't direct the expert bytes I think you ask for a number of objects and then how many you want and then you have something that you initialize it to so does it automatically do zero I think it automatically does zero right, so this is nice if you have structures with pointers because you know those structures those pointers will be null when you call C a la what is rea la yeah so this is you telling the heap the memory system hey I know I asked for 50 bytes but now I need 100 bytes give me back a new pointer that changes this what used to be 50 bytes give me space for 100 bytes and copy the previous 50 bytes over to that new memory location so this is how you should be doing I should find some way to enforce this in the future on projects 2 and 3 this is how you do dynamically expanding arrays in C, in C++ or not C++ necessarily but in C instead of hard coding the size of your arrays you set a value you check the length and when you want to add more you can double it and use rea la for more memory cool free returns to memory and says hey I'm no longer using this so the very cool thing that these are defined in the C standard library means that there are many different heap allocations many different algorithms for how you manage and how you return heap you can actually write your own heap management if you want to it's actually pretty cool and you can go read that source code it's not part of the operating system it's part of the library so it's really cool this actually enables you to do cool stuff if you're ever in like an embedded systems environment where you need you have specific heap requirements you can use your own version of malloc and free let's look at the program we have our main function we have a test pointer we're passing in as argv1 on the command line we're calling atli what does that do ascii to int that's what I remember at atli so it's going to parse the string remember argv is going to deallocate so it's going to be a character pointer so it's going to take a string and return the integer value then we're going to from i equals 0 to i less than 10 we're going to malloc that size that we passed in of memory and then we're going to print out as a pointer what was that memory that it gave us and then return 0 so this is kind of our way of black box testing and trying to reverse engineer the malloc library because we're saying hey, show us give us, show us the pointers that you're actually returning yes right, garbage when test goes out of scope we lose all reference to that yes the other way to think about it when test goes out of scope, the program ends so we don't really care right, garbage is only for when your processor program is running as soon as your program terminates everything, all memory that was allocated to your program goes away so yeah, for a long running process this is where garbage really comes into play for a long running process, a desktop application a web server, any kind of server type application if it's accumulating garbage we'll go over time and we'll crash okay, so we can compile this with 32 so we're 10, we want 32 bytes so we can compile this with 4 so we're asking malloc to give us 4 bytes so what is the other guarantee that malloc needs to give us whenever it turns us 4 bytes what do we think should be true about these pointers what must be true you can de-reference them so it has to be memory that we can actually access I guess unless if you read the man page of malloc it'll say accept that malloc will return null if it cannot allocate any more memory so in that case you couldn't de-reference it but you can check for null, we're not doing that here what else, why 4 bytes apart we cannot have overlapping buffers right, if we go back to even the box circle diagrams, when we called malloc we assumed we got a brand new box with a new value inside of it and that writing to one of those boxes doesn't change any of the other boxes if they were overlapping we'd have to worry about that so this is part of the thing, if you're trying to reverse engineer understand how something is written before you run it you want to understand what are my assumptions my assumptions are if I'm running this with size 4 these pointers better be at least 4 bytes apart if they're less than that then something has gone horribly wrong so on one run of this it was 804A008 804A018 how far apart is that 16 bytes is it overlapping no is it more yeah, it's kind of cool or interesting 28, 38, 48, 58, 68, 70, 80, 80, 90 I get the sense that they're 4 bytes apart we would only be able to use technically be allowed to use 4 bytes starting at that offset not guaranteed that if we write beyond that that's not going to kill a rhythm we're peaking under the hood into the heap yes, for your program to be semantically correct you could only write to the 4 bytes of that pointer if you ever write to the 5th byte you could A, have some random crashes B, it can actually be a security vulnerability so you need to be very careful just like buffer overflows allow you to control the execution heap overflows is another type of vulnerability that can allow you to control program execution it's a lot more hard it's more difficult to actually accomplish but it's definitely possible so yeah, you don't want to use this as like, oh I'm asking for 4 but it's giving me 16, right that's, you know we'll see if that works, that's a problem cool, so now we want 8 bytes so we look 8, 0, 0, 8 18, 28, 30, 48, 58, 68, 78, 88 98 98, so now let's go more 24 so we said it's like giving us 16 so what if we ask for more than 16 0, 8 28, so now how much is it doing 32 so it looks like it's trying to 4 byte align the memory that it gives us that's interesting 40, 96 right, we'll go really big so this is a lot I can't do the X math in my head what would it be no, it's not exact what is it, 2 so we have like a X calculator there let's see B010 minus A008 4104 interesting so it's bigger than 4096, right huge number D7FEC008 so the first one works the second one works, the third one works the fourth one works, the fifth one works until it finally outputs 0 it returns null because we've allocated all the memory that we can to this program it returns null twice I can't remember what exactly this value is maybe null at the top there 1 I don't remember how many bytes oh that's half a gigabyte so we actually got let's see, 1, 2, 3 3.5 gigabytes, that's pretty good and we could actually use this and my machine doesn't have to have 32 gigabytes if you want to know why that is the case take operating systems and pay attention to my class okay so how does the heap work so as we saw the way that we've drawn the stack so the stack starts at high memory locations and grows down if you think about it so we have this process space so we have all of our memory from all the way 1, 1, 1, 1, 1, 32 bytes bits of 1 all the way down to all 0s and our program thinks it's using all that memory as we know it's not actually the case but it thinks it's using all that memory so we have the stack that's growing down so we have a heap that we also want to grow so where would we want to put that heap at the bottom to grow up right doesn't make sense to put it growing down because there's only a fixed amount of heap you could ever use you may not use the stack you may use the heap more so the program can deal with that so the heap actually exists in lower memory addresses and grows up so stack grows down, heap grows up there's also a ton of other stuff here there's the code the program's code has to live somewhere the static variables that we talked about the global variables have to live somewhere but we're going to go into those for now how do we take our stack over to here because when operating system knows that stack is over and now heap stops it knows is the short, very very short answer really actually that is the truth it knows exactly how much so we'll see kind of how this works but it precisely allocates it knows, okay here's your stack it goes from here to here, you have this many bytes as we'll see the heap you actually request to the operating system to get more memory so we'll look at this in a second so right, yes yes we'll see yeah the heap is a libc construct free and malloc calls libc but where does that memory actually come from right, that's the important thing is malloc and free are calling a library function but it has to actually get memory from somewhere so svrt is the Linux system call to increase the size of the heap and this tells the OS, hey I need more heap space so this is kind of cool because you don't have to worry about this this is why we have cool libraries but it's defined here you can call this and so malloc doesn't actually allocate new heap directly but calls sbreak to ask the OS to increase the heap allocation and this is the primitive that all of this is based off of so we can look at this function, we can actually do this ourselves we can get size, we can call and malloc size, we can ask for the current sbreak so an sbreak with 0 will return the current where the current heap is located the top of the heap according to the OS and then we can print out these values so we can see if we run this with 4 we can see that the program has actually been allocated up to 806B000 and so malloc is being a little bit smart here in that it's asking for a lot from the OS and just using a little bit of it so it doesn't have to keep calling a system call every time you ask for memory right so you can think of it caching some space for you we can increase the size and now we can see that it's all within that size here are 10 and 24 we can crank that up to 4096 which the very first time we do that it'll ask for let's see more so 6C00 so it asks for more memory and just gave it to us we can ask for 65536 and it'll ask for a lot but we can see here that in between these calls it had to ask for more memory from the OS so this is kind of another cool way that we can debug and look at this so malloc calls espray to request more heat memory from the OS the question is how is the memory deallocated when we call free right so if you think about it it seems like malloc is incredibly simple right it's just call espray to get more free memory to get to the program the tricky part comes in is the programmer gives us what when we call free a pointer it's just an address it contains an address how do we know if we should free that maybe that was memory that we never allocated what if they accidentally pass a pointer to the heat should we free that no probably not right that would be bad so now we can do our test again and this time free the pointer before we call it again and so we can see running this on 4 this is actually kind of a cool thing so we keep getting the same pointer back and this makes sense and this matches with our semantics of what memory management should do right keep allocation should not change what's given on the stack so we can reuse that address as soon as we call free that memory is good to be reused same thing I think the same thing should happen so now we can only free half of the calls let's free half of the pointers so we can see it's reusing it every other one so we have malloc malloc takes in a size and returns a pointer free takes in a pointer so how would you I normally go into I want to finish this today so we got 3 minutes cool so we know that malloc has to call sbreak to increase the size of the heat and it needs to return the pointer that sbreak returns free could essentially set sbreak to the negative allocated size so part of the problem is when you call free you just have a pointer to memory and we told the programmer hey how many bytes you call for the size that's how many bytes you can overwrite so when we call free we're supposed to free the size but they gave that to us in the call to malloc right so where do we put that so it's really cool what actually happens is what you do is you allocate the size that they gave you plus extra bytes for your high speaking information you use those extra bytes at the front to store the size of the memory that was allocated and then you return a pointer to decide that memory so that when they call free you get that pointer you decrease it by that many bytes and now you can access the size when you know exactly how much to free yeah it's actually really cool so this is ripped from the actual source code to malloc so you can go in there it's pretty complicated but it basically looks like this so it stores the chunk size and the buffer okay no this is my version because this is a super simple one and so your code could say with the size of a malloc chunk plus the size that you gave me right because I'm going to store the chunk size and I'm going to set the chunk size to be whatever that new size you gave me and I'm going to return a pointer starting at this new memory and offsets into this buffer to point right at this buffer so it's kind of crazy if you look back you'll see there's a bunch of metadata there and so free would be very easy in that you take the pointer that they give them you go back the size of a size T so that would give you back to the start and you use a cast to say hey this is a structure that's a malloc chunk and now you can access the size so where am I going here and so you can allocate it free all that cool stuff okay so part of the problem with the heap is if you allocated a bunch we stored a bunch we have no control over when the programmer is going to free memory right they could say okay let's use this what we do is we keep the heap and we store metadata that says what's free and what's being used so at the start we allocate some memory we'd say all of this is free and then we'd say okay the programmer used four bytes let's split that up and every chunk will have a pointer to the next chunk and while the bit that says this one is not free so this chunk is not free and this chunk is free so then when I need to allocate more memory I can follow these arrows I can follow these links so essentially turning this heap into a linked list I think it's technically a doubly linked list so we could go backwards yeah so we have a pointer from the free to the next bit of memory and then we need to allocate some we turn that piece of memory and now when we free we change that and now we can allocate more so we'll reuse so this looks crazy is let's say that first allocated 16 bytes and then free it and then only allocated 8 bytes so now we can reuse this part in here and only use this 8 bytes but we have to split up that free range into a used range and a free range and add more pointers there then they could ask for a lot more so you probably never thought about it but the heap is incredibly important right if it takes a couple seconds for you to get memory that's not going to be very good for you so modern malloc is super interesting there's all different types of heap workloads there can be lots of small frequent allocations so they actually keep different sizes in areas of where they want to allocate memory so if you ask for 4 bytes it's going to come in one area if you ask for 4k it's going to come from a different area because they know you're more likely to reuse and free those small chunks so anyways if you're interested in this stuff I highly recommend you look into this it's kind of a crash course on heap management