 All right, so we are talking over how C code is actually compiled into assembly and so let's say we have a function foo it takes in two variables a B and It Return a plus B Then we have our handy-dandy main function You have types here in Maine or the parameters here. We'll have two local variables int x and y Then we will say x is equal to 20 Y is equal to 40 and then we'll say foo of C code So what should the value of 10 be just be x plus y Super simple right so now the question is when this is compiled into assembly code What does that assembly code look like so we'll start with me? And really the idea here is we want to understand How does not only the stack layout work, but how does the compiler compile this C code into x86 assembly? so if we recall our stacks I'll drop here in a green color that may or may not show up We draw stacks is growing down so we know up here all of this stack space has been used When main is called So there's a couple things we guys to do here, but we'll build up to that So First thing is we know we got to save some space on our local stack for the local variables Right, so all local variables are stored on the stack So how many bytes do we need in order to have these three variables x y and 10? See it would be four bytes for each one. Yeah, so then we need 12 bytes 12 bytes Cool. So one thing we're definitely gonna have to do is and so the other thing to remember is we draw the stack from high all f's to low Zero zero zero right so we need to move the stack. So what do we need to? add or change to ESP to Adjust the stack to allocate now 12 bytes on the stack Since we're going down when we have to 12 from it So at some point in this code and I since I know it's not the very first instruction, I'm gonna leave a little bit of space But we know we need to sub dollar sign 12 ESP and so we know with the syntax that we're using this is gonna subtract the hard-coded value 12 We know it's hard-coded. We know it's a constant because of the dollar sign 12 from ESP and put the result in ESP. So when you only have Two two parameters here to a subtraction. It's subtract 12 from this and put the result in this So it's actually subtracting from the address. That's yes. Yes because ESP just has contains an address So the value inside ESP is a pointer to the current location of the stack So by moving it down 12 now every time we use ESP We will now be pointing down 12 bytes on the stack Right. Yes. So we have state. So now this is good bytes, right? We don't know what was there before Right, but now we run into a bit of a problem. So So eventually we're gonna get to the point where we call this function foo and we pass in the values x and y Right, so we know assembly wise. We will eventually have an instruction that does this Yes We will eventually get to the point where we are calling recording by the way We're eventually gonna get to the point where we have literally an assembly instruction that says call foo right before that we're gonna have to push So in the so the important thing is the C-deck hold Calling convention we have to push onto the stack all of the arguments from right to left Then call the function All right, so the first thing we're gonna push onto the stack is why and so on our stack here Which one is why so we have if we draw a line every four bytes We've allocated three of these so we have four bytes here and the four bytes here Yeah, so every four bytes right We allocated 12 bytes we subtracted 12 444 yeah, right, so one of these is why Doesn't matter which one we just have to specify which one is which right just like local variables We have x y and 10 right easy way to do it. Let's just do x y and 10 Right, but now when I start pushing these values on the stack is going to change So how then do I reference? Set this return value equal to 10 Right now. It's pretty clear. I can use ESP And so I can say at ESP is 10 at ESP plus 4 is y and at ESP plus 8 is x But as a function executes the stack can change and it can change dramatically Right, so the idea is instead of doing that Right instead of dealing with a constantly changing stack Let's use a different register, which I'll use in red And we'll use what we call a base pointer EVP or it's called a frame pointer So the idea is this is a separate register that points to a fixed location on the stack So as the function executes the stack pointer can change but every single local variable will be some offset of EVP But we need to set that up right we you know when we get this function is called none of this exists So we actually want to do this before we subtract anything Because when we get here we know okay, we're good right? We have a bunch of stack space everything above us is safe So now we can start using this so now we can move ESP into EVP Right, so now if we didn't do the move from ESP EVP, we wouldn't know where to return Would we could still not yet? We're not there on returning yet. Okay, so it's only about referencing local variables so but What's different? What's the difference between a main and other functions? There is no difference It's exactly the same as other functions. So another function called main So if we're saying that this base pointer this EVP register points on the stack for our functions currently executing frame What was in the EVP when main gets called would be Whatever was its collars base pointer. Let's say it's some function bar. So bar calls main This is probably the worst Main bar calls main, right? And so when main starts executing the base pointer for bar whoever called it is there And if we were to just override it with the current stack pointer, right? This is great for us, but now when we return from main back to bar Now that that base pointer has been completely overwritten Right, and we wouldn't want that we wouldn't want any function that we call like when we call foo we foo shouldn't change our main's base pointer So Exactly, so we need to store it and where can we store data on the stack? Bar's base pointer exactly, but we don't know that it's bar. That's the interesting thing, right? We actually don't care who called us We just know that at this point in time as soon as main starts executing somebody else has called us And so we need to make sure we save their base pointer And this is actually the first instruction in almost every single function frame so Push EVP so we save whoever called this base pointer and now we need to start thinking about actually must Draw the stack over there because this will be important. So we draw a stack We're up here So ESP is here and Then when main starts executing the very first thing it does is push EVP on the stack So the first thing on the stack will be we'll call it saved EVP Right because we're saving whoever calls us then What's the next thing we do? ESP This instruction is your golden ticket to help you remember Right, it's impossible to copy or subtract ESP into 12 Right. This is a constant value. So there's no way we could put ESP into a constant value 12 So all the copying is happening this way, right? So we're moving ESP into So now we set up so now that we've saved the base pointer whatever was in there Now we can create a new base pointer that points right where the stack currently is with the save EVP Now we can subtract 12 from the stack pointer We don't know what's inside ESP We don't well, that's not important right now We know that it will point to somewhere valid in memory That's the only thing we know about it and we know that any place above ESP is allocated memory that we shouldn't touch But we know that we can bring ESP down as much as we want and that will be fine We're using what we're using this as scratch memory as a stack that grows down. So we think we can call this x y and z so Now we got that now we need to set x to be 10 So how do we do that? Yeah Yes So we do So It's minus four Up here We should not be overriding any of that because all of this is saved by whoever system won't warn you You could oh no, it could be It's possible to do that You can actually write code that's a minus four So minus four point at point us at x right But we want to make sure we're actually de-referencing it right because we want to Copy 10 into the location pointed to by ebp minus yeah, because that's just moving the pointer But exactly this is trying to move that into there. So we actually need to prefix this with Right, so in the syntax that we're using this is take ebp subtract four and the parentheses means dereference it Okay, so it's actually So take 10 at the right into ebp minus four. Got it nice Well, what about the next one? Yes How many local variables do we have in May? X Y and 10 Fundamentally it actually doesn't matter right this could be 12 it could be 16 it could be 24 Right, it doesn't have to be the minimum amount of space that's necessary the compiler can do whatever it wants because Extra space there doesn't affect the semantics and the syntax for the execution semantics of the program Depends on the compiler. That's why you're gonna send out that email because the compilers will often Put this to a value that doesn't really make sense like we'll put it to 16 And that's about like stack alignment and all kinds of stuff Yeah, exactly why cool. Oh, we're not doing 10 we're doing 20 so Okay, so then how do we do this line? Say it was before but you do move 40 to minus eight so when you when you're analyzing Finally code when you're looking at the object up And you see moving a value into EVP minus an offset or moving EVP minus an offset into some other register What do you know that this value should correspond to in the C code? Which value? this Well, you know, it's a this would tell you it's a 32 bit Thing it's at least 32 bits. It's not a character. You don't necessarily know if it's an integer or a Could be a float You don't know at this point looking at there, but you know, it's a local variable, right? It's definitely not a global variable. If it was a global variable. It'd be a fixed hard-coded number in here, right? But a local variable will always be at an EVP a negative offset of EVP Global one would say like move into Address I have an address you can still use this would be a hard-coded address. You could use still use EVP minus 8 or 10 No, it would not do that. No, that would not be a global address. That would be local Okay, okay every instance and invocation of this function name will be have its own Stack space on the stack its own function frame on the stack Okay, we did that so now we've done all the easy stuff now. We need to get it to the hard stuff, right? So we want to call this function foo, which we said here's our target All right, so what did the CDEC will call in convention say we need to do putting y x So I need to push onto the stack y. So what do I push onto the stack the address of Y? Which is by putting the address or the value y value Right, this is Right, I'm passing in an integer Yeah, so the type says that this is the value we're seeing as the semantics of Passed by value so it's going to copy the value that why it currently has into foo It would be different if it was if the address of operator was there then we would pass in the pointer to Y or If we were using C++ and we were using pass by reference semantics, then it would pass in the address Because if we passed in the actual address that means we do modify the value From be adding exactly and the return would be add the address of Y to X So we're gonna push on here. What? Did you reference the EDP value? What's the exact thing? All right, so where's EDP right now? EDP is pointing at Z No, it's pointing at why so it's currently pointing at why The negative eight just means that oh, we just okay. Just go there, but it stays okay We have nothing to change EVP, so we need to always be fixed throughout our whole We need to go EDP So minus a and then parentheses percent EDP That will give us why That will push the value of Y onto the second which will in our diagram So we didn't update it from here. So here we move 20 into EDP minus 4 So 20 goes here and then here we move 40 into EDP minus 8 40 goes here. Oh, that's not to see that step. And then we do push negative 8 EDP So we're gonna take whatever is at EDP minus 8 the value remember the the parentheses dereference and get the value as pointing right at the 40 Right, so it will EDP minus 8 points here And so it says dereferences take the value. So that's the 40 we're gonna take ESP points here Minus so push minus 4% We'll change this So now we've almost done everything we need to do according to C. Deco calling convention to call the function foo So some things that we're missing. We're missing So we have foo up here. So we're gonna drop foo up here right, so one thing that we're missing is memory addresses right So what's it mean is that? x0 804 Honestly, I want to make these be not Not super big addresses. These will each be at different addresses. So this is it there And that's what you're looking at when you see the object up of this function on the left. It shows you the Offset of the program of where this code is located in the middle column It shows you the actual bytes and on the right hand side. It shows you the assembly instructions This is we to we'll say this is the same thing four Six so eight those are the actual addresses in memory This is where these code instructions live On those addresses in this addresses in our imaginary program. They go by twos They don't go by four. No, it'll be very depending on the size each instruction to be anywhere from one to like 10 bytes Oh, I'm just doing this for simplicity. So those numbers are not gonna be in order or anything like that. No A, B, C, D, E So those are presenting addresses in every where those instructions slide. Yes We have something after right? So we know So let's actually ignore that for now. We'll ignore this for now. And we'll think Okay, so this calls this so according to the C. Deckel calling convention What how does food return its value in the EX register exactly? So foo will put the return value into EAX And so where are we gonna put that value? So now what are we doing with that? So how do we complete the semantics of this instruction? So we've just done this call All right, we've called foo now we need to take its return value and put it where? EX It's already in the EX. The minus 12 EVP. EVP minus four eight twelve So we want to move EAX Into minus 12 Now we need to do this final instruction of return temp. So where do we put our return value? EAX into EAX and where do we put it from temp? I'm gonna put this Value currently hearing temp back into the EAX register and at this point. We're basically done so we need to Now we need to done but now if you look the stack is all down here when we were called the stack was actually up here And we put this additional base pointer on the stack, right? If we tried to return back to whoever called us We would still have this Save EVP on the stack, right? And when you call a function the stack shouldn't ever change, right? The stack should return to you exactly the same way as you've got it So now do we need to do the reverse of what we did in the beginning of May? That is exactly what we need to do. Basically this way up. So we just take the current stack pointer Change that back to where EVP is. So we need to move Basically percent EVP into percent ESP. So that will effectively move the stack pointer up here Which does what to all of this? Yeah Deallocates it in essence, right? It doesn't actually change the contents of memory the memory contents still stay there But according to the program now everything else can use this Right. So now we've gotten rid of that So now what's the next thing we need to do? you need to take the seed EVP and Put that into EVP. So it's now set of a push and now we need to return Which is just a ret instruction. So we'll actually look at that in a second We're gonna come back to and say okay, let's go. So all this is correct Let's rewind our program back to this call of foo because we need to step through and write basically foo's function Leave does advance We haven't got there yet believe instructions simply does both of these in one So it does move EVP into ESP and pop EVP all in one instruction So I assume it was some optimization that they made of hey every function is going to be doing this So rather than having two Instructions for that, which is gonna be longer. We have just one instruction I think it's probably just one bite too. So I would shorten the length of every program Let's go back to here. So let's rewind time Our EVP is still pointing here Right right before this instruction this call foo the stack pointer is pointing here right so Now we need to call foo, but foo is let's say it's at Compiler knows I Will put it at So actually when you look at it in the disassembly, you won't see a call foo. You'll see a call zero x 804 40 Because that's what it's gonna call. So essentially the call instruction says jump and start executing at 804 40 So the instruction pointer will go here and start executing foo's instructions, but When foo's done executing, where should it go next? What's done executing yes, you should go back to main What exact instruction in main you should go back to where it was called here so that it can call itself again Now the instruction after call should go to the instruction after the call Right, which is 804 21 right So the question is how does foo know to come back here? Right because it can't be hard-coded because somebody else besides main can call foo Right, let's say we call bar which calls foo foo needs to know to go back to bars invocation We can even inside one function main. We can have multiple calls to foo Right, you don't want them to all come back to the same place They want to all come back to the instruction that comes right after the call So the way this is done in the C. Deco calling convention is that this call instruction does two things It sets the instruction pointer to be 804 4 0 whatever this is so we'll start executing there And it does a push The next instruction that would have been executed this address So when this call instruction executes the stack changes and the value 804 21 is pushed onto the stack auto Automatically it does that automatically. It's what the instruction does Move was a move instruction do it moves this into there Was a push instruction do Right, so copies of value from EVP and it copies it where the stack pointer is and moves the stack pointer down Right, we're actually technically it moves the stack pointer down and copies the value on that Right, all of these have very precise Meanings of exactly what they do that's the same thing with calls So you don't think about it like it's automatic. It is the call semantic is Set the instruction pointer equal to 804 4 0 and push 804 21 onto the stack Pushes ESP down and then it copies 804 Yep Support not to think of it. There's no magic here, right? These are all just specifications of exactly like you can look at that x86 Manual and it'll tell you exactly what every single one of these instructions does How does it know to allocate for bytes because it knows that it's a memory address? Every addresses on 32-bit systems are four bytes if it was a 64-bit system, it'd be okay So now when foo executes, what's the first thing that foo has to do push its values So save whoever called it base pointer and then what's the next thing? This one, but for this one, maybe we need to set up our own base pointer Yes, we don't have any local variables in this program in this function, but what do we have? arguments Exactly the parameters, so we need to be able to access that so we first save whoever called this base pointer And we can actually look on here And we'll call this So this will execute we will push the same base pointer. So this would be ebp Main space pointer, right? The next instruction executes we move the stack pointer into the base pointer Now we've gotten rid of whoever called this base pointer and now our base pointer for foo points here It's just an instinct so Now we need to do this functionality, right? So we don't need to subtract anything from the stack pointer. There's no local variables But how do we get a and b a and b the parameters here? What is it? That was in me don't you have to take the two arguments that were passed to you. How do you reference the arguments from here? Which just do plus whatever a bbp Four bytes get you here, eight bytes get you here, 12 bytes get you there So depends on what you want. So if you want the first parameter, how many bytes above the bbd eight eight So if we move eight percent ebp Into let's say x that would be a exactly so the value Inside the eax right at this point would be 20. So why eight? Why do we need to add feet? Because that grows down That's why specifically eight Yeah, hey, what's to see addresses above you? There are two things above us right from the current base pointer There is the ebp and above that is what we'll call the saved eip Right, so that's the instruction pointer where we're gonna go to after we're done executing this function food So using that logic so we know that this was the base pointer of main Right when main was executing Do we know right at that is the same ebp and then what's about that? It's both the same ebp. Yeah, that's gonna be whatever program called it. Yeah, the same instruction pointer Whatever called that and then what's gonna be about that? They're like the stack Stack of the What if main what if main had arguments? It'd be any arguments past the main Exactly, and that is for any function. So that's the important thing to remember for any function In the body so anything ebp plus an offset of eight or more It will be an argument to that function So ebp plus eight is always going to be the first argument ebp plus 12 will be the second argument So on and so forth So to round this out we can then do our addition with multiple ways to do this a quick way would be move add 12 percent ebp Into Yeah, so this is add that plus that and put the result in yeah, it's The reference I get it. What's the ebp plus 12? What's the value currently there that we have? 20 oh well it's pointing at the address that points to 20 then we need to be referencing. No, no, no plus 12 12 is 40 yes So 40 What's currently in ea x? What's the value currently in ea x? It's currently in ea x is 20. So this is take ebp plus 12 the value there ebp Or 8 12 40 add that to ea x you get 40 plus 20 is 60 and then store that in ea x Implicit store just like this subtract There's an implicit store the result into the last parameter. So now here in foo We put our return value in ea x Is there anything more to this function we need to do logic wise? How do we return the value what we need to replace the address at foo with How does the function return a value how do we get the return value from foo in main? That is the c-deck will call in convention a function puts its return value inside ea x So after I call a function, I know that this instruction will be executed So I know I can get from the ea x register the value the return value of that function So here our return value is already in ea x That's pretty sweet So now what we need to do are we ready to just return back to me up ebp? So this will move the stack pointer up So it's gonna first take that value in there put it into ebp. So ebp will now point back up to here We'll pop ebp. So now The stack pointer so the stack pointer will then point here, so it's gonna be SP So now that we've done that now and how do we get back? Which is gonna do what? How does that affect our stack? Deallocates yes, so it pops is essentially a return is essentially a pop eip So basically take whatever address is there and start executing from that address So this is go to 804 21 and set that eip The instruction pointer there, so that's what it'll start executing and as a side effect of the pop changes the yes So Now that instruction pointer is here so now where So now In between from mains perspective when we called here and we called here. How did the stack change? From the point where we called the function right before here Like you put a break point right here with the stack and then you put a break point here and look to the stack How the stack have changed? No change stacks at exactly the same location. What's the only thing that changed? the The return value of the EX the EX register changed because food changed the EX register Which we know it's going to do because of the seed that will calling it by the gym important things to remember Negative offsets of ebp are all local variables Positive offsets of ebp are all parameters every function Basically will have this Prologue, I may have spelled it wrong Every function will have a prologue and an epilogue Right because every function needs to deal with hey I would just call save the base pointer create a new base pointer and subtract fillable variables This one doesn't have subtraction because it has no local variables This one has the same thing too You should be able to do this So essentially what you're doing is every time you call Like let's say you have a like message functions You're what you're doing is leaving the trail of breadcrumbs for yourself to get back to where you originally were So as the stack goes down Every stack and every function call will each have not only the have the previous ones base pointer Which will point higher up on the stack, but they'll be all these pointers of code to go back and execute As you're going back so yes exactly and this is why you can this is actually how Recursion is done so that every invocation of a function has its own separate copy of local variables This is why if you have a recursive function that doesn't stop You'll get a stack overflow error because you can't just keep creating stack spaces for all these function calls That space exactly So Yeah, exactly, so it depends on the compiler you can do it. Obviously you can be Super conservative and just do the prologue and they up log every single time Or you could do the prologue and if there's no local variables, you just don't do that Sometimes the compiler I don't know what flag it's on or if you have to specify inline specifically but Function like this you can inline food directly into main so to compile it right there and not use the sack at all the paths any parameters Can you go over one more time where the ESP and the base pointer were right before you made the The return back to me From who yes Where's ESP so right let's say right Here before that one Right, it'll be right before this so when I put my finger hand here. I mean this has not executed yet, right? It's the next thing to be The stack pointer will be here The base pointer is gonna be where The base pointer will be pointing at the same thing Now is that because we haven't added any variables to the stack any local variables does that function? Okay, who had any local variables the stack pointer we've been pointing down here Okay, we've made this additional step of moving the base pointer into the stack pointer to move the stack pointer up to our base Okay, so then we pop ebp so that means it's gonna go back to the same ebp That's in that point moves up and ebp goes here. Okay, so put ebp main But this would be an actual address, which would be whatever memory address that was so pop does two things it actually Puts it back and takes the ebp back up and then moves the esp up one Yes, it's like a stack pop here is a destructive operation. Basically, it removes that item from the stack Okay, I just didn't know to pop and put ebp back well ebp back well, but it doesn't know it just puts whatever value was saved here We got to this push ebp, right? We first got to this function the stack was here and Ebp was up here, and so we did a push ebp which moved the stack pointer down Copied whatever address was in ebp onto the stack right here So it depends on what these addresses are this would be like BFFF something whatever this Address is would be copied onto the stack here And then we know when we're here. We're trying to get back. We pop ebp and so we're gonna pop from We're gonna take that value and put it back into the base pointer Okay, so it just It pops ebp back to the save ebp. Yes It bumps up ESP one right which is part of the pop And then when you execute return instruction Does it take whatever ESP is pointing at? And then it just starts back off. So another way to think about return is just a pop eip Okay, so it's basically take whatever just like here was pop ebp Right, so the stack moves up and you take that value to put it in ebp same thing here So take that value put it in the instruction pointer and move the stack pointer up one And then we get here we do some stuff and then we do the exact same thing for me So then above the save ebp you would also have the save eip Yes for the previous column program exactly and then above that we would have any arguments for me And I think every main even if you only if you say there's no arguments It'll still be called with three arguments RGC RGV and the environment So if there's nothing that's being passed, so it'll still have like three empty spaces there Even if you're declaring a main like this, I think it's always Would we have to do something like that if the question arose in the midterm Right. No, you go I go through the function That's more like a tidbit when you're analyzing functions So then once we go back, that's when you would you would put in the result from eax in the tab eax in the 10th 10th back in the eax right because the other thing that remembers were super dumb compiler, right? So we're just gonna compile each line exactly as it is without doing any optimizations. It's clear. We don't need this step But a dumb compiler will do that step Then now our return values in eax our job here is done So then we move so now we're back up there We move the base pointer into the stack pointer so now the stack pointer jumps up to where ebp currently is Then we do a pop ebp So ebp will go back to whoever called main's value and the stack pointer will move up there And then this return we use this save VIP here to return to whatever called me You heard the verdict folks Wave bye. You gotta get in more. There you go