 Okay. Wednesday, any questions before we get started? We're going to keep going with C-deckable calling convention. So that different programs are correctly interfacing with each other? Compatibility so that we can call functions written by other people and compiled by other programs. Yeah. What else, why? So cross-program, what else, Anne? Being uniform between programs, maybe across the license, and have this program that supports either or. Yeah, so we could maybe think about being able to compile to our C code two different calling conventions, different architectures, right? We need to know what these calling conventions are. And so in the Linux, we're going to focus on the study X86 Linux calling convention called C-deckable, that's what it's called. So when you look at your programs, it's exactly what they're using to call all the functions in your programs. So the responsibilities, right? And this is what is spelled out in this convention. This convention says, okay, if you want to call a function, what do you have to do? You have to do X, Y, and Z in this order. And if you are called as a function, you need to do X, Y, and Z. This is your responsibility. So the caller first does what? You're deciding this. What should functions have to do? Or sorry, what does somebody who wants to invoke a function have to do? Figure out where the location of the function is. Yeah, so technically the linker will take care of that. So when we compile our code, it says, hey, I want to call this function. Then the linker before it creates an executable, it says, oh, this function, I'm going to put it at this memory address. So in the code, we'll see a call memory address, not call name. But that's actually resolved essentially at compile time or at linking time. So what else? So that's already done. So the code has a call this function. But what does that have to do before it actually calls there? For what? Call a function. Ooh, possibly. Call a function. Right? So somebody has to allocate space for the function. What else has to happen? Update the base pointers. We have to update the base pointer. What else? Do we just need to go with a mistake? The arguments need to be somehow passed. Yeah, somehow we need to pass the arguments to the function. Let's see. Is there anything else? Save the time pointer. Save the base pointer. So yeah, we talked about it. Save the base pointer. Somebody has to save the base pointer. So specifically in, and these things, who does what is essentially arbitrary. Right? That's what I wanted you to think about. But the point is we need a convention that we all agree on. And we say, yes, this is what we're going to use. So for here, first, the caller pushes arguments onto the stack in right to left order. So the very first thing pushed onto the stack is going to be the right most argument. And then the second thing onto the stack would be the second to right most argument, and so on and so forth, until we have the last argument, which is the left most argument. Does this make sense that this is the first thing that the caller has to do? Hey, Siri. That would work with your phone. I'm worried it would work with mine. Oh, it says Siri. So why does it make sense to do this first? Why can't the called function do this? Yeah, it's like you're calling a function. You are passing an argument to the function, right? So I mean, I guess you could develop a super weird way where the function that gets called somehow knows what the arguments are. But you know, we need to give those arguments to the function. And we don't want to yet call the function because it doesn't have any arguments yet. So the very first thing we do is set up the stack correctly with arguments. Then we put what you can think of as a breadcrumb. And we'll see how this comes into play later. We're going to push the address of the instruction after the call. Why do we do this? Yeah, so that the function that we call knows how to come back to us, right? It just can't magically happen, right? We have to actually invoke this function. The CPU just keeps executing things, whatever the instruction pointer says it should execute. And it needs to know how does it go back? Where does it go back to? And that's it. So calling functions is actually fairly easy. We push arguments onto the stack. And then as we'll see, there's actually one x86 instruction that takes care of this, pushing the address of the instruction after the call. So now the function gets called. Now what does it have to do? Right? We saw that, okay, this is what the caller has to do. We have to have some kind of convention. We have to do everything else. So we have to save the previous function's base pointer, frame pointer. And do we have to save it onto the stack? That's actually a harder question than I thought. Ideally, no, but then the question is where else would we put it? Right? You need to store it somewhere. You can't store it at a fixed offset because then, if your function gets called again, it will overwrite that. It's the same thing. It's essentially a local variable. You could maybe store it in a file somewhere. It doesn't matter as long as you put it back before you return. So you need to save the previous frame's base pointer. So now what does the stack look like at this point? So going from the bottom up, we have the frame pointer, the save base pointer, we have the instruction pointer and where we're going to go to, and then arguments. Right? Then we need to create our space for our local variables. So now we create space on the stack for our local variables. And remember what we have to do, as we continue executing, it better be the case that the stack is consistent when we return. So what does that mean? The stack should be pointing to when we return. Yeah, so it should be right there at that. We'll see exactly where it's going to be. But yes, in essence, we need the stack to be returned at exactly the same point that we left in. And we put the return value in the EAX register, right? This is part of the reasons why we call functions, so we get return values from them. And the convention states that the return value gets put into the EAX register. Questions? What exactly is the EAX register? It is the, it's a general perfect register. The E stands for extended. So it's just allocated in different operating systems? Yes. So on some operating, some calling conventions, you put the return value on the stack, some calling conventions, it's in a different register. So it just all depends. Yes. Wouldn't you also need to push some of the saved registers onto the stack so that they can be corrected? Yes. So I neglected to leave, I left that part out because I didn't want to get too in-depth. But yes, there are certain registers that are, that the, a function that we call can overwrite and do whatever they want with. And there's some registers that we're supposed to save and so we have to do that in there. So the compiler would make sure about doing that. I don't want to get too deep into the weeds there. Yeah. So that's a good question. How would it? So you're calling a void function and your compiler is writing this code for the first part to call a function. What does it do with that return value? What was that? This could put some values in the register but we never used it. It could put values in that register. It shouldn't because, I mean it can change that register, EAX. It could put values in there but we've already said that's a void function, right? So the type. So AR compiler shouldn't allow us to assign the return of a void function to a variable. So we're never going to use it. We're just going to ignore it. It's that code that we write is going to do that. So the other case to think about is what happens when a function does return something but we don't use it. Same thing. This function has no idea that we don't use the return value. It just puts it in EAX and the compiler never generates code to read from that EAX register. As you pointed out last time, if my return value is float, then... It's different. I don't remember exactly how. You have to look at the spec. Yes, also if you return a struct, right? You can return structs. Those could be arbitrarily huge sizes. It's in the spec somewhere. If you look to see that goal, I'm sure that we could be evaded with the say somewhere. I think in that case you put it on the stack but the important thing is both the function that's being called and the caller function know what to do in that case, right? And it can know because it has the type signature of the function and it says, aha, this function is returning something that's more than four bytes. That means I shouldn't look at EAX. I should look on the stack. So the important thing is that you can establish all this information before you write when you're compiling the code. Cool. Any questions here? All right, let's go through an example. So now we have a main function. We have an integer a. We're calling some function calle passing in 10 and 40 and then we're returning a from main, right? Then in our function calle, we take in two integers and returning a plus b plus one, right? Very simple function. You can all step through this and see exactly how it executes. I hope. So what would main return? This main return. 51. Okay, there you go. Took a long time there, guys. Okay, 51. 40 plus 10 plus 1. Two additions. Okay, so when we compile this code, we can then see the exact x86 code that our compiler generates. So we can see that main, it compiles it as first push EVP. So what is it doing? It's completing its part of the calling convention, right? The function, what every single function has to do is save the previous base pointer. So we save, we push the previous function's base pointer that called us. We save that on the stack. We then move the stack pointer to the current base pointer. So what does this do? Yeah, set our base pointer, right? For our function, our base pointer is going to be wherever the current stack is. Then we're creating our space on our stack. So here we're doing, we're subtracting, what's 18, I just tried to do it in my head and I failed. 24. 24? Is that correct? Cool, I see some nodding, so that's good. Nobody wants to check. So moving down 24 bytes, it seemed like a lot. How many bytes of local variables are the main views? How many bytes? Yeah, we'll actually see why. It's because compilers are tricky. Is the short answer. We'll see it in a second. So now we're moving hex value, hex 28, which is going to be what? 40 onto where? ESP plus 4. So it's going to be 4 above the stack pointer. Then we're moving 10 onto where the current stack pointer is. And then we're going to call the callE function. So now why did I take 24 bytes? What did I have to do if I wanted to call the callE function? What's the very first thing I have to do? Push the arguments onto the stack. What am I doing here? Yeah, but I'm not really pushing, right? Because I've already allocated that space of that subtraction. And then I'm just moving this into ESP plus 4 and moving this directly at ESP. So that witness function is called the first thing on the stack there is the leftmost argument, which is going to be 10. And the one above that is going to be 40. We'll see when we see it all laid out. Then when we're done, we move ES into EBP minus 4. What is EBP minus 4? Yeah, the local variable A. It was never initialized before, so we didn't see it. But now we're moving... So EBP is now as well. Yes, the return of callE. So we move that into A and now we need to move EBP minus 4 into EX. Why do we do that? Yeah, we're compiling this return A. That translates directly to move that back there. We can see, because we're humans, looking at this, that this is redundant. But the compiler needs to be very precise. If you don't enable any optimization, it's going to do what it knows is correct without any kind of optimization. So then we'll see what the lead function does and then we have a return. Cool, so we call these bits at the top, the prolog. So if this push EBP, so save the EBP, put the stack pointer into the current base pointer, and create space for local variables, these have to be done for every single function. This is not functionality that's specific to main. Every function has to do this. So we call this the function prolog. And afterwards we have the function epilogue. So this is leaving a return. You can think of it. We'll step through it in a second. Do the opposite of these things. So that way the stack is all 100% good when we leave. And every function will have these. Cool, so if we look at the calling function, we have push EBP, same thing, right? We don't know who's calling us. We have to save their base pointer. We then have to move the current stack pointer into the base pointer to set up our own base pointer. Then we move EBP plus C into EAX. What's EBP plus C? So what do we do? So right before this function is called, what does the stack look like? Going, let's say, bottom up. So it's the first thing on the stack. The return address, right? It has to be the return address of where we go. And then above that, what is it? The parameter, which is in this case what? In which order? Ten and forty. Ten and forty. So then the very first thing we do is push EBP. So now we have save EBP, save the IP, ten and forty. Then we move the current stack pointer into the base pointer. The pointer points right there. So now it's at EBP plus C. Ten. Ten, right? So yes. Yeah, that's right. Yeah, okay, yeah. So that will be the second parameter, which would be B in this case. Moving that into EAX. Then moving EBP plus eight, which is the first parameter A, into EDX. And now we have again this load effective address. So we're doing EAX times one, which is EAX, plus EDX. Move that into EAX. So we're adding A plus B. Then we add one. And then we do pop EBP and then return. So why didn't it? So then what is this going to return? EAX, which is going to be the result of B plus A plus one. So what's different about the epilogues here in these two functions of Y? And the prologues, let's say. First thing, are they the same? What's different about callee than make? Doesn't have any local variable? Yeah, callee doesn't have any local variables. So why waste the time and the space to create space on the stack for something you know you're not going to need? That's the only difference here. This prologue does not create any space, and therefore the epilogue is slightly different to compensate for that fact. I still think you can use a lead, but we don't need to. So then let's look at this thing execute. So now we have callee and main, and these are actual memory addresses that I took from a compilation of this function. And we'll notice here, so this is where the compiler put each of these variable, where this code is located in memory. This code is just bits, bytes, and the CPU reads those bytes, interprets them as an instruction, executes it, and then goes on to the next one. So for instance, this is why the linker, instead of having callee here, it's in call8048394, which is the address of callee. So this is what is actually going to be executed by our program. So we have our handy stack, high, low. We have eax, edx, esp, evp, eip. These are all the registers that are used in this code. What is eip? Instruction pointer. So it has the address of the instruction that we're going to execute. So when I ran this one time, the stack was at fd2d4, which means that the stack pointer has the value fd2d4 when we start this. And the instruction pointer is at main of 8048385. And let's say evp was, is this right? Yeah, fd2c0, so somewhere above, no it's wrong, that's below us. Doesn't matter. Some value, it should be above us, but for this example purposes, it doesn't matter. So, okay. So now we're going to execute, we're going to push evp onto the stack. So we first move our stack pointer down four bytes, and then we copy the value in evp onto the stack. Push evp, and now our instruction pointer now points to a6. So how many bytes is push evp? You can tell based on what it says here. Now we move the current stack pointer, which is fd2d0, into the base pointer. So now we're setting up our base pointer and saying that now our base pointer points to here, then we subtract 24 from the base pointer. So now, from the stack pointer, so now we have all this same space is our local variables, and as we saw also arguments that we're going to use to pass to this function. And our base pointer still points up here. Questions? Okay, now we're going to set up the call, right? Now we get to we want to call this function, right? And remember, all of this code, these four lines here, map back to that one line of a equals call e 10 40. And yet, there's a lot of instructions that has to happen in order to enable this function call. So, what happens? We move hex 28, which is 40 decimal 40, into esp plus four, which is going to be here. Then we move 10 onto esp. So now we've done exactly what we needed to do, right? We didn't necessarily push 40 and then 10 onto the stack, but we copied them to the right locations so that when we call this function, it's going to have the arguments to the function in, if you're going from the bottom up, it's left to right. If you go from the top down, it's right to left. That makes sense, which is just super confusing. Offsets, why do you not just push them? Is it a compiler? Yes, it's a compiler thing. I honestly don't know why I decided to do that. I bet it could be faster somehow to just reserve that space because this subtraction here, instead of doing, what would this be? Instead of doing sub 16 and then do another push, push, maybe for whatever reason, maybe these are shorter instructions, so overall your code gets shorter. Compiler code optimization is crazy, crazy. I think compiler people don't fully understand why it does certain things because there's not only trying to optimize speed, but part of mass code is code that is smaller. You can do the same thing in the same number of bytes. Now you've cached more of your instructions, so your code should be faster because you're not pulling in all its code. But maybe it turns out that for certain CPUs, longer instructions are actually faster. How do you balance that optimization and how do you do that? They do a lot of profiling, and they try to determine what's the best option. Now it makes this call instruction. So what's the difference between a call and a jump? Call. We're coming back from the call, so according to our calling convention, what do we have to do? Instruction environment. Push which instruction? The next instruction. The next instruction, which is what? BF. BF. Why the next instruction? It's going to be a call. Yeah, otherwise we keep coming back to ourselves and doing this infinite call thing. I mean, I guess you can deal with that somehow, but it would be super annoying, right? So yeah, so this is the difference in x86. Some architectures don't have a call instruction. You have to push the correct instruction onto the sack, and then you jump. But here we have a call instruction. So the call instruction does two things. It sets EIP to exactly this value that we have here of 804.83.94, and it pushes 804.83 BF onto the sack. So it does two things. It's actually kind of cool. So when we call this, we're going to push, move the sack four bytes down. We're going to copy 804.83 BF onto the sack, and then we're going to jump and start executing here. So now we have a new function that gets called, right? And it has its epilogue. It says, I don't know who called me, but I have to save their base pointer. So the very first thing it does is saves the base pointer, pushes that base pointer on, and then it sets up its own base pointer that says, OK, my base pointer is going to be here. So why can the compiler hard code the fact that B is at offset EVP plus C, and A is at offset EVP plus 8? So it's based on the number of parameters, right? So the first parameter, but why is the first parameter always at EVP plus 8? Because according to the calling convention, the person who called me must have put the arguments in order from left to right, or pushed in the order from right to left. So the closest to the top of the heap is going to be to the first argument. Then I know they must have pushed on the return instruction pointer, and I just pushed the base pointer onto there, and then I set this up as my new base pointer. So now I know 8 bytes above that is my first parameter. And 4 bytes above that is the second parameter, and 4 bytes above that is the third parameter. Yeah, sorry, I'm counting too many numbers. And that's why it's able to hard code this. So this is part of the thing. When you're looking at binary code, if you have negative offsets of the base pointer, you're looking at local variables. And if you're looking at positive offsets, you're looking at parameters, which makes sense given the calling convention. OK, so then we move EVP plus C, which is B into EAX. And so what we have on our stack, by looking at this, we can see that we have the callee's function frame, which includes all of its parameters. This is everything that callee needs to do in order to execute. And then above that we have main's function frame, which is everything that main needs to do. And this is what we use the stack. So as if callee called other functions, they would each get space on the stack for their own function frame. And then if we ever ended up calling main again, it would have a separate function frame on the stack. And so this is how we can have local variables for every function and local parameters for every function. Yes? It looks like they wouldn't even push the variables? We would still end up, we'd push technically, it depends on how it would compile down. We're actually going to look, after this section, we're going to look at pass by value, pass by reference, pass by name, and see how that happens. But it's really just a compiler technique. In most, I believe in most pass by reference, you're essentially converting that into a pointer, transparently. So you pass the address of something into the function, and then instead of just accessing it normally, you de-reference it. Cool. Okay. So we just moved, we're going to move 28 into EAX. We're going to move 10 into EDX. We're going to add them together to get 50. And then we're going to add one more to get 52, which is clearly 33. Now we need to clean up. So what do we have to do in our epilogue? Correct the base pointer of the caller. We need to return whoever called us their base pointer to the correct location. So that's exactly what happens next, right? We're popping the value that's on the stack into EBP. So that's going to put main's base pointer back, right? Because that was our job, was to save it. So now we have to put it back before we return. Otherwise we're a bad function. And we're not conforming to this calling convention. Then we need to do a return. So a return is basically, it's really, you can think of it as the opposite of a call. And essentially what it does, exactly what it does is pop EIP. So take the value currently on the stack, pop it and put it into the instruction pointer. So that means the next instruction you're executing from that instruction. So here now the instruction pointer will be 80483VF and our stack would have moved up 4 bytes. And now we go back here, right? So from our functions, from main's perspective, everything's the same, right? The base pointer is the same, the stack is still in the same location as when it called it, right? Everything's good. And we got a value in EAX. That value that we wanted from call E is in EAX. So now we can finish the rest of our job. We can move EAX back into EVP, or not back, but into EVP minus 4, which is going to be A. Then we move again, EVP minus 4, which is A into EAX. So leave is, you can think of it as the opposite of these three instructions. So it does two things. It first sets the current stack pointer to the current base pointer. So in essence it gets rid of this sub 18. It doesn't matter what the stack is, it's going to be currently at the base pointer. Then it does an implicit pop EVP. So it takes the value that's currently on the stack here, FD2C0, and puts that into EVP. So you can see in call E it only needed to do half of that, right? It only needed to do pop EVP. That's why I've used that because there's no local variables. So this leave instruction is pretty complicated. It's going to move the stack pointer to the current base pointer, and then it's going to pop that base pointer, so we're going to get FD2C0, whatever is on the stack there, into the base pointer. And then finally, now we're going to return, where are we going to return to? We don't know, whoever called us, right? We've been dropped out on the stack. But we'll return, and they'll do their stuff. In the call E function, when exactly pop and return happens, and the return statement, or the this, the curly basis. Here? In actual code, in C code. Let's say I have written, like we have written A plus B plus 1, and then the right curly brace. Yes. So the return statement will, if you're returning a value, it will set up that value. So this, the fact that this value is in the EAX, is because of the return. But when exactly pop and return happens, at the return statement itself, or it goes to right curly brace and then pop and return happens. There is no difference, right? There is no, right curly brace doesn't compile anything. That's just a syntax. I can have a function which has multiple written statements. Yeah, so the first return is going to return. So it'll probably compile it to a jump. So each of those returns will jump to that. They will set up whatever value, they'll put whatever value you're going to return into EAX, then they will jump to the epilogue, which will do pop EDP return. But usually the very end of the function, yeah. And then all the exit points of the function will jump to that one location in the assembly. Now you get to a point where you get to choose your own education adventure. So I did this, I didn't get to do this last semester, this part because we went not as fast as we're going now. So I like where you guys are, so I think we have time to cover this. What I want to cover, so what I'd like to cover is a little bit of why this Cdeco calling convention, how this leads to security vulnerabilities like buffer overflows. And then walking through an example where we have a buffer overflows to see how that blows stuff up. So if I get a show of hands and like, yes, I would like to learn that, sweet, alright. For those watching home, everybody raised their hands. I think there was, some of that rose two hands. It was overwhelming support. Okay, so, yes. Yes, we can only, well, so no, we can only return one thing because that is what C dictates. Right, C says each function only returns one thing. It can be a struct, which has multiple fields, but it is still only one thing. So if you wanted to change C to enable returning multiple values, you'd have to figure out a way to do that and you may have to change this calling convention, which would break lots of stuff. So Python, Python's calling convention has a different design. Yes, and Python also does not compile that into C code, right? But Python does have a foreign function interface that you can call C code from Python, and it knows how to translate those stuff. But yeah, I think under the hood you could maybe do it by like tuples basically, like you just were structs. You could create like a custom struct for everything that you wanted to return, but it would be, it would be pretty, I don't think, it would be not easy. Okay, so, as we just saw, right, when Colleen was called here, so we just did, we're trying to return to whoever called us, right? On the stack we have that person's base pointer and that instruction pointer of where to go execute. Does this kind of do any validation that says yes, that base pointer that we saved for main is the proper base pointer? Does main do any checking to make sure that the base pointer wasn't different? Is there anything that prevents us from writing to the stack? No, here's proof. We're pushing things on here, right? The stack is writable memory. We can lift edges the point. It is a stack. We are writing to it. So, we go to the base pointer. Now what does this return instruction do? Pops something off the stack and just starts executing from it. Is there anything that's safe that this has to be exactly where we came from? No. There's no text being done here. So, think of it. How many of you know the story of Hansel and Gretel? They need to work under fairy tales. Okay, Hansel and Gretel, they were like little kids and they had this great idea of like oh, we're gonna go play in the forest, but the forest is really scary. So what we'll do is we'll leave a trail of breadcrumbs from our house into the forest. That way they can always find their way back, right? So they're going along and dropping breadcrumbs three feet and that way when they're like super deep in the forest, they can always find their way back. Right? And so that's the same way to think about these safety IPs. Right? These are breadcrumbs that we're leaving through the stack that tells us how to go back. But, just like we talked about, there's no reason that says that hmm, what if I change this or what if I overwrite this or what if your program has a bug and I can control that. Now I can change where you and your breadcrumbs go into the evil witches' house. I think they live, well at least in the version I know, so don't feel too bad for them. So, we're going to look at what happens if they did. So we're going to look at this function. We're going to make some code. We have a myCopy function. We're copying the parameter string into a local buffer foo. How are local buffers allocated like this? Why am I calling it a buffer? Yeah, it's a local array. So I'm saying I want four bytes and it's going to be allocated on the stack. What are the semantics of string copy? Copy from the source, which is string, into the destination, which is foo, until you see a null byte in string. Is there anything that limits the size of the bytes that I'm going to be copying? So, I have my main function. You can see I use this a year ago. Clearly didn't update it. I actually just copied this right before class. We're going to call it like this. We're going to print after. We're going to return zero. So, we can look at how this is compiled just the same, right? Move the stack pointer to the third base pointer. Subtract 16 bytes for our local variables. Move 804-8504 into ESP. What do we think this is? Well, then we're calling myCopy, so what do we think this is? Yeah, it's where, right? The compiler has to put the string ASU space, CSE space, 340 space, fall space, 215 space, rocks, exclamation point, and then a null byte. That has to exist somewhere in memory. And so, the compiler just chooses I'm going to put this at 804-8504. And so, this is the address that it passes in. Then we're going to move 804-8507 which into eax move eax onto the stack pointer, and then we're going to call printf. What's at this address? What happens directly after calling Oh, sorry. Oh, I changed this to call it. It should be myCopy. So what happens right after you call myCopy? Printf. We have a printf call. So this is the string after. Because this is being passed as the parameter to after. Then we're moving zero into eax which then gets us our return zero and we double even return. So, in the myCopy function just the same. Then we move the stack pointer to the current base pointer. We subtract 28 from the stack pointer. Then we move eax into ebp plus 8 ebp plus 8 is the parameter the string, str. So we're moving that into eax then we move that into esp plus 4. And then we are going to now we get to our load effective address but it's actually loading an address so we're taking ebp and we're subtracting 4c from it. So this is an address and we're moving that into eax then we're moving that onto the stack. So what's ebp minus c? A local variable. What local variable? Foo. So that's our foo. So we put it onto the stack. We call string copy then we leave and then we return. So let's look at this. eax ebp eib we have all of our code. We have our stack. So the stack is currently here. The base pointer is somewhere else. The instruction pointer is currently at main. I'm going to walk through this a little quickly as we just did this. We're going to push the base pointer move the current base pointer to the stack pointer subtract 16 from the stack pointer then we're going to move 804 8504 onto the stack then we're going to call my copy. And remember calling pushes we're going to return 804 8423. Now we push ebp so we're going to save main's base pointer we're then going to move the current stack pointer to the current base pointer we then have to subtract what did you say 28 was? Did we already say that? No. A lot. We'll go with a lot. Into esd hex 28. I guess that's the easy way to say it right? So it's hex 28. Then we move ebp plus ebp plus 8 which is 804 8504 which is the address of that string that we passed in. We're going to move that into dax then we're going to move that onto the stack pointer plus 4 So once again right to left we're doing string copy string copy foo comma skr so the right-most parameter is 804 8504 then we're moving okay so we did that now we're moving ebp minus c and remember the load effective address means we are calculating what is the current value of ebp minus c we're not de-referencing that value Professor why are we doing two statements for moving 0 ebp to esb first putting it in dax like this? I don't know it's a compiler I think yeah it did it and it's correct okay so then we get the effective address eax so we get fd2ac which is right here so it's 12 bytes down from our base pointer and we know that that's a local variable because it's a negative offset then we're moving that onto the stack pointer so now we're calling string copy we're saying hey copy the string that's at 804 8504 into fd2ac and so string copy is stupid it's supposed to be stupid right it does exactly what it says it does in the man page it says I'll keep copying from source to destination until I get a null byte in destination in source so we call string copy so what is it going to do in there it's going to copy asu so starting at fd2ac it's going to first copy asu space I remember it's a little tricky I believe I did this correct you'll notice if you look at the bytes we have space I think this is usa right 61 is a so it's reverse order when you look at it as an int y what was that yeah the indian nest x86 is little indian so when we're copying byte by byte we are literally copying the first byte here a at fd2ac and then the next one at fd2ab is that right no we're going higher so it'll be dad right we're copying not this back so we do 61, 73, 75, 20 so we're going to copy the bytes up that way and if you look at that as an integer you can look at it as the strongest in that then we copy csu space we just keep going right it has no qualms about where it's supposed to copy it copies here copies here it comes here it comes here and then technically it would also copy a null byte into this next thing here because it would copy the null bytes over so then it returns we get back we then call leave what does leave do remove the subtraction so it removes the subtraction so it sets the current it sets the current stack pointer to the current base pointer so it's going to change the base pointer to be 2v8 and then it's going to pop so then it's going to pop edp so it's going to put 6c, 6c, 6166 into the base pointer 6c, 6c, 6c, 6166 then we're going to call leave instruction so where are we going to start executing 31, 30, 32, 20 so we're going to the CPU will try to access an instruction there and try to execute from there that I would say that is highly unlikely to actually be a real instruction so that will fail we get a seg fault and our program will terminate and so I can run this I can get a seg fault I can run it in gdb and it will say it's starting and it will say it's got a seg fault 30, 32, 20 and I can look at all the registers here and I can see that yep the base pointer had the value 6c, 6c, 6166 um so the idea is that an attacker can overwrite and change that say VIP value especially with very vulnerable C functions that don't take in sizes right all C arrays have to have sizes right C strings are zero terminated strings they have to have sizes but if I copy from a string that's potentially unbounded or is very large into a buffer that's smaller I'm likely to overwrite memory addresses so in this one we just caused it to crash but if we can control that 31, 30, whatever those bytes were we could make the program execute wherever we wanted and do whatever we wanted to do um and so there are lots of other techniques to take this to get full execution but using this these as basic building blocks you can fully execute and take control of a program this is one of those common vulnerabilities in C and C++ they're still finding buffer overflow vulnerabilities today on modern C and C++ code so cool questions