 Good morning everyone. Alright, so some good news. Well almost good news. So the assignment three server is up. Yay. You all have accounts on that server. I just haven't sent you your username and password quite yet. I have them all. I was doing that and I was almost made me late to my class this morning. So as soon as I'm done with this you'll get an email with your username and password for the homework three site and how to access the homework three site. So homework three, I also just had to write it up. So you all have accounts on this server. There are a bunch of you. Anybody who requested me obfuscated has a numeric thing, the zero underscore. So you all have accounts on here. The goal is there are a series in slash bar. There's a challenge folder which you can't actually access directly. In each of that there's a subdirectory for each of the challenges. Okay so on the server in slash bar slash challenge in that directory there's a number of folders challenge one through challenge fifteen. So each of these challenges is one challenge, one level. You have to break one level before you get to the next level. So for instance if I look at my ID we can see that my group I'm in the group Adam D and the group level one which means that I can ls bar challenge level one because what are the permissions on this directory? Yeah so challenge the creator of this directory can read write execute but level one the people in the group level one can read and execute. And then inside here there's a binary called level one so what are the permissions on this guy? Read write execute for challenge, sticky set UID or set group ID in this case. So what does that mean? What is this one program when I run it? What's it going to run as? As root. As level two. Why level two? The group. So it's going to specifically run as the group level two right. The set UID bit does not necessarily mean it's going to run as root. Here the set UID bit is on the group execute bit which means that you get inherit the group permissions of the owner of that file. So this means when you execute it it's as if you had group level two permissions. Challenge so there is no set UID on challenge so it's not going to run as challenge it's going to run as U the user. If there was an S on challenge execute bit then it would be running as the user challenge. So I don't know. Am I going to tell you who I am? Clear. So your goal you have to figure out how to do that. You can look at how everybody is doing with the score command. That shows you at what level everyone is at. So then once you break that level you go up to the next one. There's a nice helper command called the leap function which is when you break a program right. So once you break that set UID program you call this leap function to set your group to that group. So that gives you that permissions of that level two group. That's how you add yourself to that next group. Question so far? So is this pretty much a clone of like Smash the Stack or the Wargaming site for you off to us? It's similar. Yeah. So you can see all the challenges in here level one, level two, three, four, five, six, seven, eight, nine, ten, 12, 13, 14, 15. Yeah they're similar in style, different in substance. These are supposed to be a little bit easier right, but they get harder. So some of them you'll have source code to, some of them you will not have source code to. So do responsibly. Do what you need to do. Yeah. Okay. Okay. So some important, important comments. Right. So A, there's how many of us in this class? Anybody know? A hundred and twenty three. Right. It's a decently big server. I don't know. It's eight cores with, what was it, 16 gigs of memory. So I figured that was fine. We'll see how it goes. I may have to split some of you off into another server. Hopefully I don't have to do that. But I think it should be fine. So, you know, group forcing things. I'm pretty sure none of the levels have to do with group forcing. So yeah. Do we have a coding directory, like a temp directory or something like that? You have your own home directory and yeah there's a temp directory, I believe. So A, be mindful of the resources on the system, right. Don't be hogging all the resources. B, so this is actually a very important one. So while I want you to have fun and if you want to try to attack the server, I'm fine with that. But only this machine. So this is running inside our open stack cloud for my lab, right. And so I've tried to restrict it so you can't access any of the servers but inside because there's experiments running and all kinds of stuff happening. So we're not assuming that there's malicious internal cloud traffic in our lab network so don't do anything malicious on the internal of the network. That makes sense. So you can totally attack this machine on it, you can try whatever, I don't know, there's any root exploits or whatever. But only on this machine, don't end map other machines in the network or try to sniff traffic or anything like that because that's definitely how it works. Cool. Okay, I think as far as safety, that kind of stuff, ground rules. So you'll have to, for each, so what you'll turn in is you'll submit the, I'll be able to tell exactly how you did based on where your score is on this thing. And then you'll have to submit a read me of exactly how you broke each of the levels, right. So as you're doing this, keep a walkthrough for each level of exactly how you did it and the commands. Hey, this is how I broke this level so that you can do it again and so that we can know that you actually did it. Any questions? We'll move back to give you the logins after I talk about it because then otherwise you'd all just do that instead of listening to the lecture. Okay, cool. I think you'll enjoy it. Because we're, it's essentially you're going to be applying all the stuff, all the attacks, binary attacks we've been talking about. So you have to, you know, one of my big philosophies is understanding how vulnerability works is, you know, good. The first step, right, but actually creating one that works on a real program, a real system is something else entirely. So it gives you a lot better understanding of that vulnerability. Okay, so we've been talking about, right now we're going through talking about the stack and we're talking about how the layout looks at the stack because we want to try to understand buffer overflow vulnerabilities, right, and what they can do. So I'm going to, I want to get through all this stuff today. So I'm going to go a little bit quickly because this is kind of background-ish material. This is, so like I mentioned, this is stuff that I teach in my 340 class. If you take a 340 with me, you should be a total expert on this. Okay, so functions, right, they need space to allocate for their local variables. And so we want to use the stack, right, so we want to use the stack to store each of the local variables on the function. And so we saw the stack, right, we saw that it moves and we can push things on and pop things off, right. And so we want to use the stack for storing these local variables of our function. But as our function executes, right, what's going to happen to the stack pointer? So ESP, right, is going to store the location of the stack in memory. So as our function executes, what can happen to the stack pointer? They have to go up and down depending on things being pushed, things being popped. The stack's used if there's, if you're doing computation that requires, that's more computation than you have registers. Some of the registers will spill over onto the stacks, the stack will be changing based on that. So while we could use the stack pointer, so we could say, okay, every local variable is some offset of the stack pointer, but the stack pointer is going to be changing throughout the execution of our function. So we introduce a separate concept which we're going to have a frame pointer or in x86 terminology it's called the base pointer. So the base pointer is going to point to a fixed location for each function invocation. So every function invocation, and that's going to define the function frame on the stack. And this way, every local variable is going to be some offset from this base pointer, right. And that's exactly how local variables are implemented in functions. An important concept is that there, you can have multiple invocations of the same function on the stack at once. That's how we have recursion. And so because of this, every different function frame is going to have its own base pointer, and they'll all be at different memory locations. And so in x86, the frame pointer is the base pointer and it's stored in the register EVP. So let's look what this looks like. So we have a C program, we have variables A, B, and C, we have A is equal to 10, B is equal to 100, C is equal to 10.45, and A is equal to A plus B and we're returning 0. So what the compiler does is if the compiler looks and sees what local variables are using in this scope, and then it's just going to define every local variable with some offset of the base pointer. It just decides. It knows exactly how many variables are being used. So it knows exactly the size that it needs to reserve, and then it can know for, it decides to each variable some offset. So it can say that variable A is at EVP plus A, B is at EVP plus B, C is at EVP plus C. And so looking at this code, translating this into kind of pseudo-assembly, this would be something like the memory of EVP plus A, that offset is equal to 10, and that's what this instruction does. And the next instruction is going to be EVP plus B, the offset B is equal to 100, and this is EVP plus C is equal to 10.45. And then finally we're going to say, okay, memory of EVP plus A is equal to EVP plus A plus EVP plus B. And so what it does is then, so it assigns actual concrete values to these, so it could be 4, 8, and 7, and C. And specifically as we'll look at on x86, the local variables are going to be below the base pointer, so they're referenced with negative offsets here, whereas parameters are going to be positive offsets. So we'll see this specifically. So it's going to look something like this. So it's going to say move the stack pointer into... Oh, so this is actually this function compiled into x86. So we'll see this function per log, this will come up later. So we're first moving the stack pointer into the base pointer. So now whatever was in the base pointer is now going to be going to the stack pointer. Yes, okay. And then where... So now the base pointer points to where the current location of the stack is. Now we're subtracting the stack pointer down 10 in hex, so down 16 bytes. So the base pointer is going to be where the stack pointer originally was, and the stack pointer is going to be down 16 bytes. So now by moving it down, it's essentially allocated 16 bytes for our program to use. So that way at base pointer minus C, that's where variable A is. So it says move hex A, which is 10, into EVP minus C. And then move hex 64 into EVP minus 8, which is not quite as far as C. And that's where B lives. And then it's going to move 412733333, as we all know, 10.45 is in IEEE floating port format. Then we're going to move EAX into EVP minus 4, which is the location of C. So why does it do this? So what's the effect there? Those two statements, what's the ultimate outcome of those two statements? Those last two move instructions. Yeah? So the float, they are two bytes' character as a four byte. That's why it takes two memory space. So we are showing in our register that floating. So what's the effect? What are we doing there? Move bytes into one register, like 100 EAX. And then what? And then we'll do that. Move bytes again into that summary location. Yeah, exactly. So yeah, so into EVP minus 4, and EVP minus 4 is the location of C. Now that C in the C program is originally a float. So those two instructions are just take that constant value and move it into EVP minus 4. Why uses that register EAX? I honestly don't know off the top. I don't know why. It's the only reason I can give is compilers. So it's some kind of compiler optimization. Either the instruction was less to do it in two steps like this, or maybe it's faster, one of those two options. Then the last thing we're going to do, so this takes care of these three instructions. Now we need to do the addition. So we're going to first move EVP minus 8 into EAX. And EVP minus 8 is hex 8, or is a B. So we move the variable B into EAX. Then we add EAX to EVP minus C. And the add remember takes the first parameter, adds into the second parameter. And where does it store it? It stores it from EAX. So actually the way I remember, especially when we have an example like this, is I look for the constants. Like here, subtract 10 ESP. So what's this doing? It's dragging 10 from what? ESP from the second one. And then where's it storing it? Is it storing it into 10? No, it doesn't make sense. It has to store it in the second one. So that's how that subtracts work and works the same way. So it takes EAX, which is B. And it says add it to whatever is EVP minus C, which is A. And then store that back into A. So that in effect does both of those. Okay, so let's visualize this. Let's see how this looks. So we have that code we just looked at on the right. We have our awesome stack. High to low. It's at some stack point or when we start. We don't know exactly where it is. Every function invocation can be different depending on our program. Even the same function invocation ran different times as we saw. The environment will shift our stack. And so our stack can change based on just where our program is run. So let's say we have 10,000. And now we have registers EAX, ESP, and EVX. So these are the only ones that are used here. So first originally, so we're saying, okay, the stack is originally going to be at 10,000. And so this first instruction is going to move the value inside ESP into EVP. So that's going to move that there. And then we're going to subtract 16X10 from ESP. So it's going to be FFF0. And then that's going to move the stack down now. But where's EVP pointing to? The original 10, 16 bytes up, the original location of ESP. So this is why, while this function executes, the stack pointer can change. But the base pointer is going to remain fixed for this invocation of this function. And so this proceeds just as we saw. So it's going to move 10 into EVP. So remember this syntax here says take the value of ESP EVP, right? Subtract C from it, which is 12. Yes, 10, 11, 12. 12, which is going to be some offset down here. So these are four bytes. It'll be what? 1, 2, 3 here. And it will say, OK, move 10 into there, into that memory location. So the important things here are this offset, the negative XC. That, whatever it is, is added to EVP. And then it's dereferenced. So the parentheses are dereferenced. So that's going to be, so I have all the addresses here. So that's going to put A right here. So 12 bytes down from EVP is going to be the constant value of A. Then we have the same thing. So EVP minus 8 is FFF 8. And we're going to move the constant value of X64 into there. Then we're going to move this crazy, the IEEE flowing forward representation of 10.45. We're going to move this into EAX. And then we're going to move this into EVP minus 4. EVP minus 4 is FFFC. So that's value is going to be there. And then finally we're going to move EVP minus 8. So the memory at EVP minus 8, which is A, XA. We're going to move that into EAX, or A. Yes, good, 64. I was just checking this to make sure you were paying attention. So when we can see on here that A is here, B is here and C is here. And this is how they are laid out on the stack. And we know that because we can look at this X86 code and we can match it to the C code and say, hey look, I know that I set A into, I set XA into integer A. So then I know from here I'm moving the constant A into EVP minus C. That means in this function, EVP minus C is the variable A in the original program. So then we move hex 64 into there. And then we add, now we're going to take EAX, right, hex 64, add it with EVP minus C. And we know that that is, any hex additions? It should be 110, right? It's not what they are. 110 is 110. Which? 6E. Okay, yeah, so we've done this program, we've stepped through and we've seen exactly how the stack is being used for this function frame. So the idea here is that this function, which I can't remember what we called it, it owns essentially this memory location, right? And this is how it's storing the local variables of the function. Questions on this example? So in the second step we subtracted 10 from the stack pointer. Yes. So why is it constant 10? Because it depends on the size of the program, that's why we are subtracting the constant 10. Not necessarily the program, it does depend on the program. Yes. So the compiler looks and it says, how much memory do I need for the local variables? Right, so that's why the compiler knows the exact, so in C, this is why for performance reasons, right, and probably most of others, you know exactly, you have the size of an operator that can tell you exactly how many bytes every type uses in your program. So this way the compiler can tell for this function it knows all the local variables that are declared. So it can look and see, okay, what's the type of each of these local variables, so how many bytes do I need to store these here? And then sort of determine, so here it was 12, right, 12 bytes, but then why doesn't it move 16 now? Return address. The return address? We haven't seen the return address. Yeah, return address. We'll see that. We'll see that. Oh, our return value. Can we talk about C-deckle? Wait a second. Can we talk about C-deckle yet? Maybe we're getting to that now. Okay, we'll see. The return value is actually put into EAX. So the register EAX is used for return values. And what if the program is dynamic, like we are allocating a medium-dynamic program then? With the malloc? Yeah, with the malloc, like we are making some dynamic program so that you don't want the program to move to this side almost. Right, so kind of as we saw, so we're just looking right now at the stack allocation, right? So on the left right you have the stack that's growing down and below that you have the heap that's growing up. So all the malloc is happening on the heap, right? Which is something completely different which we're not worrying about now. But you actually can't, this is why in C, you can't write an array that has a variable length, like a stack allocation of a variable length array. You have to use malloc in order to dynamically create that because the compiler must reserve the space for you, like this. So why 16 and not 12? What was that? Isn't 12 a multiple of 2? Yeah, it's not a power of 2. Yeah, it's not a power of 2, it's a multiple. What was that? A power of 2 is equal to n. So but why? Because we're dealing with binary, I like that. It's probably convenient because it's a byte and so it probably does it in byte. In 16 increments of 16? Yeah, so it actually goes back to me. It's the original cop-out answer. It's compilers. The compiler decided that this is the optimal way to do it based on its knowledge of offsets. Because on some systems, if you're not addressing bytes based on an offset, you can actually get a second fault for some type of an exception. So that's basically why. So it decides that the compiler people have done hopefully the measurements or whatever to say, hey, even though we're technically using more memory, right, it's actually faster to use that because it's byte-aligned. Exactly, so it is because it's a, I think it's even a power of 4-aligned. Cool. Yeah? Yeah, we're storing the offsets. What are you doing? We're storing them, you mean? So when the compilers is minus 0xc, plus and dvp, why are we having offsets not the actual memory? Ah, because this is a function, right? The function can be executed anywhere. We don't know exactly where it's going to be when it executes. But we know that that stack pointer when it executes is going to be at memory we can use. We know we can move the stack pointer down to get more memory. So that's exactly what we do is we say, okay, then I'm going to use this stack, this part of the stack is mine, and I'm going to use it. So, you know, the fact that it had a at minus c and b at minus 8, that's just a compiler that could have swapped those. But the important point is that it compiles it like this, right? So that's constant. Every time in the original program you reference a, here you reference evp minus c. If you have pop command, would it pop the value of a? If you use your pop command, it will pop whatever is in ffm0. So pop is take that value in there, put it into somewhere else, and then move that pointer increment esp. But because esp here is pointing, actually not even at a, it won't put garbage somewhere. If you have to, for example, you don't, like you would expect a pop would produce a, like some file extra. Ah, no, if you're the programmer, you're trying to do a pop, right? So you're trying to do x86 instructions? Yes. You have absolutely no idea what the stack is going to be at that point, right? Unless you actually put something on the stack. So because you, as a programmer, you can't make any assumptions about what's on the stack, right? There's nothing in the c standard that says this is exactly how it has to go. This is just how it goes based on convention, essentially. It could allocate tons more memory frames. It doesn't change your program at all. Your program semantics are still fine. It's probably a little bit slower. So function frames are great, right? Absolutely necessary. One of the fundamental parts of writing recursive programs. And so we can automatically, it automatically allocates memory for us for the local variables, so we don't have to do that. But there's other things we need when we call a function. So when you call a function, what are all the things, so we talked about one of them, right? We talked about the return value. What are some other ones? Parameters, right? We have to pass parameters, right? What else? So we're going to invoke some other function. Yeah. Call it back to location? Yeah. So, right? We're going to invoke some other function, which means we're letting the CPU start executing code at some other location. But presumably we want that function to do something and then come back to us, right? So we need some way to say where to come back to. What about if we're using a base pointer, right, the base pointer, and they're a function, they're written, they're probably using a base pointer to do their offset. And so we really need to, we need our frame pointer. We want to, we'll see, we need to have our frame pointer be the same when that function returns. The return address, right? And as we saw, the local variables and any temporary variables. So all of these things are stored on the stack for every function in location, or can be, depending. So let's look at this. So this is where it gets into. Okay, I should have looked ahead. So the calling convention. So this is what specifies, so the idea is to invoke a function, right? We have to store all of that information. We have to store the parameters. We have to store the local variables. We have to store the next return address, right? Of the where do we want that function to return to? We have to store our base pointer because that new function is going to use its own base pointer, right? So we need to make sure that doesn't get clobbered. And so the calling convention basically establishes like who needs to, who is in charge of storing that information, right? Does the caller do that, or does the function being called do that? The caller. So the answer is both. So the convention dictates who does that. So the caller does some things, but the callee does other things. And so this is actually something that's really interesting because it varies based on the processor. So in x86, the convention is to use the stack to pass parameters on ARM. The convention is to put values and registers to use the registers to pass parameters. But even not, even thinking about the processor, it also based on operating system. So the Windows has a different calling convention than Linux. And then even on Linux, your Linux programs use the Cdecal calling convention, which we'll see, which is different than how you make function calls to the kernel with syscalls. So those are a different calling convention. And so yeah, it could even depend on compiler or like type of call, like syscalls versus normal user function calls. And all this is is a convention, right? So the important thing is that you're able to read this and understand this, and specifically you want to be familiar with the Cdecal calling convention because this is the standard on x86 Linux. So the way this works, the caller first pushes arguments onto the stack from right to left. So if you look at the stack and you go up the stack, the arguments will be left to right. And just how they are in the function call in C. So you put the right most one, you push that, and then push the second to the right and then third to the right. So you've pushed all the arguments. So first is going to be all the arguments. Then the caller has to push the address of the instruction after the call. So this is going to be stored on the stack so that that function knows where to return to because otherwise it has no idea of how to get back to that function that called it. And so the callee, what it does at this point, the base pointer is still the caller's base pointer. EVP is still the caller's base pointer. So if the callee is going to use it, which for some purposes it may optimize out that it doesn't need to use a base pointer, there's maybe no local variables. So the callee can push the previous frame pointer onto the stack that will create space on the stack for local variables. And then it has to ensure that the stack is consistent whenever it returns. Why is this? Why should it care? Just doing this thing. If that function pushed something onto the stack and then calls a function, the stack has to be at that same place so it can pop those on the stack off and use it. This is actually one of the key things. If you're ever doing this by hand, you have to make sure your stack is consistent when you exit a function because otherwise everything goes crazy. And then finally it puts the return value into the EAX register. So this is how in C programs return values are done through the EAX register. So what does this mean when you call a function? What can you expect? When you call a function as a caller, can you store some important information into EAX? No, because it's going to get overwritten. Exactly what this says is, hey, this EAX register is going to be overwritten. Furthermore, I think there's probably a little bit more depth here because just like with EAX, the callee can obviously overwrite EAX. And so I think for each register it's defined if the callee has to save it before using it so push it onto the stack and then use it or if it can just use it directly. I actually don't know it off the top of my head but it's something to look up. And the compiler will handle all of those constraints. All right, so let's look at an example. So we have our function main, we have a variable a, and we are calling a function passing it 10 and 40 and setting that return value to a and then we're going to return a. Then we have our function callee which is going to return a plus b plus 1. Pretty simple. Pretty simple program. So our main function, so when we compile this looking at the assembly the main function is going to look like this. It's going to say push ebp. So this is, main is a function just like any other. So when it calls, at the start it's the callee so it needs to, if it wants to use a base pointer it's got to save the base pointer. So here by pushing the current value of ebp onto the stack it's now saved the previous frame's base pointer. Then we're going to move the stack pointer into ebp just like we saw so we're setting up, okay our base pointer is now where the stack pointer currently is and now we're going to move the stack down here x18. Then we're going to move 28 of, this is interesting, yes. Okay, so we saw this before and this was local variable thing but we'll see that this is a weird compiler not weird, but this is a compiler optimization. So it's actually moving the stack down so that it already has room for local variables and for the parameters to this function call. So it's going to move 28 so what's x28? 40. It's going to move 40 to esp plus 4 and then it's going to move hex 10 which is, or hex a which is 10 to esp plus 4 is above esp so here we have 40, 10 so it's a reverse order so it's pushed onto the stack from right to left so 40 is higher up on the stack and 10 is lower down on the stack. Then it's going to call e and then it's going to say okay you're going to do whatever you're going to do but I know when you return you're going to put the return value into eax so I'm going to move eax into ebp minus 4 which is going to be a which we're going to refer as a and then we're going to move ebp minus 4 into eax so why are we doing this step? Yeah so we're returning it so this is the return a but this seems crazy right it's moving it into the stack and then moving it from the stack back into eax so could it just optimize this out and say just leave it in eax and return You could do something else with your return value and then maybe do nothing You could do but we don't do it here right shouldn't it be smart enough That's a tricky question So this ebp is the memory location and this eax is the register so you put that thing in the register and then taking it into the memory yeah put this thing in the memory location and then taking it into the register for the addition we're not doing it so this is just the function main so we're doing no addition here if you look at the function main there's no addition being done So over here you're storing this thing in the ebp but you're not changing the value of the ebp after the next step so you're storing this thing in the ebp Yes we're storing it onto the stack we're at ebp minus 4 and then we're taking what's in ebp minus 4 and putting it into eax Yeah Is it because well the first one is moving the return value and it doesn't necessarily know that it's going to instantly move that so it wouldn't be maybe a very helpful optimization to pull this step out because almost every other time you're going to be moving something of value to the return from main If I did a return callee 1040 would that be better how long you want? So if I just wrote a return instead of I think there would be better optimization So if you wrote it like that then yes it would do that because there is no local variables so it will output like that but looking at the assembly code you know as a human that you can optimize this by actually just removing these two lines because you know that callee is going to place return value in eax and you're never using it so you just delete those two lines Right I guess they are doing this thing due to the multiple processors sharing the same memory stack so if there is a context switch happens and the second processor they might change the eax value of the So very close So on context switching the OS stores all the value of the registers so you don't have to worry about your registers that are going away But if we think about a multi-process environment what if this A variable was being shared between two processes with memory mapped between them and then by not copying it back into there that other process will never know that we changed that value so we actually by optimizing it out were changing potentially the behavior of the program Now in this case you could prove pretty conclusively that that's not going to happen in this program but by default that's why it doesn't do these kind of optimizations at the base default level because it would rather be correct if this maps very clearly to each line and we know that that's correct as opposed to applying the more aggressive optimizations that could change behavior Do the O optimizations actually prove this? I don't know, I was actually thinking it would be really interesting to look at that So this was on a set OS 6.7 compiler so on the latest Ubuntu GCC it will probably generate different code which is something I've seen so you know I'm going to do something else even on simple crazy examples like this so ok good ok then we have leave the leave instruction is the exact opposite of the move ESP and move stack pointer leave says it's the opposite of these two lines the first two lines so we have the base pointer we have the stack pointer the stack pointer to the current base pointer which gets rid of all this subtraction, moves it back to where it was and then it does pop EVP to take that value that's on the stack remember we stored EVP at the start and put that into EVP which is going to be our callers base pointer which is going to point somewhere else up on the stack and then return says take the value on the stack and jump to it as we'll see executing from there so a bit of important not super important but a bit of terminology so this part rate is not actually part of the function this doesn't do any of the computation that the function needs to do so we call this the function prolog which is the part that's going to be essentially on every function to set up and to take care of the calling convention the same thing with the epilogue here so this is the last part that needs to do with the actual function itself but it's important because it takes care of all of the calling conventions so if we look at callee callee is going to do the same thing it has its own epilogue so it's going to push EVP it's going to move the stack pointer to the base pointer then it's going to move EVP now plus C because from the base pointer parameters are going to be above the base pointer into EAX it's going to move EVP plus 8 into EDX it's going to this is an ad so the load effective address is going to take EDX plus EAX plus 1 and move that into EAX oh no that's right okay sorry it's going to do EDX so it's going to do EAX times 1 plus EDX and move that into EAX and even though even though there's parentheses here there's no memory dereference and that's what they load effective address means which is also confusing because you're not actually loading a memory address so then we're adding 1 to that so then we do that 1 and then we pop EVP which is the opposite here and we return so notice here we don't have any local variables so we're not changing the stack pointer at all so that's why there's no subtraction there so here the prologue is a little bit smaller with a slightly excuse me slightly different epilogue let's walk through this so we have our callee we have our main function now I need to know exactly where these instructions are going to be in memory as we saw all the code all the data is going to be in memory somewhere so at runtime these are going to be given memory locations actually I compile well at linking time if it's not a relocatable executable it's going to have fixed memory locations here so every function here is going to have one this is just for my run and actually kind of the interesting thing you can see the difference between them is the size of each instruction so we can see a push is just one byte whereas a move ESP EVP is two bytes okay so we have our stack we have all the registers so we start at the top let's say it's fd2d4 I think this was that when I ran at one time so the stack pointer is going to have a value fd2d4 where I start in main so we're going to first push EVP so there's and we know so I have EIP here so EIP points to the next instruction to be executed right so this is how I know the next thing I'm going to do is push EVP and when I do that I'm going to take let's say EVP is something this is supposed to be above us right higher than us it's the base pointer of somebody else who called function main so we're going to first push that onto the stack right and the instruction pointer is going to change so we've done that we push it onto the stack so now that we've saved it now we're free to overwrite EVP right we can make our own base pointer so now we're going to move the stack pointer into the base pointer so now we have both of them pointing to this memory location fd2d0 then we're going to subtract hex18 from the stack pointer so now the base pointer points up to the top and the stack pointer points hex18 down what's 18 16 plus 24 that's going to move 24 characters down or bytes down sorry then we're going to move hex28 to ESP so ESP points right here plus 4 so it's going to be here right so it's going to be ESP plus 4 so we're going to move hex28 into there and then we're going to move hexA or 10 into ESP so now remember the calling convention said hey if we want to call a function and we want to pass parameters we need to pass our parameters are 10 comma 40 so the rightmost parameter needs to be pushed first so this is 40 and the next parameter is 10 which is pushed second on the stack so if we look at the stack of the arguments right if we go up there 10 40 which is left to right and if we go down right to left 40 10 so we're currently here the stack is in a good state to call this function so the call instruction says an address so it says start executing from set EIP to 8048394 which is the start of calling right but it also has a side effect where it says also push the next instruction to be executed which is 483BF and that way this function knows where to return to right so it's going to first push that on there and then it's going to call the function call E so it's going to change the EIP and this isn't really important because we're ceding control of our function to some other function that function has complete control of the CPU but we want it to come back to us we want it to do some kind of computation I can't think of it as like hunt do you guys know the story of Hunts on Gretel kind of some like four of you so there's horrible, horrible fairytale but the basic idea is these little kids want to go exploring in the woods so they took a loaf of bread and they left a trail of bread crumbs eating their cows and that way however far they got they knew they could go home by following the bread crumbs just like these addresses are the bread crumbs of the functions to know how to get back so we'll revisit Hunts on Gretel's fate in a bit okay so now we're in call E so we do the same thing right now the base pointer is mains base pointer but we're calling a new function so we have to save that base pointer right so we're first going to push EVP, save that base pointer and move the stack pointer into the base pointer so now we're moving it and so we're saying okay good for call E now now we know this is the base pointer so when we execute this code every offset is going to be from this base pointer but we can get back to it because we've stored it onto the stack so then when we move EVP plus C so EVP plus C is what 1, 2, 3 it's going to be 28 right so which parameter number is this going to be? 40 the second parameter right and so see the compiler knows because of the calling convention hey I know that after the base pointer right at the base pointer is the saved base pointer because I put that there and then above that is the saved return value and then above that is the first parameter and then above that is the second parameter so that's 3 up 1, 1, 2, 3 so that's EVP plus C to get to that second parameter so it's going to move that into EAX and so here we can see this is the function frame for main and this is the function frame for call E right and so as your program executes every call is going to continually change this layout so we're going to move hex 28 into EAX we're going to move A into EDX we're going to add them together to get hex 32 then we're going to add 1 to EAX now we're done we've done our computation we've put the return value in EAX right, hex 33 now we need to reverse the prologue we're going to pop the saved base pointer into the base pointer because we're setting up our caller's base pointer so once I pop that now my base pointer points up to the top and I've popped so I move that up and now return says essentially it's the same as pop EIP so take the value that I'm currently pointing the stack is currently pointing at put that into EIP right so it'll start executing from that value and the pop moves the stack pointer when we do that the stack moves up and EIP becomes 804, 83BF which just happens to be we address the next instruction in program and EAX is happily now the value of what we want so that's the only thing that changed our base pointer is still the same we're executing right at the next instruction so to us nothing happened we just called some function and it computed some value for us so then we move EAX into our local variable into EBP-4 and now EBP is our EBP and now then we move it back into EAX and then we leave which is going to set the stack pointer to the base pointer which is going to move that up so the stack pointer is now good we've not changed the stack anymore and then finally we are going to oh yeah leave is pop that, that's right move the stack pointer and then pop into EBP so that same EBP we had which was FD2C0 and now the person who called us has their base pointer put it back into place does that make sense? okay this is critical and this is kind of low level stuff but it's actually very critical to understanding stack overflow capabilities we're going to see how as an attacker we can take advantage of this in order to corrupt and change the control flow of the application