 All right, so we need to dive right in and make sure so where we left off a week and a half ago was on buffer overflows, so we covered a number of attacks and your next homework assignment you will be trying to find more and really some programs. We'll talk about that on Wednesday, so sit tight. That will be an individual homework assignment, so and part of this assessment is studying one of the most classic security problems today, which is a buffer overflows or overwrites, so it's actually an incredibly common vulnerability, so we're going to kind of look at it historically and look at the progression of buffer overflows and specifically not only how to find them, but how to exploit them to gain code execution inside of a process, so and as we'll see, we'll then kind of ratchet up these security mechanisms, so we'll say, okay, what about the SLR now? How do we get around that? What about a non-executive people stack? How do we get around that? For now, so we're taking kind of a historical approach here, but there's still, you know, even though we're looking at binaries without all of these fancy protection mechanisms, hey, there's a lot of binaries out there that run on routers or other embedded devices that don't have these security mechanisms, so these are incredibly common, and be even most of the heap overflows, all of these memory correction style vulnerabilities all stem from the same idea of overriding bounds and memory. So it's incredibly important to understand the basics because having you, you know, you got to walk before you get around, right? So I feel like I already did all this. Okay, so, yes, okay, so but in order to understand what a buffer overflow or overflow overflow, sorry, is, we first need to completely understand the stack. All right, so a stack is essentially a scratch memory for function, but more specifically, what is the stack and how does it differ from a stack? It's memory, right? A stack is a data structure, abstract. Yeah, so a stack is the abstract data type of a stack. What are the operations you have on a stack? Push it, pop it, what order do things come out of it? Before it gets out, it laps out, right? You want to push things on, you pop things off, right? As long as you get the order straight in your head. So the stack and program is specifically a memory location, so it's going to be actually a segment of memory that's defined in the Elf header sections, which we saw that allows a program essentially to scratch memory. So why does a program need scratch memory? In other words, why don't we know in advance exactly how much memory a program uses and just allocate that and use it? Dynamics of the input data could vary, right? Depending on if you're parsing a one megabyte file versus a gigabyte file. What else? What if your data size is completely constant? Why do you need scratch memory for function calls? It does that. Do you need a scratch memory location to do that? You don't know how many function calls, right? So you've got to remember every function call that we'll see has to execute, and then after a function is done executing, what happens? Does the program just terminate? No, that'd be a horrible way to program. What happens when a function is done executing? It returns where? To the function that, or specifically the code that called it. So right after it gets called, that's where it returns. A function can be called from a hundred different places in a program, but when it returns, it only returns to that specific place where it was called. And this changes depending on the runtime of the functions you get. And then you can also, when you think about starting from the beginning of the program, all the functions that execute, can you have multiple instances of the same function on the call stack, i.e. can you have a function call itself? But you need memory for all of those things because each of those functions, if they have local variables, have their own local memory. So the stack is the answer to all these questions. It's essentially, it's actually a very basic component of low level architectures. And it starts at a high memory address and grows down. And we are going to draw the stack like normal people, or we're going to start at all f's at the top and go down to all 0's, yes. I would say the stack, that's architecture dependent with respect to the starting from a low memory i.e. when we go down. Yes, that part is how the stack operates. It needs to start one way and go one way, right? Just like you push things on it goes, in this case, down and you pop it and it goes up. But yes, definitely architecture dependent. And it's architecture dependent on how, what the ISA like R or X86 or MIPS gives you in terms of manipulating the stack. So for instance, X86 has a built in push operation, which takes the value of the register, pushes it on to the stack and decrements the stack pointer. So there's a dedicated stack pointer register. Whereas other languages, you have to manage it yourself, but you can still do it. It's the same idea. So, and this is super important. So in X86, the registered ESP holds the address of the stack. Right? So these ideas are very closely linked. You can't, the stack doesn't exist without whatever values inside ESP and the value inside ESP dictates where the stack is. So if I tell you the stack is at location 1,000, then you know that inside the ESP register is the value what? 1,000. Yeah, it has to be. Otherwise, it doesn't make sense. So the, so there's two basic instructions here, push and pop. They have very specific semantics when you do a thing that you just drill into your head because it's, you know, it's not hard and stuff you guys can all learn and remember this. Push the ESP means to decrement the stack pointer. So change the ESP register, then store the value inside the registered ESP onto the stack. So pop does completely the opposite. So pop is going to, working backwards, takes the value off of where ESP is pointing to, copies it into a register, so for instance, pop ESP takes the value where ESP is pointing to, puts it into EAS, and then increments the stack. Good? Stack experts? Can you say increment because it's building down on the stack? Increment, yes. Increment going down, so when adding things, you're decrementing the stack, and you're incrementing going up the stack. We'll see examples like this. So stack example, all S at the top. Clearly you could draw this either way. You could start at zero. It makes a lot more sense going down. I don't know why, I think I just looked at it so much. Any time I look at it the reverse way, it just looks wrong. And I've even heard of some monsters who start the stack on the left and it grows to the right. So the idea is, just like we said, if the stack pointer is pointing somewhere inside the stack at memory location 10,000 hex, this means that I must have, okay, well, I guess I don't know about that. This means that the stack pointer must have the value 10,000 hex inside there. So even just off of this point of having, okay, the stack pointer is pointing, has the value 10,000 hex, which means the current top of the stack is at 10,000 hex. What do we know about the memory addresses that are greater than that? What was it? It's fair to say the function calls are probably used data, right? It's used data because we know that anything that is 10,000 hex is the quote, quote, top all the top here is tricky because it's keep going down is the top of the stack. And if we add things by decrementing and going down the stack, then if we increment going up the stack pointer means that we're now into memory that is being used, right? These are things that are going to push onto the stack. So conversely, what do we know about the things that are lower than 10,000 hex? Garbage, right? This is, can there be, is it all zeros? No, not necessarily. We actually don't, as we saw the pushing and popping things off the stack only changes, changes the stack pointer and maybe puts values on there. But when you pop things off, you don't erase what was there. So you have garbage, which could be, in some cases, it could be useful garbage, depending on what you're trying to do. Listen up, we'll get into that later. So if we have the instructions, push ex and then pop ex. And so then if we have our registers, we think of, okay, what's the value inside ex? Let's say it's the value xa. We have ex, it has the value, let's say zero. And we know, because we talked about it, that the value inside the ESP register is 10,000 hex. If that's not the case, we've completely messed up. So when I'm going to put an arrow on push ex, this means that essentially you think that the program counter, or in x86, it's called the ex, the extended instruction pointer, is pointing to that value, which means the next instruction to be executed is that value. So if you're doing gd, you put a break point there. It means that when you do si for step instruction, it's going to execute that one instruction and stop on the next instruction. So it's going to do push ex. So what's going to happen? It's going to be hex, well, that's not right. I was going to say nine, but that's wrong. Four bytes? Why four bytes? The register is four bytes large. The register is 32 bits, which is the same as four bytes. It has to store onto the stack enough value for ex. So it has to decrement the stack pointer by four. So this will be fffc, which then moves our pointer down. Our imaginary, remember, this blue pointer is just imaginary. The blue pointer that was at 10,000 hex now moves down to fffc, because that's what's inside ESP. Then what happens? So are we done executing this? I'll store it. Yes, we need to store. So we only did half of our push ex instruction. We decremented the stack pointer, and now we need to copy the hex A, which actually we just talked about. It's four bytes, so it's going to be 0, 0, 0A, or maybe with the end in this A, 0, 0, 0, I don't actually know. But when we look at it as an integer, we'll interpret those four bytes as OXA, which is 10, 10, yes. Okay, now we're done. Now we've executed that instruction. Now we pop EDX. So what happens? Increment it. What do we do? What was the order that we did it? I'm going to push. We moved it to store. We moved and then stored. So when we pop, we need to do completely the opposite. Copy it into the register. So copy whatever's at FFMC, and copy those four bytes into the register EDX, which will change that to XA, and then increment the stack pointer by four. So now the stack pointer is going to point up at, yes, up at 10,000 hex. And just like we said, does anything, so that memory that we put onto the stack still remains in memory, that XA at FFMC. But in our program's mind, it's considered garbage, right? We can reuse and overwrite that address at will. So what was the equivalent? What essentially did I do semantically here with these two instructions? Copy speed. Copy what? You copied a kind of swapping. So you moved that from A to B. So, tricky that it's not a swap. I think swap would actually switch both of them. But yes, we essentially moved EAX into EES, right? So, and that's the stack. So this is the basic idea. And if you really understand this and centralize this on a deep level, you will find all of these stack-based, and actually all these memory corruption at overflow vulnerabilities a lot easier. So the main, one of the main uses of the stack is what we talked about of having function frame. So somebody was taking the piloter to refresh us. So what is a function frame? It's like a picture you put around your awesome function that you wrote. Yeah, it's essentially everything that a function needs to execute, right? And so when you're writing C code, how do you, so if you want memory, how do you get memory? So if you want, let's say, 10,000 bytes of memory, how do you get that 10,000 bytes? Maloc. So you can call Maloc. And then what do you have to do with those 10,000 bytes when you want to free them? We call free. As we'll see, these are libc functions you're calling into, Maloc and free. But, so how come when you declare in your function, so you're writing a function in C, this is your function foo, and then you're declaring a character buffer of 10,000 characters. You put care, bracket, 10,000, bar. Do you have to Maloc that? You have to free that? Why not? Yeah, so we've asked for this as a partner, so it's a local variable of the function, which means that variable, the lifetime of that variable is only during that function's execution. When that function returns, that data is already not used. When we Maloc 10,000 bytes, we're specifically telling the computer, I will worry about how long this memory is valid for and I will be responsible for freeing it after I'm done. Because maybe you return that 10,000 bytes with that pointer to the 10,000 bytes from that function, which gets passed to another function and another function while you're using it. If you only need that 10,000 bytes for one function then why use Maloc? The reason would be if you don't know the size in advance and if you can't actually get the size, but that's another issue. So, yes. One of them goes in the stack and the other one goes in the heap as well. Yes, essentially that's the main difference. And the key reason is understanding that local variables don't need to be on the heap, but are on the heap for essentially performance reasons, as we'll see. It's incredibly easy to free these variables, whereas the heap has to do a lot more damage. So, the idea is who do you use, let's say the stack pointer to say, okay, the stack pointer. Well, another question is, so okay, we can say we can use the stack to store local variables, but when I write a function and I have my local variables of int a and character b and my queue deraille that we just called bar, right? How does the compiler actually compile that out so that every reference to a, b, or c grabs that local variable from my stack? Yeah, so you're a compiler. You are seeing that I have this function defined that I have variables a, b, and c. And local, local variables. And you see that I have a is equal to a plus b. How do you translate that out into assembly code? Yeah. It would be offset from like, be allocated at by the compiler or ESB. Yeah, so one thing is I could try to use the stack pointer as an offset, right? I could say, okay, this, so I'll know ahead of time exactly how many bytes so each function uses a different, but known at compile time number of bytes. For instance, like with an int, we know there's four bytes. I mean, you can just run sidewalls on all of these data structures to see what you need. And so you could maybe change the stack and then just say, okay, fix offset from the stack is always a, and here's variable b at a different offset. Why is that not a good idea? Because it doesn't change when you push and pull. Exactly. The stack changes during program, during function execution, because of pushes and pops, right? Because how many registers roughly do we have? Or does anyone know exactly the answer? That's the same question I know the answer to. It's like, general purpose registers. Six. Six. So what am I doing? This complex calculation that involves multiple, say 10, intermediate values and variables, which is more than the registers that I have. I just give up. No, I store stuff on the stack, right? On the compiler. I can store temporary variables on the stack. I know exactly where they are. I can pop them off when necessary. So for this reason, we don't want to use the stack pointer, an offset from the stack pointer, because the stack pointer is possibly changing. So this is where the idea of the base pointer comes into. So it's called base pointer in x86. The other way of thinking about it is the frame pointer. So it points to a fixed location inside the frame of the function. And then each local variable will be different offsets from that frame pointer, if I would say. I know that a will always be at ebp, which direction would it be? Down. It should be down, right? We'll see y in a second, but it'll be ebp-4 will be a. And ebp-8 will be b. And ebp-10,004 will be my huge buffer r. So this value, ebp, just like the stack pointer, dsp, it holds the current value of the current function's function pointer. So now we're going to go through a, actually, real example of a c program, a small c function that I've compiled, so you can see these different things. So here I have main, I have in a and b float c, so these are all local variables declaring this function. I'm going to set a equals to 10, b equals 100, c is equal to 10.45, a is equal to a plus b, and return 0. So we got through, and we're a compiler. So we need to basically calculate fixed offsets for a, b, and c. So I'm going to, and can we have, let's say if we have a and b, which are both ins, can I have a at ebp-4 and b at ebp-6? Why not? There's four bytes, those would be overlapping bytes. That should be an interesting problem, but yeah, that would be very bad, right? You have, I mean, that would cause massive problems. So I can say that a is at ebp plus a, and as we'll see, it's going to be a negative number, ebp plus b, ebp plus c, and then I can easily translate each of these instructions. I can say, well, set the memory at ebp plus a to 10, right? Rosalie, this is pseudo code, not real assembly code. Similarly, set the memory at ebp plus b to 100, set the memory at e, whatever, and then, so on and so forth, set the memory of ebp plus a to ebp plus a plus the memory of ebp plus b. And then when this gets compiled out, this actually is very straightforward. So the compiler and depending on your compiler options, depending on the compiler version, depending on, I don't know, the cycle of the moon or something, you'll get different results for the x86 assembly here. So for this example, ebp, it said a is at minus c, so minus 12, b is at ebp minus 8, and c is at ebp minus 4. Just where I decided to put things. So I'm going to ignore the prologue for now, but the exact code of this is a move constant a into ebp minus c, right? This is essentially the x86 assembly instruction that's exactly equivalent to the pseudo code we have. And similarly, move x64 into ebp minus 8, which is b, and move, what's this crazy value? It's x41273333 into eax. Yeah, it's 10.45 and actually flowing point format. So then we move eax. So why is this two instructions and not one? The second instruction is move eax into ebp minus 4. Why two instructions? The other two instructions are just one instruction. Move 10 into ebp minus c. Try to figure out the value from that progress, which is no d reference. If there's a d reference around eax, then we have that. So there's no d reference. So it's moving literally x412733333 into eax. And then that value is then moved into ebp minus 4. So no d referencing of that value. Good guess, yeah. Is there a limit to the size of obedience when moving into memory? Yes. So actually the real answer, well, the answer of anyone asking why a compiler does something is because the compiler decided to do it that way. And so we'll see many examples of that when a compiler chooses one instruction over a semantically equivalent instruction. And the answers of why this is why things change over time. The reason in this case is because you think this instruction has to be translated out into x86 code. So there's a limit when you want to use an immediate instruction. If you want to use an immediate move instruction with an immediate value of like, because that a has to be inside your instruction, right? So move that into ebp minus c, you know, you have to specify that it's a move instruction. You have to specify the immediate value to specify that you want ebp with a minus c displacement. So if you want to specify the whole 32 bits of this x412733333, that's already a long instruction. That's already four bytes of your instruction plus the move and then the target register. So you can't actually move a huge value directly into memory location. So yeah, but you know, the compiler could choose to do this in a hundred different ways. So it just chose to do it this way. All right. Oh, okay. So then we can see that move. So ebp minus four into eax. So that's c move eax into ebp minus eight. So that's doing it the other way. Yes, it's the other way. Move eia, sorry, I got those confused, right? Okay. So take b, move it into eax, add eax to ebp minus c and store it where? Then ebp minus c, yeah, onto onto the a value. That's it. That's the body of our function. So, so, and then we can step through this. I'm going to do this pretty quickly because I don't want to get too bogged down with the details here, but feel free. You know, you should be able to walk through each of these instructions and then draw and think about how the stack is changing. And it may seem silly, but this is something that I do on almost every like CTF challenge where I have a stack and I need to figure out, okay, where am I and where am I overwriting and what else is above me, like drawing it out really helps me every single time. So, okay, so the idea is, so this function starts executing. So, we know the stack pointer is at 10,000 hex because I told you that it is, and it is in this diagram, and we know it is because 10,000 hex is inside the registry ESP. What's currently in the ebp register? How do we get here? Not here in this room, not like don't rethink all your life choices that brought you here to this room on this Monday, but how do we get to start executing this move instruction? Yeah, somebody else called us. So, even though we're the main function, if you look at this, there's actually a lot of things that happen in your binary before you actually get to the main function. So, somebody else called us. So, what's inside the ebp? The previous function that called us base pointer. So, yeah, okay, so before, sorry, this booker is not working though. Okay, so before we get here, right, we don't know what's inside the ebp. Maybe it's something above us. I'm actually missing all the prologue here, and I'm upset about that, but we'll get to it later. So, what we need to do is set up our pointer, our base pointer, because we need it for all those offsets we just calculated. So, we know right now the top of the stack is free, so we set, we're going to move our stack pointer into the base pointer. So, now we're going to just copy the value that's in the stack pointer and copy it into the base pointer. So, that's going to put 10,000x into ebp, which means now we have two blue arrows, one at 10,000x to the stack pointer, the other at 10,000x to the base pointer. Was this in any way involved in any of the instructions that we actually wrote? No, did we say we want the base pointer? No, we just said we want a to be 10 and b to be 100, and a to be a plus for c to be that 10.45 and a is even a plus b. But as we'll see, the program has to have this prologue in order to set up the stack correctly so it can execute. So, the next instruction is a sub x10, so 16 from esp. So, it's going to move the stack pointer down 16 bytes, essentially doing what for our function? Space for local variables, exactly. It's essentially allocating 16 bytes on the stack for this function. How does it know 16? Well, the compiler decided that it needed 16 bytes. So, it needed, so we had a and b, so our both four, so eight bytes plus c is a float. How many is that? Another float is four bytes. A double is eight, I think. Pretty sure. Whatever, we'll say it's four. So, we have four, four, four, 12, so we're at 12. So, how do we get a 16 bytes? A is equal to eight plus b. What? The addition and operation. Well, we store it onto this, we store it directly onto the stack where a is. What was that? The compiler decided. Yeah, the compiler decided. So, it's actually a, for historical reasons, some architectures couldn't access or execute memory or something on a non-four byte boundary. And so, the compilers usually try to make sure that these things, like stack pointers, end up on four byte boundaries. So, I don't know. But it's not necessary. I mean, you could write a compiler that did this or your compiler could decide that that variable c that we have is never used, right? We just set c to be 10.45 and never use it. So, it could completely optimize that out. So, there could actually be less space on the stack than we think based on looking at the c code, which is important as to why we always look at the assembly code, like the c code could lie to you because you can think, though, that things are on the stack this way, but the compiler ultimately decides what goes where. Okay. So, we subtract then 10 in hex, which is 16 for the stack pointer, which means that ESP now points down at f, f, f, zero, while the base pointer still points up at 10,000 hex. We then are going to move 10 at EVP minus c, which is going to be at what, f, f, f, four. And then we're going to move hex 64 to EVP minus eight. We're then going to move that crazy 10.45 and floating point into EAX and then move that from EAX into EVP minus four. And then, okay, so now on the stack we have, we can see the exact memory locations of our local variables, A, B, and C on the stack. And now we actually do the calculation. So, then we take EVP minus eight, which is the B variable, which is hex 64. We're going to move that into EAX. We're then going to add EAX, which is 64, with EVP minus c, which is going to be f, f, f, four, which is our variable A. And so, that's going to add 64 to A, which is going to be whatever x 110 is. And we're done executing. So, you should be able to take this code and draw the stack. And I say, what's the stack like at this location, or this location, right? This is a key skill. All right, cool. So, review for some or introduction to function frames. But as we talked about earlier, function frames aren't key. I just wanted to ask, don't you know the time frame for the exams yet? No, that's a good question, but I'll decide very soon. Not before frame rate. I'll say not right after. We'll see. That's a good question. Okay, yes. So, we talked about this. So, we need local variables which need to be stored on the stack. What other information does a function need when it gets called? Arguments. The arguments to the function. How does the function return its value to the function that called it? Right? What was that? Yeah, so we need a return value. We need parameters. We'll need, as we see, well, we need, first of all, we need, the other thing we need is a breadcrumb. We need the return address, right? Because every instruction essentially you can think of it has complete exclusive access to the CPU. So, the CPU does not know that it's inside of a function executing. Right? That's just an illusion that we've created. All it knows is I'm going to parse that next instruction. I'm going to do whatever it says. I'm going to execute it. I'm going to go on to the next one. And I'm just going to keep doing this over and over again because that's what I do. That's my job. So, when we finally get to the end of a function that we talked about, there's a hundred invocations of the function through the program. Which is the function that we actually return to? So, we need I think of it as a breadcrumb or some way to say, hey, this is where you go after you're done executing. So, the return address. And local variables and temporary variables, as we saw so stacked, is super handy and used all the time. So, this is all in an abstract way. This is all information that a function needs in order to not only execute properly, but to return to whoever is calling it and give those values. How those values are passed and expressed, do you push the arguments on the stack? Do you put the arguments in variables? I'm sorry, not variables, they're not variables. Do you put the arguments in registers? What does a calling function guarantee about certain registers? Does it say that it won't change certain registers? Does it say that registers can be changed when the function is called? All of this is, there's no technical reason that one is better than the other. I mean, maybe they were back when things were first assigned. But essentially, this is why the important part here is the convention. It's a convention that, if you follow this convention, you could call a C program. Which is actually how you do super cool things, like call a C function from Python code. Because Python knows how to call into that C code and it can do that and then transform the values back. You can do the same thing as Java. You can call from Java into C code because of this calling convention. So, this is also something you need to burn into your brain. I mean, don't hurt yourselves, but this is something that is incredibly important is this calling convention. And so, actually, the must be stored on the stack is actually a misnomer. It has to be stored on the stack, the calling convention that we're talking about will. And the important thing is the convention varies based on the processor. So, different architectures have different calling conventions, x86 versus ARM, different operating systems. So, you have the same. So, x86, the calling convention is different on Linux than it is in Windows. The compiler is kind of the type of call. So, when calling, even in Linux, so, Linux x86, if you're calling another function, you use the Cdeco calling convention once we're looking at, but if you're calling into the kernel into a syscall, you use a different calling convention. I think it's called a syscall calling convention, but I actually don't know. So, these are all different. And so, you need to know what the language essentially that you're speaking is, otherwise you can't call a function. So, this is the main one we're going to look at is the Cdeco calling convention. So, the caller has to do these things in this order, otherwise everything burns. And this is again arbitrary, the ordering, all this stuff, but it's, it is the, it's like learning any kind of standard like IP or TCP that you learn. Like, it's not given down to us on high. People came up with this, decided, yes, we'll use this. So, the caller pushes the arguments onto the stack in right to left quarter. So, think of the stack moving down. So, the right-most argument would be the first argument pushed onto the stack. The next argument underneath that would be the second to right. All the way to the left-most argument is the last argument pushed onto the stack. So, this allows you, so the nice thing about this is it allows you to have functions with any number of arguments, right? Because you could just keep pushing, as long as you got the memory, you can keep pushing arguments onto the stack. So, why right? I mean, there actually is a good reason for right to left, but I don't want to get into it now. So, think about why right to left as opposed to left to right. So, the next one is they push the address of the instruction after the call. So, this is used as that break-crone. So, this is when the function that we're calling is done executing, it knows where to jump to after it's done. And that's it. That's all you have to do to call a function. You just do this stuff, bang, call, good. The call lead, the person who's getting called, or the person that's a function, but the function that gets called has to, as part of its interaction here, has to save the previous frame pointer because they're going to create their own base pointer. So, they just save the base pointer of the person before them because we can see in this calling convention, the caller doesn't have to worry about the base pointer. The caller doesn't have to say that the call lead has to. So, the call lead pushes the previous frame pointer into the stack, then creates space for all the local variables, and then after program execution ensures that the stack is consistent upon return. Why is that important? I don't want to care anything about the implementation details of your function. I don't care if this function uses 10 memory bytes on the stack versus 1,000 versus 1,000,000. As long as you get it back to me, the stack in exactly the same place as when I left it, then I'm good and I'm super happy. So, yeah, this is actually a lot of interesting kind of design decisions at each of these levels about abstraction and how much, if you want to call a function, what information do you need? And puts the return value into the EX register. So, this is how a function returns a value from one function to the next. Cool. Questions? Yes? The all the registers that are stored in the stack are stored in the stack before it is called? The register values? That's, honestly, the answer is I don't know. There's a contract with each register value. So, clearly, you can see that the callee is going to clobber EX, right? So, before you call a function, if you have any important data stored in the EX, you've got to move it because it's going to go away. The other registers, I believe, yeah, I think it depends. Every register, I think it's defined whether a function, before it uses it, will store it and then use it again. So, that I don't know. As long as we, so as we'll see, we have to do the same thing we did with the frame pointer. We have to store it on the stack at the beginning, use it, and then before we return, put that value that's on the stack back in the register. So, yeah, you didn't think of it like your, I mean, I don't know, the registers are basically like cubby holes or something and they already have stuff in them and you want to put stuff in there. So, you take them, you put them on the stack in a certain order and that way when they're done, you put them back on the right location so that it looks like you were never there. There's instruction, push add and pop add and then push all of the six GPRs from the stack in case, you know, but the color must be used and it depends on information. Yeah, the compilers are, I was going to say the compilers are very good about optimizing this kind of thing. So, they will, if they can use only the EX register, they'll only use that so they don't need to store any other registers or, which you can actually see in our little snippet, right, because we don't use any registers besides the AS. But, yeah, there's also, there's all kinds of functions and all that stuff. ESP is for the compiler then, we're using the frame pointer all the time, right? Sadian? We're not using the stack pointer at all, we're using the frame pointer. We use the stack, we're, so for referencing our various, our local variables and as we'll see our arguments, we use offsets of the frame pointer. Yeah, but not the, not ES, correct. But there's weird things, compilers will sometimes like optimize, I don't know, I don't want to get into it, but sometimes you'll see weird things. You think before you call a function, you can see a bunch of pushes, so you push the arguments onto the stack, but oftentimes they move onto the stack because they have a pre-allocated space on the stack so they know they can use that to push values in there so they don't actually push anything, it's like a move into the variance. Yes. All right, so we can look at this, so we will look through this, so we have very short C code, we have int main with a local variable in A, we say A is equal to call E, which is a function called passing parameters 10 and 40, then we return A. Then in our call E function, we have int call E, int A, int B, and then return A plus B plus 1. So this is just our simplified, our very simple function just to show exactly what happened. So here I'm going to show the main, you know, the all the assembly code for main and from call E from what I did this, whatever I did this. So the main function is push EVP, move stack pointer into EVP, subtract 18 hex from ESP, move hex 28 into ESP plus 4, move 10 to ESP, which is actually an example of what I was just talking about, whether or not pushing, then call call E, then move EAX into EVP minus 4, move EVP minus 4 into EAX, leave and then return. So we can see that the first three instructions, push EVP, move stack pointer to EVP, subtract 18 from the stack pointer, these are all the function prologues, so almost every single function will have, unless the compiler can optimize these out, will have essentially these same three instructions. So what's the very first one, what's push EVP doing? Okay, why is it doing it? That's my test. So it's saving the person who calls its base pointer onto the stack, so it's going to return it later. Then the move stack pointer into the base pointer is setting up its base pointer for its function and the subtraction of the stack pointer is allocating memory on the stack for local variables and as we can already see, it's essentially preallocating it for future stack operations to call functions. So we also have an epilogue, so to leave and return at the very end here, you can think of, we'll go into exact details of what they do, leave essentially is a move the base pointer into the stack pointer. So put the stack back where the current base pointer is, which completely does the last two, completely reverses the last two instructions inside the prologue. So because after this instruction, the base pointer and the stack pointer are going to be at the, oh, and then it does a pop EVP, that's right. It does a pop EVP, which then actually does all of these. Yeah, that's right. It completely does all of these. So it puts the base pointer back, what was on the stack at the previous time. We'll look at it in a second and return as we'll see transverse instruction to the function that called it. So call these similarly, push EVP, move stack pointer to the base pointer. Interestingly enough, so why, so it doesn't have a subtract whatever from ESP. Why is that? What was that? There's no local variables. We actually don't need any local variables here. So why allocate, why do the instruction go after it? So then we're going to move EVP plus C. So EVP plus C, then what is that likely to be? So if EVP minus something was what? Local variable, then EVP plus something. Arguments. Yeah, it's going to be arguments. So move EVP plus C into EAX. Move EVP plus 8 into EDX. And then this is where what I said, you sometimes see weird things. This is a load effective address, EDX, EAX 1 into EAX. What is that doing? Yeah, so it's essentially take EDX, add EAX to it, and then add 1 to it. So if you do that base displacement offset nonsense. So this is literally just like add EX to EAX and then add 1. So it's like combining that all together in one instruction. And then, oh, sorry, sorry, your, I was wrong. Okay, so this one is, you're right. You were all right whoever said just add EDX to EAX and stored in EAX. So like why is it not add EVX, EAX? That's a good question. I tried this in a newer version and it did compile that out by myself. Who knows? Systems are weird. So then we have like add constant 1 to EAX and then pop EVP and return because we're done. So again, this is the prologue at the top of callee, push EVP, move ESP into EVP and the epilogue here is pop EVP and then return. We're going to walk through it. Yes, so why not a leave instead of a pop? You could have a leave, but leave essentially the stack pointer, the base pointer at the same location. So part of what leave does is move the base pointer into the stack pointer to get rid of the subtraction of the stack of the allocation there. So they're already at the same location. So all you need to do is pop EVP because that's the other part you want from it. Okay, so what based on this optimization point, you've done two lines of the load function and it tries to return to EAX. It just tries to exchange the stack from one place to another place and then return. So the last two lines, I mean, move the last two lines all the same. Oh, yeah, yeah. So this was not compiled with any optimizations or anything. So hey, if you compiled it with that O3, it would probably compile out a lot of these things. But that is some kind of optimization that the compiler decided I can do this because my guess would be because it's not actually also our code at all. It's not optimizing our code, it's optimizing essentially its code because it controls the prologue in the Avalon. So if it thinks, because hey, this function uses zero local variables, then I can do my shorter prologue in Avalon. That's my guess. Cool. Okay. So let's look at this. We have, and these are all, so we know these instructions are all going to be at fixed memory offset. So you load this program in GBB or you can use object dump to see because we know the elf header specifies where all the instructions will load into memory. So if you look at the file, I'll tell you exactly each memory location inside the program where all of these instructions are. So here I've essentially laid out instructions all by each of these, sorry, addresses by all the instructions that Colleen made. And you can see if you're super detail oriented, you can start calculating deltas between them to see the variable lighting architecture here. Okay. So then we have our beautiful stack. We have our registers. So we actually use five registers in this whole program, EAX, EDX, ESPE, BP, EIP. And let's say we're at some, I think I actually tried this in a real operating system and I was at FD2D4 when I did this. So this was the stack one here. So so now we just started executing main. So main just got called from somewhere. So the first thing it's got to do. So the the other thing, this EIP instruction will constantly change based on what we're going to execute just to be completely consistent. But I'm not going to talk about it anymore. Well, maybe we'll just give it a call. Okay. So the EVP is going to be something above us. We don't know what it is. We don't actually care if it's null or whatever. We just know that somebody called us, we've got to deal with their base pointer. So we first push EVP, which we know how pushing works. So that moves the stack down, we then move the stack pointer, the base pointer, essentially getting rid of the previous stack pointer, sorry, the previous base pointer, because we've just saved it on the stack. So we know we can get it later if we need it. Then we need to subtract 18x from the stack pointer, creating space on our stack for our local variables. We then are going to move 28 onto ESP plus four. So what's 28x? 40. So what was it in the context of our program? The second argument to Kali, right, it was Kali 1040. So then we have a move A onto the stack pointer. So now the stack, even though we didn't do pushing, the stack has, so at the current stack pointer location is the leftmost argument, hex A, which is 10. Four bytes above that is the second argument of hex 28, which is 40. So then we call it, but how do we, so what do we need to do to call this function? Yeah, so we need to save the instruction pointer of the next instruction that's going to be executed, which will be, in this case, 80483BF. That's the next instruction. And what else do we simultaneously want to have happen? So we want to push that value onto the stack. But what else? Yeah, change the instruction pointer to where? The address of Kali of 8048394. So x86 has jump instructions, which I believe you got pretty familiar with on the last homework assignment. I'm seeing some dirty looks, yes. There are a lot of weird jump stuff, which was fun to play with. So it's a call is similar to a jump, where all a jump does is change the instruction pointer to be whatever the target is. So in this case, the target here is 8048394, but if we just started jumping and executing from there, we have no way to get back, because the CPU doesn't magically know that it was called from somewhere. So call does two things. It essentially does push the next instruction that would be executed, 80483BF, and that in turn, obviously, because of a push, decrements the stack pointer, and then sets the instruction pointer to whatever that address is. That's all it does. It's not smart. There's no smarts going on anywhere. It's all very discrete, simple, and quotes operations. So now Kali executes, and this is the thing, like this is why we teach recursion. It's the exact same thing that happens from Kali's perspective than Main's perspective, right? When Main started executing, it had no idea how it got there. It didn't care who called it. All it knows is, I've got to save this other person's base pointer because I need my base pointer. So I'm going to push EVP onto the stack. So save Main's base pointer. I'm going to set up my base pointer by moving the stack pointer into the base pointer. And now, basically, as far as we're concerned, Main doesn't even exist. We don't care anything about Main. It only essentially exists as a function frame on our stack. So another way to think about it is the function frame basically contains all the information that is needed to kind of execute that program for that function, I should say, not program. Okay, now we're going to look at EVP plus C. So what's EVP plus C here? So EVP, so the base pointer is here. So plus C is so 4, 8, 12. This is going to be hex 28. What value is that? 40. So our function call E, our compiler, right? So we had, I believe, two arguments, A and B. A was the first one. B was the second one. How come it knows that an EVP plus C is the argument B? I mean, it knew how many parameters it had and it set it up in the stack and it controls everything. Yeah, so where's the first parameter? Yeah, plus 8. Why is it plus 8 and not plus 10 or 12 or 4? What's those 8 in between? The saved base pointer that we pushed ourselves. And above that is the saved EIP, the saved instruction pointer that the person called us put. And we know that has to be there because of the C-deco calling convention. And then the thing right above that on the stack has to be the first argument to our function. And then 4 by 2 of that is the next argument to our function and so on and so forth. Right? So this is how we can easily calculate for whatever argument in our function exactly how many bytes to look up. So just like local variables are all at EVP minus offsets that the compiler determines, in this case it's not even really the compiler gets to determine if the C-deco calling convention specifies how many bytes exactly above its EVP. So this is super important when you're looking at x86 code to remember that when you see EVP minus something, that's a local variable. When you see EVP plus something, that's an argument to a function. Cool. So we can even see that Pauli's function point, sorry, frame is essentially all the arguments and all the local variables and main also has its own function frame. So now we're going to move that x28 into DAX. We're going to move EVP plus 8 which is the first argument A onto EDX. We're going to do load effective address of essentially adding EDX to DAX and storing it into DAX. And then we're going to add one to DAX and now we're done, right? Because we know that due to the C-deco calling convention the return value of our function will be in the DAX register. But now we need to reverse everything, right? We need to get back to essentially the stack pointer needs to be an fd2b8 in order for the function to continue executing. So we pop EVP and again, we don't care who had EVP. We don't care who's EVP this is, we're just putting it back. Whatever value is in there, we put it back. And now we return, so you can think of a return, a ret as the opposite of a call instruction. Or the other way to think about it is a pop EIP. So if it was a pop EIP, what would that mean? It's all on the level of some of it. It is, but it's semantically equivalent to, so what if it was a pop EIP? What would that mean? Yeah, so take the element at the top of the stack, put it in the EIP register, and add four to the stack pointer. That's exactly what happened. So a return instruction simply does literally exactly that. Take that value, put it into the EIP, and yes, you cannot write pop EIP because that's an illegal x86 instruction, but if that helps you think about what a return does, that's exactly what it does. So now we're here. So now from a main perspective, what just happened? EAX changed. That's the only thing we care about. And maybe any global memory change that if any function we call it and they call it and they call changing global memory, those are now changed. But fundamentally in our world, only thing that we care about that change is EAX. Pretty sweet, right? This is super awesome. So now we're going to get that value out of EAX, put it into EVP minus four, which was, I think we had a local variable a, and then we do this silly, like very sillyness where we take EVP minus four and we went back into EAX. So why are we doing that? Like, what instruction does that correspond to in the program? So that, the gator does EAX, the main things. Returns EAX and no DAX to C codes. What was the C code? A, return A, exactly. Yeah, because that was the, we said A equals colleague, so that's this instruction, right? So we called it and then we have to set our local variable A, which is that EVP minus four to that value. Now that we've done that, we can be return A, which is move that value into EAX. And clearly a smart compiler could remove these, but this is just a non-optimizing compiler here. So now we're done with the semantics of the program itself. Now we have the super important prologue. So now we're going to do a leave, which is essentially set the stack, sorry, set the base pointer equal to the stack pointer. So yes, set the stack pointer, move the base pointer to the stack pointer. So move the stack pointer up and then pop EVP to, wait, it must do a pop first. Yeah, yeah, yeah, yeah, because it's returning the DAX. It doesn't matter. Okay. Sorry, why was it doing that non-optimized thing again, the EAX and the EVP minus four? The second move instruction is the return A. So to do return A, we know the return of value, you have to move that value into the EAX. So the straightforward compile of that is wherever that memory offset is, in this case EVP minus four, move that into EAX. So that's just like a straightforward translation. So you can think of it like the compiler is compiling essentially line by line and not doing a local optimization or even a local optimization to say, does this make sense? Okay, so at leave, we're going to move the base pointer into the stack pointer. So essentially changing the stack pointer now to be d zero. So fd2d0. So both the stack pointer and the base pointer are pointing the same value. What are they pointing at? The previous function's base pointer, right, the same EVP. And then we essentially do a pop EVP. So we're going to take that value and copy it into the EVP register, which essentially sets up. So what is above, so where's, what is the stack pointer to right now? So my diagram, I have it all of it at the tip of the top of fd2d4. So what is it pointing to? We already loaded that in. So the EVP is at fd2d0. So it's this fd2c0. So where is the stack before the value was pointed? I think that is technically true. Well what actually is it pointing to? What was that? The actual point. The actual what? File. It has no concepts of files. There's no, I don't even know what a file is at this level. What's the next instruction that's going to exit here? Is that returns zero? A return. What's a return going to do? A ret instruction. Is the return, let's think about this. Is this return instruction here made any different than the return instruction in call e? No. So the instruction that needs to be loaded? Exactly. It's whoever called main. There's some function that calls main that when we return from main we're going to go and start executing there. So it's actually like your exit motor, your process has to get set and an exit has to be called, all that fun stuff. So that all happens here. So yeah there's some address that you're actually going to go to after main to do the cleanup of your program. So that's what happens there and that's at that address. All right cool. So that was all background information to get to understanding stack overflows but I think we've built up enough here that you can kind of see where we're going. So the idea is a stack overflow occurs when you're copying data or a stack overflow or a buffer overrun. These are all kind of synonyms for the same thing. The idea which I know we've definitely talked about on a Wednesday long past is that in C if you want a local variable of a certain length, a buffer of a certain length, you have to specify in advance the size of that buffer. And so what this means and there's no automatic data checking on the bounds of that array. So if you're trying to copy, if you make an array of size 50 or a buffer of size 50, you can set values at 51, 52, 100 or even 1000 because nothing is doing that checking. So a stack overflow basically or data buffer overrun exists when you can trick the program to write to memory that is outside of the memory that is allocated to your program. So this is actually a good, I think more of a good general definition because you think about like heap overflow is you can overwrite a heap buffer that's then so if you have a heap array that's of size 1000 or whatever you write over that and you can cause problems. If you have a local buffer on the stack that's allocated on the stack of like 1000 bytes or something that if you overwrite those 1000 bytes and you can write past that, that has problems. So normally this will cause a segmentation fault, the second fault that we all know and love, right? We're trying to overwrite some memory that doesn't exist or is not allocated to us. But, so we talk about, okay, I'm going to ask this every time. How many people know the, the, that's probably a fable of Hansel and Gretel? Very few, yes, okay. So this is, I should have one of you guys tell it, but I guess I'll tell it. So the idea is this is a super bridge version, you can go look it up and read it. It's really groovy, especially for kids. You're going to leave out the witch? No, no, no, which is a great part of it. So, I guess we're going to tweak the wage a little bit. So the idea is these kids, Hansel and Gretel, they live in, right by a forest or something, it's this dark deep, scary forest or whatever, and they start to go exploring the forest and you're a little kid, all you have to report probably to, all you have is like a loaf of bread. So if you want to go wander around this forest, how do you know how to get home in the morning or at night when it's super dark and disorienting and you don't have a compass? So how do you get home? Hey, with breadcrumbs. So you like take a piece of bread and every so often you like leave the trail of bread and that way no matter what path you take, you can always find your way back by going from one breadcrumb to the other, right, and finding the way. So we're good. And the same instruction pointers on the stack are like the breadcrumbs of our program, right. As each function executes, it leaves a breadcrumb to say, hey, when you're done executing, come back this direction. So if we were an evil witch and we wanted to, let's say, trick these kids to come back to our like witches hut or something, one thing we could do is start maybe erasing those instruction, the breadcrumbs and leave our own breadcrumb trail right back to our witches place. I think I've butchered the fairytale, but idea remains. So we are going to overwrite the return value in such a way to trick the program to do whatever we want to perform arbitrary computations. One thing that we're going to start off doing is we can jump to user defined executable code. So if we can write to an executable segment in memory, and we can write valid x86 instructions, if we can overwrite one of those save VIPs on the stack and provide that address, we can get the program to start executing some code that we injected into the program. And as we know, the key benefits here is that this code is running with the privileges of the running program. So if it's a set UID program and you're now root and half root privileges, we can do all kinds of fun stuff. All right, I guess I'll just stop here.