 Exam or anything? Yeah? Just for clarification, is the final exam a few words? Yes, the final exam will cover everything that we've talked about in class. Will we be getting something similar till last time? No. No. No. Yeah. Is it going to be more focused on newer stuff, or is it going to just be an even distribution of? That will be up for you to try to figure out what my mindset is. This is the game theory part. We need to think like me about what I do and what I have something to say. Anything else? All right, let's rock and roll. So we need, what we're going to look at today is we're going to look, not at this, but on overflows and overrides. So how do you declare an array of characters in C? You tell it the type of the array holds, and you also get the size. And the size. So you'd car, bracket, size, and then the variable number, or maybe the other way around. I can remember exactly which one it is. And what's to stop you from overriding and overriding that buffer? It seg faults. It may seg faults. Yeah, so your program crashes. So the fact that it crashes means you must be riding over the end of that buffer, as we'll see. So this is one of the key. And actually, if you think about all the software, so all software runs on C and C++. But that was written in C or C++. Start shouting out things. Maybe I'll throw it. I won't throw it handy anymore. I really want to do this, or I want to throw it all in there. What else? Minecraft. Minecraft, what else? Has it ever been Java? Yeah, Java. Now, now, now it's C. Yes, OK. What else? Like most command line tools? Almost all command line tools. What else? Think about software. Drivers. Drivers, which run in the kernel. The operating system itself, so all modern operating systems are written in C or C++. Microsoft Office, Microsoft. Basically all of the Office suite, Excel, Word, PowerPoint, those all are running in C or C++. So the real key flaw with C and C++ is the fact that you can declare an array of the fixed size, but nothing is preventing you. Or you have to be careful when you write characters to that buffer that you're writing only the amount of characters that you need. And this is really the cause of the buffer overflow problem. It's been known since about the 70s, like the mid-70s. And it's still one of the most exploited and discovered vulnerabilities today with various variations because of this. So these will be, we'll study these as we'll see. And the reason why, they're architecture and operating system dependent. They can be exploited locally and remotely. And really the core idea is they can modify either or both the data of the application and the control flow of the application. So what do I mean by control flow? Yeah, which instructions are executed, right? When you look at some C code and you see that function foo calls function bar, and then after that function returns, function baz is called. That's the expected control flow of the application. But if you and attacker can subvert that and make the code execute whatever instructions you want, then you can fundamentally take over this application and make it execute arbitrary code of your choosing. Why is data? So why has it been able to modify data important? So if there was a path that was hard coded, then you can change that path to have it executed differently. Exactly. So maybe changing paths in the program, maybe there's a variable that's a 1 or a 0 that says if we're admin or we're not. And so if we can change that variable to make us be admin even though we're not, now we bypass the authentication of that application. So there's a lot of research in this area, which is very cool. Some of which we're doing here at ASU is automatically identifying and exploiting these buffer overflow techniques. But I won't really have time to talk about that. The key takeaway, though, is that this has been a cat and mouse game as things go on. So attackers find that by, as we'll see, exploiting a buffer overflow, they can make the program jump to shell code on the stack. So the defenders, the people writing software, say, OK, let's make the stack not executable because you don't ever want to execute that on the stack. So then attackers had to come up with new techniques to still get their goals. So the key thing that we're talking about here is the stack. So what is a stack generally? Algorithms, data structures, yeah? Is it first and last out? Yeah, first and last out. So the data structure, actually I have no idea if that's right, let's say yes. The data structure, you push things on, you pop things off, and it's, yeah, first in. It's right. OK, OK, cool. So that is the conceptual idea of a stack. What do we mean when we say the stack? Just a pointer, and you keep on incrementing it as you go to different layers. What's the purpose of the stack? Why do we need a stack? Yeah? All the processes that get called on, you push onto it? Close, not process, because a process would have its own different memory space, but function. So think about function calls, right? You have a recursive function call. Every instance of that function call has different local variables, right? So when you have, let's say, you've all done, what's the recursive function you guys have probably done? Like Fibonacci or something? Yeah. Right? So when you calculate Fibonacci, you have all of these function calls. Every single instance has different local variables of what Fibonacci number to calculate and what the local variables are. All of that is stored on the stack, because you need somewhere to hold that information. So you can think of it as scratch memory for functions. And really, the stack is what allows us to have recursive function calls, because otherwise, you wouldn't actually be able to do them. You would not be able to create instances of new calls. Almost every single architecture, MIPS, ARMS, x86, x8664 provide this stack functionality. So it's a general concept to all architectures. We will, no matter what Yan says, we will start our stack at high memory addresses and grow down. And just like normal stacks, so what are the stack operations on a data structure stack? Pop and push. So we have the exact same values. And actually, x86 assembly language supports the stack natively, which means that the registered ESP holds the address of the top of the stack, which is just some location and memory. And we saw before how that's going to grow down. And a push instruction, push eax, this instruction says, take the value that's inside eax, copy it to the stack, and then move the stack pointer down four bytes. That's how you put things into it. In and you pop them off exactly the same way, which means a pop eax says, take whatever value is pointed to by the stack pointer and copy that into the eax register. Pretty easy, simple. Two operations. There's only two operations. It seems complicated, it's not. Well, it's kind of, but you can all get this. So our stack is going to start in this example. It's going to walk you through this. So the stack starts at high memory addresses and goes down to zero. And we'll just kind of conceptually think of this as this stack. So at the start of a program, let's say the stack pointer points to here, 1, 0, 0, 0, 0, 0, whatever. It doesn't matter exactly what the value is. So which way is the stack going to grow when we push things on to it? Down. So the stack pointer is going to be going down. And when we pop things off, it's going to be going up. So what does that tell you about the memory that's greater than 1, 0, 0, 0, 0? Yeah, or even just more generally, it's in use. That's the way a stack is. So if the stack is growing down, that means you're putting things onto the stack, which means the thing at higher memory addresses is data that's used by the stack. Whereas this is garbage. We don't care what those values are. Cool. So garbage on the stack, we're going to look at super simple example right now of two instructions, push EAX, pop EVP. What's going to happen at the end of that? Semantically. Yeah, so it's just the same thing, right? It's using a stack. So if you push something onto a stack and pop it off, it's going to be the same thing. So this will push whatever is in EAX onto the stack. We'll see that happen. And then we'll see that be copied to EVX. So we have the registers that are important here are EAX. Let's say it has the value hex A, EVX has the value 0, and ESP has the value. And so we can just walk through and visualize exactly what happens after each instruction. So this is a good tip for if you're trying to understand X86 or any kind of assembly instructions, is to break it down and think about what exactly happens to the CPU and all the registers, essentially, with each of these. So for instance, if this is the next instruction to be executed, right? So this would mean EIP points to whatever that memory location is at, which we'll look at in a second. So it's going to push EAX, which is going to, I believe, first decrement. Yeah, so it's going to decrement and move the stack pointer down and then copy hex A onto that memory location. Are we OK with that? So we saw the ESP register change. It got decremented because it's moving down. And we've now, and so I guess one thing to notice is that each of these are four offsets. So this is a word. So four bytes, 32 bits there. Just makes it easier to visualize rather than having to think about every individual byte. Now we pop EVP. So what we're going to do is when we execute this instruction, we're going to look up where does ESP point to? It points to this memory location here. Copy whatever those four bytes are and move it into EVX and then increment the stack pointer by four. Cool, so after this is done, which makes sense, right? We did one push and one pop, which means at the end the stack pointer should be exactly where it was. Kind of an interesting note. Do we clean up this value, OXA, on the stack? But what do we know about that value? It's garbage. It's garbage at this point, right? But it still actually stays there. So a side note that's interesting is if you write a program incorrectly that has passwords, you can actually end up storing passwords on the stack. And if you can read out memory, you can recover passwords and all this stuff. So writing correct programs is difficult, yeah. Is there any way to clean up that memory? You have to manually zero it out. So before your function returns, you have to set each of those memory locations like any sensitive data to zero. But it doesn't happen automatically by the CPU or the compiler. Okay, so this is just a basic example of how the stack works. Now we need to understand function calls. So to understand that, we're gonna look at the concept of function frames. Some of you are currently in 340. So this should be a little bit of an overview, but we'll see how it actually happens on the next A6 platform. So functions want to use the stack to essentially allocate space for their variables. And this is why. So when you call a function that has a local variable and that function returns, how can you reference any of those local variables? We're thinking about this another way. If you call a function, it calls malloc to create some memory location and then returns a pointer to that memory address. Can you still reference that memory location? Yes, what about local variables? Have you ever tried to return the address of a local variable to that function? Yeah, horrible things will happen as we'll see because you're pointing into the stack because functions use the stack for their local variables. So all parameters and local variables are stored on the stack. So we'll use the stack pointer. We'll use the stack for this, but the problem with the stack pointer is how many registers were there in X86, general-purpose computing registers? Was it like four or five? I don't know. The specific answer is not a pop quiz. It's on the order of four or five. Do you ever do any complex mathematical calculations that involve many temporary variables? So you take a variable, add one to it, times it by two, multiply it by three, take that value, multiply it by some other complex calculation. In all of this, you still only have five registers to use, so at a certain point, the compiler can't store the values and registers and need to use the stack. So the stack can change while the function is executing. So rather than use each of the, rather than use the stack pointer to address local variables, which we'll see, we use what's called a function pointer. So this function pointer is another register that we'll see that points at the start of the function frame on the stack. And then the compiler will generate local variables with different offsets of this frame pointer, and that's what we'll conceptually get each function that executes its own local variables and its own copy of local variables without generating different code per function. Walk through an example of this. I think this will clear it up. In x86, we'll look at EVP, function pointer, base pointer. Those are both the same terms for the same thing. So let's look at a C program. So we have main, we have three local variables, A, B, and C. A and B are ints, C is a float. And then we are going to set A equal to 10, B equal to 100, C equals to 10.4, A equals to A plus B, and then returns zero. So complicated program. So what the compiler needs to do when it generates the x86 code is it needs to say, when I say something like set A equal to zero, what does that actually mean? What do I set to be zero, right? I can't necessarily set a register, because if I call another function, it may reuse that register and mess things up. So it needs to come up with offsets for all of these local variables from the frame pointer. So what it'll do is it'll say, A is at some offset EVP plus A. It will just decide. And it will say, B is at some offset EVP plus B. And C is at some offset EVP plus C. That way, these memory operations that we saw in the C code are easily translated. Well, we'll say, well, and this is a high pseudo code. We'll look at the x86 instruction. The memory of location EVP plus A is equal to 10. That's just a direct translation of the line on the left to the right. So it would be the second line. Yeah, so the memory of EVP plus B equals 100. And we'll have the memory of EVP plus C is equal to 10.45. What about the next one? What do we have to do? Exactly, see? And then the return stuff, which we won't get into too much right now. And so it just decides. So it decides on these offsets. We'll see why, but it uses negative values as offsets from the base pointer. So it'll say, okay, A is at EVP minus C. And B is at EVP minus A. So it doesn't actually matter which one of those values because these are just local memory addresses. As long as they're not overlapping, right? Because then you would get weirdness happening with your variables. How big are these variables? Four bytes, A, B, and C, all of those are four bytes. And it will say C is at EVP minus four. So it just happened to be how it did it, minus four, minus eight, minus C. And this is from a real example where I compiled this and looked at the assembly instructions. And actually the weird thing is, depending on your compiler version, it'll generate different and more optimized code. So we'll skip this part for now, but essentially the instructions that we'll see is move hex A into EVP minus C. This is exactly this operation, which is exactly this A equals 10. So it's setting the memory location at EVP minus C to 10. That's what that instruction does. And the next instruction, let's move 100 into EVP minus eight. And we know EVP minus eight is at B, because that's just what the compiler decided. And finally we get a slightly more complicated one. We'll remove four, one, two, seven, three, three, three, three into EAX, what's this value? Was it? 10.45. 10.45, it doesn't look like 10.45 to me, what is it? It's not, it's not a weird thing. Yes, it's actually floating point value representation of 10.45. So yeah, it's probably not super precise. I actually don't know exactly what it translates to. So we move that into EAX, but we haven't sorted into memory yet. We need to move EAX into EVP minus four, which is the variable C. Fun fact, if you look into this, this is because this constant value is bigger. Like, you can't do this because of the size of the instructions of a move instruction. They need to move it first into a register and then move that register. The compilers know this and figures all that stuff out, so you don't have to. So now we've done everything here, these three instructions, and then we'll look at this last one and we'll move EVP minus eight into EAX, what's EVP minus eight? What variable B? So we're gonna move B into EAX. We're gonna add whatever's in EAX into EVP minus C, which is which value? A and store that in A. So we just did B plus A into A. So that's this A is equal to A plus B, and then other stuff. Cool? If you look at this, how much space does the compiler need in this function main's function frame? 12 bytes? At least 12 bytes, right? Four for each of the local variables. So we'll see how this function frame actually looks. What we will do is go over this exact instruction with in this little stack. Again, we'll start our stack at hex 1000, and here, actually the only registers that matter are ESP, EAX, ESP, and EVP. So how, so what every function is called, it doesn't know who called it. It could have been called from anywhere. All it knows, what does the function know about when it's called conceptually when you write a function? Now, if there are any arguments, it has those. Arguments, it has those arguments to that specific invocation of a function, but you don't actually know where you came from, right? Otherwise, I'd make programming incredibly weird. If you had to think about context of who's calling you and when, when you're debugging, you have to think about those things, which makes it very difficult, right? So if we walk through this, and we say the stack pointer's at 10,000 hex, the first thing that's gonna happen, and so this base pointer is whatever base pointer was of whoever called us, so we don't actually know what that value is, but what we're gonna do is set the current base pointer to where the stack is. So that's this next instruction, is move the stack pointer into the base pointer, now they're both pointing at hex 10,000, and now we're going to make space for our local variables by moving the stack pointer down, in this case, 16 bytes for a hex 10. So that happens, but the frame pointer stays hex 1,000, and this way, throughout this execution, all of the offsets here will be for all of our variables constant, no matter what happens to our stack pointer, if the stack moves around. Does that make sense? Wait, you first moved the stack pointer to the base pointer first, or? We're setting up our base pointer, so the person who called us has their own base pointer. If you think about this very carefully, you'll realize this means that whoever called us needs to save their base pointer onto the stack, because we're clearly going to overwrite it, and then we move the stack pointer down by however much the compiler decided, in this case, it decided 16 bytes, it doesn't actually need all that space, it just, that's what decided. So you first moved the stack pointer to where the base pointer is, right? Yes. Okay. Hold it around, we move the content of the stack pointer into the base pointer. That was the very first thing we did, so now the base pointer points to the same location as the stack pointer. So now we do our computation exactly as we saw, so we move hex A into EVP minus C, so we have all of our values here, the CPU will calculate EVP minus C, does that FFF4 and copy hex A onto there? Those four bytes. Then it will move EVP minus eight, which is FFF eight onto hex 64, and then now we have our two-step process where it's going to move this 10.45 and IEEE floating point into EAX, and then moving EAX into EVP minus four. Now we actually do the computation, and what was the original computation in the C code? A is equal to A plus B, exactly. So we know that when that's done, A which is here should be A plus 64 in there, which would be 110 since it's 110. Whatever that is in hex, I have no idea, we'll find out. So we'll move EVP minus eight, just EVP four, eight 64 into EAX. We see the EAX value change, and then we will add EAX to EVP minus C. EVP minus C is here, the A variable, and it will get the value of 6E. So this means every single function that they really need to have, it's called a preamble here. Every single function needs to set up its own frame pointer, because this way, if this function calls itself, a new frame, a new function frame will be added to the stack, and that will keep going as long as that program is executing and calling functions, and then when functions return, the stack moves up, and then it goes down as other functions are called. Questions? It's more important conceptually to understand what's going on here, because we're gonna add in return addresses and how a function, like the CPU, actually executes things. Cool, okay, so I will skip this, but basically, when we think about it, when we call a function, and we just talked about this, so when you call a function, what do you provide to that function conceptually? Arguments, so if you provide the arguments to that function, what are we expecting in returns? The return value, right? And we're expecting, when we think about the CPU level, that it doesn't mess up our function frame, right? We don't expect that we call a function and it changed our local variable unless we passed in a pointer to a local variable, and then that function could change that value. So we need the return value, the parameters. We also, as we saw, the function that we call is gonna mess up our frame pointer. It's gonna overwrite it as the very first thing it does is it moves the stack pointer into the frame pointer, which gets rid of ours, which means that when control flow returns to our program, now we don't have a frame pointer, we're pointing down somewhere. So we need to actually save our frame pointer. And then this is key, and what becomes key when we talk about exploitation and buffer overflows. Now, what else do we need to know? So does the CPU, like, how does the CPU work? Like, how does it know what instruction to execute next? Yes, there's a program counter that keeps track. It has a program counter that keeps track of the next memory address to execute those instructions from. That's all it does. Very stupid. How do you change that program counter? Depending on the architecture, you may override the register. When you think about it in terms of instructions, how do you do that? How do you know what value? Possibly a return. We'll talk about that in a second, but I wanna think of branches. Usually it automatically increments it. So it will automatically increment it. Yeah, it'll just keep going and executing forever. What if you have any of the statement? Yeah. A branch or a jump instruction. Yeah, so a branch or a jump instruction tells, depending on this condition, then change the program counter. Or an unconditional jump just says change the program counter to this location. So it'll just keep executing. So then, okay, so think we're executing, let's say the main function. We'll ignore how we got there. So we're executing the main function. Now we call a function foo. Right, we set it up, we call that function, we store all these things. And now foo's done executing. How does it know to go back to main? Yeah, so fundamentally, it must store where it should go back to. Right, there's nothing, the CPU itself doesn't have any support for, oh, go back to that last function that calls you. Like, whoever calls you. Because when you write a function foo, you can call it from many places in your program. Right, and you wouldn't think that you call a function foo in main and then it jumps to bar. And this changes that runtime. So we need to, and this is called the return address. So we need to have a way to store where do we go executing after this program has stopped execution. This is gonna be key because this is, you can think of it, okay, good, I'm undergrads. You can think of it like the story of Hansel and Gretel. Everyone heard that story? Oh yeah. It's like little kids who leave breadcrumbs to go into the forest, and why do they do that? So they can find their way back at night. They follow the breadcrumbs and it leads them back to where they wanna go. This is exactly the same thing. This return address, as we'll see, is information on the stack that tells the CPU where to go next because it's just going and executing. And it says, oh, this function is done, where do I go? Oh, there we go, this is where I go. I start executing and then this is where I go. So just like if you were an evil witch and you could change that breadcrumb path to go wherever you wanted it to go, and a dumb child like a CPU will just do it. I mean, in the sense that a CPU doesn't think or have logic, it just does whatever it's supposed to do. So we need some kind of convention about how to do this, it doesn't really matter, but Linux for x86 uses the Cdeco calling convention of exactly how you call a function. So first the caller pushes all the arguments on the stack in right to left order and then pushes the address of the instruction after the call. So this is the breadcrumb of where to go back. And the call eat pushes the previous frame pointer onto the stack, creates space on stack for local variables, shows that the stack is consistent. I guess I lied, we'll see this in an example. Okay, so to walk through an example, we have a function main has a value a, says a is equal to calle 1040, return a, calle is a symbol function that takes into integer parameters, returns a plus b plus one. Have y'all written code like this? Functionally the same thing, you call functions. So when the compiler compiles this code, in main it's gonna first have to save whoever called it's base pointer. So I lied a little bit with that program. That does that. So it saves the base pointer, moves the stack pointer into the base pointer, subtracts 18 hex from the stack pointer to make room for main. Why it needs, yeah, it's kind of a tricky thing. We'll go over this a little bit. But fundamentally it's gonna move the value 28 onto ESP plus four. So there's no minus there, so it's an ESP plus four. Move a onto ESP, then call calle. And as we'll see this call instruction is the key here because a call instruction says jump to this memory address and push the return address onto the stack. Then move a EAX, so in a C-deckle calling convention, once you call a function, the value that's in EAX is your return value. So EAX is gonna move into evp minus four, which is the local variable a. And then it's gonna move evp minus four into EAX and then leave and return. We'll ignore leave for now, for now it cleans up the stack and then the return instruction will go back to whoever called main. So we're doing a return a. So we just put the value of a inside the EAX register, return so that whoever called us can get the return value of this function thing. So calle has the same thing. So calle, so this is the prologue and the epilogue. So these are gonna be mostly the same. You'll see these in almost every function you look at because they all need to do this kind of bookkeeping in order to do what they're supposed to do. Whereas calle pushes evp, moves the stack pointer to evp, moves evp plus C. So why is it a plus C instead of a minus C? What did we look at? What was like evp minus C in our other example? Like going down, right? Yeah, it's going down. What was that like in the program? Minus C wasn't like a nice variable. Local variables, exactly. So if we're going up, it's our parameters. So these are the parameters that are passed into our function. We'll see exactly how that works. Move evp plus eight into edx. This is basically an add them together plus one, or add them together and then add one and then pop evp and return. So again, prologue epilogue. And now we will watch this in action as this actually happens. So here I actually traced each of these memory offsets of where one compilation of this put all of these addresses. And this is because in a set, like when the program is actually executing, we're not going to call some simple calle. We're going to call an actual memory location that we're going to go execute. So this doesn't have to be in this exact order. Interesting thing here, you start looking at the deltas here that will tell you the size of each of these instructions, which is kind of cool. So you can see the push evp is one byte. I like that stuff. So now we're going to go through everything. So it's basically simulating an x86 CPU in, so the registers that are used here, eax, edx, esp, evp and eit. So that's not only things we care about here. So the stack will be at some value. One time that I ran this, it was at fd2d4. And you can do this by just setting up breakpoint on main and gdb, running the program and seeing what the memory locations are. That's how I got all these into this equation. So main starts executing. Does main know who called it? No, there's actually is usually another function that calls main with libc and setting up all the dynamic libraries, but for our intensive purposes, we don't care. We just know that somebody must have called us, so we have to do all of our bookkeeping. We need to save their base pointer by pushing evp. And so we'll push evp. We will then set up our frame pointer by moving the current stack pointer into the base pointer. So now the base pointer, instead of pointing to our callers function frame will point to ours. We will subtract 18 hex from esp to set up our local variables. Well, not just our local variables, but to set up the call to this function. So we will then, and remember the call here was a equals call e10 to 40. So we're gonna move hex 28 to esp plus four. Which is gonna be here, so hex 28 is here. Which of those values, which of the parameters is it? What were the parameters? All right, we're too far gone. They're 10 and 40. So this is the second parameter. And then the next one is a, which is the first parameter. So again, going back to that calling convention, essentially we did the effect of pushing onto the stack of arguments from right to left. So this was the right most argument was hex 28. And then the next argument is 10. Or if you read it going up, the very first one is the first argument, second, third, fourth, whatever. So this is how that function that we call knows how to access those variables. Now, important, what is this call instruction going to do? Semantically here. So we think of the end of this call instruction. So what's going to happen to this CPU state? It's gonna redirect the instruction pointer to the callee function. This, exactly. So it's not the callee function anymore, but it's 804-83-94. Then it's also gonna save the return. What's the return value? The current memory address of the call point. Or would it be the next one? The next one, yeah. So it figures out the next instruction. So it'll basically do a push 804-83-BF, that value onto the stack. So that's a two-part process. So it decrements the stack, it copies the stack 804-83-BF, and then it starts executing, it changes the instruction pointer to 804-83-94. Now at this point, does callee know who called it? No, right? It has absolutely no idea, which makes sense with how you're used to writing functions. If you don't care who called you, you just do your job and return. So here we just have to do the same thing. We push EVP, we move the stack pointer to the base pointer. So now we've saved main's base pointer on the stack, which means we can use it. So we're gonna use it. Now if we look at this, we can basically see that all of this memory belongs to main and all of this memory belongs to callee. So you have this as the base pointer. So you have EVP plus eight is the first argument, and the EVP plus 12 is the second argument, and that's the same for whatever function, no matter what. So we'll move EVP plus C, which is 28 into EAX. We'll move EVP plus eight into EDX, which is A. We will add those two together into EAX, so we'll add one to EAX. So this should be the value of 51, the 10 plus 40 plus one. Now we will, now we have to return. So we need to undo everything we did in the epilogue for callee. So we need to restore the base pointer of main. So we're gonna pop EVP, which is gonna set the base pointer register, the base pointer register, which is up there. How do we know that this is actually main's base pointer? We don't, who said that? Good, we don't, why not? Yeah, we think about right here, when we do pop EVP, all that we're doing, so the stack pointer was here, we take whatever that memory look, whatever's at that memory location, and we put it in the base pointer. We don't know that, there's no check to verify that what we saved at the start is what is put back. Similar, now we'll look at a return statement, where we're going to return, which essentially you can think of as a pop EIP. So this means take wherever we're looking at here, copy 80483BF into the instruction pointer, and then start executing from there. So return will change that to BF. And again, how did this return instruction know to return here? It was told by memory on the stack, right? It's just that at that very specific memory location on the stack is where callee will return to. There's nothing in here that we saw that verified that that's actually the function that called callee. So then we'll do our cleanup, I'm gonna skip this, you can go over this, a leave is fancy, and then a return, and where are we gonna return to now? Whoever called main, right? They'll be right above at FD2D4, we can't see it, but there's some instruction that called us, and we'll return right after that. Questions on that? So when it pops EBP in callee, does it like go to that, does the base pointer goes up there, right, or something? So when it pops EBP, it just copies whatever is currently on the stack pointer, and copies that value into EBP. So it doesn't actually know anything about that value, but I need to keep trucking. And so now, so if we said that, let's say this, well, let's say, where's the, yeah, okay, here. So let's say we have some buffer on the stack that's our local variable, so we have a local variable that's a character A of 50 characters here. We'll say, let's imagine this being down 50 characters. So what happened if we wrote 58 characters? So we have here 50 characters down. What's that the 50 first character? D0, or this FD2B0. So we'll overwrite one of these bytes, and then the next one will overwrite the next one. And the next one will overwrite the next one, and the next one will overwrite the next one. I think because the ending in this will actually be 0F, but that's okay, you don't mind that at all. Then the next four bytes will be these four bytes. So this is the core idea of a buffer overflow, is once we can overwrite a buffer on the stack, we can change these stored values, specifically we'll be interested in overwriting this saved return address. And then that way, when the program returns, it just returns to whatever was there. So this is the core idea, and as we'll see, we can maybe jump to user defined code, or so we may be able to jump to other functions in the program, so we may be able to redirect the control flow to, let's say there was a function that said make me an admin, you could redirect to there, and that would be great. Or as we'll see, we can redirect it onto the stack, which we have our code in, so we can have it start executing code of our choosing. And if your stack is executable, then the CPU will just do that. It has no idea it's moved to the stack. It's just memories and bytes. Let's look at an example of this, of how this would crash. So we have a super simple example. We have a function my copy, that's doing string copy, from string the parameter onto foo. What are the semantics of string copy? Like the string you want to copy into first, followed by the string you want to copy. So the destination, the source, but how does it know how many bytes to copy? It goes until the terminator. It goes until the terminator, so it goes until an old byte. So it takes the first byte of string, says if it's not zero copied into foo, it looks at the next byte of string and copies it to foo plus one, and so on and so forth, until all of the characters of string are copied. So, we can do something, you can see I made these slides a long time ago. And we can see what the size of this array. Four, it's still a character pointer that we're passing in to string copy, which is totally fine. But string copy doesn't know that the destination is only four characters, right? It has absolutely no way of knowing it. All it is is a pointer to some memory location. So fundamentally, if we control that copy to string, we can do whatever we want. So if we look at this, main will be doing all these things. It will then move a memory address, which is a pointer to that string, onto the stack pointer. It will call my copy, it will then call printf, and it will leave. Whereas my copy pushes evp, moves the stack pointer to the base pointer, subtracts 28 from the stack pointer, sets everything up, calls string copy, leaves and returns. So now let's step through and see what actually happens here. So we have all that code, same thing we looked at before. We'll go a little bit quicker this time to get to kind of the good bits. But you can, I encourage you to step through this with the code and you can look at this and see the actual instructions happen. So we're gonna push, do the epilogue, create our stack pointer. We're gonna move 804-8504 onto the stack pointer, and then we're gonna call my copy. And remember the call instruction will push this 804-84-23 onto the stack. And this is the next instruction to be executed after my copy. So now we'll push evp, move that, create our stack pointer, move, what's this? Oh yeah, move our parameter. So move 804-8504, which is our string, into eax, move that onto the stack plus four. Get this next instruction, what's this evp-c? So this is, again, so now we're gonna call string copy. We can read this as this is the first argument and the second argument. So the first argument is gonna be where we will copy the string and the destination and the second one will be the source. So we're gonna read one byte from this string and copy it onto fd2ac, which is here. So how big was the, how big was the character array in our C code? Four. If we wanna change and overwrite the saved instruction pointer, so where's the save instruction pointer on here? The last two bytes, the last byte, I guess. What's this value? Save train pointer, whoever called this, what's this value? The save return value. Save return value of whoever called us. So if we want to overwrite that, we actually have to overwrite, not just four characters, we have to overwrite all the way from fd2ac to fd2bc and then four more bytes. Is that greater than four? Yes. Yes, why? Because the compiler just decided to do that. The compiler decided it needed 28 hex characters or 28 hex bytes for this function frame. So this is why it's always important when you're doing this kind of analysis to look at the actual byte array rather than just looking at the C code and see what the C code tells you to do. All right, so now we have our copy function and so what's gonna happen? So again, we know that that's our constant string, 804-8504, it's asus-space-cse-space-340-qual-2015-rocks. And so what string copy's gonna do is copy a byte at a time, starting at fd2ac, which is this pointer, one, two, three, four, this is asus-space, and then cse-space because it's, and this is where you can see the envy. So 61, 73, 75, 20, so it's writing it in reverse order back, if that makes sense. 61 is a, 73 is s, 75 is u, and 20 is space. You're just running a byte at a time, but if you look at it like this, you'll see the byte values are, like as a 32-bit number are reversed than what you'd expect because of the envy in this, and this is why that comes in. And it'll just keep going. And again, like we've already at this point overwritten data that we shouldn't have, right? That buffer was only size four, but now we've overwritten at least eight bytes, but we keep going because string copy has no conception of the size of this buffer. So we'll keep going, keep going, keep going, keep going, is it gonna crash? Not yet, why not yet? Yeah, we haven't written to any, right? A segfall happens when we overwrite memory that we don't own. But this is the stack, right? We own all this memory. We can write to every memory address on the stack or up to some extent. If we got all the way up, we would cause a segfall. But this just returns and string copy is happy. It did its job. It copied bytes from the source to the destination. It's like, I did just like, this is the cursive computers, which hopefully you're, well, not hopefully, but you're definitely learning. They do exactly what you tell them to do and nothing more, right? So string copy copies bytes from the source to the destination. That's it. So then we do a leave. And the leave instruction, as we'll see, will change our base pointer to be 67676166, which is at the place where we overwrote the saved base pointer. Is it gonna crash? Not yet. Not yet, why not yet? I'm not sure it's accessed. Yeah, it would crash if we tried to dereference this memory location, right? If we tried to access the memory that was at that location because it likely doesn't exist. Now it will return and now it's gonna happen. Access that memory. Yeah, it's gonna try to basically, so it's gonna be, again, return is like a poppy IP. It's gonna start trying to execute memory at location 31, 30, 32, 20. So it'll do that and it will then get a segfall. So if you run this, this was on one of my examples. You can run it, and you can actually, you wanna do this, you can take that example, do these exact commands to do this yourself. If we do a.out, it will say, segmentation fault core dump. And then we can run gdb on it, run it, start it, and see that it actually gets a segfall because it's trying to access memory 31, 30, 32, 20. So it's exactly that memory we copied over in the return value of my copies. Now what? Can we control where this string, where this code goes? We were the one that gave it the input so we could give it like a different input that would cause it to go at this. Wait, did you give us this input? Oh, oh, I guess not. Yeah, so it's a hard coded example, right, which doesn't wanna make sense. So we're not, we can't change this string without recompiling the program. And like we said, for local attacks, if you can change the code, you've already won. So, but this is an example, and if we, oh, the cool thing is if you look into info registers and gdb, you would see all the values. So we'd see the base pointer has exactly what we thought it would. And when we get down to EIP, it has that value that we do not want. So if this was our user input, so what are some ways that we can give input to a program, a C program? Command line parameters. Command line parameters, how does that show us here? R, C, R, V. R, C, and R, V, yeah. So if this was a my copy with R, V, one, then you could just put different command line arguments to different size until you crash it. What else? Say again? Standard input. Standard input, yeah. So standard input, if they're reading, gets is one of the classic overflow functions. Gets, string copy or string cat, S print F, S scan F, and any custom input routines. So the key is how do we actually exploit this? There's a super famous paper that I highly recommend. If you're interested in this to read, it's called Smashing the Stack for Fun and Profit. I don't know, I don't have the date of the year that was published. It's still a good article for understanding stack based exploitations. So what we want to do is we, so let's say we can control this instruction pointer. The question is where do we go? What do we want to get out of this program? What do we want it to execute? Like you said, we could want it to execute like a function that would make us admin. Yeah, a function that makes you admin, or maybe you want to execute like a VINSH. So to get a command shell with the privileges of that system. So that's kind of our standard goal. The traditional way of doing this is called shell code. So the idea is you write some custom code that gets written onto the stack. The program executes that, and then you have it jump onto the stack to do that. Now the problem is that on modern binaries you have address space layout randomization, which means every time you run the application the stack is at a different location. And because your input is being copied onto the stack and overriding that return address, you don't, you can't predict where that value is going to be. In addition, they've now done things where you can't actually execute the stack. You can only, the stack is non-executable. So if you tried to jump onto the stack, the CPU throws an exception because you're not allowed, just like writing to memory that's read only. The CPU actually enforces that. So then basically modern exploitation uses this concept of return oriented programming, which we'll look at very quickly. There's a, yeah, it has a whole, basically we're gonna use little snippets of the program itself and reuse tiny bits of code to change registers and do whatever we want. So this was, there's a really good paper, The Geometry of Innocent Flesh on the Bones. And we'll look again at a very simple program. So now this is a vulnerable program that you can compile and play with. String copy, RV1 onto foo. Is this any different from the previous example we saw? No, not really. And if we look at main, it looks essentially the same. I mean, these instructions are basically the same. So to follow along back home, there are some, you need to disable certain features, but if you do this instruction, you will be able to compile a binary that does this. And so we want to essentially, if we went back to when we talked about system calls, what we want to do is call exec ve with slash bin sh is our goal. So that's kind of our end goal of what we want. So how did we do that? Well, we needed to get the value b into eax and we need to get the address of a string bin sh into ebx and we need to get, well, any cx. So when we call exec ve, this is the number, this is the program we want to execute, this is our b of this example. And then the next edx is going to be the environment pointer so we can put null in there. So that's our goal that we want and we can, so we need to somehow write memory to this application. We can't necessarily write it, well, maybe we can write it on a stack, but we can look at the binary and this is a modern compiled binary and we can see what memory locations in here are at fixed locations. Like locations that I know are not going to change and all these addresses are not going to change. The stack will change, which I'm not sure where it is here off the top of my head. We can look and we see, man, this is awesome. There's a dot data section that is writable and I know starts at 080EA060. So what we need is we need a little snippet of code that will do one thing for us and return. Because on a buffer overflow, we basically control the stack. So just like I mentioned, like the breadcrumbs, essentially we're going to set up a series of breadcrumbs hopping around the program itself in order to do arbitrary computation. So if you do this example, it'll be a similar gadget which will move EAX into EDX wherever EDX points to and return. So the gadget, this is a tiny gadget and we can control what, where it returns to so we can control the next function that gets for the next snippet of code that gets called. So if we have EAX be the string slash BIN and we can have EVX be the address of dot data which we saw was 080EA060, if we get the program to execute this, it will copy slash BIN to that location. Everybody agree with that? Let's assume we can do that. But what do we need? Like what's our requirement to be able to use this? Yes, we need BIN in EAX. So we need to be able to put our data into EAX and what about EVX? Same thing, right? We need to be able to control the value of the EVX register. So we need more gadgets. We need to go gadget hunting some more. There are nice tools to help you do this but I kinda wanna walk us through an example of this. And this is not to say that you absolutely have to do this but this is showing kind of modern exploitation techniques that you can get into a lot. So we need to get our data into EDX. Turns out there's a super nice gadget that pops something on the stack into EDX and then returns. So assuming we control the content of the stack and do we? Yes, because we're overriding and controlling the content of that stack. So because we can do that, whatever we put on the stack at wherever this is gonna be pointing to, it will pop that value off into EDX and then return to whatever's at the memory location after this value that we wanted in EVX. Cool, so that helps us a lot. We need, wait, where's the EAX? Oh no, oh no, we need another gadget. Okay, let's assume we have found an EAX gadget. Oh, this is just an example, okay, cool. So we can actually, so I'm looking at the program when I did this. So there's 50 As to get up to the base pointer so that gets us up to the current frame pointer. The next four bytes is the save EVP. So it's just four bytes of something or whatever. And then we have, so we want the value 0806E91A and then we have something else, EVCD. So if we breakpoint right after this string copy, we can see that we had 50 As copied from the buffer to up to BFFF688. And then the next four bytes are AVCD, which was our value for that. And then we have 0806E91A. So now it's going to, and this was that value we wanted to eventually go into, sorry, this was the location of the gadget that pops something into EVX. So if we look at this, it's going to do, it's gonna do the leave and then finally the return. And so this now is gonna execute that gadget that we have of pop EVX and then return. So what value is gonna be in EVX when this next instruction finishes? Yeah, 62, 63, 64, 65, who put that data there? We did, I did, right? And if we go back, we can see that that data is here. This is EDCV. So if we go back, so we can control the value that's now in EVX. So we can put an EVX 62, 62, 63, 64, 65. We can change that to the address of data. And now, what's the next thing that's gonna execute? Where is this control flow gonna go? Yeah, whatever's here. So there's a zero here because of the last byte of our string copy. So we copied all of our data plus a zero and that's why there's a null byte there. But fundamentally, if we had four more bytes, we would control that value, which means wherever this goes, we can then control where that returns to. Now we need a gadget for the EAX register. So there's, we look for more gadgets. We'll see that there's a POP EAX return at this memory location. There's a POP EVX return at this location, POP ECX return. And XOR EAX EAX return, why is this useful? Yeah, it sets EAX to zero, which can be nice, depending on what we need to do. Or actually, what do we need? What's our goal for the value of, or I guess I'm gonna phrase this. The goal is the value inside EAX. We want to be, I believe, B because we want to call system call exec VE. So if we reset it to zero, then if we can increment it with this gadget, to do this 11 times, we'll then finally get a system call and we'll know that there's a system call, which is an 80 at this memory location. So the super cool thing is, all of this code already exists inside the binary. We're simply changing the control flow to do all of these little tiny things that put together will actually allow us to have arbitrary computation and call, in this case, bin SH. So now we can build our shell code. So I will, we'll write a Python script. This is kind of what I meant. This is more of a tip than this is. So we'll, are you using POP tools here? I guess not, but that's fine. So we're first going to send 50 As and then BCDE. So BCDE is the state base pointer. Then we will put the address 0806E91A. The nice thing is about this pack command is that it will do the end in this for us. So we don't have to worry about that, which is nice. Then we'll put the address of dot data. And then, so this will, after this executes, right? We're writing up the stack. This will move this value 080EA060 into EVX. So now we have the address of dot data into EAX. Now we need to control EAX. We need slash bin slash, or slash bin, sorry. So now this gadget will pop EAX, it'll pop the next value, which is slash bin, into EAX. So now we're at the point where EAX register has slash bin. EDX has the address of dot data. And now we need that gadget to move EAX to wherever EDX points to. So we do that. And now at this point we've now, so we've copied slash bin to dot data. Now we need to copy, so now we have a problem. We need to copy slash slash SH. Why do we need to do that? We need four bytes. Yeah, we're gonna copy, all these copy and move instructions are four bytes. And on Unix, if you add as many slashes as you want, it doesn't matter. So that's a little trick that we use here. So we do exactly the same thing. The only thing, well, two things are different. One, do we wanna copy this to address of dot data? No, because there's already four bytes there of slash bin. We wanna do address of dot data plus four. And we're controlling this, so that's the gadget. And then we have the address of data plus four. So this is 64, that's 60. Then we do our pop EX gadget. We do slash slash SH. And then we do our gadget again to move EAX into EDX. Now we finally have the string slash bin slash slash SH into memory. We have a problem though, because our string needs to be null terminated. Slash bin slash slash SH. And we don't know what's after that. So again, we need to zero out address of data plus eight. We do pop EDX return. And we're gonna put the address of dot data plus eight. We're gonna XOR EAX. We're gonna move EAX into EDX. And now that string is zeroed out. So then we'll finish it up. So now we have a null terminated string bin SH that 080EA060. Now we need to build up the RV vector. So we have to do a little bit more stuff, but we use the address of plus 10 to do the RV vector. It depends on the architecture, whether you absolutely have to do this of setting an RV vector to bin SH. I like to do it because I've not done it once in it. It was a huge thing. So this is a slightly more complicated, but we're setting up a pointer at data plus 12. The first thing there is a pointer to address of data. So that's an RV vector. So now we have character arrays. So now we have the first one as a pointer to RV to slash bin SH. And now we need to zero that out because that's the semantics of RV. Same things. And now we have everything. So now we need to call exec ve address of dot data address of data plus 12 and the address of data plus eight. And so the idea is we're reusing the null bikes that are already there at the address of data plus eight for the environment parameter. So this is pretty simple. We use our gadgets. We then set up edx and now we do ex is 11. We don't do anything fancy. We zero out ex register and then we just increment it 11 times of the PowerPoint is slow. And then call it 80. And now no matter where the stack is we actually don't care at all where the stack is because dot data section is constant and our text section where we're using all these addresses or is constant. So I can run this with my Python script. Oh, I can first set a break point at 804-8B67 which is right before this return instruction. So I'll see this like insanity of the stack, right? Because I've written over a bunch of bikes at the stack. And if you step through this I can't believe I should have did this but you can step through each of these gadgets, right? So even though this looks contiguous, right? All of these memory locations are different memory locations, right? So everything that's a ret is different memory locations. And so it's just gonna the CPU is gonna go through pop edx return pop ex return move ex into edx continuing to do that through these gadgets just as we laid out until finally, let's see, before I get to the in 80. Okay, yeah, now you have a bunch of increments which I didn't want to do. So now you have the case where right at in 80 you have V in ex which is exact VE. You have 080EA060 which is the string slash VIN slash slash SH. You have the RV vector which is at VEX plus 12. And then you have VEX which is at adders of data plus eight. All this setup you call in 80 and the operating system says, great, I will execute slash VINSH for you because you asked it so nicely. So if you set a break point there you can look in GDB examine this memory location as a string and it would tell you that there's a string slash VIN slash slash SH there. You could look at two at 80C which is the second argument and you'd see that there is just like an RV vector. The very first one points to the RV zero of the program and the second one is a null terminated vector. So there's only one argument to this RV. So it's like executing VINSH with no arguments. At 68 we know there's zero. And so if you continue it would say process is executing a new program VIN dash which is the equivalent of calling exactly a VE with slash VIN slash slash SH which is a fully ASLR proof ROM payload. ASLR? Address based layout randomization. So it means that no matter where the stack is we don't care because we only use the fixed values of the text, the code and the dot data segment. So you should be not do this yourself by hand. It's a huge pain. I've done it a few times. There are automated tools to find gadgets. Pro and tools is a super comprehensive library that's used by most of the top CTF teams, capital fly teams. It's a really cool library to start playing with. It allows you to build like an exploit script that interacts with a remote application or a local application using the same thing. It actually is all types of complicated that I don't even understand all the features. There's tools RopGadget and RopHer which will analyze the binary and show you interesting gadgets and some of them will even automatically create that RopChain for you in a Python script. Yeah, RopGadget has that for sure. Cool. Questions? You know, that was a lot. Yeah. I'm just finding gadgets that don't really have to say it's a lot of this. Say it again? Don't nobody have to say it's a lot of point gadgets like that? Two different ways. So one way is to do it like this. Another way is you use another vulnerability that leaks a pointer on the stack. So if you had a vulnerability that would allow you to read a value off the stack like a brain pointer, then that would tell you where the stack is so you could defeat it that way. So actually modern like browser based exploits will all like chain two or three different vulnerabilities together in order to actually like go from JavaScript to executing your mission. When you say it like leaks stuff off the stack that just when you were talking about where it doesn't zero out after it pops it, or is that? Yes, that's one way or you could have an arbitrary read. So you could have something that takes in your arguments into a buffer that says read me like negative 10 which would be the other direction or read me the value at plus 50. So you can read, you can have out of bound reads which aren't necessarily the same as out of bounds writes. Cool, so this was something I definitely wanted to share so you can get an overview of what kind of a state of binary exploitation is. So they have buffer overflows, rock. Yeah, so the area, oh shit, I guess they're going. Okay, I guess we'll stop. This is a hot, awesome research area. A lot of stuff to do, a lot of software to save and thank you, semester. It was pretty fun.