 Sorry for the late phones, I don't need any of them randomly, so if you're with us, all right. And also, let's go, awesome. There are a couple of people in level 5, and a whole lot of people in level 1. I hope that you're struggling with level 1 not being never logged into the system before. Sorry. I've got a job, I've got a job. Okay, cool, let's start now. Okay, so back to shellcode. So on Monday we went over and looked at what exactly our shellcode should look like. So what is shellcode? What are we going to talk about? What does it do? So shellcode is literally just code of our choosing that we want to execute in a different process. Why do we want to execute it in a different process? The process has different permissions, exactly. So we want to try to take over somebody else's process to get to execute our code. So we talked a little bit about both our overflows, and we talked a lot about how they work. We talked about shellcode, which is what we need to use to actually execute the code. Now we're going to try to put everything together. How do we actually transfer control from the program's code to our shellcode? So what we need to do is we need to overflow Atari Buffer with a string that contains what? So what do we need inside the process's memory to execute shellcode? So we need the shellcode itself, exactly. So we need the shellcode somewhere in memory, and that memory region that it's located in that it will be executable. If it's not an executable memory region, then you can't execute that code. So we need it to be executable, and then we need to change the saved EIP on the stack to be the address where our shellcode is located. So essentially, most buffer overflow exploits are going to look like your shellcode first, and then however many random jump bytes you need in order to get up the stack to the saved EIP, then after that you have four bytes of the address of the shellcode. So let's put it all together. Let's use a super simple program. We just have a buffer in name called foo. Then we're going to copy rv1 into foo. What's rv1? The first command line argument. So we run this in the command line. The first argument that we pass in, that string will be in rv1. And then we'll return to that. So I'm not even going into the details of all this convolutional process. We should be able to more or less do this. So we can walk through this. So many looks like pushEVP, moveStackPoint, and EVP. So track ESPX3C from ESP. So this is all the prolog of the function here, right? Setting everything up. Then after that we have moveEVP-C into DAX. What's an EVP? Sorry, EVP plus C. EVP plus 8. RGC, the first argument. What's an EVP plus C? The second argument to this function. The type of RGV is a pointer pointer. It's a character pointer pointer. So we're moving what's at EVP plus C into DAX. So this is going to copy the address in RGC and put it into DAX. We're going to then add 4 into DAX. Why are we adding 4 to DAX? Yes, we're trying to get the second. So we're accessing RMV bracket 1. So when we're adding 1 to a pointer, we're moving it forward 4 bytes because the pointers are 4 bytes. We then dereference DAX and move it into DAX. So here we have the parentheses, right? Remember, means dereference. So now we're going to go to wherever EAS now points to, which should be what now? It should be RV1, right? So this should be an address that points to some string that we passed in as part of RV. Then we move this EAS value, this string value, onto the ESV plus 4. Why ESV plus 4? Yes, save the arguments for string copies. So RV1 is the second argument. So it's going to go first on the stack. And then we can assume that the next thing we'll do is take the address of foo and put it as the second argument right at ESP. And if we look, that's exactly what happens. So we load an effective address of EBV minus 32X and we copy that into EAS. Or we said this is going to be where foo is located, right? Foo is located. We know it's a local variable because it's a negative offset of EBV. We move that into EAS and we move that onto the stack. Finally, we're going to call the string copy function which we're going to go for now like a black box. Then we're going to move 10 into EAS. We're going to leave it at a turn. Pretty simple program. Any questions? I think it's a compiler that's moving the address of foo into EAS and then EAS into ESV. Can we just move the address of foo into ESV at zero ESV? I don't think you can do two B references in one move. You can't, like, get a reference from here into here. You have to always move it into a register first and then put it back in that ring. You may be able to do a push. I don't know if you can only push registers or if you can push memory locations. If they didn't already pre-allocate this ESV stack space and you needed to actually push it onto the stack, you may be able to do a push. No, no, no, it still won't get the address. That's why you need to load it back to address. Yeah, so we need to get that out of there. So it's actually performing, the load effect of the address is performing some computation. It's taking the value that's inside the EAS register and subtracting 32x from it and it's putting it in the EAS register. Any other questions on this? It's super easy. So you compile it like this. You throw it in a... So this is actually how all of the most of the programs are compiled on the HANA server. Basically, we are eliminating all of the protections that we talked about. We haven't talked about yet, but we're taking it back to like the mid-90s. So we compile this, then we can run GDB. We're going to run GDB on this program that we output here. You notice I didn't put a source file here, so you can obviously put that code into a file. Then we're going to break on the address of main. So this is going to put a break point exactly at 804.83 fd, which is the first instruction of main, as we'll see. Then we're going to try to throw some shellcode in here. So we're going to run the program in GDB, and the first argument we're passing in is what? What is the program going to read when it asks for R&B 1? Is it going to get Python space dash c space print c's print, and then all of that. What's going to be the output of that? Why? It's surrounded by ticks. Vectics, yeah, the vectics are process substitutions. So execute this process, and then whatever the standard out is there, that will be the argument at this location here. So this is actually a super cool way that you can pass in non-Ascii characters into the command line arguments of a function. So here I'm running a Python script. The dash c means don't execute a Python file, just use the string as a Python file, and all I'm doing is printing out these characters. So the nice thing, other languages do this too, Python is the only one. But if you put the slash x, that is we'll print out the character that has that hex value. So we'll print out x31, xc0, x50, x68, and assuming we have no nulls in here, this string that's inside the single quotes will be the input, the rv1 to our program. Instead of using load effective address, if we use move, what is the difference in the semantics? How does it work? Move dereferences. So move would take the four bytes that are located at edp-4, because now there's nothing, so let's say zeroes, it'll move zero into the ans. The load effective address does not do a dereference. It says take what, that's why it's load the effective address, right? Not like actually move and dereference something. So take the value that's inside of it, or the edp-register, and subtract 32x from it, and copy that value into the ans. Questions on this part? Let's see if some of the headaches in the Python restrings work differently. I know you're doing stuff. Yeah, I know you're doing Python through-sand. My students are trying to convince me to upgrade my run to put parentheses around my print statements, but that's about it. I still need the strings, and the byte strings and stuff. I'm sure it would be better to go long run than at some point I will run. I used 2.7. There you go. And you can just take the Python to Python 2, and then you're good. So, when we run the program, once we type in this run, we'll hit the break point, right at 804.83 fd, which is here with the break point, which is the address of the agent. We can get all of this information, right, from Object.gov. So that's exactly so. I literally ran this on our submission system. I compiled it. I did the Object.gov to get this, and then I paused the program. I did all the gdd commands to pause the program here to look at exactly what the stack looked like when this program started. So you can see here on this run of the program, the current stack pointer is at bfff690c, which is right here. So what is this then that's at that elevation? What was it? No. Save VIP. It's whoever called main, right? As soon as the function is called, there is a call to get here, which means that the what is right at wherever the stack pointer is is going to be the save VIP. So this is our goal. This is where we're trying to overwrite. What's about that? ArcC. And what's about that? ArcV. So we can see all the things there. Any questions on this? It's not always going to be like that, right? The address is no. The address is going to change, and we'll talk about that. It's not like ArcC fall acquired ArcV. Well, look at the main function. Yeah, but it's not always going to be like that. If I have a main function that looks like this, it better look like that. Because when I see that little column you mentioned, the first, if you're going up the stack, right, up the stack will be arguments left to right. So the first argument will be ArcC, and then above that will be ArcV. And there's more arguments? And then there's more arguments will be above that. So if I had the environment pointer, ENVP, that would be the third argument above that. And then it'll, of course, depend on whatever function. So main has usually standard arguments, but any function you write, when it gets called, the thing right at the stack pointer is to say PIP, and above that are all of its arguments. And that is constant. I mean that is 100. Okay. So when we execute this, we're going to push EVP, so we're going to store zero onto the stack. The, we move down. So now we've almost set up our frame. So now we're going to move the stack pointer into the base pointer. So now we'll have a red base pointer and a blue stack pointer. Because now we're setting up our stack range. So we saved the previous base pointer. We then copy the stack pointer into the base pointer. And then now we're going to create space for our local variables. So what we see is can somebody do it? That's how many do it. So I have to switch program. 60. 60? Okay, cool. 60. So we're going to move down 60 bytes. I think I did this right. And so one of the only things that are changing here are the stack pointer. So the stack pointer now the value is BFFF65C. And the instruction pointer is now pointing to the next instruction we have here. On the test, like, do we have like hex numbers or something? Do we have some method to convert them? I mean... Okay. So we'll just have to do it on the paper and everything? Nope, okay. Okay. Then... Alright, so the next thing we're doing is moving EVP plus C into DAX. EVP plus C here. So here's EVP, right? We have 4, 6. C is 12, 12. So we're copying this value, remember? And we're doing a dereference here. We're doing this move. So this is the important difference between this and a load effective address, right? We have very similar instructions here at this line and at this line. But on this one here, we're moving this value that is at EVP plus C. So EVP plus C is whatever BFFF698 plus C is, would be... ... We're taking the value in there and we're copying it into EAX. So EAX will now be BFFF734. So this is RV. So we're copying RV into EAX. We then increment RV by 4, because we're doing pointer arithmetic. Remember, RV points to the first argument, right? But it's a list. It's a character pointer pointer, right? It's pointing to a list of character pointers. So to get to the index 1, we have to increment the pointer 4 bytes from where it currently is. So we're going to add 4 to EAX and you'll notice that these are off the stack somewhere at the top, right? We have our own stack there. I guess I could have done that, but that would be too much. That exists there. Now we're going to dereference EAX and copy whatever is at the memory location that is inside EAX into the EAX register. So here we're going to look and say, hey, what's that memory location, BFFF738? Copy that value back into EAX. So we do that and we can see it's at BFFF87B and you need to cool GDB commands. So this is GDB syntax that X means examine. So examine does a pointer dereference looks at a memory location. So examine at this memory location that's on the right. The slash X tells it how to interpret that data. So it should interpret it as a hex value, as a string, as an integer, what size you can give different sizes. So X slash WX means just print out one value in X here. You could put 20 X and it would print out 20 values in those memory locations after starting at 738. Super useful for analyzing and looking at memory of a running program. And so we can see that, yes, GDB is telling us that at this memory location is this value and so that is why this value is now in EAX. Then remove X into ESP plus 4. So we're moving it onto the stack here. Now we're going to load the effective address and calculate EAP minus 32 and put that into EAX. So EAP minus 32 is... I put it here because there's like a lot of space here. This is... The stack is not on the scale. So there's like a lot of stuff in there and somewhere within there we have an F666. But how exactly how many bytes is that from the base pointer? 32 hex, which is how much? 50. Those are exactly 50 characters below and I believe it's actually one of the arguments that I used in GCC. There's like some alignment argument that caused it to put that exactly 50 bytes below and instead of like two or additional space there. You can remember the buffer was size 50. Then we move EAX into ESP. So now we have the value here. And if we look, so this is another way so you can examine and print it out as a string. What's FBFFFF87B and we can see that there is this string. So why does it look so weird? It's not asking characters there, right? It's actually just bytes. You can actually do something really cool. Oh, maybe. Let's not see this for now. Let's go now. We'll see you soon. You can do X slash 20 I, lowercase I and not going to interpret whatever is there as X86 instructions. So that's actually a cool way to debug your shell code is you can run that under the address and there you should see your shell code commands that you used to compile like that are actually there. So that's actually a super helpful technique if you get into I've seen super weird problems going on. So now we're about to call string copy. So what's the source argument to string copy? The first to the second. The second argument. And so in this diagram, which memory address is the second argument? 8 center beginning. Yeah. First argument, second argument, right? And so what is this going to do? I'm going to copy one byte from 8.7b and copy it into 6.666 and then it's going to increment both pointers by one and then it's going to copy at 8.7c into 6.67 and it's going to keep doing that until it hits a zero light which is probably here if you're going to do a light here. Right? Because that's how GDV knows the string answer to. It just prints out characters and asking so what do we expect to have happen? The string is copied into that buffer. Do we expect... Am I doing a buffer overflow here? Am I going to take control of this process? No. Why not? The amount of characters that you passed in this morning. How many characters do I need to get from the buffer to the same DIG? And how do you know? 50 plus the 4 for the same... What do you get 50 for? No, wrong. Well, it's also a line like you said or whatever. Yes, this number right here tells you so this is why the source code can lie to you and you should not trust it. You always want to look at the assembly. Yes, you're correct. It is size 50. But really we know from looking here that the buffer foo is at EVP minus 32x and we know that's 50 so we know that from here until here there are 50 bytes. And then what's after that? 4 bytes of save EVP which we don't care about, we're going to erase and then we get to our beautiful save EVP that we want to destroy. So when we copy this this is only going to copy so how many bytes in this stack get overwritten. So I just showed you on here there's also another cool thing string length and other functions here. So this is print out so instead of the x here x is examined so treat it like an address and show you the memory there print, print out the actual value of x. So print out the string length of this buffer. So then how many characters did I write here? Is that x 21? 33? 34? Y 34. There's a null byte at the end. Yes, it's always easy to forget the null byte. We're copying a string from one place to the other so string copy will copy all the bytes in the string and also place a null byte at the end. So we're technically copying 34 bytes in there but this is all well within the buffer we have allocated. We have 50 bytes here. So we move a into dax we leave which remember does a set the stack pointer equal to the base pointer to the top EVP so the culmination of that is the stack pointer point here and the base pointer will get whatever is on the stack to say the EVP is 0 and then we return what's the instruction pointer going to be? It's like b7, e3, fA, f3, right? And the stack pointer also move up and then something else to do we'll find here. So what went wrong? Yeah. Yeah, I didn't actually trigger a buffer overflow I just threw shellcoding to a program right? And so we must overflow the same pmp on the stack with the address of the shellcode right? That's the other critical component because we want control to go not to the function that's called main we want it to go to our shellcode that we just put onto the stack so how do we do this? Yes, so how much? So how much in the buffer so we know that there's how big is the buffer and how how many bytes are there from where the buffer is located to the EVP, the base pointer 50, 32 necks they happen to be the same but they don't always have to be right? So we have 50 bytes, then 4 bytes for save EVP and now we are at the save EIP. So what we want to do so the buffer is exactly at EVP-X32 so we need our 33 bytes of shellcode right? Remember we have to forget the null byte now because we want to copy we want to create a string that will not only inject our shellcode into the process but we'll overwrite that EIP the save EIP so we need our 33 bytes of shellcode we need 17 random bytes why 17? because it needs to be 30 right? We're going to fill up that buffer right? So do you really want to use random bytes? No, what if you have zeros or shell like new lines like do you use something simple like A's or you can be fancy if you want but, you know, up character pick your character character it's the first line of my name and I was like you today but that's just like a personal name almost everyone does this then we need 4 bytes of what? yeah, 4 bytes of more jump but this is going to become the same EIP and then we need 4 bytes for the address of the shellcode so back in our example what was the address of the shellcode bf ff f66 cool so we don't have to compile it again I don't know if I can do that we can debug it so now we can run this program with here's our shellcode and then 17 lowercase a's so this is literally the only time I've seen Python's string multiplication ever be useful it's like the world's dumbest feature like yeah, it should definitely be multiplying strings by integers because that makes sense from a type perspective but this is literally the only time I come to handy it's actually super handy so I hope they don't ever take away but it doesn't make any sense anyway so we have the buffer there we have 17 times a's then we have I'm going to put here bcde as what's going to be in edd the same base pointer and then we have our address we have bf ff f66 so just like before we can this time I'm going to put a breakpoint on the call function we're not going to step through every single part of the call function you should be able to do that on your own so here's the call function what's different with playing game because we can spot the differences you know those games, they're like the two pictures so suppose you have a picture 30 seconds ago and then you just want to eat yeah, is the memory different? all of it? do my program location change? my program location change? what about the stack change? did we all agree? did the stack change? or is it just a trick of your mind? the 646 used to be 666 646 used to be 666 yeah, so it seems like there's been 32 bytes in the ground faf3 I think that was the value yes, that value is exactly the same the value of where to return to which kind of makes sense because if you looked into it more closely you can see that that's a library function so the library always gets loaded in the same location in memory so our code doesn't change the library code's location doesn't change you've detected that it changed why? we put our argument above your main function and the stack is longer so it's all going to push it out and it actually is above so I'm basically at the main function here so I'm here and I have my arguments here or argc and argvvector and environment pointer so where is that data? where is the string that I pass in? where is this? so this is the address of argv1 and these are the actual bytes where are they located in my processes memory? on the stack on the stack above me right and if we go back to that diagram we had of the stack we said hey all of the environment variables and the argv parameters are put on the stack and then there's the arg probably the environment p pointer then the argv pointer and then there's these arguments right so what's the difference what do we do differently between we ran it then we ran it now and we know exactly how long the actual length it is it's 17 plus 8 right so if you think about it when the operating system created this process and it set everything up to call main it had to put in more data in the top of our stack in the argv wherever the arguments live it had to put an extra 17 and 8 those extra bytes on the top which pushed the entire stack down what else lives in the environment this actually doesn't live in the environment this is in the argv vector but all of the environment variables are there too so what else is in the environment so there's everything passed in by the program so the argv vector there's also is there anything sensitive yeah before recording that though oh one right up so if env prints out the environment there we go what kind of things do I have here on the shell I have which user I have I have this LS colors thing I have the fact that I'm running this as a pseudo user I have my path I have also my present working directory right so this means that my environment will change depending on where I run the commands from which means that the stack will change depending on where you are running a plan from and what arguments you are passing to it it's something that can be incredibly annoying when you're trying to do a buffer overflow and get it very precise talk about ways to deal with that so when I copy this over right now instead of just overflowing the buffer itself I'm overflowing with my shell code at this address BFFF646 shell code and then 17 first time I did this I did it wrong I calculated it based on 32 instead of 50 or something think I had said whatever it doesn't matter no it doesn't matter this should be right so it should be 17 right so let me do this now 3017 yeah yes that's right I originally had my shell code oh that's right I was using x21 and I used 21 as the number of bytes in the shell code look at that you'll never know right so what else changed yeah so we have not only so we can see that the saved EVP changed and how did that change what was the value that I put on the command line for EVP BCDE what was the byte for B 62 the process of elimination let's see it's one more 6030 then 64 and 65 right but I wrote them in BCDE on the command line but how did they end up when I'm interpreting what's that BFFF678 as an integer which one's the most significant byte there which letter number 65 is the most important byte there and then 64 then 63 then 62 this is where the endian mist comes in and messes everything up because we're writing this string we're writing this string up right we're writing one byte at a time for this specific string starting at BFFF678 we first write the byte there of lowercase b which is 62 we then at 79 write 63 and then at 7a write 64 and then at 7b write 65 but when you read that as a number the most significant byte is at 7b so that's why it looks in X to be 65 64, 63, 62 so what do we do wrong for our address it's backwards what was the address we wanted to jump to BFFF666 it's exactly in the opposite order that we wanted to do and this is directly because of the endian mist of the system so should this work there are two reasons why the first one is what we just talked about so what address is this going to try to execute from 66 F6FFDF there's probably nothing allocated there so it's going to say fault what's the second reason it's not going to work yeah the showbook doesn't start there BF6 is going to be somewhere kind of middle of the A's or something weird I don't know exactly what it's going to be but it's going to jump in the middle of the showbook and I need to jump exactly to the showbook that I'm going to get this right so if we go forward and it will be here and it will say exactly hey I couldn't access this memory I tried to access BFF6FFDF if you do info registers it will print out all the registers and you can see that EIP has this value which means that ok we almost got it correct right but the problem was A the stack chain and B we didn't take any account of the endian mist so let's fix it so now we do 46F6FFDF because we're trying to get to address BFFF646 so is this going to work if we're going to come in the same directory right yes hopefully I mean but assuming I'm running the same GDB session right I haven't changed anything also yes no no your environment is local to you but GDB also has its own environment variables that it adds so your environment when you're debugging things in GDB will be different than on the command line so also oh also yeah so what if you run one of those levels like level 2 in GDB and then you exploit it there are you done why not yeah GDB drops you try to use the ptrace ptrace is I believe the debugging interface to program on Linux it will drop setUID so you can debug a setUID binary but it does not have the permissions of that application which makes sense because you're debugging you literally do anything you want okay so let's see this in action so we get to here how do things look let's go back there's also something else in this diagram that changed after we made the copy what was that rgj for what to what 2 to 0 why that seems weird the null byte again in the string we copied our whole string our string was exactly enough for the channel code for a's for savevp and savevmp but there's always that one byte extra that gets copied and it gets copied to exactly at whatever this byte is and remember because at the end of this the least significant byte is there which was a 2 and now it's a 0 so that was the other important thing I wanted to point out here this can actually mess you up when you're doing both our overflow stuff because let's say so the other thing to think about is I do the overflow here the string copy does the overflow is it game over have I started executing code yet no what has to what instruction do I need to execute in order to get control of this program the leave and then the return right so let's say so there can be weird things that happen if there's other local variables here that you clobber that are pointers which get used by the program and then they try to de-reference them and then they do the side calls there's all kinds of super complicated things that can happen in between when you've done everything you're supposed to do to exploit this to where you can actually take control so other things to look out for alright now let's look at this program okay so we're going to do this we're going to copy from bff861 which is now our hopefully good string into f646 on the stack now are we good about between where the overwrite happens to where the return happens are we worried that we clobbered this ebp value there's a saved ebp value on the stack is that going to crash everything why are we not exactly right and does our shellcode use ebp no our shellcode calls exec we're going to execute a completely new process here right everything's going away including the ebp register so exactly so we don't care that we got rid of that value so but right so this is the key thing even though we've overwritten memory we still have not controlled this process yet right it's still going to move a into dax then we're going to do the leave which is good then we're going to put our 65646362 into ebp and now it's going to return which means now our instruction going to move from here into here at the very first byte in our shellcode now our shellcode is going to go and we should be getting a shell so what was the key here what's the key in doing this so there's a couple keys right so one is making sure you're actually getting I would very much urge you to think in terms of a surgeon right you want to be super precise you want to only provide enough bytes that you need in order to exploit this right that is super cool right anyone can come in with a sledgehammer and like break something right but somebody who can be very precise and just make the incision that they need in order to get what they want that to me is much cooler so we need to know exactly how many bytes and we can know that just by looking at the object dump right the code tells exactly how much bytes how many bytes we need right our local variable our buffer will be at some offset of EVP and that's what we use plus four for the same base pointer and we're bang we're right at the same instruction point we need that we need that what else is the other critical piece of information why did it take us three times to do this we need to know where to point EIP right that is a critical issue right so we need to either guess or know right so there's one actually really cool way to do this if you you can actually write a wrapper program that calls exact VE of the program you're trying to target with a null environment pointer and that will ensure that on every run that environment will be exactly the same way that you want it to be and you can pass in your argument just like there through like a C string so it actually becomes really simple that way so that's one nice way to do it which that will be repeatable 100% of the time and you can kind of guess if you've noticed everything even though we added more arguments we were still around BFFF 646 right and that's because of the way we saw the memory layout right the top gig of memory is reserved for the kernel so the process starts at like a BFFFFFF and then goes down from there right and that's where the stack starts we're assuming of course here no stack ramization right so you can do it in GDB and that can give you an idea of where to go but from what we've seen we need to be super super precise right we have to hit the first byte of our programs on our show code in order to execute so yeah same environment is super important the size of the timeline parameters are really important because those will change the stack you don't have to guess the offset you can know the offset exactly by looking at the assembly code that's the key so we can use a super cool trick essentially the problem is we need to hit this one byte right we need to hit that very first byte of our show code to start executing right so the idea is well why don't we increase that window from one byte to n bytes like 10 bytes so what do we need about these bytes okay let's think if we did random bytes and then we jump somewhere let's say we jumped into like the 10th byte what's gonna happen it'll interpret those bytes as x86 instructions and probably crash or maybe not I don't know actually but it could do something random which is not really what we want very controlled on the right track so there's a specific instruction on x86 called a knob it's hex 90 it's so familiar in x86 that I've memorized the exact instruction code it is hex 90 it is literally a no op it tells the processor do nothing so if we prepend 100 bytes of no ops to our shell code now instead of getting exactly at that first byte we only need to get somewhere within that knob sled so it's called a knob sled as soon as we land there we're just gonna slide right into our shell code so the problem is it's at the end so we gotta think when it's copied in we need to jump right at BFFF 646 which is that the bottom of the shell code which is the very first instruction the a's are at the end of the shell code so they're not gonna give us anything and they should never be executed anyways so we need to somehow extend the front of the shell code down so that it makes this nice sled where if you land anywhere on there you slide right into shell code yes, but why would you ever put a's at the end? if you put knobs at the end it'll hit the a's and then it'll try to execute 65646362 as an instruction because it's executing up just get rid of all the a's because it's advantageous to put the knobs at the front and you still have to right, so I'm telling you if you put we execute up in this diagram so we start executing the shell code and then we execute up I just need to flip the shell code up I see you're saying okay, flip it and then put the a's below the shell code and turn them into knobs yes, that's definitely what we're doing yes that would depend on the buffer size that's when they come in yes, very important point right it depends on the buffer size in this case we only have 50 bytes and the extent of this is by 29 we'll talk about that in a second that's still better than one that still gives us a little bit of leeway and how many times do we need to be correct? once once, we only need to be correct once you could write a super easy Python script that just keeps calling it with this and changing the address that you're trying which would be let's see, it'd be this byte by 16 or however long your buffer is to try to look for that knob sled that's a cool technique to do and you can start to load you don't want to start too far down you can start at like pretty high and then just keep going down and eventually you'll hit it well, my cell is 29 so this is exactly the technique that we use is a knob sled it's super cool x90 so basically you would have exactly you would have the knobs you'd have the buffer all the way up to the return address so you can maximize your knob sled so this is if you're being super imprecise you could just overwrite the buffer address but I really hate this I really don't want to change it now but you want to overwrite the return value with the return address and then save the IP on the stack so, exactly this problem what about small buffers so what is the requirement for our shellcode? is that one of the requirements? what are the requirements? it needs to go into executable memory somewhere it needs to be in the processes memory space that it needs to be executable but it doesn't have to be in that same buffer so if the buffer I mean, if the buffer is enough then yeah, we want to use it that would be silly not to use it but if we can access the environment as we said we can put variables into the environment we control the environment that a process executes in that was the whole basis of path attacks is because the process uses the environment and the attacker controls the environment so we could even put a huge knot sled as big as we want plus the shellcode into an environment variable and then try to get so overwrite basically all the junk and then just the address we want to overflow and then have that address be inside our knot sled somewhere this is actually can make your life super nice because your knot sled can be as big as you want because of the environment variable you can build elements there you can literally have it be like a game of just knots which makes guessing a lot easier don't do that on the server please like 512 bytes is totally fine we're doing 10.4 or way less that 2048, as long as we're less than a meg I think we're fine there's also the whole thing you can do you can put it in other buffers so this trick is great for exploiting a local process a local unix process that set you at evil with character permissions but if the process is running remotely and you're accessing it over tcp you can't control the environment variables there so maybe you can't, maybe the buffer is too small to put the shellcode in there but maybe you're interacting with this process maybe it's part of the user name to log in you can put the shellcode or maybe you can put the shellcode somewhere else in memory that's executable why is the environment variable space executable because the stack is executable and it's on the stack stacks aren't usually executable not in the future how do you do it in general very clutter you have to understand these techniques to do the advanced stuff so we're walking through that we're first doing the standard buffer overflow we're going to get into overflowing and the key is not the only thing you can overflow so there's a lot of different ways to exploit the process to get control and then we're going to do return on to the programming which is the advanced way of how to get around ASLR and non-expecial stacks so very generally we classify all of these if you talk to security trainers we don't really think about buffer overflows so much anymore because they still happen not quite as prevalent we think of them more broadly as memory corruption so any vulnerability where an attacker can overwrite arbitrary memory in your process which is clearly buffer overflow there are other instances of this and it depends on what is overwritten so what are we overwriting are we overwriting the return address the saveDIP there's actually a lot of cool stuff we can do with saveDIP we can actually overwrite saveDIP to completely control the process if we can't reach EIG it's a cool one we can overwrite a pointer to a function so C and C++ have the ability to pass around function pointers and so if we overwrite function pointers we can control the program pointers to data so this is actually a really cool more recent leaving this as getting attention so the idea is let's say if on the stack there's a variable that says if you're the admin or not like a 0 or 1 right and so you're normally not the admin but if you give them big enough buffer maybe you can't touch EIG or EVP or anything but if you can overwrite that value and say yes 1 I am an admin now from then on you're an administrator so you haven't corrected the control flow of the program but you've corrupted the data of the program in order to get the program to execute a different control flow that you want variables values pointers to data and also variables to values are a little the same so if the other thing is do you classify these all under what is overwritten what is the thing that you're changing then what causes the overwrite is it unchecked copying so the classic overflows so maybe you can't just copy everything but maybe you can access out of an array integer overflowing so integers if you add them too much they loop around and become negative so using this as some kind of offset you can get an every correction margin of these there loop overflows also where are we overwriting we've only looked up to this point to the stack but there's also the heap there's the BSS segment the data segment these are all things where you could overflow and overwrite and crop memory and they can all lead to security vulnerabilities BSS is the initialized data section I believe like the global variable and global offset table we'll see later so this is where it gets really tricky so we said that to understand if something is a security vulnerability we need to understand the program it's pretty clear that for any program if you can control the saved EIP value then that's a clear security vulnerability unless I guess your program is like execute random code in my process space then I guess it doesn't really you don't get any extra privileges but most of the time that is clearly a vulnerability but this is where it takes you thinking about I can see there's a vulnerability here if I corrupt this memory what does that actually allow me to do this is where your critical thinking skills and analyzing the program understanding what it does really comes into play here the buffer overflow is a super simple stupid just smash it get it, change it, execute code this changing one variable to get it to do what you want it to do requires much more understanding and skill so for instance what if you overwrote a pointer to a file that was going to be displayed to the user and instead of it being slash temp, slash t dot txc it became slash edc shadow and now it's outputting the edc shadow file which I can then try to crack patch it up integer value so if there's a way I can overflow some set new ID value to change it to be zero save base pointer this is what we'll talk about this later so I'm not going to talk about this function pointer so this is changing any function pointers so normally in a lot of the programs you don't actually use function pointers is there any res and c code that uses function pointers? it happens and there's good reasons to do so but in your day to day life you can go with an average c program up changes are it doesn't have function pointers but every process how does the dynamic linking work in c on elf binaries does static linking work and the code's included what about dynamic linking? it's decided at run time loaded at run time and there's a global offset table entry and it has the pointer to where to jump to where that function is actually located so actually if we went all the way back to the program we were looking at for main when we saw that call to string copy that's actually not the string copy it's jumping to a trampoline in the global offset table that gets the actual value from a table and then jumps to it so there are function pointers in memory that we can overwrite of almost any essentially any dynamically linked program to control the program's control put something in there yes so if we for instance we have a program that we could overwrite the value in the global offset table of printf we could then get that put our buffer with our shellcode in it and have it jump to our shellcode is that also called the registry? no the global offset table okay so for what can we overwrite one cool thing is long jumps so what are long jumps used for in C could in C have exceptions no not really right it just has signals so when you get a segmentation fault you get a signal and you can handle that if your process can try to handle that and fundamentally there's no way to say hey throw an error right so what do we do how does the error handling work in C respond to like an error code yes so every function has to return an error code right so usually how this is done is like a negative value will usually be returned if there was an error and then you have to interrogate another variable to figure out exactly what the error was but when you're calling a function like java let's say you want you know that that function is going to call some other function that's going to open a file right it's throw some exception so you can throw a try catch block around that and so when it throws that exception it will go the code will execute all the way from where it originally was back to your try catch block right so it's not returning each thing is not returning the error value so C doesn't have that by default in the language so they have to add in the standard library long jumps long jumps are this way to essentially do this exception handling hey there was a problem jump back to whatever set this long jump which may there may be three or four things on the call stack since then but that's fine jump all way back to there so it's similar to a go to but restores the program state to whatever you call that set jump so the set jump so that the pair calls the set jump and the long jump so the set jump saves context for the program and then somebody calls long jump and jumps back to that back to that set jump it's kind of like a fork where a fork now you have two processes running but the difference in the return value so here the first call to set jump will set up the context and then when somebody calls long jump it jumps back to the set jump return but the return value has changed so you know that you're coming back so yeah so set jump with this context variable return zero and a long jump goes back so when you call long jump with an environment and some integer x x will be returned from that set so it's used to perform exception, error handling threading, all kinds of cool stuff you can actually do here so when you look at an example so here we have just a main program we're using a jump so this jump buffer I don't have the I don't have the includes here but if you look at man set jump it will tell you exactly which things to include in the man page one of those things that it adds is a jump buff so this is you can think of it like an opaque object that is storing our context here so when we're creating this on a stack we are calling set no so the first time you call this which block here is going to execute with branch the else branch right so it's going to return zero it's actually going to print out nothing it's going to print out a carbon because the eye will never set so then it's going to call this other function f1 with the environment passed in it's going to say that is an error then long jump e error 1 else f2 it's going to do stuff check otherwise long jump so the idea is whenever this long jump with error 1 is called here with this e now the code will jump back up here except the return value will be error 1 and if it's called from here it'll jump back here and the return value will be error 2 no matter how many nested calls we are we'll still jump back to that code and the stack will be clean and the registry ball will be put back to where it should be so the jump buffer has a let's see has a so when we call a long jump so if we look at the code what this actually is doing it's putting the value i into dax so it's putting the value into the stack you have the return value to dax because this is a function called returns so the return value needs to be in dax then it's restoring the base pointer so it's accessing this jump buffer and accessing the base pointer offset inside there into edx so it's restoring the base pointer it's restoring the stack pointer and then it's jumping to where ever the program counter is so fundamentally so here we have a structure on the stack that is storing the base pointer the instruction pointer of where we want to go execute and the stack pointer so if we're able to use a buffer overflow to overflow that environment we can then set up so you can think about there's an extra save VIP on the stack where that jump buffer is so if we overflow that and then trigger a long jump we'll jump to whatever code we want to so it's more complicated but this is one that you can definitely do that is very cool so a different type of what you can overflow yeah so you need to make sure you can't overwrite sensitive data structures this is a data structure that I literally can control the instruction pointer the eit the esp so it must be protected okay so now we'll continue this on Wednesday we'll make it on Friday