 Om till dörr dag av Defqon. Även jag är happy att se det. As many of you managed to get up this early Sunday morning. I've been asked to show you this part of today's schedule. As you can see this is the first presentation and I'm going to present and discuss some issues for designing buffer or flow exploits. I mainly focusing on small payloads. My name is under praising one from Sweden and just before I begin, I'd like to give away some free stuff. I guess by one. If you ask me a question I cannot answer, then I better throw the microphone at you. Actually save the t-shirt for the one who asks a question that I can answer.I guess everyone can hear me right. So few people here.What this year I know it was like four.OK, I better get home.As you can see, I tend to lose them quite quickly.OK, anyway, why would anyone be interested in buffer overflows?I can see two reasons. These attacks are very effective. They are very flexible and very useful. I mean you can run your own code on the target system. You can run basically whatever you want if you manage to exploit an application. If you if you exploit an application that uses an allowed protocol, it will go through a firewall. Let's say HTTP or FTP or something that's allowed to pass the firewalls and ideas, ideas and so. That's the first reason effective and flexible attacks. Really cool stuff. And then the second reason why anyone would be interested in buffer overflows is that they are very common. I read this report. It's two years old now, but from the Oregon Institute of Science and Technology, I think, and they stated that buffer overflow has been the most common, the single most common computer security flaw or problem for 10 years. So that's another reason to be interested in these things. And why would anyone be interested in me besides I'm easy to trick shirts from? Well, why would anyone be interested in small payloads? The reason that I myself got interested in it is because I was dealing with an exploit for an application that that used a mismanaged bounce check. They were doing some kind of bounce check, but they didn't get it perfectly correct. And so by playing with small payloads, I think that you can add to the number of possible applications to exploit. As a benefit, it turns out that these suggestions and these design issues that I'm present will make it possible to evade some intrusion detection systems, both network based and host based, and I'll get back to that later, of course. Först har. Om en är du var. Perfektlig tjur, bortbuffer, upplåsen, har man en lilla reminder? Har man en lilla reminder? Ja, at least one. I won't get a tissue this time. Okay, let's say an application, does a string copy operation without a bounce check, then that will mean that the function will copy data from one place in memory until another and go on until it reaches an all byte. If it doesn't reach it, it will go on copying and overwrite memory and overwrite and write outside the memory that has been reserved for that variable, which means that you will overwrite your stored instruction pointer and you may jump into your own code. Excuse me. Speak, speak louder. There's no volume, but I'm okay. Okay, here's a picture of it. Let's say you do a function call, then you store your instruction, original instruction pointer on the stack memory. You reserve space for some local variable and then you perform a string copy without a bounce check. You will overwrite the instruction pointer and then you may jump back into your own code or jump wherever you want in memory. So this is pretty cool. But let's say some vulnerable server performs a bounce check. Let's say it receives data over the network connection and implement some sort of bounce check on the receive call, but then not an internal copy functions or such. Then we have a restriction. Then we have a bounce check that we have to somewhat get around to exploit this application. I see two possible ways of doing this. I think we can write a simple small exploit code where the where the exploit itself is small enough to fit into this restricted memory and still do what, what kind of exploit we want to do. It's a little naive, but it's okay. And that's what we did in our last winter. Another approach, which might be better, is to change the design concept somewhat and make a double injection. And that is basically what I'm going to present here today. But first this, first let's have a small look at the straightforward, simple, small exploits. It will implies at least two functional requirements for the exploit itself. It has to be able to listen for requests over a network and execute those system commands. That's basically what we want to do when we exploit an application. And we also has a requirement to keep the number of bytes as low as possible. Last time we did it, we came up with an example that was just above 250 bytes. We use two libraries, Windows sockets and kernel 32. We use the datagram socket instead of a stream socket, because UDP don't need to do because with UDP you don't need to do any listen or accept calls and you save a few bytes. And then we went into a loop doing receive and executing with Windows. We next take from kernel 32. If you want to have a look at this one, you can download it from security focus. It's about half a year old. Now for the second approach that I think it's much more cool is to change the design concept and do a double injection. And I, of course, I will explain that on the next slide. Besides changing the design concept, we can also try to use the existing network connection and try to reuse the already loaded libraries to further minimize and optimize our code to get even smaller. So that's two parts, change the concept and optimize. So what do I, what do I mean by double injection? Let's say you have a server process in the middle that listens for connection over the network. And when it receives a request from a client, it calls an internal parts and execute function to, well, to parts and execute. And it looks like this and the server is running. Let's say we are at this place in the server code and some clients over the network calls the server. Then when the server receives the first call, the first command or request or whatever, the server will call the parts and execute function and store the current instruction pointer on to the stack memory on top of its stack frame. Stack frame number one belongs to the server process and call jump into the instructions of the parts and execute function. The parts and execute function will receive a stack frame of its own, which is being placed above the service stack frame on to the stack memory. And then the parts next to cute performance string copy call. So let's say we have a restriction on the number of bytes for the first call from the client to the server, but let's say we have no bounce check within the parts and execute functions call to string copy. Then the first payload we send through the first call will overwrite down the second stack frame and overwrite the instruction pointer. Excuse me. Yes. Yes. So the stack grows upwards. And variables are written downwards on the stack. On the next few couple of slides, I have written memory addresses to us so you can see them, but not in this very fine animation. The first time I made a powerpoint animation. Well, anyway, okay. The second part of the double injection is when the first payload executes and listens for a second injection or a second call from the client. It reads the second payload and stores it higher up on the stack where we have free memory, lower address. Then the first payload, when it has received the entire second payload, it jumps up to it and continues to execute the second payload. We're still executing on the stack, and then the second payload can perform the actual exploit, and by that I mean open a shell or add an account or whatever that we want to do to exploit this host server. The benefits of doing this is of course that we get lower memory requirements for the first payload. And it's basically only the first payload in this approach that is interesting, because if the first payload is small enough, then the first stack frame, that one that belonged to the server, will be preserved. And if it's preserved, then we might be able to do a clean return to the calling function. And if we can do this clean return, the server function will not crash. And if it doesn't crash, we will get no log entry on the system. And if we will get no log entry, we will evade some host based intrusion detection systems, mainly those that just parses log files. Part number two or trick number two is to use the existing connection, and that means that we let the server open a new socket and receives the first payload, and then we manage to get the first payload to receive the second payload over the same session by using the same socket. Benefits of doing this is that we do not need to set up any new connection, and if we don't need to set up any new connection, we will or we might evade some intrusion detection systems, some network based intrusion detection system that looks for TCP handshakes or that looks for unrecognized elite ports over the network, but we're using the same still allowed protocol, the same still allowed port number. It's the same session. And that will make it harder for network based intrusion detection to see it. Of course, we will also get even lower memory requirements for the first payload, and that's a benefit too. Okay, I hope this sounds cool. But how can we do it? I have made a small proof of concept implementation that I'm giving away after this speech or after DEF CON. I think I will put it up on the post DEF CON pages together with my slides if you want it, where I have implemented a small vulnerable server that accepts connection, receives commands with this restriction, calls this vulnerable person, execute function, and it's possible to explore it. And also I will provide, of course, a remote client with these exploits. And they can not be used on anything but this particular server, of course. Okay, so I'm a little surprised that you didn't ask this before I got to the slide. But isn't it interesting how we will manage to use the existing connection? I say we have to find the socket descriptor and use it. And how to find it is by finding the accept call. The accept call can be found by disassembling the server and maybe look for an error message or something. And then set a break point at that address in the code and debug the program. And then look where the return value from the accept call is being stored in memory because the accept call returns the socket descriptor for the session socket. There's plenty of disassemblers available on the internet and that's just one example. That's once free. If you run disassembler on the server that I'm providing, you will get this. And if you look through the disassembled code, you will find an error message that is quite useful. It shows us that the accept function is probably that one. And then we know that we can set our break point at this address. Then we debug the program and we get this. We see where the session socket is being stored. And then, okay, so we have found it. Then to use it. We need to know that the. Sockets position within the stack frame does not change. Well, the stack frame itself might change. The stack frames position might change the rest. So if we know a fixed position within the stack frame, we can calculate to where the socket is being stored and reuse it. So this is second animation. And it shows what the first payload is basically doing. When the first payload start, the base pointer has been overwritten under stack pointer points towards the end of the first payload. Första du is to change the base pointer and move the stack pointer upwards against against free memory. Then I add some arguments for the receive function. I said a pointer to the buffer where we want to store the second point, the second payload in free memory. And then the socket, because I know the distance from, I know that the distance from the base pointer to the socket will always be the same. And the base pointer will be set during runtime. So I can always find the socket, the scripture, the scripture. So that's using the socket. To further minimize and optimize the code of the first payload, I suggest to use those functions that have already been loaded by the server. And of course, the server needs to use some functions from the window socket library. I mean, it receives the first payload and the addresses to this function are being stored at known places in within the instructions of the server process, as long as it hasn't been relocated while the library may be stored wherever in memory, not wherever, but at an unknown place. So this is what happens when the server calls a function within the window socket library. The server uses its jump table. And if we disassemble the server code one more time, we find this. This is the jump table and by just reading the code or by debugging, we can deduce that these addresses correspond to these functions. So if we want to do receive function. Or if we want to call the receive function, we can do it like this. Are you following? This is just a picture of the first payload in memory. As you can see, the addresses are printed to the left and this is the code itself. As you can see, I'll start by subtracting at least as much, as least as much amount of memory from the stack point that I need to store the second payload. Then I push the arguments to the receive function onto the stack, the new stack frame. And then I basically calls the receive function through the jump table within the server. And then I jump to the second payload. Do you have any questions on this first payload? I have some pens. Anyway, okay, so what's to send the second time? I mean, you can send whatever. You can send your code of choice. You do not need to, to feel any restrictions for the size of the code to send the second time. You can send code that opens a shell and performs. Or performs whatever action you want to do to exploit this server. And remember, you may still use the same socket. You may still use the jump table for the functions that has already been loaded, as well as you might try to load your own libraries and look up new functions. Another important thing concerning the second payload is that we do not need to XOR protect the second payload because it's being sent to the stack. It's being written to the stack. As raw data over a socket. The first payload is written to the stack by a string copy operation. And as you probably know, we cannot have any nulls within that because then the string copy function will stop. So we need to XOR protect them. But for the second payload, we do not need to bother to XOR protect the data. The proof of concept implementation that I have made. It's, it does not exploit this server in any particular way. It just confirms its success by sending a short message back to the client. Okay, so this is a picture of what the second payload does. When the second payload starts running, the base pointer still points to the base of the first payload and the stack pointer points to the top of the second payload. The instruction pointer is of course at the top of the second payload because we have made a jump. I call the send function. I push the arguments for the send function to the stack and I call the stack. And I call the send function, sorry. And then I'm finishing up. When I'm finishing up, I want to do a clean return. Because as you remember, if I do a clean return, I won't get any entry to the system log and I will. I can use that to possibly evade and host based introduction detection system. So to be able to do a clean return, I need to reset the base pointer and the stack pointer to their original, originally intended positions with respect to the first stack frame, the stack frame of the calling function, and I need to reset the instruction pointer to point back into the service or instruction codes. Of course, if the person execute function is supposed to return anything special, I have to set that up as well. Since I haven't changed the base pointer throughout the execution of the second payload, and I remember that the base pointer did point, did point to the base pointer, did point to the stack pointer before I started. I can simply send them back. And since I know that the stack frame of the calling function is always of the same size, I can calculate what to set the base pointer to. I mean all of these stack frames may, their position in memory may change, and I do not know that until runtime. But what I do know is that the size of the stack frame is always the same, and thereby I can calculate what to set the base pointer to. And then I can do a clean return back to the server, back to the service or original instructions with both the stack pointer and the base pointer reset to its original values and the stack frame of the calling function, not disrupted in any way. Do you have any questions? Yeah. You want to pen before? No, sorry. Let's say you just have a, okay, the question was, can you do this without having the source code for the server? And the answer is yes, if you manage to read only the disassembly. If you have the binary, you can still disassemble it. Okay, if it's only a remote situation, no, you need to disassemble the server. But since you're only supposed to evaluate your own programs, you will probably have access to the binary, right? You want a black one or a purple one? Okay. Cool. Any more questions? The first payload overrides the original return address. Yes. Okay. How do I know what to set the instruction pointer to the clean return? I get it from the disassembly code and it will be the same excuse me excuse me. I don't know if I got it right, but let's try to answer something and see if it fits. When I disassembled the server. I can find the address for the call and I know that the return, supposed return address is the address next, the next address. And as long as the server hasn't been relocated in memory, this address will be the same because you're taking pictures. Thanks. I think as long as the server hasn't been relocated or anything, its base address will always be this double 40, double, double. It's set. You can set that when you compile your binary and you will see it when you disassemble your binary, too. You want a purple or black one? You don't want any. Yes, Steven. Yes, I think this can be made on other servers as well. But I'm only providing a proof of concept and something that I made myself and I don't even know if it's legal to disassemble programs here. But it is probably legal to disassemble your own programs. I mean, there are a number of conditions that has to be met for this to to work out. I mean, the payload, the first payload cannot override the, it cannot be allowed to disrupt the first stack frame and that that might be true. That might be possible or that might not be possible. There are some other conditions that has to be met, too. But I think the concept, the idea, it will be able to do on other servers as well. Yes, you. OK, the question was how effective is a non executable stack frame? I would say it's quite effective, but let's say one possible way to get around it might be to jump to maybe an argument that is being stored on the heap or something instead. Basically, I am kind of amazed that stack frames aren't being protected from execution. You want the last one? Sorry. OK, so I ran out of pens, ran out of free stuff, and I'm basically running out of slides, too. So I'm going to finish off and show. This is just the payloads in memory. And as you can see, I just send this simple stupid message back from the second payload to the calling client. This is the code for the second payload. As you can see, it's basically just a number of push operations to push the arguments to the send function and then call the send function through the jump table of the server and then reset the stack pointers and return. So the return address in this example is hard coded into the second payload. You can see it on the second last line to the right. To summarize this, I've discussed a double injection. And by a double injection, I mean that we have, we send the first payload, we send one payload that uploads a second payload and then jumps to the second payload and that the second payload executes the actual exploit like a shell or something. I've suggested to try to use an existing network connection by finding and supplying an existing socket descriptor to the receive and send functions. And I have also suggested to try to use the existing functions, the preloaded ones through the jump table of the original program. By doing this, it might be possible to evade a network-based intrusion detection system. There are no elite ports and no TCP handshakes since we are sorry, using the existing network connections. There are not. As long as for the first payload, there are no large amount of amounts of data being sent. And it, this by doing this, we force and network-based intrusion detection system into interpreting the application protocol to see and detect the changes of my payloads from an original application protocol. And that interpretation will probably, it will definitely increase the complexity of the ideas and it will probably decrease the capacity and slow down analysis. Have you read through Caesar's ASCII encoding paper? That can also be used as a counter if, if application protocol interpretation by the network-based intrusion detection system is a countermeasure against these attack. Caesar's ASCII encoding can be seen as a counter-countermeasure. So you can continue this by not setting up any new connections and by by returning cleanly without the crash. There will be no log entry and that will force and host-based intrusion detection system to kind of get some strange behavior awareness because the server won't function properly, but it won't crash either. And to be able to recognize the difference between those two you need. Well, some kind of abnormal behavior detection and that will add complexity to almost finished. I really recommend you to read this. The first one is a chapter in the book called Hackproofing in Network by Greg Högland. It's really good. And the second paper I really recommend you to read is about Windows Buffer Overflows from Frack by Barnaby Jack, also known as The Dark Spirit. Is any of you guys here? No. I give you credit anyway. If you have any questions about this, if you want more information, if you want to see a demonstration, if you want to have a copy of the code or the slides, you can probably get them from the post defcon pages or you can try to catch me here during the day. I will be here all day and all night, probably, and I'll leave back home for Sweden tomorrow morning. Or you can also drop me an email at this email address. Any last question? Ja, with TCP and UDP, but I think the concept is possible for OK. Thank you very much for attending and please stay for the Cisco routers as well. Thank you.