 All right, we'll props to you guys for showing up. It's the first talk of the morning, so appreciate you being here. All right, so my name's Tilly Bush-Hump. I work at SAIC, do security consulting, pen testing, some other interesting stuff, and I did this research with Dave Weston, who used to work there with me and is now at Microsoft, so yeah. Today we're gonna be talking about retrace, so we'll start with tracing applications at runtime and show how we can use this for vulnerability pinpointing. We'll go into programmatic debugging with a Ruby-based toolkit we wrote and end up with some generic kernels and application stuff. So first, let's talk about the background of retrace in case you're not familiar with it. This is a kernel-based dynamic tracing framework, something that Sun built many years ago. They open-source it under their SDDL license. It was released with SunOS 10 back in 2005, and retrace has since been ported to Leopard, so back in October of 2007, it was the first time that we saw this on a desktop operating system. So that's when we started to get curious about how can we use this for exploit needs. We're also seeing it being ported to FreeBSD. It's currently in the FreeBSD 7 current branch, and will be in the FreeBSD 8 stable branch. This guy, John Viral, ported it to FreeBSD. Yeah, so what we're gonna talk about mostly, we're gonna use examples of OSX as an operating system, but really it's generic to anybody who implements retrace. So if you've got a Ruby interpreter and you got retrace, you can use our toolkit with a little bit of customization. One interesting thing, which I'm not sure, is I had heard in the upcoming iPhone releases that retrace may be included, so I know instruments is supposed to be included, so that would implicitly tell you that there's some retrace functionality. So I'd be very interested in seeing how retrace performs on a mobile device coming up. We've also seen it on Netrino, partial implementation. So the basic overview here is that retrace is a performance tracing framework. So it enables you to observe what an application is doing at runtime. This may remind you of other tracing utilities like S-trace or K-trace, or things that are sort of single purpose. retrace gives you the framework to implement your own tools to either reimplement things like S-trace or to specify exactly what you're interested in tracing. So you do this by defining these probes and then actions that go along with the probes. So this is what the architecture looks like here. We have a lot of cooperation from the kernel. That's what the bottom of the screen is here. So you get these categories of probes like system call probes or the function boundary tracing for tracing functions within the kernel itself. So one useful way to think about this in analogy may be consider an entire operating system where there's break points in every strategic place, and it's just a matter of enabling those break points to get some type of instrumentation in that area. So it really is like having a system with a debugger attached from top to bottom. So that's kernel subsystems, the kernel itself, IO, networking, user applications, and at any time you can just say, hey, I want to know what's going to happen when that particular point of instrumentation is hit. So there's this user LAN library here, libdtrace, which is utilized by a number of applications, generic OS applications, but then also the dtrace user LAN binary itself. So what we're interested in doing is writing these dtrace scripts, which will be run through this whole environment. So we write these little dtrace scripts in a language called d. The authors describe it as a syntax of c. So you basically specify your actions that get compiled into some intermediate form and then run with cooperation of the kernel as the application is being traced. Now, d actually turns out to be quite limiting because it doesn't have control flow. So there's not things like loops, which really limits what we can do in our logic. It's really meant for taking measurements and doing comparisons and small amount of recording. And there's good reason for this, right? You don't want to be the kernel and accidentally hand off control to some script that's hooking a function that's never going to return because the kernel will never get control and then you're completely screwed. So you'll see that safety and performance are major priority requirements in the design of dtrace. So sometimes this works to our advantage because we care about performance when we're doing tracing. And sometimes it's not to our advantage because you'll end up dropping probes and stuff. So that's probably one of the main problems solved by our talk and by our framework is, of course we've got this really cool framework that does tracing of the entire system, but it's primarily designed for troubleshooting, for performance, and for overall debugging of production systems. So Sun created this as a way to say, hey, we can send our consultant in there on your live production system and he can tell you what's leaking memory, what's sitting on CPU for too long without you having to set up a completely separate system which you would normally have to do recreating your production environment so that you can debug because it's too dangerous to attach a debugger to Apache or your SQL server in live. So what's happened is that, of course that gives us a lot of power from engineers, but because of the safety and performance considerations, you'll see that we kind of have to figure out and do a MacGyver thing around some of those roadblocks. So the bottom line for performance is that there really is no impact when detrace is not active. And when it is active, it's been streamlined to be most efficient. Okay, so there's a million of different things we can do with detrace and a lot of them have been implemented with these singular tools like S-Trace and TROS and so. So you can re-implement those if you want or focus in on exactly what you're curious about in application. So the real takeaway here is that detrace combines system performance, statistics, debugging information, execution analysis into one small framework. So it turns out to be very useful for us as reverse engineers because we want to answer very general questions about behavior, but then we also want the ability to zoom into a specific situation or condition and inspect exactly what the application is doing. We also have the ability to trace the application as it's interacting with the rest of the system or to trace multiple applications at once. So before you run off and all write your own detrace scripts, there's a lot of work that's already been done out there that you can reference. So there's this thing called the detrace toolkit which includes pretty much every kind of script you'd want for monitoring performance, things like I-O, disk I-O, system call tracing, things to snoop file input and output which is interesting and monitoring file activity. Just to give you an idea of how powerful detrace actually is, if you notice in the 10.4, so the Tiger release, there's a utility that's been around since the creation of OSX called Ktrace and that was your main system call tracing tool. Well that's been replaced in Leopard with like a 50 line descript and it performs about 60% better. So you can actually allocate any of those tools that we showed on the previous slide. The nice thing is that you have a custom a singular interface to all of these tools so instead of having to write a custom bash script where you launch Strace and then switch to GDB you have all of this in a sort of uniform very elegant framework that you can use to create sort of monster tools and you'll see a lot of that in the Sunday Trace Toolkit, some really great examples. Okay so let's take a look at some simple examples. This is a one line detrace script. This is a script here. You have the pro definition and the action itself. Basically what this is going to do is trace every application across the system recording the number of times that each application makes a system call regardless of what that call was. So the output from running the script for a short period of time shows us that, you know, Syslog made one system call, CubsD made four system calls and Veeamware made almost 7,000 system calls. So what we might be interested in doing here is drilling down and focusing on Veeamware and saying, okay, which system calls is it making? Or what the arguments those system calls are? Here's another example. This shows a small detrace script that hooks every open function every open syscall function at the entry point and then simply prints out the executable name and the first parameter of that function which will be the file that was open. So output from this might look like this. It shows finders opening a plist file, Veeamware is opening dev, urandom and so on. So these configuration files and these resource files are being opened by these different applications. So the cool part is you can start to use what are called predicates which we'll explain in a second where you can say when this system call happens with this particular argument which files are then open. So you can get some really deep sort of nested if else type of statements to really zoom in on the particular area of the application that you're interested in all without having access to the source code. Okay, so to review some of the terms we've heard here, there's probes which are your points of instrumentation or you can think of them as your points of hooking. There's these providers which are just logical categories of probes and then there's predicates which are conditional statements. So these are evaluated true or false and then the action is carried out or not. So you can hook something and then decide not to do something. This is, if you've seen AUK, it's much, the syntax is much like AUK. You specify some condition like, you know, the line AUK is processing a line. So the line has four elements to it and then the action that goes along with that you know and if there's four elements in the line then print out the first one. So it's a hook, a predicate and an action. So the way the syntax looks is like this. You've got your probe which consists of the category the module which is usually a library the function name and then this last field is usually the entry point of the function the exit point of the function or every instruction within that function. The predicate is the conditional statement that's evaluated to true or false and then the action is carried out depending on that predicate. Okay, so let's talk about how we utilize this for our reverse engineering tasks. Okay, so you probably already have some ideas of how this can be used for reverse engineering. You can get all sorts of information about how a particular application you're interested in either testing for security or understanding just a general understanding of how it's interacting with the system as a whole. So you know what syscalls are going on, you know when it's making IOs, what files it's touching you know all sorts of things about it. So the idea here is that you can really quickly by either using a script that you've created beforehand or tailoring a script to the behavior of that particular application you can get an idea of how it's working at a sort of a top level and then you can zoom in on the areas that maybe have a part of the attack surface or just going to give you some general ideas about the security of the application itself. So it's very useful. The other thing is you can use predicate so for example you may want to trace until a function gets an input from your fuzzer, right? If string copy is getting 5000 days, maybe you want to trigger the debugger at that point and say hey I want to see what's going on here. So it kind of releases you from that idea of having to say a million breakpoints which kill the performance and change the memory layout of your application and will only trigger a debugger when there's a specific incidence that you're interested in. So the way we like to think of it is as a rapid development framework for reverse engineering tasks and tools. So the way we use it is we're lazy reverse engineers, right? We're lazy hackers. If something takes too long if you have to attach the debugger and you have to write all these conditional breakpoints you're going to get tired. But the idea here is that we've really sped up the exploit development time, the vulnerability assessment time by giving you the tools that can really tell you what things you're interested in very quickly. So you can script things you can attach and re and unattach from an application as it runs because it's running in the kernel you're not interfering with the memory of that at all If you've ever experienced this while using GDB, you set up an exploit you get it working. The second you detach GDB the memory layout is different so now your show code doesn't execute and it crashes instead you don't have those problems with Dtrace. So the idea is that you can refine what you're interested in or what you're doing without any performance penalties or slowdown at all. So let's look at some of the simple things we get with Dtrace that are provided for us that we would have to implement ourselves if we were to write our own system tool. So this is things like control flow graphs, symbol resolution call stack traces and the ability to examine function parameters both in user space and kernel space. So here's an example of control flow we've got the arrows to the right indicate that the function is being called and then to the left indicate that the function is returning. So from here we can get a sense of which functions are calling which functions and when they're returning. This is the dynamic call graph. Here's an example of a stack trace with all our symbol information. So this is the FTPD main function at the bottom here and it's call stack trace all the way up to the top where it's calling string compare. So not only do you see the library but also the function name and the instruction offset within that function. So it gives us the ability to pinpoint exactly where we are in the execution. And conveniently record the function parameters as well as the return values. So this shows us it's recording the first argument of the function. So for hooking string compare it simply prints out what the first argument is, is special one. The syntax here actually gives you a little bit of idea of how Dtrace is working. If you notice the syntax there is copy and str because what you're actually doing here is you're asking the kernel to copy information from user land into the kernel to give you information. So the way a break point works is by modifying the target process memory. It writes an in through instruction and that's what causes the execution to pause in the CPU and the CPU calls back to you says what do you want to do. This is very different. It doesn't modify any instructions in any way or cause any pause in that execution of the program at run time. So you're getting a great view of everything that's going on in this application from the kernel's perspective without disrupting it at all. So what you see is what you get here and that really becomes useful. When you're attaching a fuzzer and you're annihilating this thing and causing tons of threads to happen, if you were to have a debugger attached you'd be heading break points every two seconds and having to continue and really slow down. With Dtrace you don't have those problems. You can easily monitor an instrument of fuzzer without really any performance costs at all. So we also have a convenient way to reference all the CPU context. With this global uregs variable that Dtrace provides us we can print out values like EIP, EAX and all the CPU registers. That also works on more architectures than just x86. So if you saw that the EIP is actually abstracted so it works on PowerPC, Spark, x86, a couple other architectures. So this is a cute little example that hooks the uname system call and then changes what it's going to return. So this would be kind of a pain in the ass for you to write some little rootkit style thing that's going to hook this function and modify the return value, but Dtrace lets us do it pretty easily. So we record the pointer to the buffer on the uname entry point and then on the return point of the function we just overwrite that buffer with our own information. So this shows us that Windows is running on a PowerPC architecture in 2010. Here's another small example that shows file process snooping. So this shows the right entry point and the right return point. So same idea, record the buffer and then print it out when the function returns. So this would allow me to sniff all of the output from David's terminal or any other application. Okay, so hopefully this gives you the context of what we can use this for. We're thinking about monitoring the stack, about doing code coverage metrics. We could use those metrics to automate feedback to the fuzzer to tell it to adjust inputs in certain ways. Monitor for heap corruptions. That's a really useful area. So we're going to go over some of these now. So I talked to you already a little bit about the difference between Debuggers and Dtrace as a framework for monitoring an application, but just quickly, it's really important that you don't think of it as a Debugger. There's things that GDB does that Dtrace will never be able to do in vice versa. So one thing that Dtrace doesn't do well is exception handling. We've got a trace script attached to an application and you trigger a buffer overflow. Dtrace is not going to say memory error, read or write error at a particular address. It's just going to stop, because remember it's designed for safety and performance. So as soon as something goes bad, it just detaches from the process and drops the probe and says I don't have anything to do with it. But that being said, it's infinitely more flexible for conditional breakpoints or conditional instrumenting than GDB is. What we find is that using the two together is going to give you the optimum bang for your buck. What we'll do is we'll trace to a condition that we're really interested in. When we see that a buffer overflow is about to happen, we'll send the stop signal to the application. We can attach GDB, let GDB continue it and there you go. We find that's a really flexible situation for exploit development or for looking at a vulnerability close. The other interesting thing is the comparison against S-trace or L-trace or some of the other tracers you know. The main problem there is I don't know how many of you attach trust to a really complicated application let's say Apache. The amount of assist calls that are happening per second or per minute are just devastating. It's going to drive your CPU up and make it almost unusable. You don't have that problem with D-trace. Truss and some of the other tracers work by actually setting breakpoints where the system call is made or instrumenting PROC-FS so PROC-FS pauses and allows you to grab the arguments. You don't have to do any of that. Brendan Gregg did a little bit of study you can check it out on brendengregg.com where he studied the performance of a trust implemented in D-trace versus actual trust and D-trace is about 68% better performing. You hardly notice system call tracing when you have a binary instrumented. Again some of the limitations are that D-trace is designed as a command line real-time tool so it really works on only standard in and standard out. We had to do a lot of trickery with Ruby to parse the standard out to get things into arrays and to retrofit data structures onto the output of D-trace. If you're hoping to use it and you're just going to get an array of arguments that were touched by your fuzzer or something like that you're not going to get it without our tool. You're not going to get a lot of data from your coshionaries. Like we said it would be really hard to use D-trace as a serious security tool. We'll talk a little bit later using creating a HIDs tailored to a custom application. The problem here is remember it's designed for performance and security so if you start to get a lot of errors D-trace will just back off and start to drop probes. So if you were going to use this as a dependable hey I need to know D-trace would just detach and it would go undetected. Those are some of the options or some of the things you have to deal with when using D-trace. Let's talk a little bit about what we did with D-trace combining it with Ruby and combining with some other Ruby based tools and a programmatic debugger framework that we've also written into the combo. What we did was we hit certain roadblocks with D-trace where we just couldn't do certain things to be able to do and we started looking at taking Ruby and combining these two things to give us better logic processing so that we weren't just doing the data recording. We came up with Retrace which is a combination of these two things. It utilizes this LibDtrace Ruby bindings library written by Chris Andrews to allow us to access D-trace from Ruby. What we do is write common tasks in Ruby that build these little D-trace and send them off and run them and then handle the results. Another aspect of this is to utilize IDA and its static disassembly power with this runtime dynamic tracing. Now we get both the dynamic world and the static world together. There's a plug-in called IDA-Rub for Ruby. This has two parts. There's the Ruby client module and then there's also the IDA part, the IDA server plug-in. What we do is we run IDA on a Windows box in a VMware and then over the network we can communicate to it with our Ruby based framework. Saying things like color dis-instruction annotate the disassembly in this way or pull down the disassembly information itself. This gets around the lack of IDA GUI support on certain operating systems and architectures. If you're using Solaris, unless you have a Windows system next to you, you're not going to be able to combine static disassembly with runtime. By using the network XML RPC between IDA and retrace or retrace our framework, we're really opening up the merging between statistical analysis, our static analysis and runtime analysis and it makes a really powerful combo. The other thing is we skipped over MetASM but MetASM which is a Ruby based assembler, disassembler has some incredible power itself and it's part of the Metasploit framework and you can do things like write PE files from memory, you can create assembly and disassemblies and Ruby write them to L files so you can do quite a bit of stuff and that really comes into play and becomes really powerful with malware analysis, automated unpacking, there's just so many applications and that's why we really picked Ruby as a framework because we feel like the integration of all these tool sets is really the future for automating a lot of the tasks that used to be really powerful. So what we found here is that by using this framework both the tracing and also the static analysis, it's allowed us to decrease the amount of time it takes to do the analysis to write successful exploits. We also had limitations with just doing tracing, we wanted to do things that were more like debugging. So we went ahead and wrote, see Ruby bindings to wrap OSX's mock API, the debugging API and the ability to utilize that from Ruby. So now we can do things like set traditional break points, search through memory, change the memory permissions on different memory segments, we can search for specific op codes and segments we know are executable. So this really provides us the third element. So we've got the tracing part with retrace, the disassembly with tools like Aida and MetaASM and then this debugger ability. So the way I guess I sort of covered this already, but the debugger really provides a wrapper around the native debugging API which allows us to do this stuff and walk memory segments and catch exceptions and do symbol resolution and all that. A lot of people sort of have the idea that Mac OSX is really just BSD, but that's not the case. You don't have the standard ptrace API like you would on every other POSIX system, it's really mock at the core. Even though there is a BSD subsystem, most of the intensive sort of system internals and system programming you're going to be doing isn't mock, so we had to have the fun job of learning how to send mock messages back and forth to the kernel and stuff, but we've given that to you here so you really have a flexible marriage between the best parts of debugging and Ruby and the best parts of retrace and detrace. So here's how it all comes together. There was this co-format string vulnerability. We ran into Rob Carter and Nate McFeaters back in DC and they were showing us this thing and curious on how we could use detrace to really pull out the information we needed from the application at runtime in order to understand how to write the exploit. So we threw all these tools together and were able to trace through all of the different formats that happened in the application until like the 2001, which we were actually interested in, trace until we hit that spot and then do the application analysis, whereas just sitting a break point would have took forever iterating through each step. So we used the predicate feature to say only stop the application when one of the variables has this %25 in it. So the way we envision this with the framework is sort of like this at the top here is the code for specifying the detrace Ruby lib to say hook this vf.ffunction and if the argument is this %25 in then stop the application. So we go ahead and kick that off and then we connect to our Ida Ruby server and we pull once it's stopped we pull down some information about the function. We want to see the disassembly of the function itself so that we can find the return instruction of the break point there. Then we continue the application and we have a callback function for when our break point hit and then we continue the application. So then at the end of the function we hit the break point and we want to inspect memory at that point to say okay, is the overflow aligned properly? So the cool part about this is this is not quite automated exploit development but it's about as close as you're going to get. So he kind of shaved over the fact that you had to sit there and hit continue a thousand times on every printf that could possibly happen in iPhoto well, imagine doing that during the exploit development process so literally this script could save you several dozen hours of the lifetime of an exploit developing it and in fact when Nate and Rob brought this to us they were like, you know, we can't even stand GDB anymore, this is killing us and so we actually came up with this script all sitting together at a bar one night and it's actually worked out pretty well. So when you're done with this script you know exactly what instruction you need to overwrite and you can then use our opcode searching feature to just find instruction in executable memory that's going to point you right back there. So I mean two steps beyond this and you have arbitrary code execution. Alright so here's another interesting idea right, we know there's something funky with ASLR on OSX and we may be interested in trying to analyze that. So one thing you know you've found a bug in an application and you're trying to figure out how the memory works so that you can use specific values and return to libc style attack. So what you can do is start up the application and then do some simple resolution to figure out what interesting functions you want to use are at what locations. But you know these addresses will be randomized or they should be after each run. So you search memory for some sort of function pointer to one of those addresses and then you restart the application after it's been randomized and you search again for references to those same functions. So you try and find some spot and virtual memory that always contains a pointer to that randomized address. So this is the sort of task that you can also automate with this framework. So our frameworks up on popupred.org we're going to stick it up there tonight. Retrace is already up there, we just need to add a red bug. Okay so let's look at monitoring the stack. So what we're going to do here is we're going to have a classical stack buffer overflow. We're going to trace our application until that instant and then print out all the information that we care about. So we want to do this to pinpoint exactly where the vulnerable function is. Yeah this is a situation where maybe you're fuzzing an application and you get an exception from whatever debugger you have attached. You look at the stack and it's filled with A's right but who over flew, where was that and how can you leverage that to do either understand the vulnerability itself so you can protect yourself and understand how important the patch is or you want to leverage it for arbitrary code execution and it's definitely not a one, two, three step from getting a stack overflow to understanding exactly the nature of that bug itself and whether it's new and what the problems with it are. Alright so here it is in one probe we're going to hook every function in every return address right we'll hook every function in every module if our EIP or the next instruction pointer is four on four on the encoding of all a string of A's then we're going to stop the application saying okay something bad happened here let's check it out. So you know this works in the generic sense where you're overflowing with large strings of A's but I mean that's not always the case right we want some more generic solution so our approach here is to at function entry record what's the return address is on the stack and on function return check what EIP is. Is our next instruction equal to that return? It should be if it's not we know something went wrong right there's some sort of overflow that corrupted the return address. We're going to give you a little tip that might save you several hours that it cost us you see that second slide point there UREGs are ESP and not UREGs are ESP that is a big one on the core 2DL architecture which is a 64 bit architecture they don't call it ESP they call it SP and if you confuse those two it's going to cost you several hours that's not a problem so RSP is the architecture independent reference to that so we had some problems with tail call optimizations that happened at compile time and also some functions that Dtrace can't trace so this is kind of through a wrench on the problem so I won't bore you with the details of tail calls but the basic idea is that some functions don't require their own stack frame so although you're calling new function there's no new stack frame so then it kind of messes up how do you think about the return pointers and all that stuff so Dtrace has a particular way of dealing with that and you just have to know what it chooses to do if you're curious I'll tell you all about it but the basic idea is that instead of comparing IP when we enter a function we record the return address on the stack when we exit the function then we look at that same spot in memory see if it's the same so we have this problem with functions we can't trace our solution is just to ignore them because they're usually not the functions that we're interested in anyway so it turns out Dtrace has some problems with tracing functions like when they're inside jump tables and so it chooses to just ignore them we'll trace the entry point but then it doesn't know how to do the return point so our basic way to determine which functions can't be traced in this way is to say show me the entry points for all the functions that you know about and show me the return points so let's look for mismatches because some will be in one list and not the other we just exclude those functions with predicates so let's take a look at how this works so what we're going to do is run an RTSP quicktime exploit like we've seen a lot of these this year so it's listening on the local host and then we're going to start a quicktime and how to connect to this malicious server and then stack overflow is going to be delivered but what we're going to do there is attach retrace to this application so that we can watch the overflow as it happens so we load up a quicktime here and we're going to attach retrace so we start up the Ruby environment now this is actually like your standard IRB so you can go ahead and specify whatever kind of Ruby code you want so there's all our commands like searching memory attaching a debugger and stuff like that so what we're going to do is run we're going to specify the PID the stack overflow tracer on quicktime now we go ahead and connect to the malicious server which is going to deliver the payload so a lot of things happen all once there the application you know it opened up a window to read that the payload was delivered down here and then retrace stopped the application so it's actually halted so we get all this information pointed out about the context of the application's current state you know we get a basic message about the stack overflow occurring we see the exact module where the overflow occurred the particular function what we expected the return address to be and what it actually was so in this case it's you know 0x17 blah blah blah but it was actually dead B probably a problem so here's all the CPU registers you can see in this situation a number of them were overflown the base pointer and so on so here's our stack trace you can see again module name, function name followed by the exact instruction offset within that function so from here we might be interested in doing something like you know attaching a debugger and examining the memory space searching for instances of particular strings we want to use in a return to C style attack so what I'll do is I'll show how that happens with a different application here let's use ADM so ADM's a chat client right I just picked it randomly so imagine that we're going to do a simple return to libc and we need to use binbash as one of the parameters to some system call function we need to know a location in virtual memory at this moment where binbash is located so what we're going to do is dump all of the memory all the virtual memory segments and then search through them so in retrace this works by taking and dumping it to disk with redbug we've actually implemented it so you don't need to dump it to disk so here we search for binbash, the string binbash looks through all the memory segments and we get about 5, 6 locations so let's just attach a debugger and then check what the value at each of these locations is just to make sure that it's usable for our situation so in retrace what attached debugger does is just it stops retracing and then gdb to the process so here we can see it's really in the same state now let's inspect memory at the first location we can see that this is a string that starts with binbash but it's got all this other junk on the end of it this is not going to be useful to us so we can expect the next memory location it turns out to be binbash and all terminated so this is what we want to use and as well the other 5 locations are all binbash so those would be potentially used so the idea here is to get around data execution prevention you can manipulate to make a system call in this case if you make the system call slash binbash you're going to get a shell, that's a bash shell so that's why it's so interesting on there because we can't just run code off the stack anymore we'll have to return into mprotect or binbash, something like that to actually leverage getting a shell on the box okay so some other things we can do with retrace instruction level tracing that will give us the ability to do code coverage we're very interested in that so let's briefly talk about it the idea is pretty simple here retrace gives us the ability to print out the address of the instruction at each point so we just print them all out and then map them into IDA and we can visually represent what code coverage looks like during a particular run of an application so that's the whole approach, it's really pretty simple there's things we have to deal with when we're tracing instructions that are in a library that's mapped like deep in virtual memory space and our disassembler it's mapped at location 0x1000 so we just have to deal with mapping those offsets so it's not really a problem, you can just either trick your disassembler into specifying exactly where the starting point for the address is or you can do the offsets in Ruby itself you do have some issues with performance, right? because tracing every instruction can be heavy load on the system so you just scope to only trace particular libraries or applications you're interested in so what this turns out to look like the real fruit of it is this you get a function call graph and the particular blocks that were traced or run during the application or highlighted so you can say, cool, I missed this whole section of code, let me figure out how to tweak that jump to access and test the rest of it so maybe per fuzzer run you would change the color and then you end up with a really cool rainbow or a mandala but you know on every sequential fuzzer run whether you're hitting code that you weren't before you also do stuff like this with simple dot graphs show what the function what the system call graph looks like at run time so you may know that something was patched here there was a vulnerability here and the patch was applied to this function so they're doing some sort of input sanitization here but there's another code path to get there so let's look at how we can get down here and access that vulnerable function again without going through the patched function I'm going to tell you about how to use dtrace to write heap overflows so how many of you familiar out there with the term heap spraying and that type of stuff show of hands so heap spraying is all about heap determinism right controlling the determinism the allocation patterns on the heap so you can set this heap up so your overflow gets arbitrary code execution if you want to know more about it the immunity guys who have a booth here wrote a great paper on it called debugging with ID that tells you everything you want to know about advanced heap exploits but the idea here is that in the past there was a primitive called the write for which essentially gave you a generic technique for exploiting a heap well that's gone there's heap protections and it never existed on BSD because they keep their metadata out of band that is there's no function pointers that you can just overflow into to leverage arbitrary code execution so what that means in plain English is that you have to understand a lot about how the application allocates things on the heap in order to exploit something successfully so you're looking for function pointers you're looking for any kind of application specific data that you can overwrite with a heap overflow that's going to cause you to own the application so on other platforms there's lots of tools to do this immunity debugger is a great one on windows Gerardo Riccardo from Core wrote a great open GL visualizer of the heap that's based on trust that I think works on Linux and works on any of the BSDs and Solaris so you have stuff there the problem was you really had nothing on Mac before this that could help you do that so the idea here is that we have these arbitrary hooks on any function in the system with Dtrash right so let's use them to keep track of what the heap looks like and what are the contents on the heap so quickly you can of course you can use Apple's standard instruments tool for memory leaking to identify double freeze and double mallocs that are going to allow you to maybe you could leverage that for arbitrary code execution spot off by one errors and you can do all kinds of heap visualization so the comparison to something like Ltrace which could also hook malloc operations is Ltrace is like bonds on the giants so it's Ltrace on steroids and the idea here is that you can create a heuristic for your specific application remember I said it's all about application specific data you could create a heuristic that says if this pointer is ever on the heap next to somewhere that I know I can already overflow then there's my exploit so you can do all that kind of really interesting stuff there's tons of tools on OSX that will help you do this that will set up guard pages on the heap and you've corrupted it but none of them are going to allow you to have the application specific intelligence that you need to really write these sort of advanced heap overflows so quickly this is what a directed graph if you were just interested in what some of the allocation patterns are you could quickly create a graph with allocation patterns included in Dtrace itself though we've created an automatic heap smash detector that will instrument IDA and basically how that works is we keep track of every allocation made to the heap and if anybody ever string copies memory or does a mem copy onto a chunk that we know about so we know the size and the location and that right is too large we know there's going to be a heap overflow and rewrite it into IDA so let me quickly show you that probe would look something like this so in this case we're looking at the entry to string copy we're looking at the arguments and then we'll check that probe there against our known locations in the heap and see if they overflow so I'll show you a quick video there so real quick essentially what we're going to do is show you a vulnerable C application here it's just standard they malloc something that's not large enough and then we do a string copy onto it and if you can see through the transparency in the background we have a copy of IDA going on so we're interested in owning this application what we're going to do is we're going to copy a bunch of A's on the command line and we hope that that triggers a heap overflow so just to show you and that's what it looks like we've attached the heap smash and we sent a bunch of A's in via command line this is our IDA disassembly of the same program so we're just looking around there and we're going to fire it off and if all goes well any heap overflows that are detected will automatically be in the disassembly of IDA so we'll know everywhere that has a problem in this app using another detrace script to exploit it so here we're starting up the IDARoob server that's going to give us a connection to the OSX this is all in a VM you can see it's in VMware fusion so we're setting up our server here go back to the command line fire off those A's and hope it triggers a heap overflow so amazingly it does catch the heap overflow you can see there that's the probe that found it tells us what EIP was, what the destination was the interesting thing to note there is the size of the buffer was only, what is that, 15 bytes and we copied 30 so obviously we smashed the heap there and now we can go into our IDA disassembly and it will pinpoint that so where that would be useful is if you had a browser or something like that and you're running a fuzzer against it you can let that run for a couple days or hours or whatever come back to the IDA database, it's going to tell you everywhere there was a problem and now you can sort of vet those and figure out which situation would be exploitable so there's a comment and the marking in red there so it's really powerful for combining static analysis and run time we're kind of running out of time here so let's shift this over to so we have some sections here on using this defensively, some things you can do to stop based on by seeing system call patterns that it's not supposed to do again you have performance issues there you guys can check out the rest of the slides here we also talk about some kernel level stuff not particularly related to detrace but you see this screen you have no idea what happened, you want to figure it out it turns out that there's these logs that get dumped they give you a lot of information so you don't actually need to do the kernel debugging traditionally but if you do you have to set up a remote server and it's kind of a pain so let me skip to the end let me make one point here about tracing higher level application probes because this is a cool idea so what you can do here is trace some action that it's a high level thing like a whole page load or an SQL query or a DNS look up so these are not particular functions but they're actions composed of lots of functions so this makes sense for IO performance tracing but also for pen testers we can use this for our own purposes for instance we want to fuzz an application front end that has a web front end and then examine what's going on with the SQL database in the back so look at the particular SQL statements and are they sanitized or not the idea here is that on the OSX platform because Dtrace is implemented in the kernel it's really trivial for anyone running an app on OSX to add Dtrace probes that will allow us to hook them so we could hook JavaScript SQL injection all kinds of stuff that are beyond sort of standard memory corruption and I'll show you real quick where you can get the code here you go pop pop ret email us if we don't put it up in the next couple days we've been known to lag and thank you for your time