 Okay, how's everybody doing? All right, let's give a big hand and welcome our presenters here. Danny Quist and Laura Labrock. Thank you very much. So this is Reverse Engineering by crayon. I'm going to talk a little bit about some hypervisor based malware analysis and visualization tools that I've developed here. So my name is Danny Quist. I founded Offensive Computing. Also a PhD candidate in New Mexico Tech. I do reverse engineering and I teach reverse engineering. I'm Laura Labrock, I'm associate professor and fortunately get to work with Danny, run the scholarship for service program at New Mexico Tech. And Danny's going to present essentially all of this because it's really his work. Okay, well 30 seconds into the talk and there's already booing. Thanks guys. I do have a fire extinguisher here I was told not to use so that makes me want to use it. So yeah okay so what we're going to talk about is get started talking about the reverse engineering process then I'll talk a little bit about hypervisors and then specifically Zen and Ether which are pretty awesome and how we modify the reverse engineering process to work with these and then I'll show my visualization tool Vera and then I'll show you some real reversing that I did with this and then some results of this process. Okay so let's get started by looking at the reverse engineering process. The first step with this is and this is this is just my process and the one that I teach and the one that I've had a lot of good luck with. It's by no means everybody's one but this is just what I use when I'm doing it so that's the big caveat there. So the first thing that I do is I set up an isolated environment this is something like VM wearers and virtual PCs some sort of dedicated hardware. Then we get into the initial analysis and this is using the real high-level tools like sysinternals to look for system calls and monitoring any sort of operating system state change. The next one is removing de-opfuscation and de-armoring a program. So this is unpacking using debuggers, Saffron or Ether. Then we get into this assembly so this is with Ida Pro or Alley Debug or Dump Bin if you're so inclined. And then the next part is identifying relevant and interesting features and so this is where a lot of newbies have trouble with the process and something that I wanted to address. So the two things that we're going to talk about here are specifically going to be addressed is that we're going to look at de-armoring or de-opfuscation and this is using a tool called Ether which is completely cool and are Paul and Artem here? Oh, okay. All right. So good. I know Artem bears you. And then identifying the interesting features is a thing that the visualization tool is going to do. Okay. So setting up an isolated runtime environment, the point of this is just to protect yourself from the code. And this is, you know, assuming that it's pretty difficult to actually break out of the VM. It also makes a known good baseline environment to allow you to do backtracking if something bad happens. So execution and initial analysis. This is just to get the extremely high-level overview of what code is doing without looking at assembly. And so this is looking for changes on the file system, changes in the behavior of the system, network traffic, and overall performance. Okay. So now removing software armoring. Software armoring are just protections to prevent reverse engineering or make the program smaller or protect it in some manner. And this is done via packers which are just self-modifying code. And there's a whole lot of research out there on this, Oli-Bone, Saffron, Polyunpack, Renovo, Ether, and Azure. So this research is going to use Ether. Okay. So let's talk a little bit about the packing problem. All a packer is is a self-modifying portion of code that has a small decoder stub that decompresses the main executables and then restores the imports and then allows the program to execute normally. And this just plays tricks with the executable. It does the bare minimum amount of work to get things loaded up and hides imports and that sort of stuff. And it's basically just to trip up any of the tools. So the way this looks is that we'll have a PE file right here and you'll have a normal unpacked file and you'll apply a packer to it. And what happens is that you get the compressed or obfuscated code with a small decoder stub on the right. Okay. There's some other troublesome protections. The first one is virtual machine detection. You can get, there's some of these things that are put out, Joanna Rukowska built that red pill and also we developed a tool called OCVM detect. But generally detecting that you're inside of a VM is very, very easy. And so there was an excellent paper that if you haven't read it yet, you should. It's called a tax on virtual machine emulators by Peter Ferry. What this does is it goes through every single VM that was available at the time and he gives, you know, roughly 10 assembly instructions about how to detect all of them. It's all for 16 bit mode of a DOS program, but it's pretty easy to translate into the 32 bit equivalent debugger detection. This is just looking at the process execution block and checking to see if the is being debug flag is set. And also the E flags trap flag is is going to be set there. Timing it timing attacks deal with the time step counter and the time step counter you just take a you pull it before and then run some instructions and pull it after and then see if things are inordinately large. And so this turns out to be a very effective tool. Okay, so all this is very annoying for the reverser. And there's basically two methods for circumvention. The first one is that you can know about all the possible protections and get into the cat and mouse game of trying to remove those or you can make yourself invisible. So the second one is a lot less work. So being a good grad student, I chose a lazy way. So there's been a couple. Sorry, I just have water here. So can we get security here? This guy's causing problems. Thanks, thanks, honey. Okay, so software based VMs, I always wanted to be that guy at DEF CON too. So. All right, so software based virtual machines. These are things like renovo and poly unpack and design Amix box unpacker. The problems with these is that they get into issues where detecting these is actually pretty easy. So the Intel CPU is never really meant to support virtualization and they don't emulate these bug for bug. Okay, so OS integrated debugging. This is saffron and Ollie bone. Both of these systems abuse the page fault handler and sets a supervisor bit on running pages so that you can actually get some some sort of idea of the execution inside of there. The problems with these and one of the problems I had with the saffron is it destabilize the system. It was very good at unpacking something once. But if you wanted to run the system or like do anything else with it, you basically had to reboot and do the process again. So it wasn't very flexible for implementing an automated unpacking system. The other problem with that is that the you couldn't actually do any sort of fine grain monitoring. You basically got things on the page of boundaries. Okay, so now we get into the awesome. So there's a fully hardware virtualization monitoring system called ether. It was built by Artem Dynaberg and Paul Royal from Georgia Tech. This is a Zen based hypervisor system. And ether has a couple of things that monitor for system calls instruction traces, memory rights, and basically all this all the interaction with the OS is done via the shadow page table inside of the Zen hypervisor. So the problem is which isn't really a problem is that it requires dedicated hardware. But what it buys you is that you get an actual VM environment that you can use to restore a lot of this to so it's actually very flexible. Okay, so back to the reverse engineering process disassembly and code analysis. This is one of the most nebulous portions of the process. And when I'm teaching this, this is the one that students get a little bit frustrated at because this is something that you only get after you've done a lot of reversing. So how do you get a lot of reversing experience by reversing? And it largely depends on intuition. And you get into this issue where a lot of times you focus too much on the actual assembly code and not the overall view of the program. So fighting analyst fatigue is something I like to address get to the meat of the problem before it's so you can avoid the issues. Okay, and finding interesting and relevant portions of the program. So just like the disassembly portion, this requires a lot of experience. Some typical starting points is looking for interesting strings looking for API calls and looking at any sort of interaction with the operating system. And so this, this, this is a fundamentally imprecise process. And beginners typically have problems with this because until you get to where you understand us a little bit, it's hard to know where to start. Okay, so let's talk about hypervisors for a little bit. There's been a lot of hype about this over the last few years. And I'm not trying to propel this anymore, even though the talk title probably suggests it. But one of the things is, is that a lot of the new hypervisor based root kits, you know, like blue pill and that sort of thing have led a lot of the defensive tools. And just like just like always, the offensive need to hide yourself from a running process or from any sort of introspection from a piece of code inside the operating system works well in this environment. So that's what we're going to, we're going to utilize here. And one, one big feature that we're trying to take advantage of is the detection of hardware based virtualization is not widely implemented. That's not to say it's impossible. It's just that I don't see too many too much of that out in the, out in the wild. Okay, so there are a couple of hypervisor implementations, VMware ESX server, I've had some good luck running. This is a commercial solution from VMware. And it mostly avoids a lot of the VM detection issues. Linux kernel virtual machines, this is what I use as my base environment when I'm basically reverse engineering. And that's just to prevent me from running malware on my home system, which not to say that never happens, or my base system. And then Zen, Zen is really nice because it's got a excellent set of tools for introspection. And a lot of this has really been led by Georgia Tech. So hats off to you guys there. But what's nice about Zen is it uses a standard QMU image format that's been set up a long time before. And it's API that's fully controlled via Python. So you can integrate into tools. Oh, yeah, Python. Okay, so let's talk about Zen and Ether. So Ether is a set of patches to the Zen hypervisor that's used to instrument a window system. And so it has a bunch of base modules that allow you to do instruction tracing, API tracing and unpacking. And so one of the best papers I've I've read in a couple years is this Ether malware extensions via hardware virtualization extensions. And this was again written by Artem and Paul and a couple other people at the ACM CCS conference. So if you haven't read this, it's extremely high quality and very useful, especially compared to other academic research. So you should definitely check this out. So Ether's event tracing. This is what it's using this to detect events on the system. So we talked about system call execution, instruction execution, memory rights and context switching. But what's nice about it is it also gives you a covert monitoring. So there's no modifications to the base system, which means that it it's very hard to detect. So instruction tracing inside of Ether is implemented by setting the trap flag inside of the E flags register. And then modifications to that is handled via the push F and pop F instructions. So these are intercepted by the hypervisor and used for watching that. So modifications to this single stepping process to look and see if it's being single stepped is effectively hidden. Okay, memory and system calls, memory rights are tracked by manipulating a shadow page table. So this gives you access to all the right read to and written memory system calls. This modifies the Sysenter EIP register to point to a non page address. And when any sort of access is hit to that ether logs that and also logs of the the into instruction to catch the older system calls. Okay, so the this architecture, you start out with a Linux DOM zero management system. This is basically a Linux system that you run all your tools on. This is going to have the VM disk image and ether management tools. And then it's going to work with the Zen hypervisor, which I'll have the ether patches, and then work with an instrumented Windows XP service pack to. Okay, so I made a couple of extensions to ether. The first thing is I move some of the unpacking code from the hypervisor into user space. I put a little bit more user mode analysis in this. And I also repaired the portable executable rebuilding system. And this is just to allow you to actually disassemble it inside of IDA. And then there's a lot of other monitoring for executables that I'll I'll be released a little bit later. So the user mode unpacking what this is is all all we're trying to do is watch for and monitor all the memory rights and executions. And so we keep all the memory rights inside of a hash table whenever we see a execution inside of one of those memory rights. We assume that it's a candidate original entry point and then take a dump of the image. This isn't perfect, but it's a decent solution. And it certainly helps you to get some unpacked unpacked code. The other nice thing about moving this out of the hypervisor was that there's a little bit of scaffolding for future modifications. PE repairs to implement this. This is there were a couple issues. First of all, the sections weren't aligned. The address of the entry point was invalid. And it wouldn't load inside of IDA correctly. So basically all I did is I took the code from Ali dump and use that to fix the section offsets and repair the resources as much as possible. And then setting the address of the entry original entry point to the correct place allowed it to be loaded inside of IDA. Okay, so the results from this is that you get it close to a truly covert analysis system. Ether is nearly invisible. It's still subject to blue pill detection. So we can start playing that game, but I think it's a game that's worth playing. So get started with that. The next thing that we get is fine grain resolution of program execution. And since you get memory monitoring, we can start doing some some cool stuff with it. And so we can start looking at these files inside of IDA and what are their tools? Okay, so I want to give a quick demo of how ether works. Or at least the unpacking portion. Okay, so we're going to start by loading the file up inside of IDA. And just to verify that this is actually a packed executable, we can see that the we get the standard warning about the import segments being destroyed. And if we allow IDA to continue executing, we can see that it's generally not finding anything. The start function is pretty miserable. It's missing strings. So there's not actually stuff inside of here. And the functions are pretty much non-existent. And the import address, it just has the suspicious get proc address and load library. So in this case, things are pretty bad. Also this top visualization bar inside of IDA doesn't have a lot of blue code. So that's a good indication that things are all hosed. Okay, so what we're going to do is actually start running this inside of ether. To do that, this is just going to be a Linux box here. So this is my ether system. And I'm just going to mount the QEMU image. This handy command in the 3G driver for NTFS, you can use to copy the file over. So this is pretty straightforward. So we get that copied in. And even though this is a video, you can see how slow I type. Okay, so we're copying this over. Once we get the file copied over, what we want to do is we want to start the virtual machine up. So what I'm going to do is use the standard Zen commands to get this started. So this is going to be XM create, and then you give it the config file of what you're looking for. Okay, so once that gets started, you get this message. And once you've got that, you should be able to open up a VNC image. And we can see here that our Windows machine is actually started. Yeah. All right. So Windows. Wait for it. Okay, so once we have that, we actually get the screen and everything's happy. And we've got our virus right here. So everything's ready to go. So one thing when you're using ether that you want to be careful of is you want to be able to load the task manager so that you can actually kill execution of the program when it's running. So this turns out to be a pretty important thing to do unless you like your box rebooting. Okay, so what we're going to do is do an XM list to get the ID out of here. So in this case, it's nine. And then we're going to actually start running the ether tool. So this will be ether. Give it the ID that you want, the user mode unpack code, the executable that you want to watch for, and then a local copy of that executable. And so once you got that up and up and running, we can see that this is a good sign that ether's happy. It's at the filter name to be mu underbar netbol. So whenever it sees a context switch into this process, it's going to pull it out. And at this point, what we're going to go over and do is over on the system, we're going to run the virus. So we're running netbol. And we can see that netbol is going to start taking about 100% of the CPU. And it gets a little bouncy. Or like it gets a little sluggish inside of there. So in this case, we fixed a dump image to have an RVA of this two one or four two one three four zero. So this is a good candidate dump. And we'll allow this to execute a little bit longer until we see the actual icon for the program pop up. So one of the reasons I use netbol so often is it's pretty benign, but it's still got the malicious component to it so you can watch and see that things are executing. So this actually takes a little bit of time. So I am a little bit of time dilation to go on. All right. So the next thing we get is a fixed dump image to have an OEP of RVA BB08 or 40 BB08. And so this is this is actually the original entry point for the program that I verified by manually reversing this. Okay. So it's very important to kill the process inside of windows before you stop ether. Otherwise hilarity ensues. So we've got that going. Then we stop ether. And then inside of this we should have a listing of directories. And then for each one of these files there's the original address. So we've got mu netbol image and then the address that it stopped. And then this dash fixed is the corrected image inside of here. Okay. So now what we can do is we can take this image and load it up inside of IDA Pro. And we should start seeing some good things here. So what we're going to do is pull the fixed image up. Load that up inside of IDA. And then once we start loading it we can see that things are a little bit happier. So IDA is making noises like it's found the right compiler signatures that it's used to. It's found a lot of the unpacking code and that sort of stuff. So we get actual visualization. The functions exist inside of here. We get our references and everything that we come to expect from this. So we get these functions. Strings. This is a good indication that we've got netbol gun packs. So in this case we've got the HRH is netbol which I presume is what the AV company's named the virus after. And all this other stuff. Okay. So again we can follow links and that sort of stuff. All of that works and we're all set to go there. Okay. So this is an example of Mew's netbol. And one person brought up a point it's like so what it's Mew. It's easy to do. But this this unpacker is really awesome. It works on pretty much everything. So I ran the media through this and it does a great job. So if you could just like clap for Artem right now it's pretty cool. Thank you. Alright. So with that with that said we'll talk a little bit more about what's going on. So there are a couple of open problems. The unpacking process produces a lot of candidate dump files. So this algorithm that we have is a little bit imprecise. So what we want to do is we want a better way to find the original entry point. The other thing is that import rebuilding is still an issue. That's something I'm working on at this point but I haven't had a chance to integrate it in yet. And so the other thing I'd like to do is do some analysis. So this is gonna be the next thing here. So we can integrate this into our process by allowing somebody to figure out to know what to look for inside of here. Because this is the thing that people have trouble with most of the time. So once you have an idea of the execution flow of a program it's really nice to be able to look at this. So software armoring is pretty much made to be a trivial process and is extremely easy. Okay so let's get to the visualization portion. One of the goals I set out with when I was doing this visualization is to be able to quickly subvert to software armoring process. So in the case of netball we only had two images that we were trying to detect there are two candidate dump files. But what I'd like to do are in other samples especially with more complex packers you end up getting thousands of them. So we want to make that process a little easier. The other thing is for various portions and phases of the program I wanted to identify initialization, main loops, and the end of the unpacking code. So figure out where the self modifying code ends which is the OEP detection and figure out some of the dynamic runtime behavior of the program and also integrate it with some of the tools that were used to like IDA. So I made this program called Vero which is visualization of executables for reversing an analysis. But really it was just a contrived name to name something after Jane's gun and firefly and so it's finally glad to have a program that I could do that. So when you're looking at this it's it's it's the code's pretty simple it's kind of a Fisher Price My First MFC and OpenGL application but generally it uses a lot of the OpenGL rendering and integrates with IDA Pro and it's fast, small, and has a little memory footprint. Okay so with that I'll turn it over to Lord. So this is just a graph preview which on you know the fortunate part is there's now a graph in here at the beginning instead of without one. The bad news is it's zoomed out so far you can't see nodes versus edges but the the idea here is that we're going to use a graphical representation so that you can see what the different phases of computation are and where your time is being spent. So the vertices here actually represent addresses it could be the address of a specific instruction or it could be as you see at the bottom the address of the beginning instruction in a basic block either one of the two and the edges represent a an execution on a path so you're going from one instruction to the next what path did you take that's an execution the more time some a transition happens the heavier the weight on the edge so that gives you an idea of how where you're spending your time in the execution. The basic blocks for any of you have done the low level compiler stuff and that sort of thing you see those are just it's straight line code so no transfer of control except from one instruction to the next in the stream there's no conditionals none of that stuff so basic blocks give you a real simple way of getting a hold on a little bit more code at a time than a single instruction that makes this potentially useful for commercial codes but as you're going to see some of the next steps in this will be looking at doing functions and handling function level sorts of things because you again you still get you saw that that graph is is actually a very small graph compared to what graph theorists look at but they get kind of big and complex so the basic blocks give you a way of compressing some of this the instruction view where you're looking at every single instruction becomes a node in the graph gives real pretty graphs Danny likes those nice big swoopy graphs as he calls them so they may have some aesthetic value because they're pretty to look at but again any kind of commercial code and they're just too big to be able to figure out what's going on so again those transitions between addresses they're going they're going to give us more information about where we're spending time in the graph but so there's some an assumption here that I just need to point out and that is the assumption is that during this single execution so this is for a single execution of the code during that single execution of the code the thing you're looking for happened so if you get something that every 50th time it's executed it does something you're not likely to find that using this approach and so some future work can look at doing more statistical base where different executions different when you don't change the input file sort of thing but at the moment you run it once you get these these graphs they tell you where things are executed more so looking at these edges wherever you see from one node coming out multiple lines that means there was a decision point there and there are multiple paths you can go to the next instruction the the thing that's really interesting although somewhat arbitrary is how the colors are assigned and what they mean so I'm going to have Danny step back up here and talk about what these different colors in the graph mean when you're trying to find information okay so again these colors were chosen sort of arbitrarily and what these represent are the relation of of the file that has been dumped from from ether to the what's actually on disk so in this case we're trying to identify execution into sections where code doesn't exist or that sort of stuff so what yellow is going to be this is going to be the standard normal compiler-generated code dark green is going to be where the sections not pack are present in the packed pack version so this would be something like executing inside of the P header light purple is where the memory has been allocated for runtime but isn't present in the actual executable and dark red is one that I wanted to really point out which is actually the high entropy version of the code so this would be sort of execution inside of as compressed or encoded or encrypted or whatever you want to call it the other thing that I wanted to point out is light red these instructions are not in the packed executable at all and then line green this is where the operands aren't going to actually match so this will be any sort of shifting decode frame type attack where you're executing code over and over okay so these these colors were chosen arbitrarily I kind of like that they I think that they look good but I did show them to one of my friends Kazmir and he looked at it he says that's great Danny but why are all your colors brown so turns out he's red green colorblind and so and so I sat down with with Kazmir and he was really nice enough to help me go over things and pick out a set that looked okay but I didn't really like the way they look so inside the code there's actually a set of instructions that are available for colorblind people and I'm also going to make this available but for right now it's not compiled by default so any feedback on this would be really appreciated so I do want to make sure that it's a useful tool all around okay so the architecture is we've got the ether analysis system which plugs into a program that takes those trace files and generates the graphs with the ogdf or open graph display framework and open graph display framework is the thing that makes the decisions about where to actually place the vertices and locate them and that sort of stuff so this is a really awesome library and if you get a chance to use it it's I highly recommend it it's very similar to graph is but what's nice about this is over graph is it works with a large data set so I've had millions and millions of vertices inside of here and it hasn't choked on it but it will take all your memory but that's okay finally we get to Vera and Vera is the open GL component to this program and so what Vera is going to do is actually display this and allow you to start analyzing it okay so the basic steps are to run an instruction trace with ether transfer that trace file to the analysis box then run gen graph on that output and then those resulting gml files or the graph language files are going to be open inside of Vera and then you can use that to actually correlate instructions okay so let's go ahead and do a demo okay so we start back at our same place with the ether instruction so the VMs already started so what I'm going to do is an instruction trace and give it the name of the executable to look for and then read just redirect the output of this to a trace file so this is going to run and then over on the on the VM side I'm just going to start executing the program and again I want to make sure that I have that task manager up so that I can stop this here okay so netball gets executed get started and then we wait for a little bit of time to elapse and so it might not be easy to see but when you move the mouse around it gets a little jumpy so you know that's actually the case where our things are going on but generally when you get this icon down here it means it's running okay so now we can stop it from executing close this stop ether and then at this point we can look at the trace file and see how big it is it's roughly about 250 megs and so at this point what we want to do is transfer it to our analysis system and so this is a good reason to point out that you know I know live demos are the way to do it but I decided to save you all the the intrigue of watching 250 megs transfer over the network so that's why I did this you're welcome okay so it's copying it's done it's the world's fastest network we run GenGraph on the program so GenGraph just handles the graph layout and builds the GML files so we give it the trace file that we copied over the original executable so it can do its analysis and also the output that we want to put this at so this too takes a little bit of time to run and in this time I'm generating two basic implementations the first one is the instruction base graph and then the next one is the basic blocks so once this gets done what's this okay yes it's actually backwards so from there what we want to do is actually load this inside of Vera now and so this is the actual interface right here so to navigate and move around and look at this graph you simply just drag the same way you would with Google Maps and use your scroll wheel to go in and out the initial execution is going to be at this blue this blue address right here and then any sort of transition is going to move around inside of there so what we're looking at right now is the basic block view okay so we can zoom in look at these sort of things and any sort of line the thicker the line between these that's going to indicate the loops that are executing so thin lines mean there's only one transition and so it turns out that Mu actually has two stages of unpacking so this this initial loop that we were looking at is actually going to be the the first one which was identified via the via the unpacking process and then the secondary decoding which actually brings the rest of the program out I will identify more and so what we're looking for our transitions in colors so when we see a a color change here especially from this green to purple this purple here is going to be representative of the address so this is the basic block view this should be the original entry point that we're looking for but this is pretty close so the original one was BB 08 and this is BB 8B and if you're some kind of math genius you can calculate that in your head okay so then we get over to the the actual execution of the program and inside of here these are all the transitions that the program makes okay so the next thing that we're going to do is I want to show you the all all instruction trace and so this is a little bit more laid out so one of the comments that we got when we we showed this and did a user study is that it was a little bit dense information so this this is the actual instruction base level so we see that first set of loops inside of here we come to the central section and inside of this we can see that there's one one function inside of here that's making a lot of the decisions about what the program is doing so this is this address right here so if you're looking for the main decision point of a piece of mower or something like that you can find that but once we zoom in here we can see that this address is actually going to be the original entry point that we're looking for and so we validated it and looked at this but just by doing a quick glance at this we can we can figure out what the OEP is just by looking at this but as far as the overall functionality of a program any time we see this sort of long lines of single execution this is a good indication that this is the initialization phase of the program and anytime that we get these branching operations right here this is actually going to be the part where it's making a decision or actually doing the branching okay but what's nice about netball is that it's got these features inside of it that are pretty common so it's always got that three-way branch at the beginning and a little little pigtail up there this internal side right here this is the main loop that the Trojan uses to execute and and basically control the system so just moving around inside of here the next thing that we can do is we can actually zoom in and we can take these instructions right here and then open up IDA and we can use this to start correlating things so inside of once we find some interesting portion of execution we can use that to find the address that we're looking for so in this case it was 403c80 and if we zoom in and just find those addresses we'll find that there so you can use that to to correlate back and forth with IDA eventually there will be an IDA module but I ran into a couple of bugs that prevented me from releasing it on time but you can use this as a initial stopgap solution stopgap okay so that's Vera let's go here thank you okay so now I the next thing that I did is I took a bunch of pictures which were basically the different version or the same version of netbowl packed with a whole bunch of packers but in this case this is netbowl and it has the and netbowl is has the same structure so we saw this initial initialization portion here this branch structure and then this tight knot of execution inside of here so zooming in this is the same view that we're looking at in the demo so now we get into some of the packers so this is UPX and UPX starts out with some this red region or the darker red region which is this highly compressed code area and then we transfer into this light purple or lavender okay I'm not good with colors but so lavender is actually means it's the size of data is zero so this is where the code is actually unpacking itself and then running the program we see that the original entry point is immediately after the high entropy code and we get into the size of data is zero next is a aspect here we start out with the the blue instruction right here goes into these yellow instructions and then it unpacks itself and then we get a transition into this this red portion of the program which is where the unpacker was compressing itself so this is a good way to profile these sort of things okay same thing with FSG it's got a smaller or more compressed set of instructions and then we transition to the higher portion of the program and this graph here looks a little bit different and the reason it looks different from the other ones is that there's a nondeterministic portion of the OGDF rendering system but you can still pull out some of the the various features here okay mu mu is really nice this is the one that we looked at right there it's got a whole bunch of different transitions of colors but the general same format and then T lock T lock actually has a long set of instructions and I'm assuming that it uses some shifting decode frame to actually encrypt itself and rerun a lot of these executions okay so one of the things that I did with this is I took a sample from O.C.'s collection and called mebroot which is a pretty hardcore virus and analyze it inside of Vera and so when I was initially looking at this it seemed to be idling for long periods of time so actually executing this I knew it was actually executing because my my friend and I'm really glad I made friends with him the network manager at my school said hey do you know you've got malware on your box yes but thank you that validated that so what what the member it is is it's a hybrid user mode and kernel based malware so when I initially executed it for maybe 30 seconds and stopped it I saw that there was this basic execution right here we're just sat inside of this but it wasn't really doing anything so it essentially just sits in here for 30 minutes waiting for us to get bored so I accidentally left it running overnight and tried to look for any sort of unpacking program that it was running after so once it got through that unpacking program it transitioned in this main area of execution which was the top portion of this loop and got going so this is the entire graph of execution we start out with the initialization this is a 30 minute busy loop secondary initialization right here and then there are three main portions of the program so reverse engineering it this was the main unpacking loop and these are other two other secondary decoders that I didn't figure out but this kernel code insertion actually occurs at this point and that's when the executable stops and your system is then infected okay okay so Danny's been teaching these reverse engineering classes using the more traditional approach not using Vera but using this traditional sort of traditional approach is his version of it anyway for quite a while and recently at one of his classes after having taught the class for you know spending a week on reverse engineering he took Vera to show these these people that were in the class and see what they thought of it so the first comment in this is a very small user study in part because there's not a lot of really good reverse engineers out there you can get together for reverse for reverse engineering user study okay the second thing is this was a training class so you might say well but you're taking all newbies to do this well there were two people in the class that hadn't done reverse engineering before there were I think two people that had a lot of experience and the other two had had a quite a reasonable amount so this is a although a very small sample it's a pretty good representation of what we would expect in the reverse engineering community so he had them analyze two different packed versions of netball and I'm just going to go through this very quickly because we're about out of time basically trying to see if the things that he's just walk you through doing in the demo whether new users whether this helps them or not they're supposed to find the original entry point then find these different parts of interest the packer code the initialization the main loops and if they can that's great especially if it's a little faster so original entry point you'll see users want to use user one and user three where the two newbies who had never before this week done any reverse engineering and one of them you know we've got the two different packers one of them with one packer finds the OEP the other one doesn't find it with either one for the initialization in one case they all recognize it in one case the two new people can't find the initialization code main loops a little better performance we have one person with one packer can't find the main loops of the program which I find hard to believe then overall evaluation they were asked were they likely to use this again and that's in the light lavender and would they recommend it to others and that's in the pink so both for the new people and for people with a lot more experience they thought this work pretty well and they would recommend it the basic gist that comes out of this is there's more work to be done yeah PhD students talk about the work you need to do yet okay thank you nice so we basically need a better way to identify the beginning and ends of loops that was a good observation a lot of these loops became overlapped and we're a little bit convoluted and it would really be nice to be able to search for these memory addresses and see the basic blocks it matched so the future work that we're going to get into is general GUI and bug fixes right now it's pretty unsophisticated so I want to make that a little bit better I'd like to start doing some memory access visualization ether provides an awesome way to do that so I definitely want to get there integrate system calls look at the function call boundaries and then get a little bit of more interactivity with the unpacking process the other nice thing that would be cool is that it'd be really nice if you could use something like wind debug or Ali or it is debugger inside of the ether system and all the functionalities are it's just a matter of making it together so conclusions visualization makes it easy to identify the OEP there's no actual statistical analysis need needed you can actually identify program phases the graphs are pretty simple and the preliminary user study shows that other people don't think we're full of it okay there are some tripping hazards that are right here basically used 64-bit version of Debbie and Sarge followed the instructions that Artem and Paul have worked so hard on to the letter I was guilty of not doing this and then I re-read it and it was much easier so anyway it has to be a 32-bit Windows XP service pack to image and you have to disable depth large pages multiple CPUs and for God's sake whatever you do kill the program before inside of Windows before stopping ether okay so closing thoughts ether is awesome many many thanks to Artem and Paul this is some excellent excellent work if you get anything out of this I think that more people should be using ether because it's just so completely cool source code tools and the latest slides are on offensive computing net I do have an initial version of this out source code will be up later if you use this tool please give me some feedback and a more formal treatment of this is going to be at this at 2009 so again okay and then thanks to all these people they were very instrumental in making this this whole thing happen so with that I believe we're going to the Q&A room which is 106 goon yeah okay so Q&A room 106 and then we'll be available after the talk for a little bit to to talk to you so thank you very much for coming out