 Hi, my name is Alex Bulozel and I'm here to present on my research on reverse engineering Windows Defenders antivirus emulator. A little about me before we get started. I am a security researcher for all secure. We made another company from their victory at the cyber grand challenge two years ago at DEF CON 24 with the mayhem CRS. I also do firmware reverse engineering and cyber policy at Riverloop security. And I'm a very proud alumnus of RPI and RPI SEC. They're playing over in the CTF right now and want to say good luck, guys. And this is my first time speaking at DEF CON so it's great to be here. This work is my personal research and is my own views not those of my employers or anyone else I've previously worked for. Before I get started I do want to say this presentation is a deeply technical look at reverse engineering Windows Defenders binary emulator. And as far as I know the first conference talk to really look at reverse engineering the antivirus emulator for any AV product. It's not an evaluation of Windows Defender. I'm not going to tell you where this is a good product. You should use your network or not. I'm not going to tell you whether it catches viruses effectively relative to other AVs or anything like that. And also this talk does not address Windows Defender ATP or any other technology under the Windows Defender name. This is about Windows Defender antivirus the traditional endpoint AV product. So an outline of this talk going to go through an introduction then talk about my tooling and process how I did what I did then reverse engineering and the real meat of the presentation a bit and vulnerability research and then we'll conclude. So why look at Windows Defender antivirus? This is Microsoft's built in AV product that is installed by default on all Windows systems and on Windows 10 it runs by default which means that over 50% of Windows 10 systems have Windows Defender antivirus running. The defender name now seems to cover a variety of mitigations and security controls built into Microsoft OS so you have control flow guard, E-Met, ATP, all these different things now get lumped under Windows Defender device guard, Windows Defender application guard, Windows Defender exploit guard and so forth. Again here we're focused on Windows Defender antivirus. It also runs unsandboxed as NT authority system meaning if that are you found a vulnerability inside Defender that would give you initial RC if you could exploit that. It would also give you a prevask up to system and you'd be running inside an AV process so the AV would be unlikely to catch you doing anything malicious because it's not going to flag itself say doing something malicious, writing a file, injecting another process and so forth. It's also surprisingly easy for attackers to reach. I've not tried this myself but friends of mine at Google project zero have told me that you could send an executable to someone who has a Gmail account open and if they have that Gmail open in a background tab Chrome the Chrome browser will cache the downloaded file that just hits the inbox that'll hit like a mini-filter driver on the Windows OS and then the file that's written to desk will be passed off to Defender to be scanned so you can actually reach this in a remote fashion even though you would think this is a traditional host-based protection system. My motivation came from this tweet from Tavisormendi at Google project zero who about a year ago found some vulnerabilities in Defender's JavaScript engine with Natalie Silvanovic also project zero and I had a background reverse engineering antivirus software did some work we called AV leak with Jerry Black thrown us here in the audience a couple years ago presented that at Black Hat and Woot but I never actually analyzed Windows Defender and I always wanted to and I also had this interest in JavaScript engines so I took on Defender and looked at the JavaScript engine for about four months then presented that work and moved on to reverse engineering the Windows emulator which I'm under here I'm here to talk about today. So our target is mpengine.dll this is the main DLL that provides Windows defenders scanning functionality it's a very large miner it's about 12 megabytes large and again this is not the part of Defender that's say doing hooking for system calls or filtering you know disk writes this is the main scanning engine this you take a buffer of data and you say this is malicious or it's not malicious that's its purpose and inside mpengine are a variety of scanning engines I'm focusing today on the Windows binary emulator which is one of many scanning engines before we go into my work on the Windows binary engine just want to quickly recap what I did reverse engineering the JavaScript engine this bitly link there will take you to that presentation and this was presented at Recon Brussels in Brussels Belgium back in February so Windows Defender has a JavaScript engine that's used for analysis of potentially malicious JavaScript code and I reversed it from binary I used a custom loader and shell for dynamic experimentation with help from Rolf Rawls so thanks Rolf throughout the JavaScript engine I found AV instrumentation callbacks that inform the heuristic antivirus portion of Defender about actions that the potentially malicious JavaScript is taking that it uses to determine whether this is malicious JavaScript or not say for example an exploit and I also found that the developers seem to prioritize security at the cost of performance so the JavaScript engine is very pared down stripped down doesn't have jitting or many of other features and optimizations that make modern JavaScript engines fast on the other hand I found it to be relatively secure and the attack surface to be relatively pared down you'll see some common themes like that throughout this presentation today as far as related and prior work goes there's really only a handful of prior publications on reverse engineering antivirus software at all let alone the emulators within them there is of course the work I mentioned AV leak which I did with some collaborators at RPI some of who are here there's also book work from Hawksian Coret touching on this there's Tavis Ormondi's work at Google project zero and there actually are some talks from the AV industry itself such as Mihai Shirox talk from I believe this was hack.lu I think 10 years ago as a AV industry developer talking about how Bitdefender's emulator works but really there's not been a lot of offensive work or work from people who don't work in the AV industry looking at these systems I'd also mentioned that patents are a great source of sort of open source intelligence about how AVs work Chris DeMas called that in his presentation looking at patents on XT6 processors similarly you can find a lot of patents that describe undocumented functionality within AVs or how these particularly complex mechanisms work all right moving into a background on emulation itself so there's this traditional AV model I think a lot of people have this idea about how AVs may work which is that they scan files and look for known malware signatures such as file hashes sequences of bytes or file traits and they might have some heuristics about say imports or they recognize a static MD5 hash or they recognize a particular snippet of code that's known to be associated with a given malware family but this is really an outdated model and this is an outdated model you know 15, 20 years ago this was outdated because malware could evade these hard coded signatures with packed code by creating novel binaries you know packing obfuscation you heard a lot about polymorphic viruses back in the early 2000s so the solution that again 15 to 20 years ago the AV industry came up with was runtime dynamic analysis on the endpoint through emulation so actually running these unknown binaries in a virtualized environment and looking for signatures there this technology goes by a number of names you may hear it called sandboxing heuristic analysis dynamic analysis detonation virtualization and so forth at the end of the day it's all emulation and that's what we're talking about today so an overview of emulators in general you begin by loading a potentially malicious unknown binary that you can't identify with more expensive analyses or less expensive analyses or other such as hashing or heuristics based on imports you're going to then run the emulator run the binary in an emulated environment so you're going to have a cpu emulator for the particular architecture of binary generally xcd6 you're going to run that in this emulator and throughout running you're going to collect these observations and you'll terminate it at some point such as length of time at the run number of instructions that have been executed number of api calls amount of memory the malware is used or so forth and throughout this you're collecting heuristic observations about the malware's behavior that inform detections you might also look for things like if the malware calls create file and writes a known malware signature with create file you you'd hook that implementation and every create file you would look for say a known malware signature a known malware hash at that point i'm moving into talking about tooling and process how i did what i did reverse engineering wise i use pretty standard industry tools like ita and bin diff for patch analysis so as google project zero was discovering some vulnerabilities i was able to diff updates of the dll and find what had changed how the microsoft tried to mitigate vulnerabilities inside defender i found overall there's about 30 000 functions of cross this massive 12 megabyte dll so this is enormous probably one of the largest binaries i've ever taken on reversing um obviously people look at firmwares that are much larger but this is really absolutely monolithic for a single windows dll what does make this job a lot easier is that microsoft publishes pdbs that's basically debug databases that have symbols and sometimes type information uh for the binaries dynamic analysis wise uh avs are generally harder to look at than traditional software uh and dynamic analysis does require some work on the part of the user the reverse engineer in defender's case it's a protected process meaning that even if your system or admin on your local system you cannot attach the process to debug it even if you have a have s ad bug privilege or anything like that you can't still can't attach it's protected by the os the solution to this is to go into a kernel debugger and for example debug an entire vm and then attach the kernel process or the process from the kernel but that's very expensive and just annoying to do so uh introspection is also challenging actually if you can say pause on a breakpoint actually understanding what's going on in the emulator state can be difficult with a debugger even though you have a debugger running scanning on demand can be difficult to trigger uh if you want to scan a binary you might have to go into a gooey interface click a couple buttons select something choose it you know it's a pain to do that you want an automated command line interface just say scan this file scan that file scan the other file and code here reachability may be configuration or heuristics dependent meaning that local settings about uh say how aggressive the scanning is what time limits you allow the scanner to have all of these can get in the way of effective scanning the solution is to build a custom loader for these av binaries and it was nice that I was able to start with some work that tavisor medieval project zero did on building his own custom harness for defender which I then extended extensively so first off I'm going to talk a little bit about tavisor's existing work which she called load library so tavisor built a pe loader for linux so this is able to take a windows dll on linux and load it up and then actually run it this is not a full replacement for something like wine or any other windows emulation this is just enough to get windows defender itself running and shimming out uh system calls on windows that defender would be making to linux implementations so talking through how tavisor's tool works um and the link here will take you to the github project we begin with the linux binary just standard username right binary and it's going to load and resolve uh imports for mpengine.dll so this is just the process of taking the dll relocating it in memory doing standard dll loading process putting it in the read write execute memory buffer there on linux then the it the import address table you're going to go through and shim out the implementations of various windows apis with linux replacements so for example create file is replaced by a call to open file or f open and say write files replaced to a call to f write inside this engine you have an emulator and for now just remember that there's a table called gsis calls which is a table of function pointers to various emulations of windows api functions and on the outside we have our malware binary uh with here we have the standard mz header on the binary we're going to call a function exported by defender called r signal and this is the main entry point to defender uh we give it a buffer of data and it's going to come back with a malware classification we then go through a process of selecting a scanning engine so defender may do some initial analyses with things like static hashes if those fail and it can't determine whether this is a malicious binary or not they're ultimately going to route it into the emulator the emulator will run make its determination whether this is a malicious binary or not and then come back with a virus identification or it might say this is just benign so quick demo going to show you scanning with mp client this is tavisormandy's unmodified uh harness for windows defender so here we're scanning the ecar test file this is an industry standard test file um for any av and we see we scan the file and it comes back and says we found ecar dot com so that's kind of a demo we're actually taking this uh windows code running it here on linux and seeing what happens when we scan a binary in addition to using this harness from tavis i did some dynamic analysis with customized code coverage tools developed by marcus gossidam of ret 2 systems as a fellow rpi sec alumnus as well and marcus made a tool called lighthouse that lets you scan a binary or run a binary under dino rio or pin collect coverage information and visualize that in ita pro so you can see here in this control flow graph the blue basic blocks are those that have been hit during a given scan and i found this to be extremely powerful and useful tool when i was doing my reverse engineering i did find it interesting to see how far flake uh just about a month or two ago gave a keynote at sst ic where he was talking about challenges of introspectability with malware or with binaries and how it can be very difficult to introspect and analyze and debug binaries and how ultimately that's a hindrance to security and how our explicitly called out the challenges of analyzing windows defender as one example of this where because defenders in a privileged process on windows you can't analyze it under a tool like pin or dynamic rio of course we're running on linux we sidestep the whole issue of the protected process and we can actually run and visualize coverage okay now moving into the meat of the presentation talking about reverse engineering the emulator itself first off i'm going to talk about startup of the engine then we move into cpu emulation instrumentation and then the windows environment and emulation so first off the first thing that has to happen when we want to emulate a given binary is we have to load it in and initialize the emulator and get everything started up so we're going to call the r signal function which provides this entry point the defender scanning and we give it this buffer of data to be scanned to be classified and uh it will turn the malware classification um so these results are actually be cached as well there's lots of stuff going in the back and we don't really care about we ultimately care about just going into the emulator itself so the emulator has to be initialized we have to allocate memory for execution we have to initialize various say plus plus objects that are involved in the emulation and sell process itself various subsystems then defender for example the object manager we have to create an object manager instance we have to set up the virtual file system and so forth we're going to load the binary that's to be analyzed resolve its imports and things like that and then initialize virtual dls in this emulated process memory space these are akin to the real dls in our real window system that provide emulator provide windows api functionality throughout this process defender is collecting heuristic observations about the binary and you can see these on the right side here for example things like pea suspicious section size so these may inform some heuristic classifications in defender because there's a suspicious section size maybe this is malware um we'll also be doing things like in the bottom right uh you can see some min win resolution resolving api ms some of the api set dls and here in the bottom left i have um this example of uh when we're setting up a name for the miner to be emulated you can see that if the binary is a windows executable it'll be called my app.exe this is something you could write a face of malware that says if my name is my app.exe i won't run i know that i'm running inside defender and indeed if you google this string you will find malware binaries online that explicitly look for the name my app.exe and choose not to run if they see it after startup and initialization we're going to move into talking about cpu emulation so technically what defender does is not so much emulation as it is to name a translation this is akin to what kimu the quick emulator does which is basically taking assembly code of a given language lifting it up into an il or an intermediary representation and then taking that il and then dumping it out with a jit engine uh into executable code um so defender supports a number of architectures you can see here in the enum on the right ranging from xd6 of three different flavors up to arm and even vm protect so they can take a vm protect op codes lift those into an il and dump them out into sanitized xd6 to be run and analyzed as well as arm now this subsystem is incredibly complicated and not really a primary focus of my research but i'll give you a brief overview of it in the next few slides we begin with the architecture to il lifting process which are in the of these giant functions that are architecture underscore two il you can see an example from xd6 to il translator just an absolutely massive ugly switch case uh thousands of switch cases you know i get super slow when you load this in and basically what they're doing here is grabbing a byte of opcode from an xd6 opcode looking at that uh determining what it is and then emitting the according uh or related uh windows defender engineer representation for that binary uh operation and you can see an example here in the bottom right where all push instructions lift to uh 13 in the windows defender il there's also after we lift to the il there is an il emulator uh where that runs in software so we can actually run binaries in software i never observed this being run during my research did some code covered analysis never saw this being hit my uh intuition is that this is so that we can support uh analysis of xd6 binaries on non xd6 hosts so for example if you're running windows defender on windows for arm you don't have to have a il to arm jit engine you can just run it in software now as far as the il to xd6 jit translation uh we're taking il code and then translating a basic block at a time similar to the way kimu does things and I did observe this jit being uh used during my research defender will actually uh handle unique instructions that it can't handle with emulation uh through software bound emulation so if it can't get an instruction out it'll actually generate a call directly into a function that does that we're gonna show them the next slide but just you can see here uh circled in red on the left you can actually see the opcodes being constructed so they're actually constructing a move move an immediate and then call the immediate calling directly into a function handling a particularly unique architectural instruction or event uh over here on the uh right you can see the lea opcode actually being emitted the opcode on xd6 is 8d so as you're dumping out from the lea il instruction down to xd6 you do 8d and then you xor or or that with a register to uh register in value to create a valid xd6 instruction uh microsoft actually documented this in 2005 at virus bulletin with a paper called defeating polymorphism beyond emulation and it's definitely worth checking out and it's really remarkable that microsoft is experiment with experimenting with technology this technology almost 15 years ago ills are so hot right now everyone's playing with ills for things like buyer and ninja or various program analyses but microsoft was doing this on the end point you know on your your computer your grandma's computer everyone's computer 15 years ago they were lifting up the ills jitting them out doing analyses on them it's very impressive I found so then we have these architecture specific escape handlers for these unique architectural events that we can't emulate with the jet engine uh you can look at this offline see an exact listing of some of these enums and an example of one of these functions uh would be this software bound emulation of the xd6 cpu id instruction so this is an instruction that provides unique information about given xd6 cpu and here it's emulated in software so I've shown here I wrote a malware binary that uh does cpu id with this argument hex 8001 and when we run this binary inside defenders analysis engine we'll get this code coverage and we'll actually see that we'll bounce off the block where that same immediate is compared and then we go down the true branch because the immediate that our code was doing matches up with the immediate here in software and then they can emulate cpu id by setting register state accordingly all right moving into talking about my instrumentation which is a big enabler for the rest of my research so the problem with analyzing windows defender again I said there's very little introspection it's very difficult to tell what's going on inside of it all you really get out of it is virus identification now you could exploit virus identification as sort of a side channel to extract information about inside the engine and indeed that's what I did with the AV leak project a couple years ago was exploiting malware identifications as a side channel to get information about what's going on inside various AV emulators um but this is really slow and inefficient so a smarter technique is to go in and sort of give us a malware eyes view of what's going inside in the engine so mpengine.dll has various functions that are invoked when various windows apis are called by malware running inside of it and we can then hook those emulation functions and provide our own implementations so we can create a one or two-way IO path to share information with the outside and also in turn inform the malware binary inside about what actions we want to take so let me give you a diagram of that this is the original load library diagram I showed you this is Tavis Ormondi's tool kind of in an unmodified state this is how it works and I went in and I hook the gsyscalls table this is the table of about 120 functions providing emulations for various windows apis I hooked it and replaced those implementations with my own implementations of various common functions like output debug string A or WinExec so when these functions are now called by our malware binary inside the engine instead our functions are invoked so here's an example of our output debug string A hook and the process we have to take on which is resolving the relative offsets of these functions and then setting hooks in the read write execute DLL buffer kind of in our Linux process so what this looks like is this here in the top right we have our IDAPRO disassembly or decompilation rather of windows defenders emulation of output debug string A basically a no off all it does is retrieve a single parameter off their virtual stack and then bump the ticket count so bumps the time a little bit in the emulator and here in the center of the screen I have my re-implementation of this function so we're going to walk through this step by step first off we have our declaration so this takes a void pointer pevars t is a massive about half megabyte large structure passed to all these windows api emulations we don't want to know an exact definition of that function so we just provide take a void pointer just say we're not going to worry about it it's just a pointer then we have this local thing to hold parameters to the function so the function has parameters passed to it in the virtualized emulated environment and we want to interact with those so we have to make some space for them we're going to use a function internal defender to pull off one parameter from the virtual stack so we're going in talking you know looking at the virtual ESP and EVP state in this virtual memory space and then pulling off the 4 byte value that was there I'm actually calling back into defender from my hook function to that then I'm calling a function get string that's going to translate a virtual address inside the emulator to a real address that we can interact with locally and now we can just print that string to standard out so this sounds like a lot but let me show you a quick demo of interaction so here I have a malware binary that's going to say hello def con when we run it it's just goes output you x during a hello def con we're not going to scan that binary inside my hooked and modified version of tabus's load library tool and you'll see here it says hello def con now going back to visual studio we're going to add a new line this is a live demo of course this is a pre-recorded video because the def con organizers this year wanted us to do pre-recorded videos but I was doing this live I just rebuilt the binary and here scanning it again it's now going to say hello def con and then also this is a live demo so this is what's happening is inside the emulator our malware binary is calling this function and because we've hooked the implementation of the output debug string a emulation in defender our functions being called instead we're going to run it one more time I believe with some more information you can see here we have a more rich debug output and we can see things like the exact addresses passed to it from the virtual memory space so this is a big enabler for the rest of my research the fact that I had this sort of window into what's going on inside the emulator I can have my malware binary inside take observations and then post them out to the outside world as far as my malware binary goes call it my app that xc again that's the name of all binaries running inside defender's engine it does this IO communication they'll be doing any and some other functions on the right side you'll see a list of factors that I found could impede emulation and the ways I get around them so I had to really massage the linker optimizations imports in order to get binaries that were consistently emulated by defender and I'll be releasing some code at the end of this talk that will have a very simple visual studio project that I found I was able to get consistently emulated when scanned with load library finally as far as the reverse engineering goes moving into the windows emulation and the windows environment I think the most interesting part of this presentation I'm going to start off by talking about the user mode environment so this is the emulation of a fake windows user mode so in windows defender there is a virtual file system um as any real system would have a file system and files that malware might look at defender virtualizes one there's about 1500 functions on their virtual file system and you'll see a variety of things in there mostly it's fake executables that are there for malware binaries to for example infect or you know do different things to that could be indicators that they are in fact malicious binaries so I'll do a quick demo of dumping the file system again using that mechanism that I showed you of posting data out without bdb string a we're able to enumerate the entire file system and dump it in just a few seconds I did here actually use a slightly more sophisticated hook uh whereas doing WinExec and I'll show some examples in my backup slides it's not as simple as just output debug string aing them but you can see here in just a second or two we dumped the entire virtual file system from the inside windows defender read a malware binary go inside there enumerate all the files that it could see and then dump them out and when we after we dump them out we see that there's about 1500 of them in this virtual file system and you'll see things like this the word goat repeated thousands of times over in a file called aaa touch me not dot xe my intuition is that this binary is right there on the the c drive and it's there so that a malware binary might read that file in and say send it over the network or encrypt it or do some indicator that we are indeed malware so maybe if you touch it that might be an indicator that you're malicious uh the reason it has the goat uh goat the word goat pasted thousands times over presumably it is a goat file that's sort of an avi industry term for a sacrificial file like a sacrificial goat that you can let get infected or changed or encrypted by malware in order to have the malware kind of show its true intent so that was an interesting artifact again this is also something that you could write malware that says if i see the word goat thousands times over in a file called aaa touch me not i know i'm running inside Defender therefore i'm not gonna run i'm not gonna do anything malicious we'll see fake config files you can see that these are very clearly written by a real human with comments like blah blah uh and you know generic sequel queries we have a virtual registry that has thousands of entries and a numerating whole registry dumping that out we'll see things like this so for example there's a registry entry for world of warcraft presumably there's malware that maybe looks for world of warcraft registry entry and touches it so if we saw a call to say reg open key on world of warcraft that might be an indicator of potential malicious intent we'll see various other fake processes running on the system and these are not real processes they're just uh when you call you know the callback function to numerate all processes it'll give you this fake listing and highlight the bottom and yellow there is our function myapp.exe quick demo of that dumping the process listing again using this same mechanism that i developed so there you can see real time just took less than a second we dumped the entire process listing all right back to the presentation in addition to this environment we have windows user mode code that runs to provide emulations of various windows api functions and there are generally two types of windows api emulations akin to those windows api functions the real window system there are those that stay in user mode which are ones that stay in the emulator and those that resolve into a syscall and just like a trap to a native emulation here in defender symbols indicate that these emulated virtual dll's that are in the emulator environment are called vdll's and because they are simply dll's once we have a file system dump we can just go reverse that dump and or reverse those dll's by throwing them in ida and their standard windows pe files when we look at them they're definitely not the real implementations of things at kernel 32 that you would see in our real system so we'll see things like this um in kernel 32 if we call get user name it'll return a hardcode string of john doe this is again something we could use to create a vase of malware that says if i see the username john doe i'm not going to run we'll see a computer name hal 9 000 ostensibly a arthur c clerk uh you know space odyssey 2001 reference uh so again you could write malware that looks for 2000 hal 9 000 i know you're running inside defender we'll also see very simple limitations of functions like rtl get current all that function takes is it needs to just go grab a memory segment at fs 18 so they actually support memory segmentation at the architectural level so they can just do that actual instruction inside the emulator or we'll see complex functions like rtl set sample security stripper just knocked out they just return zero the more functions just stubbed out between zero negative one so forth they're just trying to interrupt so lots of complex functions complex functions are not fully emulated by defender we'll also see things like this again more unique strings and identifiers that we know we're running inside defender like these german ip addresses and references to german websites maybe a german programmer developed this particular dll emulation so that covers some of the user mode code and the very simple emulations those that just return hardcode names like john doe or hal 9 000 how about the user kernel privilege uh boundary and how do we get into more complex emulations such as those requiring access to a virtual file system these functions are implemented with a hypercall like instruction called api call this is of course not a real x86 instruction with the upcode 0 f f f 0 and then a 4 byte immediate describing the particular function to be invoked but when this instruction is called and the virtual cpu it's going to generate a call into a native mpengine.dll function that provides emulation of these unique functions so these are complex functions that minify system state or maybe require a particularly complex handling and so in copy file worker we have an api call to kernel 32 copy file will be worker the virtual cpu sees that instruction generates a call directly into this emulation of that function and then it's emulated there in software in mpengine.dll this is great attack surface if you found any vulnerabilities in these native emulation functions you could use these to break out of the emulator and infect the native host this disassembly here is provided by a processor module and I'll have an article coming out in pocker gtfo issue 19 describing exactly how this ideprocessor extension module works so once we have these api call instructions running they're going to trigger a call to a function that looks at the gsis calls table which is a big table of these function pointers and these hashes that's going to look for the 4 byte immediate that was called from the api call instruction and then dispatch to the appropriate function that matches up with it so kind of a workflow of what this looks like inside the emulator here we have kernel 32 output debug string a it's going to do things like log the number of times it was called so if it's called more than 900 times that might trigger some unique behavior but ultimately it's going to resolve down into this function api call kernel 32 output debug string a which is then going to use the api call instruction you can see the zero f f f zero bb 1480 b2 it's going to see that instruction and then the hyper call is going to step in and basically transition us into native emulation out of this managed dynamic translation context and we're going to hit the native emulation for output by string a of course this is what we hooked and we had our own output by string a implementation that I was using to post information out of the emulator enumerating the emulated functions that have native emulations these are them the yellow functions are those that are not found on real windows systems so they're specific to defender for example for debug functionality or unique backdoor management here's more of them including a number of vfs functions which are for low level access to the virtual file system so all these native emulation functions take a pe varsity a very large half megabyte large structure containing everything about a given emulation context and then we have templated parameters functions that are used to retrieve parameters to the function from the emulated snack and then programmatic apis for manipulating return values register state the CPU tick count or time all that sort of stuff can be programmatically managed through manipulations of pe varsity structure virtual memory can be interacted with with a API similar to that found in many emulation engines such as Unicorn Engine where we can memory map virtual memory into our real memory space and manipulate it there and there are wrapper functions for common operations like reading a single byte writing a single D word reading or writing wide strings or regular char stars these are all have kind of these utility functions wrapped around them to make them easier for developers moving into kernel internals so we've talked about the user mode code talk about how the user mode code gets into kernel mode or the native emulations let's look at how those native emulations are themselves implemented so the windows kernel provides a number of facilities to any binary you know this is ntoskernal.exe and associated drivers and these are really the core of the windows OS or really the NT kernel these these are include examples like the object manager process management file system access the registry through registry hives and synchronization primitives for IPC first off we're going to talk about the object manager this is an essential part of the windows executive that provides management for handles so anytime you are opening a file a socket so forth it's going to go through the object manager and Defender supports five types of objects with its object manager so these are file thread event mutant which is a singular of mutex and semaphore and these are stored in the big object manager map here in mpengine.dll their stored memory is c++ objects and they all inherit from a common parent class object manager object we then have subclasses like file object or mutant object and you can see I've made a little larger for the font the unique traits to those particular c++ objects such as the m file handle thing in the file object or the weight count variable for a mutex if various processes can weight only given mutex c++ rti is used to rtti is used to cast between these subclasses to their parent class when they're retrieved and the object manager can be interacted with programmatically by these various functions so if we open a mutant they're going to grab that object and then you know mess with it if we open a file object it's actually called the object manager get file object which will first check the type and then explicitly use rtti to cast to a file object and fail if the retrieved handle is not indeed a file handle we'll also see things like the pseudo handle for current process is emulated as hex1234 again a trait of the emulator we could use to write evasive malware based on seeing that our own handle is 1234 we have a virtual file system provides emulation and access to a file system and this is accessed through the standard ntdll ntwrite file ntcreate file and so forth apis as well as these lower level vfs functions which provides sort of a backdoor unsanitized access to the file system emulation finally moving into talking about av instrumentation so the heuristics and analyses the av is doing throughout the runtime so there are some internal functions that are exposed through the hyper call api call interface and i've summarized them here and we're going to look at a few of these first off MP report event which is used to communicate information about malware binary actions with defenders heuristic detection engine so these are in some of these user mode emulations such as get user name or get computer name those don't require trapping into a full native emulation and that would increase the tax office greatly if they all did but we do want to inform defender that the given function was called so if get system directory is called it'll report event one two three three one or if you create a process and you do it suspended it'll do hex 3018 but it'll say create suspended specifically noting that a process like that was created and pure port event can be called in more cases you can see here just more examples this is called thousands of times throughout these vdls and a more concrete example of how this might play into av identification of potentially malicious binaries is here where we see that if we call terminate process on a pit in the 700 range which you'll note that all these various av processes are in the 700 range it'll trigger a call to mp report event one two three four nine but it'll also say av so if you try to terminate process in an av that's probably a good indicator you're malicious anti control channel sort of a backdoor interface for administering the engine this is something to have a storm to hit and I went here and reverse engineered the 32 switch case options of this function and showed you what they all do so these do things like manipulate the uh rewrite microcode uh manipulate register state all sorts of stuff uh great attack surface and definitely something that shouldn't be open to malware binaries running inside the emulator uh we're going to include by talking about vulnerability research start off by trying to understand some prior vulnerabilities discovered by Tavis Normandy at Google project zero so Tavis discovered this API call instruction that I talked about and he was able to call directly into native emulations of functions rather than passing through their API call stubs by just generating the API call instructions on the fly as you can see here and then Tavis was hitting internal debug functions like anti control channel which when you give it option hex 12 it goes to rewrite microcode and this code here lets the user specify the count in a tight loop uh and with the user specified count we only have I think a thousand elements allocated uh for the new microcode information uh but the user can give and say two thousand we have a linear buffer our flow Microsoft patched this by adding a check that the count is no greater than one thousand and if it is it returns zero it doesn't doesn't run Tavis also looked at the virtual file system and by calling directly into these unsanitized functions to access the virtual file system was able to uh basically get a uh linear uh heap read and write primitive uh by creating a file with these you know strange sizes and and this sequence of calls could crash the engine with an out of bounds write now I looked at the mitigations that Microsoft put in for the abuse of the API call instruction which were primarily that Tavis himself was generating the API call instruction on the fly from the malware dot text section and then Microsoft added a check that says is the call to the API call instruction is it coming from a VDL page and if it's not it's going to deny the user the ability to invoke a native emulation function this means that these uh API call instructions can only be invoked from code pages that are associated with a given VDL that cannot be called from the malware binary and in fact if you call them it'll do mp set attribute which will basically set a heuristic that you tried to call the API call instruction from your dot text section this is really really weird probably a strong indicator of malicious intent and I found that I could bypass this mitigation by simply finding the API call stubs in memory in our VDL L's which I can reverse engineer and that can just bounce off the API call instruction and hit this uh interface these interfaces with my own uh controlled arguments this is not good I did report this to Microsoft and they told me this is not a trust boundary kind of a classic Microsoft response to a lot of vulnerability disclosures but it's not quite a trust boundary um you know unless you actually found an actual vulnerability like uh you know actual buffer overflow in there the fact that there's this logical flaw that I can hit internal debug interfaces and do things like stop emulation right then and there or change microcode in the emulator that's evidently not a vulnerability according to Microsoft so an example of a bypass here doing something pretty benign just we're going to hit output debug string A so I found in kernel 3d2 the offset of output debug string A and I can resolve that address and then treat that as a function pointer and just bounce off this emulation and when this runs we hit output debug string A now more maliciously we can sort of hit a NT control channel uh again that internal debug interface left in by developers maybe debug or administer the engine and we can set our own heuristics like for example if we call a vibrant body found it will trigger immediate malware detection so a quick demo of that so in this video you can see we're calling up putty bugs during A in the legitimate way and then calling it with our output debug string A of use um through this uh unintended interface kind of left there in the vdll code page um once we run and compile this binary you know and we'll also hit NT control channel as well and we're going to use NT control channel to check the exact version number of the engine this was done on the february 2018 build of the engine so with our kind of ret2 api call technique we run this binary and we'll see we hit budget big string A the normal way then through the api call with kind of the bypass for microsoft mitigation so we have a controlled argument going into there and we also show that we can hit NT control channel with a controlled argument as well now uh again the implications of this is we can hit these internal debug interfaces with attacker controlled arguments probably not a good idea finally i want to talk a bit about fuzzing so i was able to then fuzz emulated apis um basically working out some more complex mechanisms to allow our our channel to be a two-way IO channel not just an output channel uh i took mwr labs osx kernel fuzzer which generate random values to fuzz the osx kernel and i folded that in with my code uh generating random values at each time and then i post those into the emulator and i was able to do things like fuzz NT write file and actually reproduced tavus's crash but in an unique way that got around uh the sanitization that NT write file normally does i reproduced his crash in vfs write but through NT write file without having to abuse the api call instruction you can see in this demo here we're going to do that we're going to resolve the address of NT write file and then fuzz that and this whole mechanism here with the params uh this is a more complex interface that i have for passing information in and out of the emulator and basically on the outside of the emulator we're generating fuzz input to give to inside of it and we're calling NT write file with those fuzz parameters and saying what happens so running this you're going to see just run for quite a while uh it's just going to keep running in my experience it took about seven minutes running single threaded to around eight thousand system calls per second to reproduce tavus's crash again this is not a smart fuzz or there's no AFL there's no code coverage information it's just a dumb the random values at uh windows defender in order to fuzz it there's our demo and moving into the conclusion we covered tooling and instrumentation cpu emulation basics for xcd6 binaries and a bit on vulnerability research and fuzzing for windows defender we didn't cover a whole lot of other stuff for example uh xcd6 uh x64 emulation excuse me emulation arm emulation vm protect emulation the 16 bit emulation there is a full DOS emulator aside from the winter two in a modern windows system emulator there's an a 16 bit emulation built into defender really interesting attack surface as well probably not was well looked at as the 32 bit one uh we didn't look at the threading model how you could do multi threading for binaries inside emulators that's always a a source of problems for avm mv emulators at large so worth looking at we're also analysis for dot net binaries we're primarily looking at windows pu binaries that are just compiled xcd6 code also inside mp engine we have unpackers parsers uh javascript engine which you can see in my retcon brussels talk other scanning engines and a dot net engine now i want to say that people love to talk about avs and what they can and can't do where they where they weigh or may not be vulnerable but there's not a lot of ground truth about avs in the public and i think there should be more i think they're really fascinating target to analyze i think they're a lot of fun i think this is much more interesting to me at least than looking at malware actually seeing how malware gets caught and mitigate and detected and you also learn a whole lot about say nt kernel internals and object managers and things like that it gives you an impetus to look at all these different technologies um a lot of claims about av vulnerabilities and how they may or may not be vulnerable are based on tavis homerity's work and a bit on hawksians work but there's really not a whole lot that out there i really like this tweet from oxyan where he said if you google antivirus internals or all you find is me him and then tavis marmin d i would say if you like this sort of work definitely grab a copy of his book it's an awesome book and really under appreciated by people um just some really incredible work to win into that i'll be releasing some code later here's my github i'll also tweet about this so no worry to you don't have to take a picture of the slide but i'll be sharing some of the harnesses that i built an idea disassembler for the api construction i'll also be publishing an article in pocker gtfo issue 19 describing more of this one of the more technical details of some of these technologies and that concludes the presentation i'll have a whole lot more slides being released online after this this is only about 50 percent of the material that i prepared for today my javascript slides are available there that bit.ly link and i want to thank all my friends tavis merc marcus hawksian and then numerous friends helped me edit this presentation and get it here at defcon hit me up on twitter if you have any questions i have open dms thanks very much