 All right, thanks for coming. I know it's lunchtime and no one likes to miss lunchtime. So my name is Josh Pitts. I got all the Marines right before 9-11. I wrote Backdoor Factory and Backdoor Factory Proxy and this is a file infection framework that will inject code into PELF and MAKO binaries on the x86 and 64-bit side and also do RMB7 for ELF. Because of that, I found Uniduke, which was a Russian malware infecting downloads over tour. Why would you do that over tour? I have no idea. But I co-authored an environmental keen framework that made environmental keen malware called Ebola and it has to spell correctly. I work at Okta where I do red teaming, I do design reviews, everything you think of, pen testing, reverse engineering of anything we put on our network and there's my Twitter handle and there's my GitHub where a lot of this code is. So why this talk? Well, you know, I think writing shell code is fun, believe it or not. My wife thinks I need new hobbies and that's fair. So, and I want to talk about the current state of public windows shell code, how it works and talk about updating it. So we got three parts. There's history, there's further development and there's mitigations and bypasses. So the first part. So Metasploit shell code right now uses Steven Führer's hash API or Metasploit payload hash. The basic concept of it is it uses a 4-byte hash with a 13-bit war instruction to find windows APIs and to use the export table or to find windows APIs in the export table of a DLL, any system DLL. And it was introduced in August 2009. And some of this has roots back to Matt Miller or Scape's Win32 shell code paper and these slides are online and I have a link to the paper. And Scape now works for Microsoft in the mitigation department. So just keep that in mind. So this is how it works. You do a call over the payload. You push everything onto the stack and then the shell code or the hash API will parse the export address table and then jump into a windows API and then return back to payload logic and it will continue until there's no more payload logic and your payload has been executed. Now this allows, what was kind of cool about this is this allowed payloads to be portable across all windows platforms. And of course some mitigations decided to be or to come out. You have Emmet or E-Met. This was obviously by Microsoft. And then you have Peter Benia. He has a frack article on beating some of these payloads. And then there's also the Havoc mitigation which was released, the concept was released in POC, GTFO, 12.7, amen. And in Havoc stands for halting attacks via obstructing configurations. And this was funded by DARPA, FastTrack, and it's by digital operatives. And what it did, it threw a DLL up into the loaded module list of the DLLs. And what it did, it basically pre-computed collisions for this 4-byte hash roar assembly. Anyway. So what it did, it made these collisions and it actually worked pretty well. So there's also the Emmet caller and EAF protections. Remember the shell code was introduced in August 2009. And the Emmet EAF protection was introduced in 2010. And what this does, it protects kernel 32, NCDLL, and kernel base from, you know, to read that. And then also you have the caller protection which is really more restricted to ROP like calls, but Steven Fierce hash API actually would trigger on this also. Basically you cannot go into Windows API via a rat or a jump. It had to be basically a indirect call. And as you may have heard, Emmet is end of life. It's supported through July 31st next year. It still works. But it's being reintroduced into Windows 10 this fall. And we're going to talk about that. And it does still work. I mean, so this is the protection, this is the Tor exploit that was targeting some people that were doing illegal things on Tor. Nobody does that. And basically it flagged on a stack pivot mitigation. So Emmet still has some relevance in the field, right? So bypassing Emmet. There have been a couple public bypasses. And Skyfer had a, or Skyline, used a method to make Emmet think that the code is valid reading the export table, going into the export table. And then you have Peter Bonilla. When Emmet used hardware break points, he would erase them essentially. In offensive security, they had a bypass by reusing code within Emmet itself. Now the caller bypass, like I said earlier, if you had the handle to a Windows API, all you do is move it into a, there's a couple ways to do it. All you do is you move it into a register, dereference the register, and then you can call it directly. Or you can do an indirect call. I think it's shorter to do an indirect call directly. I think it's only two bytes. So knowing this, back in May 2014, I put import address base payloads into BDF. And they worked just like compile code. Because I realized the closer you are to working like compile code, the harder it is to stop you with any mitigations. And it also makes you harder to see for antiviruses. So I added these in May. And then December, I wanted to see if I could use a concept in exploitation. And I wanted to use the import address table. So I looked for some prior work. Scapes, paper. Again, I used that. I looked at that. It was an import address table payload that he had. But there were some issues with it because he used the export table to get to API, load library A. And he would use that to get to a DLL and then use the import table off of that. And it was, I think it was hard coded to a certain version of a system DLL. And then Peter, he had a import address table parser. I decided to use that. But just for reference, the first known, I guess, use of the import address table for something malicious was a virus from 1997. It was not really an exploit to just use the import table directly. So it was a file infected and we just used the import table. So this is Peter Benia's point of concept. I cannot get this to work on a modern Windows OS. I think I was testing this on a Windows XP service pack 3 and also Windows 7. And so I decided to update it for some tolerances. And I was playing that. So the way it would work is you would find image base in the PEB, you would get the PE header and then you would get the import table relative virtual address. And then you would loop through and find kernel 32 via an ASCII match. And so next I would find low library A and get procedure address. But I added a check to set bounds of readable memory to make sure that it was readable memory because if you couldn't read the memory, you would have a crash. And then after this stuff is done, low library A would be an EBX and get procedure address, get PROC address would be an ECX. And I bolted on a reverse TCP shell and it's this bypass caller and the EF checks really easy. And then I emailed the EMAT team. And this was the response, essentially. So they knew about it. Obviously Matt Miller, I don't know if he was consulted, but they knew about it. I mean, they get tons of crashes and they analyze these crashes. So MI POC just used the low library A and get PROC address from the import table. It was pretty limited. So if the binary did not have it in the import table, then it was useless at the time. So this code sat from December 2014 until February 2016 when sub T, if you don't know sub T, you should get to know him. He's the person that executes code and things, executes code and things that shouldn't be executing code, not through exploitation, but through hidden features and other things that just, you know, these are signed by Microsoft and you can run code and it's great. Like, you know, like it's pretty awesome. So he tweeted that he was having problems at EMAT because of EAF and I knew exactly what his problem was. So we started to collaborate and I sent him the only stuff I had at the time and he went crazy with it. He was using it pretty much everywhere. The only problem was in PowerShell, PowerShell did not have low library A in the import table. So we talked about moving, looking elsewhere for low library A and get PROC address from a loaded module and so he wrote an addition to do this and he borrowed code from Stephen Fuhr's hash API stub and changed it around a little bit. He decided to use a four byte hash, the Roar algorithm to find just the DLL name versus DLL and API and so that means it would use that hash to find anything in the loaded module that will match that. And so if you're going to do something like Havoc where you're going to cause a collision, you would need to know all the possible combinations of anything that might be deployed with any applications as far as DLLs. And so yeah, he wrote that stub and we were excited about it. So now we have two stops. Okay, that's cool. Right. So then we knew that you could use get PROC address anywhere in memory space and if you do that, you can get low library A as long as you can rate the location of kernel 32. So we used a four byte hash algorithm like I mentioned and then we would get, this is how we would get low library A, pretty simple right? And then we would push the handle onto the stack and move ESP to, we'd push the handle of low library A onto the stack and move ESP to EBX and then we'd do an indirect call to EBX. So now we have four stops that we could use. But we didn't know where we could use them. So what we did is I went through an enumerated system binaries across all five operating systems for low library A, get procedure address or just get procedure address in the import table. And this is the output of that. I saved the output in a JSON format so when you were looking for an import to use, it would walk everything that is loaded recursively in a dependency walker style fashion to find anything statically that might be in memory. So we had a lot of opportunity to use this. And if you look, you see they made an effort to decrease the use of low library A and get PROC address. And you see Windows 10 has less than XP, obviously. But then you see there's more get PROC address. So either way it was a pretty good find. So we're going to submit to a CFP or we're going to submit to a conference. This is about June. And then this came out. A flash exploit that used get PROC address from user 32 import address table to load a payload was discovered by FireEye. And this was pretty depressing. Because then what we had, it wasn't good anymore. Not good but it just wasn't as exciting. So we wrote a blog post about it and we released a POC that would pick for you which stub you would use. And we released a reverse TCP shell with this with no exit function so it would automatically crash all the time. That was kind of trolling a little bit. But we wanted to do more stuff. So we wanted more payloads. We wanted to get this into Metasploits somehow. And we knew it would be a lot of work. And we had a couple ideas. So that brings us to part two. So two ideas. Remove CMFuse hash API and replace it with something else completely. Or build something to rewrite payloads logic for use with an import address table parsing stuff. So I decided to rewrite all the things. And do it automatically and magically. And this is how Metasploit payloads work. You push everything on the stack, for x86. So you push everything onto the stack, then call the hash API with a call EVP. So I had a workflow. I used capstone for disassembly, keystone for reassembly. The only thing I had issues with was protecting the saved load library and get PROC address from being clobbered because of conditional instructions that would pick a path and then I'll lose context. And I worked on this for five days straight for about 12 to 15 hour days over Christmas. Not on Christmas but over Christmas holidays. And when I solved one problem, more peered. And there was a point that I crossed it and where I could probably just rewritten everything. Or at least have a good POC of things to show for my work. But really I had nothing. So I decided to burn it down. So the next idea was to take the hash API and the actual payload logic, remove the old hash API, use one of my import address table stubs, then offset table. So I had some requirements. I needed to support read execute memory just in case I wanted to use this at other places that weren't read write execute memory. I wanted to keep it small as possible. I wanted to support any metasploit payload that uses Steven Führer's hash API. So I had the same four steps. I would take inputs in, disassemble it, capture the instructions or capture the blocks of instructions, capture all the APIs, build lookup, offset table. And then I would find the appropriate import address tables for the executable and then there would be an output to different types. So even if the user does not provide the DLL they're looking for, this would do it automatically because of all the JSON output that I've saved. So this is what I came up with. So you would take the four block, four byte block hash used in the API to parse the export table. But now it's going to point to the APIs and the DLLs that we're going to be called. And those are going to be in a string and they're going to be null terminated. So for example, you have the first hash, right? It's four bytes. And then the DLL offset, if it matches, is going to point to kernel 32. And then the next API is going to, the API is going to point to WinExec. So you have the next hash, points to kernel 32 again, and then it points to the next API, exit thread, so on, so forth. So you do, because I unique the string, you do have some overlap and you can save some space. So this is the logic for the parsing, for parsing the lookup table. Basically I jump over the table, I move the first hash into the lookup table and I continue until I find a match. If found, I move the DLL offset into AL, I normalize for memory, and I use load library A to load the DLL. And then I save the DLL handle. I put the API offset in AL, normalize, and use get proc address to get the Win API handle. I prepare the call to the Windows API by clearing the stack. I save EAX down the stack for recovery on pop AD. Then I save the return address to EVP that I call the Windows API. On the API return, I fix up EVP to point back to the beginning of the import address table stub, and then I return back into the payload logic. So this is how it works. I jump over everything into the payload logic that comes with Metasploit. And then I return, or I do a call into the import address table stub, and then lookup table goes into a Windows API, returns back in the lookup table, and then I go back into the payload logic. I just continue until done. So the initial POC to write the lookup table took about 12 hours. And then adding all the workflow, and the stubs took about another 12 hours. And it took a while to get the tool where it's at, but I'm really happy about it. So now that these API hashes are, they no longer hold reference, they're now meaningless, okay? After it goes through, and we figure out what they are, they're meaningless. And I found that AVs depend on them for signatures. So what happens if we just randomize them? It's pretty fun actually. So I got to demo that. Let me do this properly. Everybody see that? Okay, so all I'm doing here is just doing a, I'm just outputting to file a reverse TCP shell from MSF Venom. And then I'm going to cat that out into FIDO. That's what the name of this tool is, FIDO. And I'm going to do a load library A, get proc address. I'm going to use those out of the import table of the main module. And I'm outputting that file. And the output's fairly, fairly useful. I'll tell you what's going on pretty straightforward. And now I'm going to use backdoor factory and infect TCP view. And I'm going to put that on Windows 10. And there's Windows Defender. Of course it was detected. Now I'm going to do the same thing, but I'm going to use M for Mangle. Going to use backdoor factory again to infect TCP view. And I drop that onto disk. And then I set up a net cat handler. And then no detection. All right, so here's an example of using FIDO. Just as you saw in the video. So I was having some issues with a couple of DLLs. And when I say couple, all the system ones. I was running to a problem where I had to build a blacklist to avoid using those as saying that they had the get proc address or the library A in their import table. And I decided to look at this because it was going to a point where I was just blacklisting everything. And I figured out that it was just Windows 7 through 10. So I decided to look into it even more. And this is when I discovered the effect of MinWin. Typically these are used in system DLLs. It's for portability and they have been used in Windows 7 since they've been in use since Windows 7. And if the DLL, if it is in the DLL's import table in your process space, you can use exposed APIs such as get proc address. And in get proc address case, it's everywhere. It's in every single process because it's in kernel 32. So this is a view of kernel 32's import table. So kernel 32 is importing get proc address via its import table through one of these MinWin DLLs. And then it exports it back out so it can be used through the normal API. But if you're parsing the import table, you can use it. So let me explain what this means. We just need get proc address in any DLL, any import table in any DLL to access the entire Windows API. We just need get proc address period. So since Windows 7 get proc address has been in kernel 32 import table. So that means we've had a stable MDEF and color bypass since Windows 7. And it still works on Windows 10. So I'm going to demonstrate that with the Tor exploit I talked about earlier. All right. So what I'm showing here is I turned off the stack pivot mitigation just to show that AAF will flag as far as the mitigation. So now I'm going to drop the exploit and payload into Tor browser. You see the mitigation pops up and tells you what was flagged. It's very nice. So now I'm going to just run FIDO with using Firefox as the target binary. And what you're seeing here where it says number of lookups to do, it tells you everything. That's recursively parsing through and trying to figure out what DLLs are loaded through the system. And then it will continue through and tell you where you can use what import table. And so it will automatically use get proc address. You can see it at the bottom where it says GPA. But I'm going to use the MinWin DLL. Because you can see it's in kernel 32 and it's the API MS WinCore library loader DLL. And what I did here is I had an encoder to put the output stand out so I can reformat it into a proper JavaScript format. Just showing that. I already have it in the exploit example. I just need to uncomment it. This is a calc payload. Restarting the tor browser. And so these payloads were introduced at recon Brussels back in January 2017. For DEF CON 25 I'm releasing 64 bit payloads. And that brings us to mitigations and bypasses. So I opened up a GitHub issue to incorporate these import address table payloads into Metasploit. Part of what I was offering to do was to release these 64 bit stubs to help with that process. And if my talk would have ended here, right here. However, three months later, after my GitHub issue submission and five weeks before this talk, the email protections are being added back into Windows 10 implemented via the kernel. Additionally, Matt Graber pointed out to me that there's now an import address table mitigation. Which is, this is my reaction. Just flipping tables for days. Yeah. So how does the import address table filter work? Well, so you had to download, I had to download the preview edition. This is coming out in the fall by the way, officially. But you can get in the preview edition. So first off, they're not enabled by the fault. You have to go through and click through, enable everything. But what it does is they take, there's a pointer to the import name and they zero that pointer. So at this point, the thunks are still there so that compile code can work. But at this point, you're driving blind. And if you're driving blind, you're probably going to crash at some point. And yeah, it's pretty awesome. So, but you know, the funny thing is I knew that something like this might happen. So I had an ace in my pocket. So this is kernel 32 entry for get proc address. And on the next line is get proc address for caller. This was introduced in Windows 8. It's exported by kernel base and then imported by kernel 32. That means it's in every process. It works very similar to get proc address. It's not filtered by the import address table filter yet. And this is how it works. You just add a zero on the end and that's it. Yeah. Way to go, guys. So yeah, so I've added, I've added this end to Fido and you use it basically, use it through extern GPAFC. So I got a demo for that real quick. Sorry. All right, so what I did, let me just, I don't remember. All right, so what we're doing here is, so I'm just taking a 64 bit reverse shell from Metasploit and I'm using just get proc address. And then I'm passing into backdoor factory, patching who is 64. And then I'm going to throw that onto the operating system. Now I do not have the protection enabled yet. Just to show you that the protection does work. There you go. Now I'm going to enable the protection. And then it's going to take a second to crash, it will crash. And then I'll run it again. It should be faster than the next time. All right, now I'm going to use a get proc address for color. Put that on disk. And there you go. So I did let Microsoft know about this. So they're gonna, they're gonna, they're gonna have a patch for it whenever it comes out officially. Because since they were making an effort to make it an honest try to fix it and I decided to let them know. So now what? Now we can't parse the export table or the import table. Is it possible that you could find more APIs that are not filtered and that could give you some useful information to get to get proc address? Yeah. Probably. Or what if we didn't use import table or the export table? So let's, let's think about this real quick. In modern user space Windows exportation you have to bypass ASLR depth and other protections. And on top of that your exploit is most likely to be tailored to a specific version or versions of software and operating systems. So why shouldn't your payload be targeted also? Why does a payload have to work on every operating system from XP to Windows 10? Right? Why not make it targeted to that specific version of application? So the way you would do that is you would go to get proc address directly. And the way it would work is you take PEV image base and that's easy to find and then get proc at the offset for get proc address. And that's going to be version dependent and the offset can be found during exploit or during exploit or payload development. And this can be in the main module of the main program or it can be in any application specific DLL. So I would not target system DLLs because those change very often. So if there's a DLL that has the same offset for a number of versions, let's say they depend on open SSL for something and they just, they don't update the open SSL binary even though you're not exploiting open SSL you can use the get proc address offset in that DLL across multiple versions. So and it does save code. So this is, this is the import parsing table code to find get proc address. Just to get get proc address. It's a fair bit of code. But if you know it, if you know it get proc address offset is going in that's how much code it takes. Obviously. Right? So and so what if, what if you had more, what if you had, you couldn't find a single get proc address in one DLL. Well what you would do is you would find a DLL that was consistent across all versions that had get proc address and then you would engineer, you would diff the binaries across the different versions, diff the DLLs across different versions and then you would make a similar lookup table so that you could use the diff sections to figure out what version you're in and that would be associated with the appropriate offset table. So I have an example of, not of the actual diffing process, but I have the example of using an offset for get proc address either within the main module program or the external offset of a DLL. Right? So if anyone wants to help me develop that or engineer that system, please find me. I'm easy to contact. I'm on Twitter and I haven't emailed address. That sums up my talk. Any questions?