 How's it going, everybody? Thanks for sticking around. Two o'clock on Sunday, I thought it would just be Greg and I talking to ourselves in a room, which would suck because I don't even really like Greg. So today we're going to talk about a technique that you can use to decrypt information that Trojans have in their binary. I would relate the technique similar to using a calculator to solve a complex math problem. So I got a piece of coffee bean stuck in my throat. OK, so using a calculator to solve a complex math problem kind of alleviates you, the hassle of going through all the math yourself. So I know there's probably a few people out there who could tell me very quickly, or just using their head or using paper, what, 46.009 divided by 8,432.007 can tell me the answer to that. But I can't. But I do know where to find my calculator. So in my screw, no, I can still solve the problem. I can do it just as quickly. So that's what this technique is based on. We're going to use what's readily available to us to solve problems that would otherwise be a little bit difficult. So at this time, I'm going to turn it over to Greg. And he's going to take you through some intro, some theory, and then we're going to hop into a bunch of real-life demos using a few malware samples. We're going to show you some cool stuff like the disassembly rather than just boring slides. So I hope you enjoy it. OK, so as Mike said, we're basically going to use a debugger to make. Is this better? All right, sorry about that. There we go. All right, so basically what we're going to talk about is how to use a scripting debugger to make your life easier when it comes to reversing malware. As we all know, malware basically obfuscates pretty much everything you can possibly get its hands on, like strings, configurations, protocols, communicate back to the CNC, as well as the data that is stolen. So what we're going to do is use the functions that are already in the malware that are used for encryption and decryption against itself. And then we're going to go through a couple examples. All right, so why do malware authors obfuscate things? Well, as we all know, this is kind of a game between us as analysts and the malware authors to make our lives more difficult, to make their code more difficult to understand. Also, they've worked hard to steal the data, so they don't want other people to take their data away from them, to see what was been stolen, to make forensics more difficult. And also, they don't want people knowing what's going to and from the command and control. They don't know what commands are coming down. They don't want you knowing what data is being sent back. As a result, this also makes IDS and IPS signatures and antivirus that matter more difficult to write. So what we want to do is the opposite of this. We want to make the reversing easier by making it clear text. We want to know what data was stolen, what the configuration of the malware is, as also we want to know what data is being sent to the command and control server, what instructions are being received from the command and control server. And also, we want to have a way to protect ourselves, a way to, for a better solution, write signatures for IDS, IPS, and antivirus. So there are three basic ways you can write a decryptor for this kind of data. The first way is by far the most difficult and requires you to understand the most, which is doing it by hand. You're throwing it into a dissimilar or debugger, figuring out what each function does and going that way. That, that this approach is slow, painful, and no one likes to do it. So, and the more code you have, and the more complex that code is, the longer you're going to sit there and work on this problem. Now, recently, within a past year or so, I guess, there have been advances in decompilers that turn assembly language back into something that resembles C. So as we all know, C is a whole hell of a lot easier to read than looking at straight assembly. But this also has its problems. It's still kind of an empathy. Things don't get typed right. You have pointers that look funny. You have data structures that just don't make any damn sense. So this isn't the, you know, the all-enclopsing solution. The third one, the one where I see most of the time talking about today, is using a debugger that you can script to do all the work for you. So the malware has made the job actually easier for you. They have functions that encrypt and decrypt their data, so we can use that against it to, you know, decrypt the data. Now, depending on what you're trying to accomplish with this, you know, not one solution is gonna be the catch-all for you. If you're trying to write, you know, a Wireshark, a Wireshark filter, you're not gonna be able to do it with a scripting debugger. You're gonna need to get there, figure out what the code's doing, and write something that you can compile into Wireshark. All right, so there are three basic classes that we came up with when it comes to doing scriptable analysis. The first one is what we call active. This is where you're not letting the mal code really do its job, you're not letting it run natively. You found that you find the functions that do the encryption and decryption, and you attach to that. You only run that function to recover what you need. An example would be trying to recover strings, because as I'll know, malware loves to obfuscate a string so you can't see what is actually being done just by looking through a string's output. So you find the function that decrypts it for it, and then you run every string you can find through that. Another way is called passive. This is where you more or less let the malware run as it normally would, with the exception that you're running it in a debugger at this point. And you're monitoring key aspects of it, either API calls that it does, or internal functions. And we'll go through this example as well, where you're monitoring the Kraken traffic. Basically with Kraken, it tries to encrypt, it doesn't try, it actually does, encrypt the network traffic going across the wire. So what you can do is just catch it a couple of lines before, where it actually encrypts the traffic. So when it encrypts, it's going to have you in plain text before it encrypts it. So you just capture that, and you can see what's going on with the network traffic. And the last one's kind of your catch-all solution when it comes to scripting. We call it recon and utility scripting. This isn't really, it's kind of the kitchen sink solution where you're looking at things like you're taking an arbitrary file, like a JPEG or PDF or whatever, and you want to see if that possibly has shell code in it, or you want to see if there have been any shared libraries that have been hooked. Another example we have here is you can use it to unpack malware without actually running it. So what are you going to need to do this? Well, obviously you need something to work on, you need something to analyze. You can go to offensive computing and you can grab any number of samples. And more important than that, probably is something that scripting debugger. We're going to talk about pretty much immunity debugger, which has Python built into it, but you can do other things like Oli debug with Oli script. Sorry, IDA has IDC as well as plug-ins for plug-in. And it would be really helpful if you understood the concept of reverse engineering. You need to understand how to unpack things, set break points, and just generally flow through a piece of code that you're reversing. Optionally, these things kind of come in handy. Some sort of similar to IDA being the obvious choice. And unless you really enjoy infecting yourself on a routine basis, we recommend some sort of virtual environment. So at this point, I'm going to hand it over to Mike, who's going to go through several demos using real malware. I was just kidding about liking Greg, by the way. Greg's awesome. Okay, so I'm going to skip this slide because it's a little boring, and we're going to go into the first disassembly. So this is a copy of the SilentBanker Trojan. And you know, imagine you open up this binary, go to the string's pane, and you see a bunch of stuff that you can't read. So you're going to go to it, hop over to the first cross-reference, and find out where that string is used in the binary. Right now, we can see through this little graph pane that the function is more or less pretty linear, just goes from the top to the bottom, doesn't do a whole lot, except for several repeating actions. The first of which moves a, I'm going to start using this one for an example. It moves the address of this encoded string into the EAX register, and then it pushes EAX twice before calling the same function over and over again. So before even looking into this function, we can pretty much determine that this is probably the decoding function. And we also know that it takes two arguments, a point or two, an input buffer, and a point or two, an output buffer, which indicates that SilentBanker, in this case, is going to decode the string in place. It's not going to move it to the heap or do anything like that first. So if we take a look inside this function, it begins with a loop right here, and inside the loop, it calls a function, which looks like it's got maybe some base64 derivative with a special alphabet, and it calls a few other sub-functions, and there's a bunch of stuff down here. Do we need to manually go through each of these instructions, reverse them in a C, reverse them into Python in order to decrypt the same string that it's doing? No, we don't need to do that. So our technique is going to be using this small Python script right here. I hope you guys can see that. If not, it's on the CD. We're going to hard code the address of that decoding function that we saw. We're going to loop through and get each cross-reference to that function. And for each of the cross-references, we're going to disassemble backwards until we meet that move EAX comma offset and then the encoded string. So we're essentially finding that move EAX instruction so that we know the address of the encoded string in memory. At that point, we're going to read the encoded address, read the string at the encoded address, so that when we spit out a table of these results, we have the encoded strings here and then the decoded strings on the right. And at that point, we're just going to jump over the encoding function, which is the equivalent of hitting F8 in Immunity Debugger or AliDebug. So once again, we're just playing the Trojan's own internal function using it to decrypt this information, extracting the results from it so that we have a pretty table like this. So I'm going to do a quick demo of how this is done. Can you guys see that? Yeah, you should be able to see that, all right. Let's do this. Okay. So we're going to just start by moving the SilentBanker DLL into Immunity Debugger. We're going to play it until the debugger's DLL loader takes us to DLL main of the SilentBanker Trojan. And then immediately, we're going to run SilentBanker underscore strings at the command down there. So on disk, there's a file called SilentBanker underscore strings.py, which contains the code that we just showed you on the last slide. As you can see there, the table in the middle is printing out a list of the encoded strings and then it's printing out the corresponding decoded string. So at the beginning, these strings aren't that interesting. They're just file extensions, but that's because we're going in order. So as we move on, as you can see down in the bottom right-hand column, we're decrypting in place the strings, which now become the HTTP headers, so you can see the content type. And here's a quick screenshot of what you might see down near the end of the strings list, showing you the exact text that the Trojan uses and that might help determine what the Trojan's functionality is. So in this case we can probably determine that it's some type of HTML injector, steals passwords, steals some type of account information from you. So the interesting thing about this method before we get to the Q&A session and somebody runs up and says, why the hell don't you just let this Trojan run and then dump it out of memory with LordPE or something and then run strings on that binary? There's a really good reason why we don't do that. Actually there's two good reasons. The first one is that imagine we've got conditionals in the Trojan that only decode strings if a particular trigger is occurred. So for example Trojan gets a packet on port 80, decode string A. Trojan gets a packet on port 82, decode string B. But we don't really have the time to go through and find out what all those triggers are. And if we don't find out what all those triggers are, then we don't decode A and B, okay? So the way that we're doing this is we're finding all the cross references to the decoding function and we're going back. So there's no way that we're gonna miss anything. We don't even have to let the Trojan run in its native environment. This is an example of what Greg was talking about being active. So we're not letting the Trojan run from start to finish, we're making it run through each iteration of the call to the encoding function. So the other example that you might not get a comprehensive list of strings by just dumping it once it's running is if the Trojan does something like moving the encoded string to the heap first, decoding it there, using it and then freeing it. In that case, if we dump it while it's running, we'll probably only get a few strings because they're being cleared every time they're being used. Whereas this case, we get them all no matter what. So the next example is also with SilentBanker. SilentBanker uses what's known as runtime dynamic linking. If you've ever analyzed malware, one of the first and easiest things to do to determine what the malware's capabilities are is by just looking at the import address table. So if it imports URL download to file, we can pretty much assume that it's gonna download some file off the internet. But if we don't have that capability, it's much harder to determine what the Trojan does in more than 20 seconds. So what malware will do is use runtime dynamic linking which consists of just calling load library on the DLL and then calling git proc address to find out the functions address of URL download to file. But the problem with that for malware authors is that the URL download to file string is still gonna be compiled in the binary. And if I run strings on it, then I'm still gonna find out. So just as easy as looking at the import address table. So what SilentBanker does is it uses a hash-based import address table resolver similar to what you might see in shellcode. And what that presents is a little bit more difficult scenario for us. Here is an example of the import address resolving function. I've renamed it git proc address from hash. And as you can see, this function takes three arguments. The third of which is a 32-bit hash. It corresponds to a function named such as URL download to file. So that gets sent as an argument to this git proc address from hash function which is gonna go in here. It's gonna begin iteration of this loop which processes each export from the loaded DLL. And for each exported function, it's going to compute a hash using the same algorithm as that original hash was computed with. And then it's gonna make a comparison. Is the hash of that function equal to what I just got passed as an argument? If it is, then I found the function that I'm trying to look up and I'm going to save the address. EAX is being moved into ESI plus 18 hex right here. So at this point, we can determine that ESI points to the base address of the call table in memory. And what we wanna do is if we follow ESI up to the beginning of the function, try to find out where that came from. ESI comes from arg zero. So the base address of the call table in memory is sent as the first argument to FX table resolver. So to script around this, we want to set a break point at the beginning of FX table resolver. We wanna read arg zero from the stack, find out what the address in memory is. We're gonna set a break point at the end of this function and just play all the way through. We don't care about any of these individual instructions or any of these individual instructions. And the ability to do this is based on these, what five, six lines of Python code, which includes the table print statement. So it's pretty easy. And what we've done now is we've got a way to, after the SilentBanker Trojan resolves all of these addresses itself to reverse that and turn it back into something that looks like this. So instead of reading something like call dword pointer eax plus 20, in the binary after this, you're going to be able to read eax plus get private profile int, which makes the IDB a whole lot more readable. So this code can be reused no matter what Trojan you're dealing with. It's not limited to SilentBanker. It's just any Trojan that uses runtime, dynamic linking or hash based import address tables like this. And this is a really quick demo. We're gonna start by moving the SilentBanker DLL into the debugger just like we did last time. We're gonna hit play until we get to DLL main. And then right here, you can see that execution is paused at the table, the function that goes through and resolves all of those hash based imports. So we're simply going to fade over that until we hit the next instruction. At that point, we're going to run that API underscore resolution script which contains those six lines of code that you saw in the last slide. And immediately it's gonna spit out these results right here. So the next step is simply to copy and paste this table. You can see it's got the offset on the left hand side. And we'll just patch the IDB with that using IDAPython or you can use IDC or whatever. You can even do it manually if the list is small. So just a really quick five minute way to patch up your IDB and make it a whole lot more readable. So on to the next one is Kraken. So Kraken is a spam bot, you've probably heard of it. It's the difficulty with Kraken is that if you run a network, maybe an MSS solution, maybe you've got a lot of customers that are worried about getting infected with this. They want to block in their firewall the hosts that Kraken will try to connect to for its command and control. So how do you preemptively know what hosts that a Trojan might connect to? If it's not hard coded as an IP address in the binary, if it's not hard coded as a host name in the binary, and if it's not encoded in any form in the binary. In this case, Kraken has an algorithm that generates host names dynamically based on this formula. I've written the C code right here that shows you essentially what it does. There is a loop that goes through and passes this getDomain function two arguments, the first of which is a pointer to the output. So that's where the host name gets written. The second one is a simple integer that iterates from zero to however many number of hosts that you want to generate. So that's how the Kraken host name generation works. Now for us, do we really want to let this Trojan run for potentially hours, potentially days before we know what host names it's going to connect to? I want to know in an instant what host names this is going to connect to so that I can tell my customers I can tell people to make sure you block so and so host. So let's quickly go to the Kraken example. And the way that we found this function is by getting a cross reference to the get host by name import, which is right here. So as you can see, the first argument to that is LP host name, which contains obviously, you know, like moo.com or whatever. And we want to track this back and try to find out where that host name came from. So if you go up here, it is derived from a store copy call it's used as the destination where the EAX register contains the source. And that source is filled in after the get random host name function is called. But obviously I've named that so it's pretty easy to see. But before I named it, how am I supposed to determine to name this get random host name? Well, it's because that backtracking that I just did and finding out that the EAX register at this point contains the host name. So if you want to take a quick look at the code inside this get random host name, there's a lot of math, there's a few conditionals, a lot of constants, a lot of multiplication, a lot of rotates, a lot of pushes. Do we need to write a formula in CR Python to mimic this function and spit out a list of host names? That might take a little bit even if you're a really good reverse engineer. So what we did was simply looked at the input to this function. It takes two arguments like I said before. One is the pointer to the output buffer. The other is a number, EAX plus four, which gets incremented through each iteration of the loop. So all we need to do is script around that function. We're gonna replay it a thousand times. Each time we're gonna pass it a new number for that second argument, one, two, three, four, five, and just keep going on. And then every time it's finished executing, we're just going to read the returned host name. So we're gonna do a quick demo of that. We're gonna wait until the Trojan is executed in memory. It starts to run. We're gonna attach to it with the debugger, type, crack, and host names. And as you can see right here, it sets several break points. One of them is at the very beginning of the function. It's gonna reset EIP to point to that location. And it's also got a break point set at the end of the function, which is right here, that return. And then it's gonna hop back up. It's gonna do the same thing. Do it again. And through each iteration of that, it's going to output the host name that it generates into our table. And of course, just let that run for a little bit. About a thousand times or so. Go get launched or something. You come back and you've got a list of all the host names that Kraken is going to connect to with its command and control protocol. So there's also something interesting that we found during the research with Kraken that there's a vulnerability, or not necessarily a vulnerability, but there's kind of a weakness in its cryptography model. So if you're here, you most likely know that when you're encrypting something and sending it over the network, if it's going to be secure, you don't put the encryption keys in the same message as the encrypted data. Okay? So maybe the Kraken authors didn't know that. I'm not sure. Maybe they were just lazy. But they actually decided to derive the encryption keys based on the IP address of the host machine that's infected, the CPU details, and the hard drive serial number. So it's pretty unique for each machine that it infects. But does that make it harder for us to grow through an arbitrary network's packet capture and decrypt the information? No, because of the weakness. They give us all the information that we would need to decrypt the information. So based on that, this is an example of something that you can't really use the scriptable debugger method for. We wanted to write a Wireshark plug-in so that network administrators could analyze their still and live network captures in order to find out. When you see this UDP-447 and HTTP post-traffic that Kraken generates going across the network, what's in it? And this Wireshark module that we released a few months ago can tell you that. So it parses the packet and it extracts those keys that are in the header and it uses those to decrypt the information. So we've got a little tab down here that shows you what it has. It's gonna be IP addresses, the spam templates, anything that the Trojan needs to do its nefarious acts. I don't have a demo for that one. Yeah, I can show that. It's not actually a demo, but it's just a packet capture, a still one, after we've installed the Kraken module. So you can click on any of the packets that Kraken generated and you can go over to the decoded payload tab. And as you can see here, it's a list of the IP addresses that's Kraken that the module is going to use to connect to you for its command and control. Actually, not the command and control, but for like sending spam and stuff. So that's what it looks like. I'm going on to the next demo. LACMA is an information stealing Trojan. And there are several components to LACMA. There is a client DLL that gets injected into windowing processes like Internet Explorer or Explorer, Firefox. And when that gets loaded by the process, it hooks multiple API functions such as PFX, import cert store, HTTP, send request so that it can snoop on the information that the process sends to that API function. So in the case of PFX, import cert store, it's going to be certificates. HTTP send request is probably going to be the post payload of any login that it's trying to make over the web. So once those API functions are triggered, the code that the Trojan inserts takes the information that it's stealing and it maps a global shared memory segment called temp underscore exchange and it writes the stolen information to that shared memory segment. And then it uses send message to basically alert that remote process, which I'm going to talk about in a second, that hey, I've stolen some information, I put it right here for you, go ahead and fetch it now. So over there in the other process, which is just the standalone executable, it has that actual serve window loop waiting for messages from the clients. Once it gets one, it says okay, thanks for letting me know, I'm going to map this shared memory segment now, I'm going to read what you've stolen and I'm going to determine if there's anything that I should do with it, such as uploading it to a stolen drop site server. So this is some source code that we reversed out of the client part, the DLL. This is the code that gets executed after the trampoline hook is triggered. So traffic is redirected to, the processor is redirected to this function instead of executing the legitimate function. And as you can see, like a normal trampoline function does, it simply calls HTTP send request, the legitimate version first, and then it maps that shared memory segment and it copies over the information that it's stolen and it uses find window, this is actually typo, it shouldn't be underscore underscore int serve, it should be that axle serve thing. And then it calls send message to let that remote process know that it's time to act. So if we take a look at the client DLL right here, this HTTP send request A, this is the exact same thing that I just showed you in C, but this is the disassembly version of it. You can see down here where it pushes that global temp underscore exchange string under the stack and then it calls map view of file and write, which is just internal function that I named based on the code that happens inside of it. So we're trying to determine during execution of internet explorer, somebody logs in, what exactly is this stealing and I wanna know in real time, I wanna see it, I wanna kinda snoop on what data it writes to the shared memory segment so that I can extract all the information and I'll have a better idea of what this Trojan is capable of doing. So can I hook map view of file in order to extract the information that I want? It seems like a likely function that we would wanna hook, but in this case it's not because at the time you hook map view of file, you haven't actually transferred over the stolen information. So it calls map view of file first, gets the address of that shared memory segment and then it calls these two internal mem copy like functions to transfer the data. Do we wanna hook those functions? Probably not because the address is gonna be different in memory for each version of LACMA that gets released and we don't wanna have to go through and find it every time and then hard-coded into our script. It's a little bit more work. But it's very convenient that right afterward it calls flush view of file, which doesn't actually delete the contents it's just like flushing up contents of a file to disk. And the first parameter is the base address of the shared memory segment. So that is exactly what we want. So we'll hook that flush view of file function, we'll get access to the address of the shared memory segment and then we can parse it and find out what information is stolen. So we're gonna go to a demo and on this workstation I've installed LACMA. There is another component to LACMA that I didn't discuss there. It actually comes with a kernel root kit also which hides the processes and files on disk. So if that root kit is enabled I can't open the process that I wanna debug in the debugger because the debugger can't find it. What it just so happens that LACMA comes with that kernel root kit it's a little bit generic and it's kind of funny too because the kernel mode root kit you can actually disable it from user mode by simply sending a signal over the named pipe that it exposes to user mode. That's what you're gonna see us doing right here. We wrote a little program called write to landman. There's the arguments. So if we send write to landman-x and it's gonna deactivate and not hide all the items. So we'll do that. And all of a sudden we're gonna have a few new processes so that landman work just showed up in the kernel DRV that he actually done there at the bottom just showed up. So we're gonna open the debugger we're gonna attach to those processes and we're going to open up an internet explorer window. Get a Gmail, just some site that you can type in a username and password. And I'm gonna let this one play. We're just gonna type LACMA spy and then we're gonna go back to the Gmail window and enter username and password and hit enter. Yeah, thanks. So this is an example of the passive method because we're simply just letting the Trojan run from start to finish and we're essentially just monitoring. We're not modifying the path of execution. And as you just saw, we're able to get a real time view of all the information that it steals out of the API hooks, which includes the post payload, your usernames and your passwords and everything. Okay, so on to the next one. Torpeg, another information stealing Trojan. Just recently, well not recently, a few nine months maybe ago. Incorporated the NBR root kit. And the challenge to this was, this is the first one that I personally ran into that did the decryption in the kernel mode driver. So the immunity debugger is obviously a user mode debugger which isn't gonna really let you debug that kernel driver. So the kernel driver gets installed, it starts to run, the kernel component downloads a DLL and the Trojan's configuration from the web. If you're viewing it with a packet capture, you're just gonna see a bunch of encrypted data. It does the decryption in kernel memory. But then, the stupid thing that it does is it makes all of that decrypted content available to any user mode process through a named pipe. It's a bang win money. However, you can't just connect to that named pipe and do read file because there is an internal command and control protocol. So you need to send it the exact same sequence of bytes and the exact same value of bytes that the kernel component expects. You need to send it from user mode in order to invoke the data that you're interested in. So we wanted to write a script around the internet explorer process. Because the DLL that Torpy downloads decrypts and then it injects it into any process that starts. So we want to wait until internet explorer loads that DLL and then we wanna have a break point set on create file. We wanna wait until that DLL automatically calls create file for the named pipe. And after we detect that it's open the transmission to that named pipe, we wanna set the break points for read file and write file so that we can inspect the exact information that it's sending back and forth. We don't wanna enable read file and write file hooks before the create file call to bang win money because if you've ever monitored an internet explorer process as it's loading into memory, it calls create file like a million times so that would be a whole lot of overhead. So we just wanna wait until it calls create file on that bang win money then we're gonna set the read file and write file hooks. And this is the Python down here to do it. It's equally as small. There's no disassembly for this one. So the goal of this is to extract that information from the kernel which is the configuration file that Torpy downloads. It contains the list of targets that Torpy is going to try and steal information from which is pretty essential for knowing what this Trojan is capable of doing. So we essentially just started the debugger. We began a process of internet explorer and we're waiting for it to call create file on bang win money. And when it does we're gonna immediately start getting some results here. So on the left hand column it's saying read and write which is letting you know which of the write file and write and read file hooks are being activated. It's telling you the size of the information that it's reading or writing. And it's also telling you the exact data that is being sent over the pipe. So what we did in that case, taking that information was we wrote a C program that you can run on an infected machine. And it does exactly what the Trojan does except for in this case we're able to simply put the output on the screen. And so now we've got a list of all of the Trojans targets. Even though the decryption occurs in kernel memory, even though we're not able to debug the kernel driver itself, but just by inspection of even what's happening in user mode we're able to extract the most critical information that this Trojan's trying to hide. So that's pretty cool. So the last one is called Trojan X encryption kegum. Can't actually tell you the name of this Trojan because, well for various reasons, none of which we can tell you. But so this Trojan's really interesting because it's also an information stealer. And the information that it steals, it encrypts before posting to the drop site. Now if the Trojan just generates a random encryption key and then encrypts the data with it, sends it to the drop site server, how is the attacker himself going to know what the encryption key is if it's generated randomly? He needs some way of knowing. And these guys aren't exactly as dumb as the other people, but they're pretty dumb. They didn't include the encryption keys in the same message as the encrypted data. What they did was they exploded using arbitrary math, the 16 byte encryption key that it generates randomly into a 32 byte number. And it sends the 32 byte number along with the encrypted data. So when you recover these files on the website, you first need to extract the 32 byte number and then you need to somehow reverse engineer that back to the 16 byte encryption key. And then you need to reverse the encryption function so that you can decrypt it. So the encryption function itself turned out to be pretty easy. And the Trojan doesn't actually have in its own binary the decryption function. It's different from the encryption function. It's not reversible. So we actually did need to do that by hand, which like I said, wasn't too bad. But then we're still troubled by the fact that we needed to generate a 32 byte key from a 32 byte number. Sorry, a 16 byte key from a 32 byte number. So let me jump over to the disassembly real quick. Here's the Keegan 16 primary, as I've named it. This function generates, using their custom pseudo random number generator, a 16 byte key. Now there happens to be this particular vulnerability in the Keegan that makes that pseudo random number not so random, which, just keep that in mind for a second. After that it sends the 16 byte primary key which is used for encryption to the Keegan eight secondary function, which outputs an eight byte secondary key. And then that eight byte secondary key is passed to this function, which generates another eight byte key. Which then gets sent to this function, which takes that eight byte tertiary key and turns it into a 32 byte number. And if you wanna just take a look at why I don't wanna reverse this manually, there's a whole lot of crap in there. Okay, so that's a whole another world right there that we don't necessarily wanna take the time to reverse. But are we screwed? No. So what we can do is, given that the 16 byte key is always gonna be the same, I'm sorry, the 16 byte key is not always gonna be the same. It's gonna be generated pseudo randomly with that vulnerability of course. But given a particular 16 byte number, this generates the same eight byte secondary key. This generates the same eight byte tertiary key. And this generates the same 32 byte final key. That's the one that's transmitted in the uploaded packet. So here's an example of one of the log files that we recovered from a rated drop site. So in yellow right there is the 32 byte key. And then in blue right after is the size of the encrypted data. And then in green is the encrypted data. So what we did was, based on that vulnerability in the keys, here's demo x, we just simply made a wrapper around these few functions. So we called Keegan 16 primary. We let the Trojan run until after this reorder big num function. And at that point we control memory. We can read whatever values we want. So we wanna save the 32 byte number that it generated. And we wanna associate that with the 16 byte number that was input to this entire sequence of events. And then we're gonna do it again. And then we're gonna do it again. And we did that 5,000 times. And I will show you how this works. Based on that vulnerability, 5,000 keys is enough to decrypt 100% of the records that we recovered from the website. So what we just did was, we logged into a website with internet explorer while the Trojan is infected on the machine. We have a break point right here at the encrypt main function, which is what I just showed you in the disassembly. The reason we did that is because there's actually a second parameter to that tertiary function, function besides the eight byte secondary key. And it's a number that the Trojan downloads from the internet and decodes from a configuration file and then passes to this function. So do I need to download that configuration file and decrypt it myself and then pass it to this function? No, I'll just let the Trojan run naturally like I just did. I had a break point set. I just logged into the website and the Trojan did all of that itself once I get to this point. And then we're going to select the function here. It's gonna loop through 1,000 times or 5,000 times. And it's gonna generate this binary file on disk, which contains a mapping of the 16 byte primary keys to the 32 byte final keys. And then all of those logs that we recovered from the website, we're gonna process them, look for the 32 byte number, and we're gonna say, hey, is this 32 byte number included in that list of 5,000 that we generated? If so, give me the corresponding 16 byte encryption key. And at that point, we've already got the decryption code written, so we'll just send it through. And like I said, that was capable of decoding 100% of the logs that we found. So that's all for the demos. And Greg's gonna wrap up with some utilities and some other things that you can do that are fun using a scripted debugger. Okay, so I mentioned earlier in the presentation about utility functions and recon scripts. One thing you can do is look for arbitrary shell code, or shell code arbitrary file, rather, I'm sorry. So let's say you have a JPA or PDF, doc file, whatever you think might have shell code in it. Well, shell code's gonna have certain characteristics such as it's gonna jump somewhere within the memory that it has and then start executing. It's kind of a trampoline. So one thing you can do is write a quick script that starts with byte zero decon. This symbol's the file in memory. Check to see if it's a jump or a call or whatever. But if it is, then jump to that location, see if that location's inside the memory that was loaded for the file. See if that instruction makes sense. So if you do jump within memory and you jump to like an out or an in or some other privileged instructions, odds are that what you have is just happenstance. It's curiosity. But if it does jump to a valid instruction, then odds are you have shell code in your file. So that looks like real quick. All right, so this is what this script Mike wrote. What did it do? It went through a, I think this was a JPEG, right? Went through a JPEG. Oh no wait, it was a malicious PDF. Oh sorry, PDF. Went through a PDF, did this process it. Started at the top, disassembled. If it was a jump or a call, it checked to see where it went to and if it was within the memory range, it checked to see what instructions was next. As you can see, it'll produce a lot of data, not necessarily all of it useful. But if you look at the highlighted portion here for this demo, you can see that the jump landed within the memory range of the file that was loaded and it looks like a valid instruction. So it's doing a move from a known anti-data structure. So odds are that's your shell code. So if you jump that location using this script, you can just assemble it, you can see that odds are pretty good, that's shell code. So I think we also put this on the desk too, right? This is script on the desk? Yeah, I think so. All right, we're running out of time. Is this yours? Yeah, sure. Another cool thing you can do is you've got access to that user mode memory. If you want to look for trampoline hooks, you can obviously do that with GMR or some other root kit detection script, but assume that you want to find out exactly what the purpose of those API hooks are. So GMR will tell you, hey, there's an API hook on internet read file and it jumps to 0x8d0000. But does that really tell you what the purpose of the hook is? So with some scripts in the debugger, you can, number one, enumerate the same trampoline hooks as GMR. And you can also set a break point on the destination of the trampoline hooks. And you can just use the browser as you normally would, wait for those triggers to get hooked, and then you can debug it. Find out exactly what it's doing with the parameters that get sent to internet read file. That's pretty easy, just a normal way of detecting trampoline hooks, just looking at the prologue of the legitimate function. Does it contain a jump to some arbitrary memory and an arbitrary location memory, or does it call some arbitrary location memory? So it's pretty easy to do that. This is an example of, I think this is Silent Banker, again. As you can see over there on the opcode, that's the instruction at the prologue of those respective functions on the left-hand side. So wininet.htp send request down there at the bottom would normally contain, probably like a pushy bp or something like that. But in this case, it contains a push of the random address, well not random, but the address of the hook code and memory that you're gonna wanna follow if you're trying to debug it and find out what it does. So we're pretty much done here. But there are a few things you need to be careful of whenever you're scripting. First and foremost, as we pointed out, scripting may not actually be the best solution for you. The wirestrikes, for example. So know what you wanna do, know what you want to accomplish, and that will tell you if you can script it or not. So if you're given to a customer, odds are you're gonna have to do some manual or assisted reversing, and good luck with that. The next thing is, kernel space doesn't script well, if you can try. But we highly recommend that you look at things more like in the user land, like certain calls to kernel space, actually happen in user space, hook those instead. And it's important to understand that you're dealing with a script here. You're not, you didn't compile this. So it's interpreted and interpretation takes time. Some of the demos we had, like I think the Kraken domain name, actually runs for quite a bit of time, so don't be surprised if your script does run for a bit. Like we said, go to lunch. Yeah. And finally, as you are running your malware in a debugger, it may be aware of that, and may be a little pissy about that. So there are tricks, obviously, you can do to make your debugger a little less visible to the malware, so you might wanna try those. So at that point, we are done. So, can you guys feel that? Go.