 So, welcome back everyone for the second talk of the day. Just before I make the introductions, a few reminders. If you have questions, you can ask on app.sli.do and the code is hashtag insect20. Questions can be upvoted, so make sure to read the questions before you ask your own. Also make some noise on Twitter, the handle is at insect underscore IO. We always like to have attendees make a little bit of noise for us. And also make sure to take note of the code of conduct, it's something really important for us that we enforce quite strictly and that we like people to read it because we spend a lot of time writing it. So without further ado, let me introduce our next speaker. His name is Holger Unterbrink and Holger is working for Cisco Talos, the threat research organization of Cisco. Their goal is to find and reverse engineer new unknown malware campaigns and esteem uncovered attacks like nutpedia, I want to cry, DNS to an age, sea turtle and many more. He is frequently presenting on international, internal and external conferences. For example, Microsoft digital crime consortium, Google annual reverse engineering meeting. Just the ISC, the fourth international conference on cybersecurity and privacy Balkan. Besides Munich, SEC IT Germany, Cisco live and many more. His talk is going to be dynamic data resolver, either plug in extending either with dynamic data. So thank you, Holger and please proceed. Hi, and welcome to the live stream of my presentation dynamic data resolver. My name is Holger Unterbrink and I'm a security researcher at Cisco Talos, mainly looking into malware research, threat hunting and tool development. I'm based in Germany and if you like to follow me on Twitter, you can find me at hunterbr72. Today I would like to introduce you in a new tool which I have developed in the last month and which I'm releasing next week, the dynamic data resolver. Dynamic data resolver is mainly an either plug in which goal it is to resolve dynamic values like registers, for example, or memory values at runtime by using instrumentation. So if you're interested in ESI in something like call ESI, and you want to know the value of ESI, you can use DDR to resolve it. And to guide you through the main features of DDR, let me proceed with the one which I've already mentioned. So finding dynamic values, you can not only resolve these absolute values like you have seen before, which are stored in a certain operand or in a certain register. You can also resolve pointers and pointer pointers which are stored in these operands. And you can get the memory they are pointing to. So if you're analyzing an unpacking routine, for example, and you see that EDI is pointing to a buffer which has the MZ at the beginning, then you might be lucky that you have found the next stage of the unpacking routine which constructed the PE header. And this is just one example where it can help you to analyze an malware sample statically. Of course, there are hundreds of other things like crypto routines and many, many other things where it is pretty handy to get these dynamic values inside of your static analyzers. As far as DDR is almost instrumenting every single instruction of the executable, depending on what you have, what command you have picked, you can also easily use it for code coverage. And DDR is not only marking the instructions which were executed, it is also marking it with a certain color depending on how often the instruction was executed. If you see an instruction which is marked light green, you know it was executed once. If you see an instruction with, let's say, dark red with much warmer color, then you know that it was executed many more times. And with this, it helps you to find, for example, crypto routines which are often looping through the same basic blocks, et cetera, et cetera. It is also collecting all the jump calls and similar instructions which are touching API calls. So any time one of these instructions tries to access an API call, it is written into this table. And the table, of course, is searchable. So if you're looking for a certain API call like Virtual Alloc or LotLibrary, you can just hit Control-F and search for it. If you then do a double-click on the line, it brings you directly to the program counter inside of your disassembly to the location where this API call was accessed. So you can see exactly the instruction which tried to access this API call in your disassembly. Pretty much the same applies for strings. DDR tries to collect interesting memory location which are looking like strings. So it is putting all of them in this table, which means you can search for certain strings like MZV or anything else you're interested in. And then again, do a double-click on the line, and it brings you to the instruction which accessed this string at runtime. You can also dump buffers in a smart way. You just have to hand over three parameters to DDR, the buffer size, the buffer address, and the actual location where you want to dump the buffer or when you want to dump the buffer. You're doing that by marking the operand which stores the size, for example. And then you are selecting use marked operand to get buffer size. So at runtime DDR will read the value from this operand, no matter if it is an absolute value like you can see here on the slide, the C8, or if it is a register or something like that. You're doing the same for the buffer address and for the location where you want to dump the buffer. So if you have something like a virtual-alloc, you know that the function returns a pointer to the allocated buffer in RAX. So you would mark RAX and pick the use marked operand to get buffer address menu point. Then you're looking in your disassembly where this buffer is filled with something you're interested in. You're marking the line and you pick the mark address to dump buffer to file menu point and you are done. The only thing which is left is you have to execute the sample and dump the buffer. And of course you can do that multiple times in the executable. So you can dump multiple buffer buffers in one step. Of course today malware comes often with anti-analysing checks and if you want to disable and patch these anti-analysing checks, you have three different options. You can either use the knob out functionality by marking these instructions which you want to disable and then they are getting knocked out at runtime. Or if you want to manipulate the control flow, you can just patch certain flags in the e-flag register. So if you have a conditional jump for example, something like jump not zero, you can mark the line. You can click toggle e-flag at runtime and you would manipulate the zero flag for example and toggle it which means that at runtime it would get the exact opposite value which means the jump would do the exact opposite thing than it would do if you would execute the sample in a normal way. And last but not least, you can also completely skip functions if you like. If you have a function, let's say something which detects a virtual machine and you want to skip it, you can completely skip it and return a fake return value. Which means that the other parts of the sample getting a fake return value which is telling the rest of the sample something like, okay, we didn't found any virtual machine. We can proceed. We don't have to exit. We haven't found any analyzing stuff, so no debugging or anything like that. We can just execute all malicious functions. So you can skip the function and fake any return value which you would like. If all of that is not enough to do aesthetic analysis of the sample, you can also create an x64 debug script. And when you're creating the x64 debug script with DDR, it builds a script which is also applying all the patches which you have set up before, all the patches which I've talked about on the slide before. And it is writing them into an x64 debug script and using the x64 debug script language to implement these patches. And then it is breaking at the address which you have highlighted in IDA. So you can just execute the script inside of x64 debug and it will automatically break at the point which you have marked in IDA. And then you can proceed working with x64 debug like you always do. If you don't like x64 debug, you can also create an executable with an endless loop at the marked address. So you can mark in certain address in your disassembly. You create an executable and the original executable gets patched with an endless loop. So DDR is overwriting two bytes. So when you're executing the executable, it is looping forever. So you have time to attach debugger to it, your favorite debugger which you would like, and then you just have to replace these patched bytes, the two ones. You can use the DDR output for that like you can see here on the bottom of the slide. And then you're just replacing the original bytes back and then you can just proceed debugging with your favorite debugger. OK, so this is what you can do with DDR. Let me talk a little bit about the architecture of DDR. It is highly recommended to use the plug-in on one machine and use the DDR server and the Dynamo-ri client on a separate machine. Keep in mind that we are doing instrumentation and we are really executing the malware. And you probably don't want to execute the malware on the same machine where your IDAR and your IDAR license is running on. So we would highly recommend to use two virtual machines, for example, one where you have a running IDAR and another one where you run the server component and the actual instrumentation DLL. Of course, you can run it on the same box, but again, it is not recommended. The way it works is that the IDAR plug-in is sending the commands to the DDR API server and the DDR API server is controlling a command line tool, the Dynamo-rio client. This is a client which you can also run standalone. For example, if you want to analyze the sample on a completely air-gapped system and you don't want to access it via IDAR, you can just install the Dynamo-rio client on that box, which is pretty easy. And you can put nothing else on this physical box. It's completely air-gapped. This malware sample gets instrumented. It is collecting all the interesting data, writing this data to a JSON file. And then later on, you can copy the JSON file to the analyst machine and read it into IDAR if you like. And then you can use the IDAR plug-in pretty much the same way like you would have used the plug-in from the beginning on. So these are the two options. Either you're running it manually on the command line or you're doing everything from the IDAR plug-in and the whole process is fully automatic. As you have seen on the slide before, I'm using the Dynamo-rio instrumentation platform for doing all the instrumentation. And the reason for that is that Dynamo-rio is an extremely rich API for instrumentation. It comes with a lot of different tools. It comes with a lot of different API functions which are helping you a lot when you are trying to implement something like that. So you don't have to think about a lot of underlying issues for the different architectures, for example, and so on. No matter if it is x64-bit or x32-bit, Dynamo-rio is able to instrument the binary no matter what kind of sample you have. Dynamo-rio also comes with an BSD license, which is pretty nice, so you can easily use it inside of your tools. And another really important point is it is a supporting self-modifying code. And it can also trace files which are starting threads. It's multi-threaded capable. And it can even trace new processes which your sample is executing. And the probably most important thing is it is really well documented. I don't really want to compare it to Intel PIN, but it is doing a pretty similar thing, except that I like it a little bit more. And it is, at least for my point of view, a little bit better documented. The installation is also super simple. You just have to unpack and zip file, and that's it. So it has a lot of advantages. And as far as we already have a PIN trace inside of IDA, I don't want to invent the wheel again. So I picked Dynamo-rio for this implementation. Dynamo-rio is around for at least 10 years, I would say. It's pretty stable. And again, coming with a lot of features. And the main feature, one of the most important feature, if you want to analyze malware samples, samples is probably that it is built from scratch with the idea of being totally transparent to the instrumented malware sample or to the instrumented sample. With this, you can still detect Dynamo-rio if you're actively looking for it. But so far, I haven't seen much malware which is actually doing that. And hopefully, even after the presentation, that doesn't change, but I'm giving my fingers crossed. So the way it works is you are executing a Dynamo-rio client by executing, for example, the DR-RUN.exe one, the DR-RUN tool. And you can use your own DLL, which is the main engine of the instrumentation. Now the DLL is collecting all the data, it is controlling the instrumentation and then also sending the data back to the JSON files. And the way you would execute that on the command line is just by the way you can see here on the slide. DR-RUN, exe, dll, dll, ddr, dll, and then the ddr, dll config parameters and then the sample which you want to analyze. And the result, as I mentioned before, is a JSON file. And then you can either import the JSON file to IDA or you have done this whole process automatically by using the IDA plugin. This is how the JSON file looks like. And you can see that it is collecting pretty much all the registers and all the interesting memory points or pointers at more or less every single instruction. And as far as this is, of course, generating a certain overhead, you have several options inside of DDR to only do the instrumentation for certain basic blocks, for example, or for certain ranges, for certain instruction ranges inside of your executable. And with this, it's still extremely fast, even that fast that even time-measuring, anti-analysing routines are often not detecting that anything was instrumented. The whole workflow works like, you can see here on the slide. The first thing you have to do is if you want to use the plugin, you are executing the DDR server, which is the API server, which you have seen before. Then you are launching IDA on a different machine and you're picking a command inside of DDR. You just do a right mouse click and execute, for example, a light trace, which is tracing through the whole segment to the whole code segment, for example, and collecting a certain amount of data there. This command is sent over an encrypted channel to the server. The server is then executing the Dynamo-Rio command line and the DLL, the DLL is generating the JSON file. And then finally, the JSON file gets sent to the IDA plugin by the server. And then you can use this data inside of your static analysis by just right clicking on an operand or on an instruction line and picking some of these menus. So get value for source operand, for example, or get the value of a certain register, whatever you are interested in. By the way, you can see here on the right side of the screenshot, something like xax equal no data. That doesn't mean that we haven't found any data. That just means that the absolute value which was taught in EAX is the 27B6. And this is just an absolute value which is not pointing to anything. So you know that there was not a pointer or pointer pointer stored inside of this register. It is just an absolute value which was used in the instruction. Before we are moving to the demo, let me quickly warn you about a pretty nasty behavior of Windows if it comes to executing Python scripts inside of a command window. Unfortunately, if you're marking any text inside of the window, inside of this command window, Windows will freeze this Python application. So it's not getting executed until you are hitting escape. Which means that your DDR server is pretty much frozen. And of course, it will not accept any commands from the plugin anymore. So if you are receiving a timeout error or something like that on the IDAR plugin side, it is very likely that you accidentally marked or highlighted something here in the DDR server output and that has frozen the server. So you can either hit escape a couple of times and the server gets executed again or if that doesn't help you just do a Ctrl Z and restart the server. Both works and then you can proceed with the commands with the right mouse, click in IDAR in the plugin and everything works like before. But just be warned about that one. Try not to mark anything or if you're marking a text in the output window, make sure that you hit escape a couple of times afterwards. So a quick final disclaimer. Of course, DDR is not replacing your brain. Of course, you can do a stupid thing with it. It is quite powerful. So keep in mind that something like patching is might be dangerous and could crash the sample. I've also seen malware, which is for example, killing the whole process chain until Explorer XE. So that would also kill the Python server and the communication would obviously not work anymore. But nevertheless, like for any tool, it doesn't fit at all, but at least it will fit most of the samples and hopefully it will help you with your analyzing. Okay, enough about that. Let's move to the demo and let me show you how the thing looks in real life. Actually, before we are switching to IDAR, let me quickly show you the important parts of the source code of the sample, which we are going to analyze. The first thing is that the sample is comparing its process name with evilmalva.exe, which means if its process name is not evilmalva.exe, it will copy an instance of itself to the temp folder and then it is executing this instance. So the first instance is just leaving and the second one is recognizing that it is running under the process name evilmalva.exe. So the comparison is not true and it will just print out this message new instance running from temp folder and then proceed with the rest of the code. Again, I don't want to go through all of the source code, just the important parts for the demo. And one thing you should recognize is that A can never be bigger than five. We are doing a mod five here, so there's no chance that A can ever be bigger than five, which means that the following comparison will always be not true and it will always print out the message A is not greater than five. At least in theory because we will see later on during the demo that we can patch this behavior inside of DDR. The last thing which you should keep in mind for the demo is that after this comparison, we have a dialog box which is asking for a value. This value is getting assigned to A but that's a really important part is that you remember that there's a dialog box which would stop the execution of the sample and of course we don't want that. We want to get it executed in one step. So we will also knob out later on this dialog box and skip it. Okie dokie. With this, let me switch over to IDAR. Okay, so we've loaded the sample into IDAR and we move to the location where it is comparing the process name with evilmalve.exe. So usually the sample would be executed with a different process name and it would go this path where it is just copying itself and launching a second instance of itself. But of course we are not so much interested in this path. We are more interested in the other path and the rest of the sample code. So what we want to do is we want to patch this comparison and we can do that by toggling the zero flag, for example. It's a jump not zero. So if we are inverting the zero flag, it is doing the exact opposite of what it is supposed to do. So we go and go to the DDR patch menu, toggle E flag. We are picking the zero flag, the F flag, click okay and that's it. The next thing which we want to do is we want to get rid of the dialogue box, as I mentioned before. So we are moving over to that part of the code. We are marking the code and we can just knob out all these instructions at runtime. Again, we are moving to the patch menu and we are picking the knob out marked instruction at runtime option. So now we can execute another DDR command. So for example, we can run a trace if we want to get a code coverage, for example. Now we do that. We have to wait a little bit and it has sent the command to the DDR server. The DDR server has executed it and sent back the analyzers in form of the JSON format. So you can see here, the light trace is done. If we are now moving back to the process check and if we highlight all the instructions which were actually traced, now we get a nice code coverage and we can also see that the other path was executed even if the process name was not evilmelva.exe, it went this path and it has executed all these instructions here. And it has also skipped that part at runtime. Before I'm moving to the next feature, I would like to come back to the warning which I have mentioned during the presentation that you should be very careful with marking the output text of the DDR server. So for example, if I'm moving over to the DDR server and if I'm marking something, it can even be just one character as long as something is marked in this window, the process is frozen. And if we now try to execute and trace like we did before, for example, then it will run into a timeout. It takes a little bit and Bob, you see, failed to run trace for segment. If you are running into this and if you want to test if the communication is working between the server and the IDAR site, you can always test that with your browser. Now, if you're going to the root directory of the server, you can just do that in your browser and you see that it is timing out. So if we are moving again over to the DDR server, I can hit escape a couple of times and you see all the commands are coming back in and the application is now unfrozen again. If we are doing the test again, you see that this counter is now counting up and the communication is working again. So I could now move back to IDAR and run the trace again or whatever I want to do. And all the trace will be executed as you've seen before. Done. Okay, so after we have analyzed the sample and we found an interesting buffer which we want to dump to disk, we can do that the way which I've mentioned during the presentation. So we have, for example, Virtual Alloc and we know that this parameter is the buffer size. So we can just go to the DDR menu dump and we choose that one as the buffer size. Then we are marking RAX because we know that this is including the pointer which is pointing to the buffer and we are heading that over to DDR. Get buffer address. So now we need to find the location in the code where this buffer is filled with something interesting which is in this case here. So we are just marking the line and we are handing over the third parameter to the dump and we are done. We can now see that DDR has all the parameters which are necessary to dump this buffer. So now we can finally execute the sample and write the buffer to disk. So again, the sample gets executed and the buffer gets dumped to disk. So we can write it somewhere to the desktop, for example, and you can see now it is the buffer which we have or it's a string which we have copied into the buffer. So the last thing which I would like to show you is how you can generate x64 and debug scripts. And as I mentioned before, these scripts are including all the patches which you have done before. To demonstrate that, let me manipulate this comparison where we are checking if A is bigger than five. Again, we are going to the E-flex menu and as far as it is this time and jump less equal, we have to toggle the SF flag. Okay, so now we are done and now we can generate the script and the x64 debug script. Now it has generated this script and sent it over to the Melva site. So where the server is running. We can now load the x64 debugger, go to scripts and load this file. So now you can see that it is doing all these patch tricks which you have seen before and we can execute this script and now we can move to the different break points. So here we have, we break right before this compassion and you can see if I'm stepping over it that it has manipulated the SF flag and the comparison now thinks that A is greater than five even if that is theoretically impossible. Okay, that's it regarding the demo. Let me switch back to the PowerPoint presentation. So I hope you liked the demo and if you want to see more features, you can watch the YouTube video which I have here on the bottom of the slide. As I mentioned before, the plan is to release the tool next week and if you don't want to miss the release you can follow me on Twitter. With this we have reached the Q&A section and I'm giving back to the NorthSecFox. Well, thank you very much Helga that was very technical and in-depth talk. Maybe a bit too technical for me because I am not so knowledgeable in reverse engineering. So we'll let a few seconds, few minutes for people to ask questions. The link has already been posted. So it's app.sli.do. Just while waiting just a quick message if for whatever reason and not just for this talk but any talk you feel overwhelmed because it's too hard for you take it as an opportunity to rise curiosity and learn about these things because I know for a fact that you can quickly get overwhelmed and have a feeling that you're not wise enough for these things but as you dig and you try these things there is not a lot of feeling as great as popping your first buffer overflow and having a calculator up here on your screen. So without further ado, let's start with the first question. So the first question from anonymous user is there any plan to make it work with Gaidra too? Not in the moment because my time is limited. I wish I could do that but it's not too realistic in the moment to be honest. Okay, thank you. Does it also from an anonymous user? Does it also support processes which are launching other processes and threads? Yes and no. So the command line tool is supporting that and the problem with that is that you cannot lock all these files at the same time. Well, you can but the Gaidra plugin can't consume them. So what you can do is you can run the command line tool and it will generate certain JSON files per process. And it will also track all the different threads of the processes. So then you can later on load these JSON files manually into Gaidra. Okay, thank you. So I guess this brings us to the next question. What happens if I apply two different batch functions to the same instruction? That is something you should usually not do because only one will win. It will probably not crash the, it will probably not crash the DDR process, but it has results which can't be foreseen. So I would not recommend doing that. Oh, well, I guess it's good to know. Thank you. So the next question from Humblepian. Aren't you afraid that malware detector will now try to detect it? Yeah, a little bit. But it's always a question of if you want to release something in public or not. Of course, if I would have kept it private, it would be more comfortable for myself. But I put a lot of time and effort into it and a lot of spare time. So I thought it would be nice to share it with people even taking this risk. Yeah, no, that's very good. Thank you. Thanks to you. So I guess the next question is actually follows. Why don't you release it today? Why not? Because there are some organizational issues which I still have to overcome. Like how can I release it on the Talos GitHub and so on? So it's more organizational stuff than technical stuff. So I guess like would you release it with like open source free license? Yes, yes, absolutely. I will release it as an open source tool. And it's very likely that I will release it in the mid of next week. Okay, so it's very, not today, but... Exactly. Very shortly, thank you. So the next question from an anonymous user. Can I use Dynamo RIO also for fuzzing? Yes, absolutely, absolutely. For absolutely every instrumentation tasks, it's a really recommended tool. Perfect, thank you. It's from my point of view. So I'm sorry, they're telling me that the time is over. There is one other question. I'm gonna ask it very quickly. So is there a way to make it work with pack malware or would you need to unpack the malware first? Absolutely, it's actually the main goal to use it for unpacking malware. So what I'm using it usually for is to correct the first stage of the unpacker. Of course, if you have code which is unpacking something and copying something to a buffer which you can't see in your disassembly, then you probably can't use it for the second or third stage. But that heavily depends on the case and how the malware has implemented the second and third stage. So you can absolutely use it for everything. And again, this is definitely the goal of the tool to use it for unpacking malware. Perfect, that makes it very powerful. Well, thank you, that's all for the Q&A. Again, a warm thank you for your participation in the online event and I wish you a great day and I hope that you can enjoy the rest of the conference. Thank you, you too, bye-bye. Thanks everyone. We see you in a few minutes for the next talk by BX.